Transfer your data
If you want to use your own data, you should transfer the FASTQ files into your project folder /shared/projects/YourProjectName
before doing your analysis. Alternatively the workflow allows you to download data from SRA simply giving the SRRxxx
IDs, see below metadata.tsv.
FASTQ names
The workflow is expecting gzip-compressed FASTQ files with names formatted as
- SampleName_R1.fastq.gz
and SampleName_R2.fastq.gz
for pair-end data,
- SampleName.fastq.gz
for single-end data.
If your files are not fitting this format, please see how to correct the names of a batch of FASTQ files.
Generate md5sum
It is highly recommended to check the md5sum for big files. If your raw FASTQ files are on your computer in PathTo/MethylProject/Fastq/
, you type in a terminal:
You@YourComputer:~$ cd PathTo/MethylProject
You@YourComputer:~/PathTo/MethylProject$ md5sum Fastq/* > Fastq/fastq.md5
Copy to the cluster
You can then copy the Fastq
folder to the cluster using rsync
, replacing username
by your login:
You@YourComputer:~/PathTo/MethylProject$ rsync -avP Fastq/ username@core.cluster.france-bioinformatique.fr:/shared/projects/YourProjectName/Raw_fastq
In this example the FASTQ files are copied from PathTo/MethylProject/Fastq/
on your computer into a folder named Raw_fastq
in your project folder on IFB core cluster. On iPOP-UP cluster, only the address is different:
You@YourComputer:~/PathTo/MethylProject$ rsync -avP Fastq/ username@ipop-up.rpbs.univ-paris-diderot.fr:/shared/projects/YourProjectName/Raw_fastq
Feel free to name your folders as you want! You will be asked to enter your password, and then the transfer will begin. If it stops before the end, rerun the last command, it will only add the incomplete/missing files.
Check md5sum
After the transfer, connect to the cluster (IFB, iPOP-UP) and check the presence of the files in Raw_fastq
using ls
or ll
command.
Check that the transfer went fine using md5sum
.