My pooled PE RNA-Seq data was demultiplexed by the sequencing facility, so the data I receive is a directory with sub directories for each sample that contain the R1 and R2 fastq files for each lane (i.e., the main directory "FISH_RNA_SEQ" has 96 folders, each labeled by sample-like "SpA.Treatment1.Rep1", etc.). If I am in a sub-directory for a particular sample, I can concatenate across lanes and write to a file in a new directory so I have two files for each sample (R1/R2) using this for loop:
for SUFFIX in R1_001.fastq R2_001.fastq
do
cat *L001_$SUFFIX *L002_$SUFFIX *L003_$SUFFIX > ../test.cat.DDIG/samplename_cat_$SUFFIX
done
However, this requires me to manually run this for each of the 96 samples, going into the sub-directory and typing in the desired output name. Since I will have to repeat this in the future, does anyone have suggestions about how to use a nested for loop (or other way) to do this automatically/iteratively do this from the main directory for all subdirectories, naming the output files with by the subdirectory (i.e. sample name)?
Working from the main directory, I was testing something like:
for dir in *; do
(for SUFFIX in R1_001.fastq R2_001.fastq
do
cat *L001_$SUFFIX *L002_$SUFFIX *L003_$SUFFIX > ../test.cat.DDIG/test_cat_$SUFFIX
done)
But this doesn't seem to work, and it doesn't solve the problem of naming the output files according to the sample names. Any suggestions appreciated!
for SUFFIX in R1_001.fastq R2_001.fastq
do
cat *L001_$SUFFIX *L002_$SUFFIX *L003_$SUFFIX > ../test.cat.DDIG/samplename_cat_$SUFFIX
done
However, this requires me to manually run this for each of the 96 samples, going into the sub-directory and typing in the desired output name. Since I will have to repeat this in the future, does anyone have suggestions about how to use a nested for loop (or other way) to do this automatically/iteratively do this from the main directory for all subdirectories, naming the output files with by the subdirectory (i.e. sample name)?
Working from the main directory, I was testing something like:
for dir in *; do
(for SUFFIX in R1_001.fastq R2_001.fastq
do
cat *L001_$SUFFIX *L002_$SUFFIX *L003_$SUFFIX > ../test.cat.DDIG/test_cat_$SUFFIX
done)
But this doesn't seem to work, and it doesn't solve the problem of naming the output files according to the sample names. Any suggestions appreciated!
Comment