SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
shell script for concatenating fastq files JQL Metagenomics 2 04-21-2016 01:13 PM
Sam to Bam using bowtie and using the shell script AnushaC Bioinformatics 7 11-01-2013 04:17 PM
shell script help Kennels Bioinformatics 2 07-15-2013 05:26 PM
running samtools in shell script zorph Bioinformatics 4 03-02-2012 05:46 AM
Mira assembly -shell script robelb4 Bioinformatics 2 07-21-2011 06:57 AM

Reply
 
Thread Tools
Old 10-17-2017, 07:59 AM   #1
JQL
Member
 
Location: MO, USA

Join Date: Apr 2011
Posts: 83
Default shell script for copying files with consecutive numbers

Hi all,

I have 170 fastq files from miSeq. They are named as 1_Sxx_L001…, 2_Sxx_L001…, 3_Sxx_L001… consecutively.

If I only need files from 1 to 120. How to quickly copy those files? I was thinking about using a for loop, but keep getting errors.

I can try this, but inefficient and error prone.
cp [1-9]_*.gz;
cp [1-9][0-9]_*.gz;
cp [1-9][0-1][0-9]_*.gz;
cp 120.*gz

thanks in advance.

example file names:

100_S122_L001_R1_001.fastq.gz 126_S145_L001_R2_001.fastq.gz 152_S168_L001_R1_001.fastq.gz 23_S71_L001_R2_001.fastq.gz 4_S36_L001_R1_001.fastq.gz 75_S34_L001_R2_001.fastq.gz
100_S122_L001_R2_001.fastq.gz 127_S155_L001_R1_001.fastq.gz 152_S168_L001_R2_001.fastq.gz 24_S82_L001_R1_001.fastq.gz 4_S36_L001_R2_001.fastq.gz 76_S45_L001_R1_001.fastq.gz

Last edited by JQL; 10-17-2017 at 08:02 AM.
JQL is offline   Reply With Quote
Old 10-17-2017, 08:10 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,046
Default

Where are you copying them from or to?

You can generate the numbers you need using a bash for loop like this easily
Code:
for i in `seq 1 120`; do echo $i; done
GenoMax is offline   Reply With Quote
Old 10-17-2017, 08:16 AM   #3
JQL
Member
 
Location: MO, USA

Join Date: Apr 2011
Posts: 83
Default

Those files belong to 3 projects. Lets say I want to copy the 120 files to the directory called project1.

for i in `seq 1 120`
do
cp "$1_*.fastq.gz" ~/project1/ ## it doesn't work.
done

thanks

Quote:
Originally Posted by GenoMax View Post
Where are you copying them from or to?

You can generate the numbers you need using a bash for loop like this easily
Code:
for i in `seq 1 120`; do echo $i; done
JQL is offline   Reply With Quote
Old 10-17-2017, 08:19 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,046
Default

That should be a $i (not 1).
Code:
for i in `seq 1 120`
do
cp $i\_*.fastq.gz ~/project1/
done

Last edited by GenoMax; 10-17-2017 at 08:38 AM. Reason: Escaped _
GenoMax is offline   Reply With Quote
Old 10-17-2017, 08:37 AM   #5
JQL
Member
 
Location: MO, USA

Join Date: Apr 2011
Posts: 83
Default

how come I copied all 170 files (since they are PE, so 340) over? Should be 240. Was it to due the "_"?

$ for i in `seq 1 120`; do cp $i_*fastq.gz ~/project1/; done

$ ls |wc -l
340
JQL is offline   Reply With Quote
Old 10-17-2017, 08:40 AM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,046
Default

The _ should have been escaped. Corrected code above. Replace "echo" instead of "cp" to make sure all looks good first.

Last edited by GenoMax; 10-17-2017 at 08:44 AM.
GenoMax is offline   Reply With Quote
Old 10-17-2017, 08:58 AM   #7
JQL
Member
 
Location: MO, USA

Join Date: Apr 2011
Posts: 83
Default

yes, it was the "_".

thanks!

$ ls | wc -l
240
JQL is offline   Reply With Quote
Old 10-20-2017, 11:30 AM   #8
JQL
Member
 
Location: MO, USA

Join Date: Apr 2011
Posts: 83
Default

what does underscore mean without escaped?

Quote:
Originally Posted by GenoMax View Post
The _ should have been escaped. Corrected code above. Replace "echo" instead of "cp" to make sure all looks good first.
JQL is offline   Reply With Quote
Old 10-20-2017, 11:39 AM   #9
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,046
Default

Run the following to see the difference
Code:
for i in `seq 1 120`
do
echo cp $i_*.fastq.gz ~/project1/
done
GenoMax is offline   Reply With Quote
Old 10-21-2017, 04:19 PM   #10
JQL
Member
 
Location: MO, USA

Join Date: Apr 2011
Posts: 83
Default

I ran it before. It copied all the *.gz files over (more than 2x120).
Interestingly, when I removed the copies in project1,
it copied again, removed, recopied.

any explanation?

Quote:
Originally Posted by GenoMax View Post
Run the following to see the difference
Code:
for i in `seq 1 120`
do
echo cp $i_*.fastq.gz ~/project1/
done
JQL is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:13 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO