SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
fastq-dump on SRA files harlock0083 Bioinformatics 14 10-18-2018 03:19 AM
merging fastq files shilo Illumina/Solexa 8 07-06-2016 01:15 PM
How to convert old SOLEXA files to fastq? sameet Illumina/Solexa 3 08-19-2013 05:44 PM
manipulate sequences in Fastq files lisann_5 Bioinformatics 4 10-25-2012 04:24 AM
Something like seqcleaner for fastq files? hammerf Bioinformatics 1 03-16-2011 12:22 AM

Reply
 
Thread Tools
Old 12-30-2012, 11:02 AM   #1
wbsimey
Junior Member
 
Location: san francisco

Join Date: Jul 2010
Posts: 9
Default Append two fastq files repeatedly

Hello, I am on an Ubuntu 12 system.
I am trying to write a bash loop to run
Code:
cat /data/rad1/ang_TP30124.fastq  /data/rad2/ang_TP30124.fastq >  /data/rad3/ang_TP30124.fastq
for 184 files.

I have 184 files in two directories with identical file names (/data/rad1 and /data/rad2). I want to append each of the like named fastq files into a single like named file into a third directory /data/rad3.

I am just learning and have come up with:
Code:
topdir=/data/rad3
dir1=/data/rad1
dir2=/data/rad1

for f in $topdir/$dir1/*.fastq
do
    outf=$topdir/`basename $f .fastq`
    cp $f $outf
    cat $topdir/$dir2/`basename $f` >> $outf
done
which is not working. Any advice for a beginner?
wbsimey is offline   Reply With Quote
Old 12-30-2012, 11:53 AM   #2
BAMseek
Senior Member
 
Location: St. Louis, MO, USA

Join Date: Apr 2011
Posts: 124
Default

Here are a couple of things I noticed.

1. dir2 should be set to /data/rad2
2. instead of $topdir/$dir1, I think it should just be $dir1 (likewise for $dir2)
3. basename includes the fastq extension, so no need to append .fastq

This would be my attempt at it:

Code:
topdir=/data/rad3
dir1=/data/rad1
dir2=/data/rad2

for f in $dir1/*.fastq
do
    outf=$topdir/`basename $f`
    echo $outf
    cp $f $outf
    cat $dir2/`basename $f` >> $outf
done
I threw in an echo command, just to illustrate the usefulness of printing out your variables when debugging to see if the variable gets set to what you think it should be.

Also, it might be more straightforward to do "cat a b > c" rather than "cp a c; cat b >> c;"

So maybe the body could be replaced with

Code:
outf=$topdir/`basename $f`
cat $f $dir2/`basename $f` > $outf
Hope that helps.
Justin
BAMseek is offline   Reply With Quote
Old 12-30-2012, 12:58 PM   #3
wbsimey
Junior Member
 
Location: san francisco

Join Date: Jul 2010
Posts: 9
Default

Thank you Justin, that worked perfectly As you suggested, I didn't need the 'cp' command.
wbsimey is offline   Reply With Quote
Old 12-31-2012, 01:38 PM   #4
malcook
Member
 
Location: 66206

Join Date: Sep 2009
Posts: 23
Default time to learn xargs?

here is a one-liner that uses xargs

Code:
basename -a data/rad1/*.dat | xargs -t -I {} bash -c 'cat data/rad1/"{}" data/rad2/"{}"  > data/rad3/"{}" ' \;
notes:
  • this works on my mac/OSX; if you're on linux the options for xargs might be different
  • also, you might need to upgrade your GNU coreutils to get the version of basename that supports -a
  • xargs also supports -P for doing multiple cats in parallel if you have multiple processors this could speed things up....
  • the double quotes are protection again odd characters in your filenames, if any
malcook is offline   Reply With Quote
Old 12-31-2012, 02:08 PM   #5
wbsimey
Junior Member
 
Location: san francisco

Join Date: Jul 2010
Posts: 9
Default

Thanks malcook. I will play with this, especially the xargs -P option as I have 32 cores to play with.
wbsimey is offline   Reply With Quote
Old 12-31-2012, 02:34 PM   #6
wbsimey
Junior Member
 
Location: san francisco

Join Date: Jul 2010
Posts: 9
Default

Hi malcook, I tried to run this xargs one-liner, but I keep getting a basename error - "basename: invalid option -- 'a'"
I have the latest Ubuntu coreutils (8.13-3ubuntu3.1).
I looked at the basename man page and there are no listed options, only help and version.
wbsimey is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:46 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO