SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Which Pipeline is Correct?? lg36 Bioinformatics 5 09-21-2016 02:37 AM
BWA Pipeline Tomi Bioinformatics 0 09-27-2011 03:11 PM
Tophat cannot evoke bowtie from under qsub kwicher Bioinformatics 5 09-07-2011 09:04 AM
Solexa pipeline cgjkjk Bioinformatics 0 01-23-2009 07:47 AM

Reply
 
Thread Tools
Old 06-27-2012, 10:15 AM   #1
milesgr
Member
 
Location: NJ

Join Date: Jun 2010
Posts: 34
Default Qsub bwa pipeline...

This may seem like a relatively easy question, but here goes:

I am trying to create a relatively simple shell script that sits in a parent folder, goes through all subfolders, grabs all files with the extension .fq, and aligns them using BWA aln. The caveat is that I need to use qsub to do this or I will make the admins very unhappy

My initial script was something along the lines of the following:

result=(`find . -name "*.fq" -type f`)

for i1 in ${result[@]}
do

qsub -q queue_name -l nodes=1pn=8 -V test.sh

done

It should be noted that the first portion simply gets the filenames (and paths) and the second calls test.sh, which is shown below (note that the first two variable paths are truncated for forum purposes):

REFERENCE=/refpath
BWA_HOME=/bwapath

result=(`find . -name "*.fq" -type f`)

for i1 in ${result[@]}
do

$BWA_HOME/bwa aln -t 8 $REFERENCE $i1 > $i1".sai"

done

I am simply trying to get BWA to run using the $1i filename and $i1.sai as the output name. The reason I ran the for loop here and in the initial script is because there seems to be a problem holding on to $i1. If I run the bottom script as a standalone script, BWA begins running on the first file. The key is I'm not certain how to get it to run qsub X times (where X is the number of .fq files), each time is with a different file/output. Any help would be greatly appreciated. Thanks!
milesgr is offline   Reply With Quote
Old 06-28-2012, 10:44 AM   #2
milesgr
Member
 
Location: NJ

Join Date: Jun 2010
Posts: 34
Default

Anyone?
milesgr is offline   Reply With Quote
Old 06-28-2012, 11:07 AM   #3
bjchen
Junior Member
 
Location: New York

Join Date: Jan 2012
Posts: 9
Default

What if you change your test.sh to just take one .fq file at a time and you pass the file name to test.sh in the for-loop of your initial script?
bjchen is offline   Reply With Quote
Old 06-28-2012, 08:09 PM   #4
Jon_Keats
Senior Member
 
Location: Phoenix, AZ

Join Date: Mar 2010
Posts: 279
Default

If you are interested check out our pipeline (www.keatslab.org). Its a qsub based program. We build two flavours. One that reads all the fastq pairs in a folder and aligns then calls variants. The other leverages an input file that allows for proper RG tag assignment that is required for GATK variant calling. Nice thing is it will merge multiple lanes of the same library together for you.
Jon_Keats is offline   Reply With Quote
Old 06-29-2012, 03:01 AM   #5
alec
Member
 
Location: Cambridge, MA

Join Date: Apr 2011
Posts: 18
Default

You'll probably find it more convenient to use a single job array than submitting many jobs in a loop.
In your launch script do a single qsub:
qsub -t 0-$[${#result[@]}-1] test.sh
In test.sh the environment variable PBS_ARRAYID tells you which job is running:
i1=${result[$PBS_ARRAYID]}
alec is offline   Reply With Quote
Old 07-02-2012, 08:33 AM   #6
milesgr
Member
 
Location: NJ

Join Date: Jun 2010
Posts: 34
Default

Got it working using:

sh "/filepath/test.sh $i1" | qsub -q queue_name -l nodes=1pn=8

Not positive why I had to go about it in such a roundabout way.
milesgr is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:35 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO