SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
bwa sai to bam conversion and indexfile.nt.ann?? cllorens Bioinformatics 16 05-29-2013 09:27 AM
how to check integrity/completeness of bwa sai files ? mccullocha Bioinformatics 0 09-04-2012 02:55 AM
Merge multiple fq read files hosseinv Bioinformatics 7 08-23-2012 07:36 PM
when does BWA writes the outpu .sai Thorondor Bioinformatics 1 07-06-2012 02:05 AM
Merge sai file of bwa ? louis7781x Bioinformatics 5 12-20-2011 04:00 PM

Reply
 
Thread Tools
Old 09-09-2012, 10:27 PM   #1
hosseinv
Junior Member
 
Location: Melbourne

Join Date: Aug 2012
Posts: 8
Smile Merge multiple *.sai files in bwa ?

I have illumina paired-end multiple-lane_reads of same individual which are distributed in 3 lanes like below;

C09FNACXX_EGLOB-140092_GTCAGT_L008_R1_001.fastq
C09FNACXX_EGLOB-140092_GTCAGT_L008_R1_002.fastq
C09FNACXX_EGLOB-140092_GTCAGT_L008_R2_001.fastq
C09FNACXX_EGLOB-140092_GTCAGT_L008_R2_002.fastq

D0H2WACXX_EGLOB-140092_GTCAGT_L001_R1_001.fastq
D0H2WACXX_EGLOB-140092_GTCAGT_L001_R1_002.fastq
D0H2WACXX_EGLOB-140092_GTCAGT_L001_R2_001.fastq
D0H2WACXX_EGLOB-140092_GTCAGT_L001_R2_002.fastq

D0H2WACXX_EGLOB-140092_GTCAGT_L002_R1_001.fastq
D0H2WACXX_EGLOB-140092_GTCAGT_L002_R1_002.fastq
D0H2WACXX_EGLOB-140092_GTCAGT_L002_R2_001.fastq
D0H2WACXX_EGLOB-140092_GTCAGT_L002_R2_002.fastq

They are all representing one individual (140092).
I am mapping these with BWA.
Question:
Are R1_001 and R2_001 in each lane the read-pairs?
if yes, I will probably able to do the aln command as following:
bwa aln refrencegenome.fa C09FNACXX_EGLOB-140092_ATATGA_L008_R1_001.fastq > aln_L008_1_1.sai
bwa aln refrencegenome.fa C09FNACXX_EGLOB-140092_ATATGA_L008_R1_002.fastq > aln_L008_1_2.sai
bwa aln refrencegenome.fa C09FNACXX_EGLOB-140092_ATATGA_L008_R2_001.fastq > aln_L008_2_1.sai
bwa aln refrencegenome.fa C09FNACXX_EGLOB-140092_ATATGA_L008_R2_002.fastq > aln_L008_2_2.sai
.
..
...
if yes, then how can I merge them in bwa sampe?
the original command is: bwa sampe ref.fa aln1.sai aln2.sai R1.fq R.fq > aln.sam
Here I have a confusion, actually I don't have aln1.sai aln2.sai R1.fq R2.fq but instead I will have;
aln1_1.sai aln1_2.sai aln2_1.sai aln2_2.sai for each lane.
So how I can merge all the many *.sai files into a single final_aln.sam?

My be I should consider R1_001 and R2_001 as paired reads and do the bwa sampe command as following;
bwa sampe ref.fa aln_L008_1_1.sai aln_L008_2_1.sai L008_R1_001.fastq L008_R2_001.fastq > aln_L008_1.sam
bwa sampe ref.fa aln_L008_1_2.sai aln_L008_2_2.sai L008_R1_002.fastq L008_R2_002.fastq > aln_L008_2.sam
.
..
..
and convert them to BAM files
and then merge them with samtools like;
samtools merge final.bam aln_L008_1.bam aln_L008_2.bam aln_L001_1.bam aln_L001_2.bam aln_L002_1.bam aln_L002_2.bam


I hope the question is clear.
I will be thankful to anyone who could help me with this question.
Cheers, Hossein.

Last edited by hosseinv; 09-09-2012 at 11:38 PM.
hosseinv is offline   Reply With Quote
Old 09-10-2012, 10:13 AM   #2
Bukowski
Senior Member
 
Location: UK

Join Date: Jan 2010
Posts: 390
Default

Or you could just concatenate the R1 fastq files together into one file, repeat in the same order for the R2 fastq fles and then you have two fastq files for the one individual and can go down the normal process of mapping the reads without having to merge alignments later.
Bukowski is offline   Reply With Quote
Old 09-10-2012, 04:47 PM   #3
hosseinv
Junior Member
 
Location: Melbourne

Join Date: Aug 2012
Posts: 8
Default

I've heard combining fastq files is not really recommended as there might be issues like fragment length penalties and ... so best to merge bam files. So I think you are agree with the way I wrote above, yes?
hosseinv is offline   Reply With Quote
Old 09-11-2012, 12:49 AM   #4
Bukowski
Senior Member
 
Location: UK

Join Date: Jan 2010
Posts: 390
Default

Quote:
Originally Posted by hosseinv View Post
I've heard combining fastq files is not really recommended as there might be issues like fragment length penalties and ... so best to merge bam files. So I think you are agree with the way I wrote above, yes?
Could you elaborate on this fragment length penalty issue? I've only seen this referenced with respect to Novalign, and you're using bwa.

What I suggested is the most parsimonious solution and certainly wouldn't have suggested it if I wouldn't do it myself
Bukowski is offline   Reply With Quote
Old 09-15-2012, 06:50 PM   #5
hosseinv
Junior Member
 
Location: Melbourne

Join Date: Aug 2012
Posts: 8
Default

Hi Bukowski. As I mentioned I'd just heard about that but you are right it happens with Novoalign. it should'nt be same problem with BWA.

Could you please tell me a website or manual describing BWA and SAMtools options in detail as those in http://bio-bwa.sourceforge.net/ and http://samtools.sourceforge.net/ are not sometimes clarified.

Thanks again for your attention.

Hosseinv
hosseinv is offline   Reply With Quote
Reply

Tags
bwa, merge, sai file, sam, sampe

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:59 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO