SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
BWA insert size calculation in paired end sequencing dg13 Bioinformatics 3 07-30-2013 09:17 PM
BWA 0.6.1 fail to insert infer isize: weird pairing + Segmentation Fault attilav Bioinformatics 5 02-20-2012 02:54 PM
BWA sampe shows extremely large insert size oiiio Bioinformatics 7 12-26-2011 01:22 PM
bwa insert size estimation athena.uci Bioinformatics 2 11-07-2011 08:49 AM
bwa sampe max insert size zlu Bioinformatics 0 10-27-2009 07:35 AM

Reply
 
Thread Tools
Old 09-21-2012, 02:27 PM   #1
id0
Senior Member
 
Location: USA

Join Date: Sep 2012
Posts: 130
Unhappy BWA - fail to infer insert size: too few good pairs

I have some MiSeq data that I am trying to align with BWA to hg19.

This is the command:
bwa sampe -P <dir>/BWAIndex/genome.fa 1_1.sai 1_2.sai 1_1.fastq 1_2.fastq > out.sam

This is the output I am seeing:
Code:
[bwa_sai2sam_pe_core] convert to sequence coordinate... 
[infer_isize] fail to infer insert size: too few good pairs
[bwa_sai2sam_pe_core] time elapses: 0.10 sec
[bwa_sai2sam_pe_core] changing coordinates of 0 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_sai2sam_pe_core] time elapses: 0.00 sec
[bwa_sai2sam_pe_core] refine gapped alignments... 0.10 sec
[bwa_sai2sam_pe_core] print alignments... 1.06 sec
[bwa_sai2sam_pe_core] 262144 sequences have been processed.
[bwa_sai2sam_pe_core] convert to sequence coordinate... 
[infer_isize] fail to infer insert size: too few good pairs
[bwa_sai2sam_pe_core] time elapses: 0.06 sec
[bwa_sai2sam_pe_core] changing coordinates of 0 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_sai2sam_pe_core] time elapses: 0.00 sec
[bwa_sai2sam_pe_core] refine gapped alignments... 0.08 sec
[bwa_sai2sam_pe_core] print alignments... 1.13 sec
[bwa_sai2sam_pe_core] 524288 sequences have been processed.
[bwa_sai2sam_pe_core] convert to sequence coordinate... 
[infer_isize] fail to infer insert size: too few good pairs
[bwa_sai2sam_pe_core] time elapses: 0.01 sec
[bwa_sai2sam_pe_core] changing coordinates of 0 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_sai2sam_pe_core] time elapses: 0.00 sec
[bwa_sai2sam_pe_core] refine gapped alignments... 0.01 sec
[bwa_sai2sam_pe_core] print alignments... 0.16 sec
[bwa_sai2sam_pe_core] 569473 sequences have been processed.
[main] Version: 0.6.2-r126
It outputs a BAM file, but nothing is aligned. The same data was processed through Illumina's BaseSpace service, which uses BWA, and it produced results, so there should not be anything wrong with the actual reads. I also tried aligning with Bowtie2 and got reasonable results, so the FASTQ files should not be corrupted.

I have no idea what's wrong and I can't seem to find any record of such an error anywhere online. How do I troubleshoot this?
id0 is offline   Reply With Quote
Old 09-21-2012, 09:55 PM   #2
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

It means what it says. The software thinks that almost none of your reads aligned. Are you 100% sure the bwa aln steps are correct?
swbarnes2 is offline   Reply With Quote
Old 09-22-2012, 10:03 AM   #3
id0
Senior Member
 
Location: USA

Join Date: Sep 2012
Posts: 130
Default

Quote:
Originally Posted by swbarnes2 View Post
It means what it says. The software thinks that almost none of your reads aligned. Are you 100% sure the bwa aln steps are correct?
I am using this for bwa aln:
bwa aln <dir>/BWAIndex/genome.fa 1_1.fastq > 1_1.sai
bwa aln <dir>/BWAIndex/genome.fa 1_2.fastq > 1_2.sai

If there was a problem with bwa aln, I assume sai files would not get generated or at least there would be some error shown.
id0 is offline   Reply With Quote
Old 09-22-2012, 01:52 PM   #4
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

What I meant was, double check to make sure that you used the same reference genome in all the commands, and that it is the right reference genome, and that you got all the file names right in all the commands.

One quick trouble-shooting thing is to do make a single-end sam from one fastq, and confirm that that looks okay.
swbarnes2 is offline   Reply With Quote
Old 09-22-2012, 05:52 PM   #5
id0
Senior Member
 
Location: USA

Join Date: Sep 2012
Posts: 130
Default

Quote:
Originally Posted by swbarnes2 View Post
What I meant was, double check to make sure that you used the same reference genome in all the commands, and that it is the right reference genome, and that you got all the file names right in all the commands.

One quick trouble-shooting thing is to do make a single-end sam from one fastq, and confirm that that looks okay.
If I use the wrong genome or wrong file names, I get an error.

I will have to check how single-end SAM will work, but doing paired-end alignment using Bowtie2 works fine, so there should not be anything wrong with the reads.
id0 is offline   Reply With Quote
Old 09-24-2012, 06:33 AM   #6
id0
Senior Member
 
Location: USA

Join Date: Sep 2012
Posts: 130
Default

Quote:
Originally Posted by swbarnes2 View Post
One quick trouble-shooting thing is to do make a single-end sam from one fastq, and confirm that that looks okay.
I tried using bwa samse instead of sampe. I don't get any errors displayed, but it still fails to make any alignments.

This is the output in case it makes any difference:
Code:
> bwa samse <dir>/genome.fa 1_1.sai 1_1.fastq > 1_1.sam
[bwa_aln_core] convert to sequence coordinate... 1.40 sec
[bwa_aln_core] refine gapped alignments... 0.71 sec
[bwa_aln_core] print alignments... 0.55 sec
[bwa_aln_core] 262144 sequences have been processed.
[bwa_aln_core] convert to sequence coordinate... 1.37 sec
[bwa_aln_core] refine gapped alignments... 0.72 sec
[bwa_aln_core] print alignments... 0.65 sec
[bwa_aln_core] 524288 sequences have been processed.
[bwa_aln_core] convert to sequence coordinate... 1.25 sec
[bwa_aln_core] refine gapped alignments... 0.59 sec
[bwa_aln_core] print alignments... 0.09 sec
[bwa_aln_core] 569473 sequences have been processed.
[main] Version: 0.6.2-r126
id0 is offline   Reply With Quote
Old 09-24-2012, 06:48 AM   #7
zee
NGS specialist
 
Location: Malaysia

Join Date: Apr 2008
Posts: 249
Default

Quote:
Originally Posted by id0 View Post
I will have to check how single-end SAM will work, but doing paired-end alignment using Bowtie2 works fine, so there should not be anything wrong with the reads.
If you have the Bowtie2 PE result then perhaps do "samtools flagstat" on the Bowtie2 BAM file to determine how many reads alignn, how many proper pairs to expect,etc. You may also infer the insert size using BAMtools or Picard's CollectInsertSizeMetrics.
zee is offline   Reply With Quote
Old 09-24-2012, 08:12 AM   #8
id0
Senior Member
 
Location: USA

Join Date: Sep 2012
Posts: 130
Default

Quote:
Originally Posted by zee View Post
If you have the Bowtie2 PE result then perhaps do "samtools flagstat" on the Bowtie2 BAM file to determine how many reads alignn, how many proper pairs to expect,etc. You may also infer the insert size using BAMtools or Picard's CollectInsertSizeMetrics.
samtools flagstat produces good results (as far as I can tell):
Code:
1138946 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
1093043 + 0 mapped (95.97%:nan%)
1138946 + 0 paired in sequencing
569473 + 0 read1
569473 + 0 read2
1052996 + 0 properly paired (92.45%:nan%)
1085424 + 0 with itself and mate mapped
7619 + 0 singletons (0.67%:nan%)
1750 + 0 with mate mapped to a different chr
1193 + 0 with mate mapped to a different chr (mapQ>=5)
id0 is offline   Reply With Quote
Old 02-21-2013, 01:12 PM   #9
fongchun
Member
 
Location: Vancouver, BC

Join Date: May 2011
Posts: 55
Default Ever solve the problem?

Did you ever solve this problem? I am seeing the exact same problem with our MiSeq data exact the problem exists only for the reverse reads.

When I ran sampe the forward reads align no problem, but the reverse reads don't. When I try to run samse on the read pair individually, again the forward are fine, but +90% of the reverse reads do not align.

Only happens for this one case that we have....

If advice would be appreciated.
fongchun is offline   Reply With Quote
Old 03-05-2013, 09:34 AM   #10
AlliCox
Member
 
Location: Iowa City, IA

Join Date: Nov 2012
Posts: 10
Default

I just posted a similar problem - I hadn't seen this - did you ever resolve the mis-paired alignments?
Thanks!
AlliCox is offline   Reply With Quote
Old 03-05-2013, 09:40 AM   #11
fongchun
Member
 
Location: Vancouver, BC

Join Date: May 2011
Posts: 55
Default

Quote:
Originally Posted by AlliCox View Post
I just posted a similar problem - I hadn't seen this - did you ever resolve the mis-paired alignments?
Thanks!
My problem was that the reads were quite long (250 basepairs) and these weren't suited for BWA since from my understanding is a short-read aligner.

I ended up switching over to Bowtie2 which can handle longer reads and this solved my problem.

Hope that helps,
fongchun is offline   Reply With Quote
Old 03-06-2013, 01:09 PM   #12
id0
Senior Member
 
Location: USA

Join Date: Sep 2012
Posts: 130
Default

Quote:
Originally Posted by AlliCox View Post
I just posted a similar problem - I hadn't seen this - did you ever resolve the mis-paired alignments?
Thanks!
My original BWA issue did not get resolved. I am using Mac OS X 10.7 and I have seen reports of other Mac users reporting the same problem. When I switched from BWA 0.6.2 to BWA 0.5.9, it was working fine.

As I was writing this reply, I noticed BWA 0.7.0 was released last week. I am not sure if that one solves any of the Mac issues. See changelog here:
https://github.com/lh3/bwa/blob/master/NEWS

Quote:
Originally Posted by fongchun View Post
My problem was that the reads were quite long (250 basepairs) and these weren't suited for BWA since from my understanding is a short-read aligner.
I used BWA for 2x250bp reads and I did not encounter any problems. From BWA main page:

Quote:
Burrows-Wheeler Aligner (BWA) is an efficient program that aligns relatively short nucleotide sequences against a long reference sequence such as the human genome. It implements two algorithms, bwa-short and BWA-SW. The former works for query sequences shorter than 200bp and the latter for longer sequences up to around 100kbp.
id0 is offline   Reply With Quote
Old 03-06-2013, 01:26 PM   #13
fongchun
Member
 
Location: Vancouver, BC

Join Date: May 2011
Posts: 55
Default

Quote:
Originally Posted by id0 View Post
I used BWA for 2x250bp reads and I did not encounter any problems.
Then I am back to square one in terms of not knowing why BWA didn't work for my situation. The only other thing I can think of was that the base qualities in the reverse pair were very poor after about ~150 bases. This was actually a problem with the sequencing itself.

Maybe this played a role in the alignment process, but I don't know enough about BWA to comment on this.

Last edited by fongchun; 03-06-2013 at 01:32 PM.
fongchun is offline   Reply With Quote
Old 03-06-2013, 01:29 PM   #14
fongchun
Member
 
Location: Vancouver, BC

Join Date: May 2011
Posts: 55
Default

Actually it just came back to me that I used BWA-short and not BWA-SW. According to the manual, BWA-short is suited for reads shorter than 200 bp. This is likely why it failed.

I wanted to use BWA-SW, but from my understanding it doesn't handle paired-end reads? Bowtie2 does however so that's why we ended up switching over to Bowtie2 for this reason.

Quote:
Originally Posted by id0 View Post
I used BWA for 2x250bp reads and I did not encounter any problems. From BWA main page:
Out of curiosity, you ran BWA-short for that right? Or did you get it to work with BWA-SW? Because I didn't see any options for paired-end reads.

Last edited by fongchun; 03-06-2013 at 01:32 PM.
fongchun is offline   Reply With Quote
Old 03-06-2013, 01:31 PM   #15
id0
Senior Member
 
Location: USA

Join Date: Sep 2012
Posts: 130
Default

Quote:
Originally Posted by fongchun View Post
Actually it just came back to me that I used BWA-short and not BWA-SW. According to the manual, BWA-short is suited for reads shorter than 200 bp. This is likely why it failed.

I wanted to use BWA-SW, but from my understanding it doesn't handle paired-end reads? Bowtie2 does however so that's why we ended up switching over to Bowtie2 for this reason.
You can try trimming your FASTQ files and see how that goes. Sickle trims based on quality scores, so you would only lose low quality bases:
https://github.com/najoshi/sickle
id0 is offline   Reply With Quote
Old 03-06-2013, 01:34 PM   #16
fongchun
Member
 
Location: Vancouver, BC

Join Date: May 2011
Posts: 55
Default

Quote:
Originally Posted by id0 View Post
You can try trimming your FASTQ files and see how that goes. Sickle trims based on quality scores, so you would only lose low quality bases:
https://github.com/najoshi/sickle
We'll give that a try. Thanks!
fongchun is offline   Reply With Quote
Old 03-06-2013, 01:44 PM   #17
AlliCox
Member
 
Location: Iowa City, IA

Join Date: Nov 2012
Posts: 10
Default

Quote:
Originally Posted by id0 View Post
My original BWA issue did not get resolved. I am using Mac OS X 10.7 and I have seen reports of other Mac users reporting the same problem. When I switched from BWA 0.6.2 to BWA 0.5.9, it was working fine.

As I was writing this reply, I noticed BWA 0.7.0 was released last week. I am not sure if that one solves any of the Mac issues. See changelog here:
https://github.com/lh3/bwa/blob/master/NEWS



I used BWA for 2x250bp reads and I did not encounter any problems. From BWA main page:
hi,
It looks like the last 3 versions of bwa, including the latest released last week, don't work on a Mac. I downloaded 0.5.10 and I think things are working now. It's definitely taking much longer to make the .sai file, so I see that as a good sign.
Thanks!
AlliCox is offline   Reply With Quote
Old 06-10-2014, 07:27 AM   #18
wlangdon
Member
 
Location: ucl

Join Date: Nov 2012
Posts: 15
Default

I just ran into this after I aligned human paired-end sequences against the tiny non-human genome included in the distribution kit.
Opps:-(
Bill
wlangdon is offline   Reply With Quote
Reply

Tags
bwa, miseq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:56 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO