SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
bwa sampe error Mali Salmon Bioinformatics 14 10-27-2014 11:25 AM
bwa sampe very slow natpokah Bioinformatics 25 08-13-2013 10:18 AM
BWA 0.6.1 fail to insert infer isize: weird pairing + Segmentation Fault attilav Bioinformatics 5 02-20-2012 02:54 PM
BWA - sampe max occurrences of a read for pairing madsaan Bioinformatics 1 06-14-2011 01:48 PM
bwa sampe 0.5.7 error? rcorbett Bioinformatics 2 04-22-2010 07:13 AM

Reply
 
Thread Tools
Old 10-20-2009, 07:19 AM   #1
zlu
Member
 
Location: UK

Join Date: Nov 2008
Posts: 32
Default BWA sampe: wierd pairing

When running bwa (0.5.4) smape I got this line output to the screen over and over again:

[infer_isize] fail to infer insert size: weird pairing

Should I be worrying about this? Does it mean the pairing is not correct?

The following are the commands I used for alignment and sampe:

bwa aln -l 32 -t 2 -q 4 Genomes/Btau_UMD3.fa s_1_1_sequence.fq > Run20_s_1_1_sequence.sai & bwa aln -l 32 -t 2 -q 4 Genomes/Btau_UMD3.fa s_1_2_sequence.fq > Run20_s_1_2_sequence.sai &

bwa sampe -a 253 -o 1000 Genomes/Btau_UMD3/Btau_UMD3.fa s_1_1_sequence.sai s_1_2_sequence.sai s_1_1_sequence.fq s_1_2_sequence.fq > Run20_s_1_pe.bwa.sam

Thank you.
zlu is offline   Reply With Quote
Old 10-20-2009, 07:26 AM   #2
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

Bwa fails to infer insert size and will use "-a" to set the maximum insert size in pairing. This may happen if too few reads are mapped or the insert size distribution is bimodal or something alike. You should check the distribution after mapping.
lh3 is offline   Reply With Quote
Old 10-20-2009, 07:33 AM   #3
zlu
Member
 
Location: UK

Join Date: Nov 2008
Posts: 32
Default

The insert size specified was obtained from ELAND alignment. Do you think increasing the -a will help? And is there a tag in the sam file that reports pairing with wrong insert size?
zlu is offline   Reply With Quote
Old 10-20-2009, 07:35 AM   #4
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

You should draw the distribution.
lh3 is offline   Reply With Quote
Old 10-20-2009, 07:50 AM   #5
zlu
Member
 
Location: UK

Join Date: Nov 2008
Posts: 32
Default

Thanks for the quick reply.

Looking closer at the output from the sampe below, am I right to assume that there are 1649 out of 262144 processed reads where the insert size cannot be inferred correctly?

[bwa_read_seq] 0.0% bases are trimmed.
[bwa_sai2sam_pe_core] convert to sequence coordinate...
[infer_isize] fail to infer insert size: weird pairing
[bwa_sai2sam_pe_core] time elapses: 160.55 sec
[bwa_sai2sam_pe_core] change of coordinates in 1649 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_sai2sam_pe_core] time elapses: 1.39 sec
[bwa_sai2sam_pe_core] refine gapped alignments... 0.72 sec
[bwa_sai2sam_pe_core] print alignments... 2.13 sec
[bwa_sai2sam_pe_core] 262144 sequences have been processed.
zlu is offline   Reply With Quote
Old 12-14-2009, 01:18 PM   #6
elalo
Junior Member
 
Location: Montreal

Join Date: Dec 2009
Posts: 4
Default

Did you ever solve this issue? I'm encountering the same error message...
elalo is offline   Reply With Quote
Old 12-14-2009, 02:38 PM   #7
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

This message is usually caused by bad libraries. You should check the quality of your library in the first place. As I replied above, bwa still works if -a is about right, but to set a proper -a, again, you should plot the distribution of insert size. This is not a major problem with bwa but with your input data.
lh3 is offline   Reply With Quote
Old 12-15-2009, 03:25 AM   #8
zlu
Member
 
Location: UK

Join Date: Nov 2008
Posts: 32
Default

My problem was actually due to the uneven number of pair reads in the input fastq files. I was doing some quality filterings, mainly artefacts removal, on read1 and read2 separately and this resulted in the 2 files having different number of reads.
zlu is offline   Reply With Quote
Old 12-15-2009, 03:56 AM   #9
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

No. You must make sure the two files contain the same set of pairs with identical order in each file. Your input will fail all aligners to date, so far as I know.
lh3 is offline   Reply With Quote
Old 12-15-2009, 08:09 AM   #10
elalo
Junior Member
 
Location: Montreal

Join Date: Dec 2009
Posts: 4
Default

Thank you both for your help! It was indeed an issue with my library...
elalo is offline   Reply With Quote
Old 01-15-2010, 06:00 AM   #11
valeu
Member
 
Location: Paris

Join Date: Sep 2008
Posts: 69
Default

Dear Heng,

I aligned my mate-pair data with BWA (0.5.5) and observed a weird pairing of reads. I explain below:

when I run bwa sampe for one of pairs I get:
Code:
HWUSI-EAS454:1:2:0:108#0        113     chr2    96713303        0       50M     =       96439877        -273426 GATCAGTGGACTTTATGTTAATGAAAAAGGAAATCATCCAGGGTGCATCT      :B?BC?A-357;67C@C<CC<9B>BC<BB>B:<7>B=-BCBBBC@BB@B@      XT:A:R  NM:i:2    SM:i:0  AM:i:0  X0:i:3  X1:i:0  XM:i:2  XO:i:0  XG:i:0  MD:Z:7T23C18
HWUSI-EAS454:1:2:0:108#0        177     chr2    96439877        23      50M     =       96713303        273426  GAGTCTCTTTTGCTGAGTGTTGTCATATATGGAGGTGATGCATGGAACTG      ?A95/5?@B;?:@7BB9959?'79BAC>@B?;@>;B(B8:/'>;9C:BBB      XT:A:U  NM:i:2    SM:i:23 AM:i:0  X0:i:1  X1:i:2  XM:i:2  XO:i:0  XG:i:0  MD:Z:28C12C8
So here the distance between ends is 273426bp, though I (and BWA) know that "inferred external isize from 157719 pairs: 3054.215 +/- 185.122".

When I run BWA in simple end mode "bwa samse -n 30" for the same pair I get:
>HWUSI-EAS454:1:2:0:108#0 3 3
chr2 -96713303 2
chr2 -98220112 2
chr2 +96442725 2

on the left and

>HWUSI-EAS454:1:2:0:108#0 3 3
chr2 -96439877 2
chr2 +98222957 3
chr2 +96716152 3

on the right.

So my question is why BWA decides to pair ends in such a weird way when I could pair them as:
left: chr2 +96442725 2
right: chr2 -96439877 2
with ~2800bp of insert size?

And also, why in the output of "bwa samse -n 30" there is no information about quality of mapping? Why can't it be printed in SAM format as well?

Thank you in advance,
Valentina
valeu is offline   Reply With Quote
Old 01-15-2010, 07:24 AM   #12
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

Could you show the low and high boundaries from the bwa output? Something like:

[infer_isize] low and high boundaries: 330 and 670

EDIT: For a "proper read pair", you would expect to see the read with small coordinate mapped to the forward strand but in your example, it is the contrary. I guess you are aligning reads from Illumina long-insert library where the "proper pair" has RF orientation. Bwa does not support such read pairs. So far as I know, Maq is still the best tool for such alignment.

Last edited by lh3; 01-15-2010 at 07:29 AM.
lh3 is offline   Reply With Quote
Old 01-18-2010, 12:23 AM   #13
valeu
Member
 
Location: Paris

Join Date: Sep 2008
Posts: 69
Default

Low and high boundaries are: 2284 and 3824.

You are right, these are Solexa mate-pair data which should be aligned as "RF" instead of "FR"..

I have too much data to use Maq on them... Or I should run Bowtie first and then use Maq to align what was not aligned. But it is really a pitty that I cannot use BWA for that.

Maybe you could add a parameter that would specify which type of mapping you expect? Like you can run Bowtie in "--rf" or "--fr" mode.

Thanks,
Valentina
valeu is offline   Reply With Quote
Old 05-05-2011, 09:05 AM   #14
smehr12
Junior Member
 
Location: New York

Join Date: May 2011
Posts: 4
Default

Hi elalo,
How did you find out it was an issue with your library. How can I take of this isize failure message?
smehr12 is offline   Reply With Quote
Old 05-05-2011, 09:06 AM   #15
smehr12
Junior Member
 
Location: New York

Join Date: May 2011
Posts: 4
Default

Quote:
Originally Posted by zlu View Post
When running bwa (0.5.4) smape I got this line output to the screen over and over again:

[infer_isize] fail to infer insert size: weird pairing

Should I be worrying about this? Does it mean the pairing is not correct?

The following are the commands I used for alignment and sampe:

bwa aln -l 32 -t 2 -q 4 Genomes/Btau_UMD3.fa s_1_1_sequence.fq > Run20_s_1_1_sequence.sai & bwa aln -l 32 -t 2 -q 4 Genomes/Btau_UMD3.fa s_1_2_sequence.fq > Run20_s_1_2_sequence.sai &

bwa sampe -a 253 -o 1000 Genomes/Btau_UMD3/Btau_UMD3.fa s_1_1_sequence.sai s_1_2_sequence.sai s_1_1_sequence.fq s_1_2_sequence.fq > Run20_s_1_pe.bwa.sam

Thank you.
Hi zlu,

Do you mind tell me how you got rid of the failure message from bwa? I keep getting the message?
smehr12 is offline   Reply With Quote
Old 05-05-2011, 09:12 AM   #16
smehr12
Junior Member
 
Location: New York

Join Date: May 2011
Posts: 4
Default

Quote:
Originally Posted by lh3 View Post
You should draw the distribution.
Hi Ih3 ,
By distribution, you meant the library size distribution? or the reads that matched the subject?
smehr12 is offline   Reply With Quote
Old 05-05-2011, 10:32 AM   #17
Jon_Keats
Senior Member
 
Location: Phoenix, AZ

Join Date: Mar 2010
Posts: 279
Default

For the Mate-Pair data you can reverse complement with the FastX package and then run as a standard PE run but with much larger inserts. Works fine unless you have a ton of reads crossing the circularized junction.
Jon_Keats is offline   Reply With Quote
Old 01-03-2013, 07:46 AM   #18
mcgreevy
Junior Member
 
Location: Dublin

Join Date: Oct 2012
Posts: 7
Default error - weird pairing

Just thought I'd post this for posterities sake. I was getting the infer insert size -- weird pairing error but it was because the perl script i was running to call bwa was using the same fastq file for both sides of the pair end reads instead of using the two seperate files. perhaps this will save someone some trouble.
mcgreevy is offline   Reply With Quote
Old 11-10-2015, 02:05 AM   #19
Warr
Junior Member
 
Location: Edinburgh

Join Date: Nov 2015
Posts: 1
Default

Quote:
Originally Posted by mcgreevy View Post
Just thought I'd post this for posterities sake. I was getting the infer insert size -- weird pairing error but it was because the perl script i was running to call bwa was using the same fastq file for both sides of the pair end reads instead of using the two seperate files. perhaps this will save someone some trouble.
Thank you for posting this, mcgreevy, you certainly saved me some trouble. I had a typo in my shell script .
Warr is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:27 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO