Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
BWA sampe shows extremely large insert size oiiio Bioinformatics 7 12-26-2011 01:22 PM
bwa insert size estimation athena.uci Bioinformatics 2 11-07-2011 08:49 AM
short insert size in samll RNA sequencing wangzhijiao Illumina/Solexa 0 07-21-2011 10:37 PM
insert size for illumina 72-SE jgibbons1 Sample Prep / Library Generation 0 04-01-2010 12:20 PM
bwa sampe max insert size zlu Bioinformatics 0 10-27-2009 07:35 AM

Thread Tools
Old 01-30-2012, 02:04 AM   #1
Junior Member
Location: London

Join Date: Jan 2012
Posts: 4
Default Problem with BWA mapping of Illumina PE short insert size fragments (FFPE material)

I am having difficulties mapping paired end Illumina reads on FFPE material (i.e. short insert size, approximately 80-150 bp), with BWA (both version 0.5.9 and the newest version 0.6.1).
To enforce the short insert size the following flags were used:
"$BWA sampe -a 500 -A -o 10000 -r "

Mapping is done without errors however when viewing the sorted.bam files, the paired reads are more often than not, mapped to different chromosomes (as exemplified by the picture), despite the fact that the reads overlap with each other, due to the short insert size!
The majority of the reads contained adapter sequence which was removed by cut adapt, which suggests that the insert size was smaller than 100 bp. Is this what is causing problems when mapping and how would I go about getting around this?

Your reply would be much appreciated!
Attached Images
File Type: jpg seq2small.jpg (92.6 KB, 65 views)
LadyGray is offline   Reply With Quote
Old 01-30-2012, 09:34 AM   #2
Senior Member
Location: San Diego

Join Date: May 2008
Posts: 912

That doesn't seem too likely to me. I've got some data recently where the insert sizes are a little smaller and more variable than I'd like, and bwa has no problem assigning them to the right coordinates, with lots of overlap where necesary. But than again, my samples are bacteria, and there's a lot less genome for the aligner to play with.

3 Mb, isn't that pretty close to the telomere? Do you see the same problem in the middle of the chromosome?

First thing I'd do is to spot check both sequences for those cross chromosome clusters. If you manually blast those sequences, can you get alignment positions that make more sense than what bwa assigned? The mapping quality seems to be high for those, so that's not the problem.
swbarnes2 is offline   Reply With Quote
Old 10-22-2012, 01:20 AM   #3
Junior Member
Location: London

Join Date: Jan 2012
Posts: 4


I thought Id give a late response with an update on what I did.

Since BWA PE mapping was not successful, (i.e. paired reads mapped to different chromosomes in 80% of all cases. And the mapping quality of these was still high! This is probably due to the extremely short insert size in the FFPE material, that messes with the mapping algorithm.)

I tried BWA0.6.1 – same result.

Adapter-trimmed sequences are to short to contain more information if mapped in pairs, whereas sequences longer than 100 bp contain additional information when mapped in pairs -> So I wrote a perl script that separates trimmed reads into one file (for SE mapping) and untrimmed read pairs into two files (for PE mapping). It worked fine.
LadyGray is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 11:17 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO