SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
BWA and mate pair bouhassi Bioinformatics 0 12-07-2011 07:33 AM
RNA-Seq: X-MATE: A flexible system for mapping short read data. Newsbot! Literature Watch 0 01-11-2011 07:20 AM
behaviour of bwa mapping for homologous regions. KevinLam Bioinformatics 1 07-28-2010 08:39 AM
mate pair + single read data boetsie General 1 04-14-2010 09:08 AM
bwa for mate pair reads talk24 Bioinformatics 1 03-29-2010 07:37 PM

Reply
 
Thread Tools
Old 06-23-2011, 09:42 AM   #1
apratap
Member
 
Location: Bay Area

Join Date: Jan 2009
Posts: 58
Smile BWA behaviour with Mate Pair data + Multi read mapping

Hi All

PS: This message was also posted on BWA mailing list but I did not get any response.


I am not able to logically understand why BWA is/not able to work natively
with mate pair data. Second question is whats the best work around if I
want to obtain multiple read mappings (if any) for a read. I am also pasting
the result of aligning mate pair data both before and after reverse
complimenting. The mapping gets better after rev comp but I dont quite
understand why. Appreciate you input on both my questions.


A. Before Reverse comp : Mate pair data <----- ------>

356766 + 0 in total (QC-passed reads + QC-failed reads)
282236 + 0 mapped (79.11%:nan%)
356766 + 0 paired in sequencing
178383 + 0 read1
178383 + 0 read2
126094 + 0 properly paired (35.34%:nan%)


B. After Reverse Comp : Mate pair data --------------->
<----------------------

356766 + 0 in total (QC-passed reads + QC-failed reads)
265575 + 0 mapped (74.44%:nan%)
356766 + 0 paired in sequencing
178383 + 0 read1
178383 + 0 read2
11146 + 0 properly paired (3.12%:nan%)


It seems the algo takes a major hit during the pairing of reads.

Thanks!
-Abhi
apratap is offline   Reply With Quote
Old 06-23-2011, 01:25 PM   #2
Jon_Keats
Senior Member
 
Location: Phoenix, AZ

Join Date: Mar 2010
Posts: 279
Default

What was your experiment design? Standard Illumina mate-pair? Read length? Also how did you reverse complement and what was the bwa command line arguements?

I've done standard Illumina MP preps, reverse complemented with fastx-toolkit and aligned with standard parameters using bwa before with good success. See below

133724796 in total
0 QC failure
45170770 duplicates
121800898 mapped (91.08%)
133724796 paired in sequencing
66862398 read1
66862398 read2
102897270 properly paired (76.95%)
115119322 with itself and mate mapped
6681576 singletons (5.00%)
3387642 with mate mapped to a different chr
2495203 with mate mapped to a different chr (mapQ>=5)
Jon_Keats is offline   Reply With Quote
Old 06-23-2011, 01:37 PM   #3
apratap
Member
 
Location: Bay Area

Join Date: Jan 2009
Posts: 58
Default

Hi Jon

Thanks for your reply. The protocol is not standard we are trying to sequence the ends of transcripts using Mate Pair technique.

I data that I get after linker removal is of variable read length 60+/-20 bp. I reverse compliment the reads based on the basic definition reverse the read and then compliment it and also reverse the quality header.

One thing that could trick BWA is the variable fragment size as it dependent on the length of transcripts that we are trying to capture.

As per BWA options I have pretty much used the standard ones. At this point I am not so concerned about the mapping % as I am about the need for reverse complimenting the reads before mapping with BWA and how it handles the multi read mapping.

Thanks!
-Abhi
apratap is offline   Reply With Quote
Old 06-23-2011, 01:48 PM   #4
Jon_Keats
Senior Member
 
Location: Phoenix, AZ

Join Date: Mar 2010
Posts: 279
Default

I'm assuming you are trying to get the 5' and 3' ends of each RNA species by circularizing the cDNA? Neat idea, definately the weird distribution when mapped to genome will give bwa some problems. You might want to try Tophat instead for the alignment.

For reads that map at multiple locations bwa will report the other potential sites and will randomly select one unless the mate/pair read dictates the location, but even then it should report the alternative options.
Jon_Keats is offline   Reply With Quote
Old 06-23-2011, 01:51 PM   #5
apratap
Member
 
Location: Bay Area

Join Date: Jan 2009
Posts: 58
Default

Any idea why BWA needs reads to be inner directional (---> <---) for it to map them.

I guess Tophat will not work as the read lengths are variable and as per my understanding of the version I used they require read 1/2 of equal length. In our case based on the identification of linker the read length will be variable.

-Abhi
apratap is offline   Reply With Quote
Old 06-23-2011, 04:38 PM   #6
Jon_Keats
Senior Member
 
Location: Phoenix, AZ

Join Date: Mar 2010
Posts: 279
Default

Maybe try BWA in single end mode, filter out the reads aligning to multiple locations, then manually pair the reads using perl or something to find the mates/pairs that mark the ends of your RNA species
Jon_Keats is offline   Reply With Quote
Reply

Tags
bwa, mate-pair

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:55 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO