SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
454 reads correct with illumina biocomfun 454 Pyrosequencing 6 02-12-2012 03:00 AM
Correct input file format for SLOPE? Heisman Bioinformatics 2 09-21-2011 06:20 AM
Correct kit for qPCR on Illumina preps? Heisman Sample Prep / Library Generation 2 06-08-2011 07:34 AM
Mapping SOliD reads to a Newbler 454 alignment to correct errors Bukowski Bioinformatics 0 03-09-2010 02:20 AM
Manually correct heterozygous indels captainobvious Bioinformatics 2 03-03-2009 10:07 AM

Reply
 
Thread Tools
Old 01-15-2012, 10:44 PM   #1
mitochy
Member
 
Location: one does not simply approximate location

Join Date: Dec 2011
Posts: 10
Default How to find out if mapping is correct/not

Hi all.
I'm really new to sequencing.
I'm trying to map single end RNA-seq reads (short) to reference genome using (any) short read aligner. My question is, how do I find out if the program is mapping correctly using my data?
- input is some million reads of Ecoli K12 illumina RNA-seq data, 35bp in length
- ref is Ecoli K12 ~4.8MBp
- e.g. I use BWA 0.50 with bwtsw and it generated a sam file with position of each reads.

How do I know how correct the result in the SAM file is (statistically)? Or is there no way to find out and we have to trust result of correctness/ROC based on mapping known/simulated data (like this one?)
mitochy is offline   Reply With Quote
Old 01-16-2012, 04:53 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Have a look at the MAPQ field (5th column) of the SAM or BAM output file. I don't know how bwa calculates that (particularly of transcripts rather than simpler DNA reads), so you may want to browse the source code if you really need to know the nitty-gritty.
dpryan is offline   Reply With Quote
Old 01-16-2012, 03:47 PM   #3
mitochy
Member
 
Location: one does not simply approximate location

Join Date: Dec 2011
Posts: 10
Default

That is true. The thing is the 5th column is gotten from bwa's own calculation, and maq/bowtie might have different result from their own calculation. So I guess we can only trust ROC comparison based on simulation data.
mitochy is offline   Reply With Quote
Old 01-17-2012, 01:09 AM   #4
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Pretty much. I know from experience that two different aligners can align the same sequence to the same location and give very different MAPQ scores. Since many aligners are open source, you could just look at how they generate the score and see if it satisfies your requirements for the statistic (though I doubt it will). But, yeah, the simulation data is probably what you'll need to go by unless you feel up to modifying an aligner.
dpryan is offline   Reply With Quote
Reply

Tags
bowtie, bwa, mapping, maq, roc

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:52 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO