SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Something strange in flag and mapping quality from BWA (0.6.2-r126) qiongyi Bioinformatics 6 08-21-2012 02:13 AM
strange SAM output zhidong Bioinformatics 5 08-23-2011 06:10 AM
strange bowtie output vebaev Bioinformatics 2 08-17-2011 02:44 PM
strange mapping results bwa + SOLiD Hit SOLiD 11 05-09-2011 11:54 AM
strange bowtie index building and mapping problem Gangcai Bioinformatics 0 08-04-2010 06:02 PM

Reply
 
Thread Tools
Old 10-29-2012, 04:26 AM   #1
diagen
Member
 
Location: Ned

Join Date: Jun 2012
Posts: 13
Default Help! Strange discrepancy in the mapping output

Hi,

I analyze the public RNA-seq data GSE29278 from the Ren lab using this pipeline: convert sra to FASTQ - map to mm10 by BWA default - convert sam to bed by SamTools. Remarkably, while the number of FASTQ reads is comparable in all datafilels, about half of them result in 10-20x less mapped reads than in others.
How to explain this? Could it be a problem of BWA or something else is wrong?
Will appreciate for any suggestions.

datafile #FastqReads #mappedReads
SRR207090.sra 53680296 10916922
SRR207091.sra 55117644 11007351
SRR207092.sra 137872756 25021157
SRR207093.sra 138580112 25982948
SRR207094.sra 122571612 24707083
SRR207095.sra 138402388 27051256
SRR207096.sra 96303468 19404082
SRR207097.sra 93965080 19379369
SRR207098.sra 138521432 26517816
SRR207099.sra 129879756 25531487
SRR207100.sra 101842656 20724671
SRR207101.sra 94289428 17374918
SRR207102.sra 99444472 20386170
SRR207103.sra 97366804 19698156
SRR207104.sra 89976804 18662045
SRR207105.sra 94502252 18551594
SRR207106.sra 134119212 25478228
SRR207107.sra 136421136 25550764
SRR207108.sra 112083680 21283621
SRR207109.sra 103706068 19259646
SRR392604.sra 81341016 1167997
SRR392605.sra 79523572 919123
SRR392606.sra 107383260 1493933
SRR392607.sra 108810768 1232292
SRR392608.sra 80765908 1247112
SRR392609.sra 78333052 890879
SRR392610.sra 72429160 1225045
SRR392611.sra 58510300 1299132
SRR392613.sra 121652996 1602188
SRR392614.sra 29990064 205181
SRR392615.sra 30096232 251824
SRR392616.sra 96376652 1586269
SRR392617.sra 79150596 1040652
SRR392618.sra 22528760 116048
SRR392619.sra 22024856 124722
diagen is offline   Reply With Quote
Old 10-29-2012, 05:01 AM   #2
diagen
Member
 
Location: Ned

Join Date: Jun 2012
Posts: 13
Default

So the mapping efficiency is around 20% and 1% for these series of datafiles. Actually 20% is not good either.
diagen is offline   Reply With Quote
Old 10-30-2012, 03:10 AM   #3
diagen
Member
 
Location: Ned

Join Date: Jun 2012
Posts: 13
Default

The problem was in index sequences that were not removed.
Solution: remove flanking 8bp or run bowtie in local alignment mode.
diagen is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:26 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO