![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Extract aligned reads from a BAM file above a certain threshold | The Snow | Bioinformatics | 4 | 07-29-2013 03:02 AM |
run pindel with re-aligned bam files? | caswater | Bioinformatics | 2 | 04-24-2012 02:17 AM |
How to count aligned RNA-seq reads after sequenced and aligned by Illumina? | IceWater | Illumina/Solexa | 5 | 04-05-2012 10:18 AM |
Can GSNAP generate SAM/BAM files? | efoss | Bioinformatics | 4 | 10-16-2011 09:11 PM |
Bioperl adding tags to BAM aligned reads | jonasZ | Bioinformatics | 4 | 06-06-2011 06:22 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Seattle Join Date: Jul 2011
Posts: 98
|
![]()
I would like to remove reads that map to more than one location from bam files I created using gsnap. Does anyone know how to do this? I would prefer not to have to realign. Also, does anyone know what MAPQ 0 means in files that have been aligned using gsnap? As I understand it, the meaning of MAPQ 0 can change depending on which aligner was used to generate the bam file.
Thanks. Eric |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: bethesda Join Date: Feb 2009
Posts: 700
|
![]() |
![]() |
![]() |
![]() |
#3 |
Member
Location: Seattle Join Date: Jul 2011
Posts: 98
|
![]()
Fantastic! Thanks so much, Richard Finney.
As I understand this command, it's saying to filter out anything with a mapping quality (MAPQ) score that is less than one and output that as a bam file. Is it true that, regardless of which aligner you use to create your bam file, a read that maps to more than one location will have a MAPQ score of 0? |
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: bethesda Join Date: Feb 2009
Posts: 700
|
![]()
Not necessarily.
The tags (I think) are optional and not all alignment programs go the extra mile to make sure the tags are thorough. I'm not sure how orthodox GSNAP is on this matter; you may wish to view the sam output tags to make sure they're what they should be. |
![]() |
![]() |
![]() |
#5 |
Super Moderator
Location: Walnut Creek, CA Join Date: Jan 2014
Posts: 2,707
|
![]()
Typically, a read that maps to multiple locations with a similar (internal) score will get a mapq of 3 or less, as 3 indicates at most a 50% chance that a given alignment is correct. But it varies greatly by aligner; some will always give mapq 255 for any mapped read, for example.
More importantly, even if an aligner does assign a read to multiple locations, they are not necessarily equivalent; the primary might be much better than the secondaries (and as such perhaps get a mapq well above 3). There is not a simple, universal way to ensure that you remove all reads from a sam file that map to multiple locations when processing it as a stream without tracking names to see how many times they occur, though you could make this process efficient if the file is sorted by name. It's trivial to filter out all secondary alignments, though, with samtools. |
![]() |
![]() |
![]() |
Thread Tools | |
|
|