SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Extract aligned reads from a BAM file above a certain threshold The Snow Bioinformatics 4 07-29-2013 02:02 AM
run pindel with re-aligned bam files? caswater Bioinformatics 2 04-24-2012 01:17 AM
How to count aligned RNA-seq reads after sequenced and aligned by Illumina? IceWater Illumina/Solexa 5 04-05-2012 09:18 AM
Can GSNAP generate SAM/BAM files? efoss Bioinformatics 4 10-16-2011 08:11 PM
Bioperl adding tags to BAM aligned reads jonasZ Bioinformatics 4 06-06-2011 05:22 AM

Reply
 
Thread Tools
Old 07-14-2015, 01:53 PM   #1
efoss
Member
 
Location: Seattle

Join Date: Jul 2011
Posts: 98
Default removing reads that map to more than one location from gsnap-aligned bam files

I would like to remove reads that map to more than one location from bam files I created using gsnap. Does anyone know how to do this? I would prefer not to have to realign. Also, does anyone know what MAPQ 0 means in files that have been aligned using gsnap? As I understand it, the meaning of MAPQ 0 can change depending on which aligner was used to generate the bam file.

Thanks.

Eric
efoss is offline   Reply With Quote
Old 07-14-2015, 04:54 PM   #2
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 701
Default

see here : https://www.biostars.org/p/56246/

samtools view -bq 1 file.bam > unique.bam
Richard Finney is offline   Reply With Quote
Old 07-14-2015, 06:13 PM   #3
efoss
Member
 
Location: Seattle

Join Date: Jul 2011
Posts: 98
Default

Fantastic! Thanks so much, Richard Finney.

As I understand this command, it's saying to filter out anything with a mapping quality (MAPQ) score that is less than one and output that as a bam file. Is it true that, regardless of which aligner you use to create your bam file, a read that maps to more than one location will have a MAPQ score of 0?
efoss is offline   Reply With Quote
Old 07-14-2015, 07:28 PM   #4
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 701
Default

Not necessarily.
The tags (I think) are optional and not all alignment programs go the extra mile to make sure the tags are thorough.
I'm not sure how orthodox GSNAP is on this matter; you may wish to view the sam output tags to make sure they're what they should be.
Richard Finney is offline   Reply With Quote
Old 07-14-2015, 07:39 PM   #5
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Typically, a read that maps to multiple locations with a similar (internal) score will get a mapq of 3 or less, as 3 indicates at most a 50% chance that a given alignment is correct. But it varies greatly by aligner; some will always give mapq 255 for any mapped read, for example.

More importantly, even if an aligner does assign a read to multiple locations, they are not necessarily equivalent; the primary might be much better than the secondaries (and as such perhaps get a mapq well above 3). There is not a simple, universal way to ensure that you remove all reads from a sam file that map to multiple locations when processing it as a stream without tracking names to see how many times they occur, though you could make this process efficient if the file is sorted by name.

It's trivial to filter out all secondary alignments, though, with samtools.
Brian Bushnell is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:49 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO