SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
MiSeq gDNA reads still fail "Kmer content" and "per base seq content" after trimming" ysnapus Illumina/Solexa 4 11-12-2014 07:25 AM
AVA-difference betweeen "computation" reads and "global align" reads CCBIO 454 Pyrosequencing 0 01-17-2014 03:13 AM
samtools flagstat reports an odd number of reads "with itself and mate mapped" syfo Bioinformatics 0 04-24-2013 03:43 AM
Creating "Consensus" Sequence between Reference and Mapped Reads roliwilhelm Genomic Resequencing 2 07-23-2012 11:21 AM
SEQanswers second "publication": "How to map billions of short reads onto genomes" ECO Literature Watch 0 06-29-2009 11:49 PM

Reply
 
Thread Tools
Old 01-12-2015, 10:50 AM   #1
gene_x
Senior Member
 
Location: MO

Join Date: May 2010
Posts: 108
Default how to get reads with "mate mapped to a different chr"

In the output of
Code:
samtools flagstat input.bam
There is these last two lines:

HTML Code:
xxx +0 with mate mapped to a different chr
xxx +0 with mate mapped to a different chr (mapQ>=5)
I'm wondering what's the FLAG for "mate mapped to a different chr"?
gene_x is offline   Reply With Quote
Old 01-12-2015, 11:04 AM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

There is no flag; you compare "rname" with "rnext". If they are the same, or "rnext" is "=", and they are both mapped (0x1 set, 0x4 and 0x8 not set) then they are on the same sequence. They are mapped to different sequences if both are mapped but rnext is not "=" and not equal to rname.
Brian Bushnell is offline   Reply With Quote
Old 01-12-2015, 11:05 AM   #3
gene_x
Senior Member
 
Location: MO

Join Date: May 2010
Posts: 108
Default

Quote:
Originally Posted by Brian Bushnell View Post
There is no flag; you compare "rname" with "rnext". If they are the same, or "rnext" is "=", and they are both mapped (0x1 set, 0x4 and 0x8 not set) then they are on the same sequence.
oh, i see. Thanks!
gene_x is offline   Reply With Quote
Old 01-12-2015, 11:14 AM   #4
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 694
Default

Good question.

Consult the Sam format documentation.

https://samtools.github.io/hts-specs/SAMv1.pdf

Section 1.4 contains the bitwise flag descriptions.

Note that not all software properly sets these flags.

I guess you'd check the 0x4 and 0x8 flags ( 0x4=segment unmapped, 0x8=segment unmapped). If both unmapped then check if field 3 (RNAME) is not same as field 7 (RNEXT) [and 8 field is not '*' and not '='] .

There are various "FIXMATE" programs running around; you may wish to use.

I'm not sure about 0x800 flag (not same as 0x8). "Chimeric" I'm guessing doesn't always mean same chromosome.

Last edited by Richard Finney; 01-12-2015 at 11:24 AM.
Richard Finney is offline   Reply With Quote
Old 01-12-2015, 11:20 AM   #5
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by Richard Finney View Post
I'm not sure about 0x800 flag (not same as 0x8). "Chimeric" i'm guessing doesn't always mean same chromosome.
I don't think the answer is fully defined for either chimeric or secondary alignments.
Brian Bushnell is offline   Reply With Quote
Old 01-12-2015, 12:01 PM   #6
gene_x
Senior Member
 
Location: MO

Join Date: May 2010
Posts: 108
Default

Quote:
Originally Posted by Brian Bushnell View Post
I don't think the answer is fully defined for either chimeric or secondary alignments.
I'm actually pretty confused about what is "secondary alignment".. can you clarify it a little bit?

Also, in the attached image, I don't understand how the entry in the second column equals to that in the first column? For example, 0x0001 and p are supposed to represent the same thing I guess? what's the encoding conversion rules here? And where is the string representation used? In BAM files?
Attached Images
File Type: png SAM FLAG.png (51.0 KB, 11 views)
gene_x is offline   Reply With Quote
Old 01-12-2015, 12:09 PM   #7
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

I have not seen that string notation before ("p", "P", etc) and it's not part of the SAM specification as far as I know. Reads are supposed to have at most one primary alignment; if a read maps to multiple locations, it can have multiple secondary alignments (0x100 flag bit). But reads can also have multiple 0x800 "supplementary" alignments, a new feature of the sam format which is rather confusing.
Brian Bushnell is offline   Reply With Quote
Old 01-12-2015, 12:15 PM   #8
gene_x
Senior Member
 
Location: MO

Join Date: May 2010
Posts: 108
Default

Quote:
Originally Posted by Brian Bushnell View Post
I have not seen that string notation before ("p", "P", etc) and it's not part of the SAM specification as far as I know. Reads are supposed to have at most one primary alignment; if a read maps to multiple locations, it can have multiple secondary alignments (0x100 flag bit). But reads can also have multiple 0x800 "supplementary" alignments, a new feature of the sam format which is rather confusing.
Yeah, the documentation is very limited and confusing. I have to search online for a few tutorials to get an OK understanding of bitwise FLAG..

I'm confused.. what does supplementary alignment mean?
gene_x is offline   Reply With Quote
Old 01-12-2015, 12:19 PM   #9
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

In practice, if you run bwa, it means a chimeric alignment, in which there are multiple local alignments that are not very close to each other. I am not aware of any other aligners that generate it.
Brian Bushnell is offline   Reply With Quote
Old 01-12-2015, 01:38 PM   #10
gene_x
Senior Member
 
Location: MO

Join Date: May 2010
Posts: 108
Default

Quote:
Originally Posted by Brian Bushnell View Post
In practice, if you run bwa, it means a chimeric alignment, in which there are multiple local alignments that are not very close to each other. I am not aware of any other aligners that generate it.
I see. Thanks.
gene_x is offline   Reply With Quote
Old 01-12-2015, 02:28 PM   #11
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,479
Default

Quote:
Originally Posted by Brian Bushnell View Post
I have not seen that string notation before ("p", "P", etc) and it's not part of the SAM specification as far as I know.
It's been deprecated. Back in the day, there was a multicharacter version of the FLAG field available from samtools. That was done away with after the conversion to htslib (I'm pretty sure it was still there in 0.1.19). The only thing I've ever seen use those is BS-Seeker2, in fact.
dpryan is offline   Reply With Quote
Old 01-13-2015, 10:50 AM   #12
gene_x
Senior Member
 
Location: MO

Join Date: May 2010
Posts: 108
Default

Quote:
Originally Posted by dpryan View Post
It's been deprecated. Back in the day, there was a multicharacter version of the FLAG field available from samtools. That was done away with after the conversion to htslib (I'm pretty sure it was still there in 0.1.19). The only thing I've ever seen use those is BS-Seeker2, in fact.
I see. Good to know!
gene_x is offline   Reply With Quote
Old 07-18-2018, 04:43 AM   #13
splaisan
senior molecular biologist
 
Location: Belgium

Join Date: Jun 2009
Posts: 30
Default one way to get it

see https://www.biostars.org/p/17575/#327644
splaisan is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:06 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO