Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • gene_x
    Senior Member
    • May 2010
    • 108

    how to get reads with "mate mapped to a different chr"

    In the output of
    Code:
    samtools flagstat input.bam
    There is these last two lines:

    HTML Code:
    xxx +0 with mate mapped to a different chr
    xxx +0 with mate mapped to a different chr (mapQ>=5)
    I'm wondering what's the FLAG for "mate mapped to a different chr"?
  • Brian Bushnell
    Super Moderator
    • Jan 2014
    • 2709

    #2
    There is no flag; you compare "rname" with "rnext". If they are the same, or "rnext" is "=", and they are both mapped (0x1 set, 0x4 and 0x8 not set) then they are on the same sequence. They are mapped to different sequences if both are mapped but rnext is not "=" and not equal to rname.

    Comment

    • gene_x
      Senior Member
      • May 2010
      • 108

      #3
      Originally posted by Brian Bushnell View Post
      There is no flag; you compare "rname" with "rnext". If they are the same, or "rnext" is "=", and they are both mapped (0x1 set, 0x4 and 0x8 not set) then they are on the same sequence.
      oh, i see. Thanks!

      Comment

      • Richard Finney
        Senior Member
        • Feb 2009
        • 701

        #4
        Good question.

        Consult the Sam format documentation.



        Section 1.4 contains the bitwise flag descriptions.

        Note that not all software properly sets these flags.

        I guess you'd check the 0x4 and 0x8 flags ( 0x4=segment unmapped, 0x8=segment unmapped). If both unmapped then check if field 3 (RNAME) is not same as field 7 (RNEXT) [and 8 field is not '*' and not '='] .

        There are various "FIXMATE" programs running around; you may wish to use.

        I'm not sure about 0x800 flag (not same as 0x8). "Chimeric" I'm guessing doesn't always mean same chromosome.
        Last edited by Richard Finney; 01-12-2015, 12:24 PM.

        Comment

        • Brian Bushnell
          Super Moderator
          • Jan 2014
          • 2709

          #5
          Originally posted by Richard Finney View Post
          I'm not sure about 0x800 flag (not same as 0x8). "Chimeric" i'm guessing doesn't always mean same chromosome.
          I don't think the answer is fully defined for either chimeric or secondary alignments.

          Comment

          • gene_x
            Senior Member
            • May 2010
            • 108

            #6
            Originally posted by Brian Bushnell View Post
            I don't think the answer is fully defined for either chimeric or secondary alignments.
            I'm actually pretty confused about what is "secondary alignment".. can you clarify it a little bit?

            Also, in the attached image, I don't understand how the entry in the second column equals to that in the first column? For example, 0x0001 and p are supposed to represent the same thing I guess? what's the encoding conversion rules here? And where is the string representation used? In BAM files?
            Attached Files

            Comment

            • Brian Bushnell
              Super Moderator
              • Jan 2014
              • 2709

              #7
              I have not seen that string notation before ("p", "P", etc) and it's not part of the SAM specification as far as I know. Reads are supposed to have at most one primary alignment; if a read maps to multiple locations, it can have multiple secondary alignments (0x100 flag bit). But reads can also have multiple 0x800 "supplementary" alignments, a new feature of the sam format which is rather confusing.

              Comment

              • gene_x
                Senior Member
                • May 2010
                • 108

                #8
                Originally posted by Brian Bushnell View Post
                I have not seen that string notation before ("p", "P", etc) and it's not part of the SAM specification as far as I know. Reads are supposed to have at most one primary alignment; if a read maps to multiple locations, it can have multiple secondary alignments (0x100 flag bit). But reads can also have multiple 0x800 "supplementary" alignments, a new feature of the sam format which is rather confusing.
                Yeah, the documentation is very limited and confusing. I have to search online for a few tutorials to get an OK understanding of bitwise FLAG..

                I'm confused.. what does supplementary alignment mean?

                Comment

                • Brian Bushnell
                  Super Moderator
                  • Jan 2014
                  • 2709

                  #9
                  In practice, if you run bwa, it means a chimeric alignment, in which there are multiple local alignments that are not very close to each other. I am not aware of any other aligners that generate it.

                  Comment

                  • gene_x
                    Senior Member
                    • May 2010
                    • 108

                    #10
                    Originally posted by Brian Bushnell View Post
                    In practice, if you run bwa, it means a chimeric alignment, in which there are multiple local alignments that are not very close to each other. I am not aware of any other aligners that generate it.
                    I see. Thanks.

                    Comment

                    • dpryan
                      Devon Ryan
                      • Jul 2011
                      • 3478

                      #11
                      Originally posted by Brian Bushnell View Post
                      I have not seen that string notation before ("p", "P", etc) and it's not part of the SAM specification as far as I know.
                      It's been deprecated. Back in the day, there was a multicharacter version of the FLAG field available from samtools. That was done away with after the conversion to htslib (I'm pretty sure it was still there in 0.1.19). The only thing I've ever seen use those is BS-Seeker2, in fact.

                      Comment

                      • gene_x
                        Senior Member
                        • May 2010
                        • 108

                        #12
                        Originally posted by dpryan View Post
                        It's been deprecated. Back in the day, there was a multicharacter version of the FLAG field available from samtools. That was done away with after the conversion to htslib (I'm pretty sure it was still there in 0.1.19). The only thing I've ever seen use those is BS-Seeker2, in fact.
                        I see. Good to know!

                        Comment

                        • splaisan
                          senior molecular biologist
                          • Jun 2009
                          • 32

                          #13
                          one way to get it

                          see https://www.biostars.org/p/17575/#327644
                          http://www.bits.vib.be/index.php

                          Comment

                          Latest Articles

                          Collapse

                          • GATTACAT
                            Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                            by GATTACAT
                            Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                            07-01-2026, 11:43 AM
                          • SEQadmin2
                            Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                            by SEQadmin2


                            I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                            Here are nine questions we think about, in roughly the order they matter, before...
                            06-18-2026, 07:11 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by SEQadmin2, Yesterday, 11:08 AM
                          0 responses
                          7 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-30-2026, 05:37 AM
                          0 responses
                          11 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-26-2026, 11:10 AM
                          0 responses
                          20 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-17-2026, 06:09 AM
                          0 responses
                          53 views
                          0 reactions
                          Last Post SEQadmin2  
                          Working...