Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bowtie2 bit 2 of the flag

    Hello,

    I am wondering what does mean "The second bit (2 in decimal, 0x2 in hexadecimal) is set if the read is part of a pair that aligned in a paired-end fashion."

    I have mate pair reads mapped with following entries in sam:

    Code:
    H8:C19M1ACXX:5:1101:6777:2322   97      Sc_hox  1694486 42      101M    =       1691781 -2806   AGCCGGGCTGGAGCTTACCTGGCTGACAGGAACTTCTCTGGCTAGCATATACGATTTTCGGTGCACCGTGTAGATCTGCTTGGAGATTACTAGTAAGTGTC   CCCFFFFFHHHHHJJJJJJHGHJJJIIIIJIIJIIJJJJIIJJJIJJJJJIIJGIIIJGHHHFFFEE>;A?CCDDDDCCDDDB:CCDCCACCCCBCB?ACD   AS:i:-10        XN:i:0  XM:i:2  XO:i:0  XG:i:0  NM:i:2  MD:Z:93G0G6     YS:i:0  YT:Z:DP
    H8:C19M1ACXX:5:1101:6777:2322   145     Sc_hox  1691781 42      101M    =       1694486 2806    TACTGCTGAAGGCAATATTAGTTCCCCTCACAATGCGAAACTAATGTTATAGATTATATTGAAAACTCATTGGTACTGTAAACTATATCATAAACATAGTA   CDDDDCDEEEEEEEFFFFFFHHHIIIJJJIHJJIJIIJJHJJIJJIIJIHHIJJJIJIIJJJIGHFJIIJJJJJGJJJJJJJJJJJJJHHHHHFFFFFCCC   AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:101        YS:i:-10        YT:Z:DP
    Bitflags of those reads in binary formats are (from left to right 1 2 4 8 16 32 64 128)

    1 0 0 0 0 1 1 0
    1 0 0 0 1 0 0 1

    It seems that both reads are well aligned (full length), they are bout 3k from each other, they are on other strains...

    The only thing I can not understand is, why it has not bit 2 activated, because everything else looks fine to me...

    Cheers

  • #2
    Bit 2 denotes aligning as a "proper pair according to the aligner". Firstly, the alignments point away from each other. Secondly a 3kb fragment size is about 6-10x wider than expected. Are these mate-pair alignments? If so, you'll need to tell the aligner that.

    Comment


    • #3
      Originally posted by dpryan View Post
      Bit 2 denotes aligning as a "proper pair according to the aligner". Firstly, the alignments point away from each other. Secondly a 3kb fragment size is about 6-10x wider than expected. Are these mate-pair alignments? If so, you'll need to tell the aligner that.
      Cheers Ryan!

      Yes, these are mate-pairs (That is why I expected this insert size). Is it really important for mapper to know that the library is mate-pair library? It is already post-processed - therefore only difference to pair-end library is the insert size...

      I have not realized before, that they do not point to each other. If this is the reason, what is the interpretation of this observation? Because there are about half of pairs without flag 2 (I have to check if all of them are pointing away from each other)

      Comment


      • #4
        For mate-pairs it's expected that alignments point away from each other and have long insert sizes. Just make sure not to filter these out during a down-stream analysis step.

        Comment


        • #5
          You are right. Twice.

          Mate-pair indeed should, be pointing out from each other, but this is not the problem. And here comes, why you were right twice - bowtie thought that I have pair-end reads, therefore he considered correct the fraction of reads close to each other pointing out to each other.

          So when I have plotted histogram of insert sizes, I found that all correct reads (with flag 2) have very small insert size and those, which have small insert size and still do not have flag 2, they are pointing out from each other (mate-pair orientation) https://plus.google.com/u/0/10778698...81109861311357

          So my ultimate explanation is, that I just have poorly prepossessed mate-pairs, because about half of them are in fact just pare-end reads (i.e. they probably just had trimmed adapter from one of the outer sides).

          Comment


          • #6
            What kind of reads are these? Nextera LMP libraries, for example, are expected to produce a substantial fraction of short inserts, and those reads need to be specially processed first. If processed correctly, the long- and short-inserts will not end up in the same file.

            Comment


            • #7
              Brian, yes, I guess so, given the fraction of short insert sizes with convergent orientation. There reads were trimmed (and qc looked good), therefore I expected that the facrtion of short inserts is filtered already - apparently it is not.

              Mystery is solved, Thanks again Ryan.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM
              • seqadmin
                The Impact of AI in Genomic Medicine
                by seqadmin



                Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                02-26-2024, 02:07 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 03-14-2024, 06:13 AM
              0 responses
              32 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-08-2024, 08:03 AM
              0 responses
              71 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-07-2024, 08:13 AM
              0 responses
              80 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-06-2024, 09:51 AM
              0 responses
              68 views
              0 likes
              Last Post seqadmin  
              Working...
              X