Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bwa: “XT:A:U” and MAPQ of 0 at the same time

    I map RNA-Seq Data with BWA to the genome. The output files from BWA in sam-format contain reads that have on the one hand the tag "XT:A:U" and, on the other hand, as well a mapping quality of 0.

    What does this mean? I thought that "XT:A:U" means uniquely best hit?! How does this then go together with a MAPQ of 0?

    As far as I understood, the MAPQ value is also 0 if there are other possible alignments - even with a lower score. So a read might have a "XT:A:U" score but at the same time a MAPQ of 0 - meaning that there are many other possible alignments with a slightly worse score.

    Any comments on this?

    best,
    tefina

  • #2
    They are produced by different algorithms. Mapq is produced by Smith-Waterman Alignment while the XT flags come from Burrows-Wheeler Transform search.

    What I want to know, is which (if either) do I "trust"?

    -Cam

    Comment


    • #3
      another question regarding the MAPQ and teh XT:A flag

      hi All,
      i have a similar question, but on the complemetary situation:
      how come i haev reads that are flaged with XT:A:R (which means they have multiple possible allignments) and at the same time MAPQ >0 ?
      (their SAM flag indicates that they "mapped in correct orientation and within insert size" - i.e. flags 99, 147, 83,163)

      am i missing anything??
      i feel that i might misunderstood it completely..
      your help will be appreciated :-))

      Comment


      • #4
        Hi Morangal,

        I suspect that in these cases the reads can also be mapped correctly to a second location. MAPQ = 0 if reads map perfectly in two places. This should be possible for short reads and multiple copies of a gene for example. Given that you can't actually tell which of the two locations produced the read you can't have any confidence in claiming it mapped to one location or the other.

        Hope this helps,
        Cam

        Comment


        • #5
          thanks Cam for your quick reply :-)
          but i still don't understand why those reads, that are marked as "multiple locations" (i.e. XT:A:R) can get MAPQ that is not 0 (some actually have a high MAPQ e.g. 36 and so).
          the way i understand it, is if there is more than one posiible allignment, then the mapping quality should be low or zero...

          anyone?

          Comment


          • #6
            The XT:A:R flag is generated by a different alogrithm to the MAPQ score - so they can have different answers!

            The question is which algorithm do you believe? BW alignment creates the XT flag - so it detected two locations of high mapping probability - while Smith-Waterman only detected one good match - and so gave a high MAPQ score.

            So which do you trust?

            Cam

            Comment


            • #7
              Mapq=0, xt:a:u, bwa

              Have any of your resolved this? or rather, have any suggestions on which value to follow, "XT:A:U" or the MAPQ. This is the example of my BAM file where I have ran

              samtools view -f 2 aln.bam |grep "XT:A:U" >tmp.bam
              I am looking for properly paired reads that are uniquely mapped but is having the same issue as described, please see below

              HTML Code:
              HWI700710:177:d0u1eacxx:6:2301:4126:92443       99      PM_map_6-1 113189  0       101M    =       113411  299     CGACTCCAGCTTCTCCGACGCAGGTGAATTTTTGCACCAACAGGATTGGCGCTAGAGGAAGGGGCTGTGTCTTCGCAGTACCCTTGTTCTTAAAGGAGCGC   CCCFFFFFHHHGFGIGJJJIIJJJBFHGIJIJJJIJJIGGJCHIIIIJJJBHHEFDE@AEEDDDC=?BDDDDDDD@BDDDDDCCDC@CDEDCADD=A?ADB   XT:A:U  NM:i:2  SM:i:0 AM:i:0   X0:i:1  X1:i:14 XM:i:2  XO:i:0  XG:i:0  MD:Z:14T70A15
              HWI700710:177:d0u1eacxx:6:2107:10916:192451     163     PM_map_6-1 142637  0       80M     =       143393  854     AGCTTCTCGGAGTTGGAACATGCATTTTGATAAGGTGATCAAAACGTATGGCTTCATTAAGAATGGAGCAGAGCCCTGCA        CCCFFFFFHHHHFHIIJIJIIIJJJJJJJIIJJIJCAGGGIJJIJGDHIHHIJGIIIE@GHJJJFJCD.=>AEDDEB>=;        XT:A:U  NM:i:1  SM:i:0  AM:i:0  X0:i:1  X1:i:1015      XM:i:1   XO:i:0  XG:i:0  MD:Z:68A11
              HWI700710:177:d0u1eacxx:6:2302:5626:184755      147     PM_map_6-1 145836  0       5M1I71M =       145311  -601    CCTGATTCAACCATCAAAGAAGGATCAGCGCAGTACTAGCATACTGTGCTGATTTTCTGAAGCAGGACTTCTGATCG   HFHE?7)HGDJIJJJJIIHHIJJIJIHJJJJIJIIJJIHGGGHIFJJIJJJJJJJJJJJJJJJIHHHHHFFFFFCC@   XT:A:U  NM:i:1  SM:i:0  AM:i:0  X0:i:1  X1:i:783        XM:i:0  XO:i:1 XG:i:1   MD:Z:76
              HWI700710:177:d0u1eacxx:6:1201:17643:176102     99      PM_map_6-1 195460  0       98M     =       195850  466     GACCCAATGGCATACTTGAAATTCTTGGCAAACTTTTCGCGGGCTCTTTTCTTTTTAACTTAAAGCCCGTTCATGAGACGACCCCTTCTTCTTCTTCG      @<@DDDDDAD?DDBHIG+<EGGIG@FFHGG@G;CGHIF6FFEIGGGIHHHAEHC>>?>BCDCCA>35=?@?<:@@>CA@?><0<B?<?CC>ACACC@<      XT:A:U  NM:i:3  SM:i:0 AM:i:0   X0:i:1  X1:i:8  XM:i:3  XO:i:0  XG:i:0  MD:Z:35C0C40C20
              HWI700710:177:d0u1eacxx:6:1201:17643:176102     147     PM_map_6-1 195850  0       76M     =       195460  -466    CCCAAAGATGAAACTTCCTCAAAGAAGAAAATCAGTCCGTGCGCTGGAAGAAGGCATCCGCAGAAAAAGGCAAGCT    ;;;DDA>53;EAAA9CB;>FHCHHE?CGCDDBIGHAIGIHGGHDEIGCGDJIIHDFEFFGBIIHDFFFFDFFFCC@    XT:A:U  NM:i:2  SM:i:0  AM:i:0  X0:i:1  X1:i:39 XM:i:2  XO:i:0  XG:i:0 MD:Z:15C45C14
              HWI700710:177:d0u1eacxx:6:1304:4645:131533      163     PM_map_6-1 199485  0       70M1I10M        =       199953  569     AGCTTTCAAGAAGCCTTTAATCCAAGCTCACCAACTCCTTAAGTGCACATTCAAGTCTCCATCCACCTTGTAGAAAATCCA       @CCFFFFFHHHHHJIJJJJJIGEIJJIIIJJIIGGGGGIJJIJIIIIJJJEDHIJGGGHIEGHIIGGGGG).=@EGIFEHH       XT:A:U  NM:i:2  SM:i:0  AM:i:0  X0:i:1  X1:i:26XM:i:1   XO:i:1  XG:i:1  MD:Z:39C40
              I am currently using BWA version 0.5.9rc1 (r1561)

              Any inputs will be appreciated.



              Cheers,
              Jo

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              59 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              57 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              56 views
              0 likes
              Last Post seqadmin  
              Working...
              X