Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bwa mapping quality

    bwa approximate mapping quality in such way,

    {.
    .
    .
    if (p->c1 == 0) return 23;
    if (p->c1 > 1) return 0;
    if (p->n_mm == mm) return 25;
    if (p->c2 == 0) return 37;
    n = (p->c2 >= 255)? 255 : p->c2;
    return (23 < g_log_n[n])? 0 : 23 - g_log_n[n];
    }

    c1 and c2 are the number of top1 and top2 hits. The higher the mapQ, the lower the probability the read alignment is wrong. I kind of mix up, by above function, if c1 is more than 1, why return the mapQ 0?

    Thanks for any comments and answers.

  • #2
    Originally posted by totalnew View Post
    bwa approximate mapping quality in such way,

    {.
    .
    .
    if (p->c1 == 0) return 23;
    if (p->c1 > 1) return 0;
    if (p->n_mm == mm) return 25;
    if (p->c2 == 0) return 37;
    n = (p->c2 >= 255)? 255 : p->c2;
    return (23 < g_log_n[n])? 0 : 23 - g_log_n[n];
    }

    c1 and c2 are the number of top1 and top2 hits. The higher the mapQ, the lower the probability the read alignment is wrong. I kind of mix up, by above function, if c1 is more than 1, why return the mapQ 0?

    Thanks for any comments and answers.
    It is ambiguous as to which is the best hit since there are more then one.

    Comment


    • #3
      Originally posted by nilshomer View Post
      It is ambiguous as to which is the best hit since there are more then one.
      Sorry, I still don't understand.

      Comment


      • #4
        Originally posted by totalnew View Post
        Sorry, I still don't understand.
        If a read aligns to two location equally well, then you cannot unambiguously say which of the two places is the correct location. In any case that there are two equally likely alignments, the mapping quality is zero.

        Comment


        • #5
          That is clear enough, thanks a lot!

          Comment


          • #6
            Originally posted by totalnew View Post
            bwa approximate mapping quality in such way,

            {.
            .
            .
            if (p->c1 == 0) return 23;
            if (p->c1 > 1) return 0;
            if (p->n_mm == mm) return 25;
            if (p->c2 == 0) return 37;
            n = (p->c2 >= 255)? 255 : p->c2;
            return (23 < g_log_n[n])? 0 : 23 - g_log_n[n];
            }

            c1 and c2 are the number of top1 and top2 hits. The higher the mapQ, the lower the probability the read alignment is wrong. I kind of mix up, by above function, if c1 is more than 1, why return the mapQ 0?

            Thanks for any comments and answers.
            Where these words come from?
            I can't understand if (p->n_mm == mm) return 25;

            and for my data,
            XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 gives a quality score 25,
            XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 gives a quality score 23.

            for my data(-n 2 -o 1 -e 2 ),
            37 means NM<=1, x0==1,x1==0;
            25 means NM==2, x0==1,x1==0;
            23 means x1==1;

            compatible with this rule above?

            Comment


            • #7
              What is the probability that the second best hit was NOT found due to heuristics?

              What is the probability that the second best hit was NOT found due to heuristics?

              Since all these algorithms use heuristics there is a good chance some hits will be missed. When I evaluated BWA I saw that the mapping quality would change for some sequences depending on what was the sensitivity setting I was using. The bad part is that I did see some cases in which the mapping quality would give a higher value in a combination of parameters that was supposed to have higher sensitivity, this was on one of the first versions of bwa so I don't know if it was a bug. It was also only seen in a few reads which is part of the error rate mentioned in:

              "Simulation reveals that BWA may overestimate mapping quality due
              to this modification, but the deviation is relatively small. For example, BWA
              wrongly aligns 11 reads out of 1,569,108 simulated 70bp reads mapped with
              mapping quality 60." BWA paper

              My question is the following, the reference genome is "static" therefore would it be possible to fix this small error by tracing back what areas it is generally generated in. I'm guessing its some sort of sequence in repeat areas that can be prone to errors, and that can be tricky because it will bias the mapping so that it finds more of a certain types of areas than others.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM
              • seqadmin
                The Impact of AI in Genomic Medicine
                by seqadmin



                Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                02-26-2024, 02:07 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 03-14-2024, 06:13 AM
              0 responses
              32 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-08-2024, 08:03 AM
              0 responses
              72 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-07-2024, 08:13 AM
              0 responses
              80 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-06-2024, 09:51 AM
              0 responses
              68 views
              0 likes
              Last Post seqadmin  
              Working...
              X