Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bwa mapping quality

    bwa approximate mapping quality in such way,

    {.
    .
    .
    if (p->c1 == 0) return 23;
    if (p->c1 > 1) return 0;
    if (p->n_mm == mm) return 25;
    if (p->c2 == 0) return 37;
    n = (p->c2 >= 255)? 255 : p->c2;
    return (23 < g_log_n[n])? 0 : 23 - g_log_n[n];
    }

    c1 and c2 are the number of top1 and top2 hits. The higher the mapQ, the lower the probability the read alignment is wrong. I kind of mix up, by above function, if c1 is more than 1, why return the mapQ 0?

    Thanks for any comments and answers.

  • #2
    Originally posted by totalnew View Post
    bwa approximate mapping quality in such way,

    {.
    .
    .
    if (p->c1 == 0) return 23;
    if (p->c1 > 1) return 0;
    if (p->n_mm == mm) return 25;
    if (p->c2 == 0) return 37;
    n = (p->c2 >= 255)? 255 : p->c2;
    return (23 < g_log_n[n])? 0 : 23 - g_log_n[n];
    }

    c1 and c2 are the number of top1 and top2 hits. The higher the mapQ, the lower the probability the read alignment is wrong. I kind of mix up, by above function, if c1 is more than 1, why return the mapQ 0?

    Thanks for any comments and answers.
    It is ambiguous as to which is the best hit since there are more then one.

    Comment


    • #3
      Originally posted by nilshomer View Post
      It is ambiguous as to which is the best hit since there are more then one.
      Sorry, I still don't understand.

      Comment


      • #4
        Originally posted by totalnew View Post
        Sorry, I still don't understand.
        If a read aligns to two location equally well, then you cannot unambiguously say which of the two places is the correct location. In any case that there are two equally likely alignments, the mapping quality is zero.

        Comment


        • #5
          That is clear enough, thanks a lot!

          Comment


          • #6
            Originally posted by totalnew View Post
            bwa approximate mapping quality in such way,

            {.
            .
            .
            if (p->c1 == 0) return 23;
            if (p->c1 > 1) return 0;
            if (p->n_mm == mm) return 25;
            if (p->c2 == 0) return 37;
            n = (p->c2 >= 255)? 255 : p->c2;
            return (23 < g_log_n[n])? 0 : 23 - g_log_n[n];
            }

            c1 and c2 are the number of top1 and top2 hits. The higher the mapQ, the lower the probability the read alignment is wrong. I kind of mix up, by above function, if c1 is more than 1, why return the mapQ 0?

            Thanks for any comments and answers.
            Where these words come from?
            I can't understand if (p->n_mm == mm) return 25;

            and for my data,
            XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 gives a quality score 25,
            XT:A:U NM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 gives a quality score 23.

            for my data(-n 2 -o 1 -e 2 ),
            37 means NM<=1, x0==1,x1==0;
            25 means NM==2, x0==1,x1==0;
            23 means x1==1;

            compatible with this rule above?

            Comment


            • #7
              What is the probability that the second best hit was NOT found due to heuristics?

              What is the probability that the second best hit was NOT found due to heuristics?

              Since all these algorithms use heuristics there is a good chance some hits will be missed. When I evaluated BWA I saw that the mapping quality would change for some sequences depending on what was the sensitivity setting I was using. The bad part is that I did see some cases in which the mapping quality would give a higher value in a combination of parameters that was supposed to have higher sensitivity, this was on one of the first versions of bwa so I don't know if it was a bug. It was also only seen in a few reads which is part of the error rate mentioned in:

              "Simulation reveals that BWA may overestimate mapping quality due
              to this modification, but the deviation is relatively small. For example, BWA
              wrongly aligns 11 reads out of 1,569,108 simulated 70bp reads mapped with
              mapping quality 60." BWA paper

              My question is the following, the reference genome is "static" therefore would it be possible to fix this small error by tracing back what areas it is generally generated in. I'm guessing its some sort of sequence in repeat areas that can be prone to errors, and that can be tricky because it will bias the mapping so that it finds more of a certain types of areas than others.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              30 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              32 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              52 views
              0 likes
              Last Post seqadmin  
              Working...
              X