Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mapping quality value

    Hello everyone,

    I just want to know how important is mapping quality value when we align the next generation sequencing data with any of the software like bowtie, bwa or bfast. I mean if reads have mapping quality value of 60 or say 50 should be discard them or take them. How should be decide that up-to what mapping quality we will take the reads.

    As far as I know mapping quality indicates the confidence that a read mapped to a particular position in the genome is correct.

    Please give your inputs.

    Thanks,
    Neha

  • #2
    Mapping qiuality is on a log scale, like Phred base Quality scores. So a mapping quality score of 50 is actually equal to an expected error of 1 in 100000, or a mapping accuracy of 99.999%.

    I normally set my mapping quality minimum value cutoff at 10, which equals an expected error of 1 in 10 or a mapping accuracy of 90% (I use LifeScope 2.5.1 for our ABI data, but the priniciple is the same for any of the aligners that report mapping quality as the log of the probability).

    For bowtie2, just have a look at the manual for guidance
    Michael Black, Ph.D.
    ScitoVation LLC. RTP, N.C.

    Comment


    • #3
      Thank you so much mbblack for explaining it in such a nice way.

      Comment


      • #4
        Thank you for your explanation, it is very helpful. I am using BWA for mapping, in testing parameters I tend to get a plateau of reads aligned. I was wondering why this is? Is it due to mapping quality? What parameter can I set to decrease the mapping quality so I can get more reads to align to my reference. Thank you for any help you can provide.

        Comment


        • #5
          First off, please note that the mapping score is defined as "Phred-scaled" by the sam specification, but in practice, that's impossible to calculate and the scores vary highly by aligner, as they use different heuristics to estimate. For example, bowtie1 prints quality 99 for mapped and quality 0 for unmapped. To rigorously determine the best quality cutoff, you should look at ROC curves for that aligner and determine which point has an acceptable true positive to false positive ratio for your application.

          BWA increases in time exponentially with higher sensitivity and does not increase the amount mapped much more than the defaults (the plateau, as you noted). If you want something with higher sensitivity, please try BBMap. The sensitivity is configurable with the parameters "maxindel" and "minid", which allow you to set the maximum length of insertion or deletion allowed (default 16000), and the minimum percent identity of aligned reads (default 76, minimum 50). It's very fast. If you want something with higher sensitivity than BBMap you'll have to go with BLAST, which is too slow to use on 3rd-gen libraries.

          Please ask if you have any trouble using it. Basically you just unzip it then run:
          (to index)
          bbmap.sh ref=reference.fasta
          (to map)
          bbmap.sh in=reads.fq out=mapped.sam

          It requires Java.
          Last edited by Brian Bushnell; 03-03-2014, 04:14 PM.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 11:49 AM
          0 responses
          13 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-24-2024, 08:47 AM
          0 responses
          16 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          61 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          60 views
          0 likes
          Last Post seqadmin  
          Working...
          X