Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to calculate FPR from dwgsim_eval output?

    Hello,

    I am using dwgsim_eval to evaluate different mapper's performance on my simulated reads. I want to plot a ROC graph about the result, so I need to calculate True Positive Rate (TPR) and False Positive Rate (FPR) for each mapping quality threshold. I guess TPR should be column 17 ("sensitivity
    # (mc' / (mc' + mi' + mu')) | sensitivity: the fraction of reads that should be mapped that are mapped correctly at or greater than the threshold"), then how should I calculate FPR? Thanks!

    BTW:
    I am quite confused with the following 3 definitions:
    # mu | the number of reads unmapped that should be mapped be mapped at the threshold
    # um | the number of reads mapped that should be unmapped be mapped at the threshold
    # uu | the number of reads unmapped that should be unmapped be mapped at the threshold

  • #2
    You can find your answer here: http://en.wikipedia.org/wiki/Sensiti...nd_specificity

    Comment


    • #3
      Originally posted by RockChalkJayhawk View Post
      Thanks RockChalkJayhawk! I've checked the definition of FPR(=FP/(FP+TN)) but I think my problem is that I am not sure how to find the corresponding term of FP and TN from the output of dwgsim_eval.

      The list of dwgsim_eval is given as follows:

      # thr | the minimum mapping quality threshold
      # mc | the number of reads mapped correctly that should be mapped at the threshold
      # mi | the number of reads mapped incorrectly that should be mapped be mapped at the threshold
      # mu | the number of reads unmapped that should be mapped be mapped at the threshold
      # um | the number of reads mapped that should be unmapped be mapped at the threshold
      # uu | the number of reads unmapped that should be unmapped be mapped at the threshold
      # mc' + mi' + mu' + um' + uu' | the total number of reads mapped at the threshold
      # mc' | the number of reads mapped correctly that should be mapped at or greater than that threshold
      # mi' | the number of reads mapped incorrectly that should be mapped be mapped at or greater than that threshold
      # mu' | the number of reads unmapped that should be mapped be mapped at or greater than that threshold
      # um' | the number of reads mapped that should be unmapped be mapped at or greater than that threshold
      # uu' | the number of reads unmapped that should be unmapped be mapped at or greater than that threshold
      # mc' + mi' + mu' + um' + uu' | the total number of reads mapped at or greater than the threshold
      # (mc / (mc' + mi' + mu')) | sensitivity: the fraction of reads that should be mapped that are mapped correctly at the threshold
      # (mc / mc' + mi') | positive predictive value: the fraction of mapped reads that are mapped correctly at the threshold
      # (um / (um' + uu')) | false discovery rate: the fraction of random reads that are mapped at the threshold
      # (mc' / (mc' + mi' + mu')) | sensitivity: the fraction of reads that should be mapped that are mapped correctly at or greater than the threshold
      # (mc' / mc' + mi') | positive predictive value: the fraction of mapped reads that are mapped correctly at or greater than the threshold
      # (um' / (um' + uu')) | false discovery rate: the fraction of random reads that are mapped at or greater than the threshold

      Comment


      • #4
        See the discussion here:


        What is a false positive here though, since we could have a read that is mapped and it can be "wrong" if:
        #1 mapped to the wrong position
        #2 not mapped

        #2 would seem to be a FP, but #1 doesn't fit into the FP/TP/FN/TN scheme. Hence use positive predictive value, since it only hurts sensitivity when a read does not map (so only care if it does map). Use a ROC plotting sensitivity vs. PPV.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Today, 08:47 AM
        0 responses
        11 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        59 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        54 views
        0 likes
        Last Post seqadmin  
        Working...
        X