Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cuffdiff producing q-values > 1

    Hello at all,

    I use TopHat and Cufflinks to process some human RNA-seq data with an annotation gft-file for known genes (this means my pipeline is: TopHat -> Cuffdiff).

    In the resulting gene_expr.diff file I obtain q-values greater than 1:

    Code:
    > range(expr$q_value)
    [1] 0.00000 1.08198
    Since I thought that a q-value is a probability (fdr corrected p-value) it should not be greater than 1.

    Its clear to me that when the p-value is corrected there can be q-values greater than 1 but I thought its limited to 1 in this cases. I am confused about the values, 1.08 is a very small value assuming the q-value is not a probability anymore (then I would expect higher values) but its to big for a rounding error I think.

    Maybe I got there something wrong. Could someone explain it to me?

    Thanks in advance,
    Oliver

  • #2
    I got the same above 1 q value, would also want to know how q value is generated

    Comment


    • #3
      fdr

      Benjamini, Y. & Hochberg, Y., 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B

      The equation does the correction and it is up to the user to take appropriate values. ie >0.05

      there may be a more resent paper too but this is the original.

      this paper should clear it up

      Comment


      • #4
        Hello severin, thanks for the paper. I looked at it, but they always correct the cutoff for the p-value and not the p-value itself. They don't even mention the q-value. So the paper may be useful for the background but I can't see that it explains my inital question.

        Comment


        • #5
          R script for fdr

          Here is an R script for false discovery rate based on that paper.

          #FALSE DISCOVERY RATE
          bh.fdr=function(p){
          #
          #This function computes q-values using Benjamini and Hochberg's (1995)
          #approach for controlling FDR. Courtesy of Dan Nettleton
          #
          m = length(p)
          k = 1:m
          ord = order(p)
          p[ord] = (p[ord] * m)/k
          qval = p
          qval[ord]=rev(cummin(rev(qval[ord])))
          return(qval)
          }

          Comment


          • #6
            Thank you severin!
            Now I can generate q value based on the cufflinks pvalue output, I think there is a bug in cufflinks when generating q values, possibly that they miss the line qval[ord]=rev(cummin(rev(qval[ord]))), which results in a >1 q value.

            Comment


            • #7
              Thanks for that script, severin.

              I have a question concerning the use of the cummin function. In this case it ensures that the q-value is not getting bigger than 1 (thats nice) but also that if once a q-value is reached, the following values can't get bigger but would be set to the lowest value known to this time point (actually this happens in reversed order). This means that some q-values are not the appropriate q-value for the according p-value. And it means also that they are ascending. At first glance this seem not to be legal. So for example if I have q-values (in p-value order):
              Code:
              0.01, 0.02, 0.06, 0.04, 0.07
              this procedure would result in
              Code:
              0.01, 0.02, 0.04, 0.04, 0.07
              .

              Having a cutoff of 0.05, in line 1, the third would not be accepted but the fourth. In line 2 the third is accepted (but due to correction it wouldn't be). This is what is confusing me: What happens to q-values that are again lower as one preceding value? Why is it okay to accept them anyway? I looked at the paper, but its not clear to me.

              Comment


              • #8
                hope this help ->
                Until mid-1990’s, control of the error rate under multiple testing was done, in general, using family-wise error rate (FWER). An example of this kind of correction is the Bonferroni correctio…

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                25 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                28 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                24 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                52 views
                0 likes
                Last Post seqadmin  
                Working...
                X