Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by sindrle View Post
    "
    You run DESeq2, you pick out 10 genes you want to look at including p values.
    Say 6 genes have p < 0.05.
    You then use p.adjust in R.
    What FDR do you choose and why?
    Which n do you set?
    If you pick the ten genes a priori, i.e., in a manner that is independent of the the outcome, the you can run p.adjust only on the p values from these 10 genes.

    By a choice "a priori", I mean that you knew before doing the analysis that these genes are worth looking at and others are not. If, however, you have chosen these ten genes precisely because their expression data in this very experiment looked so interesting that you want them to be in your result list, then you need to run p.adjust on all genes.

    In the former case, you only wanted to look at these genes, so your test only has to reject the null hypothesis that precisely these genes seem to have a signal that looks interesting but arose only due to chance. In the latter case, you have to reject the null hypothesis that somewhere in your data with its many genes, some of which will show strong signals merely due to chance fluctuations, there will be ten genes, which look so far out as to appear interesting. As this is much more likely to happen if it may be any 10 genes rather than a fixed set of 10 genes, therefore the signal has to be stronger to convince us that it is not mere chance. Hence the more stringent multiple-testing adjustment.

    Comment


    • #17
      Originally posted by sindrle View Post
      Do you know how to do this in edgeR?
      See the "genefilter" package for some useful functions.

      Comment


      • #18
        Originally posted by rskr View Post
        I don't think FDR is very important for RNA-seq.
        Sorry, I cannot let this stand like this, as it might be misunderstood to mean that accounting for multiple hypothesis testing is optional in RNA-Seq data analysis. Of course, you always need to account for multiple hypothesis testing when you test many hypothesis (here: many genes).

        On the flip side, you could get a fabulous FDR, by simply not sequencing very much.
        Um, no, you don't. Why should you?

        Comment


        • #19
          Originally posted by dpryan View Post
          @rskr: Given that this is in the context of DESeq2 (I realize that the thread is titled with edgeR...), low-count genes are automatically dropped and power maximized (I have to admit that it's handy to not have to do this myself anymore). So, the low-coverage genes screwing the p-values critique doesn't apply.
          Low coverage genes can still be significant, just not at the same rate as the higher coverage genes, though it may be possible to filter out certain genes which have zero chance of being significant, however the power to tell depends on the proportion as well as the coverage, so as I said FDR isn't so important.

          Comment


          • #20
            Originally posted by rskr View Post
            [...], so as I said FDR isn't so important.
            So, what do you suggest to do instead?

            Comment


            • #21
              p value is just a widely used joke. The signification of p value is hard to get and imply assumptions that lot of people don t know.
              FDR is just a bigger joke. Your best pvalue will most of the time be multiply by your number of p value.
              So if you have 10 genes to test giving you 10 pvalues, the best is multiply by 10, the second best by 5, then by 3.3333, then by 2.5, then by 2 etc .....

              Here is the code in R

              Code:
              # produce a vector of FDR with an ordered pval vector
              fdr = function(pval){
               size=length(pval)
               if(size<2) return(pval)
              
               #the worst pval is multiply by (size) / (size-1)
               FDR=c( min( 1 , pval[size]*(size)/(size-1)   ))
              
               for( i in 1:(size-1)) FDR=c(FDR,min(FDR[i] , pval[size-i]*(size)/(size-i)))
              
               # We have to revers the vector to be consistant
               return(rev(FDR))
              }

              Comment


              • #22
                Originally posted by Simon Anders View Post
                So, what do you suggest to do instead?
                I don't know for sure. I first noticed the problem doing some meta-analysis hypothesis testing on coverage merging p-values of bases with each base having a different statistical power. Using Fishers meta-analysis procedure, it became obvious that the underpowered bases were dominating the the test and that Fishers test assumed all published results were adequately powered. It would be nice if that theory were also better. I came up with a heuristic involving weighted sums of -log(p-values) and information entropy as degrees of freedom, which has a certain appeal to it.

                Comment


                • #23
                  Originally posted by dpryan View Post
                  See the "genefilter" package for some useful functions.
                  Interesting, thanks!
                  Another question, excuse my ignorance.. But look at these codes:

                  > FDR <- p.adjust(lrt$table$PValue, method="BH")
                  > sum(FDR < 0.05)

                  Is this the way to choose FDR < 0.1:

                  > FDR <- p.adjust(lrt$table$PValue, method="BH")
                  > sum(FDR < 0.1)

                  Comment


                  • #24
                    Originally posted by swbarnes2 View Post
                    I feel that this is an appropriate contribution:

                    http://xkcd.com/882/
                    oh it's so appropriate

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Strategies for Sequencing Challenging Samples
                      by seqadmin


                      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                      03-22-2024, 06:39 AM
                    • seqadmin
                      Techniques and Challenges in Conservation Genomics
                      by seqadmin



                      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                      Avian Conservation
                      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                      03-08-2024, 10:41 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, Yesterday, 06:37 PM
                    0 responses
                    10 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, Yesterday, 06:07 PM
                    0 responses
                    9 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-22-2024, 10:03 AM
                    0 responses
                    49 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-21-2024, 07:32 AM
                    0 responses
                    67 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X