Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DESeq2: genes significant under more stringent model but NOT with standard model

    hello everybody,

    I noticed some inconsistency in the DESeq2 results when calling results() under the default and the alternative hypothesis option.
    Some genes are called signifcant above a certain threshold (e.g. LFC>0.5) but not significant when no threshold is applied (i.e. just differentially expressed no matter the strength).
    The issue is caused by independent filtering, which in my case filters out lowly-expressed genes under the default (i.e. sets their padj to NA) but does not so under the alternative hypothesis.

    Is this
    (a) just some negligible consequence of independent filtering and perhaps the tradeoff for having a greater number of DEG under the default
    (b) possibly a kind of bug and filtering should be the same for both
    or
    (c) am I doing/understanding something wrong?

    pls let me know if you require additional information
    btw: I like to add the stereotypical "I am a biologist and don't know much about statistics/informatics"-excuse
    bye



    Code:
    [COLOR="Blue"]> #without independent filtering
    > #standard method
    > res <- results(dds, contrast = c("condition", "C", "A"), independentFiltering = FALSE)
    > summary(res, alpha=0.01)[/COLOR]
    
    out of 13821 with nonzero total read count
    adjusted p-value < 0.01
    LFC > 0 (up)     : 547, 4% 
    LFC < 0 (down)   : 896, 6.5% 
    outliers [1]     : 0, 0% 
    low counts [2]   : 0, 0% 
    (mean count < 0)
    [1] see 'cooksCutoff' argument of ?results
    [2] see 'independentFiltering' argument of ?results
    
    [COLOR="blue"]> res <- na.omit(res)
    > res <- res[res[,6] < 0.01,]
    > #alternative hypothesis
    > resGA <- results(dds, contrast = c("condition", "C", "A"), independentFiltering = FALSE, 
    +                  lfcThreshold=0.5, altHypothesis="greaterAbs")
    > summary(resGA, alpha=0.01)[/COLOR]
    
    out of 13821 with nonzero total read count
    adjusted p-value < 0.01
    LFC > 0 (up)     : 40, 0.29% 
    LFC < 0 (down)   : 9, 0.065% 
    outliers [1]     : 0, 0% 
    low counts [2]   : 0, 0% 
    (mean count < 0)
    [1] see 'cooksCutoff' argument of ?results
    [2] see 'independentFiltering' argument of ?results
    
    [COLOR="blue"]> resGA <- na.omit(resGA)
    > resGA <- resGA[resGA[,6] < 0.01,]
    > #is resGA subset of res?
    > table(is.element(rownames(resGA), rownames(res)))[/COLOR]
    
    TRUE 
      49 
    [COLOR="blue"]> 
    > #with independent filtering
    > #standard method
    > res <- results(dds, contrast = c("condition", "C", "A"), independentFiltering = TRUE)
    > summary(res, alpha=0.01)[/COLOR]
    
    out of 13821 with nonzero total read count
    adjusted p-value < 0.01
    LFC > 0 (up)     : 549, 4% 
    LFC < 0 (down)   : 911, 6.6% 
    outliers [1]     : 0, 0% 
    low counts [2]   : 1382, 10% 
    (mean count < 15.2)
    [1] see 'cooksCutoff' argument of ?results
    [2] see 'independentFiltering' argument of ?results
    
    [COLOR="blue"]> res <- na.omit(res)
    > res <- res[res[,6] < 0.01,]
    > #alternative hypothesis
    > resGA <- results(dds, contrast = c("condition", "C", "A"), independentFiltering = TRUE, 
    +                  lfcThreshold=0.5, altHypothesis="greaterAbs")
    > summary(resGA, alpha=0.01)
    [/COLOR]
    out of 13821 with nonzero total read count
    adjusted p-value < 0.01
    LFC > 0 (up)     : 40, 0.29% 
    LFC < 0 (down)   : 9, 0.065% 
    outliers [1]     : 0, 0% 
    low counts [2]   : 0, 0% 
    (mean count < 0)
    [1] see 'cooksCutoff' argument of ?results
    [2] see 'independentFiltering' argument of ?results
    
    [COLOR="blue"]> resGA <- na.omit(resGA)
    > resGA <- resGA[resGA[,6] < 0.01,]
    > #is resGA subset of res?
    > table(is.element(rownames(resGA), rownames(res)))[/COLOR]
    
    FALSE  TRUE 
        2    47

  • #2
    hi,

    the answer is (a). In the first run, independent filtering at 15.2 increases the total # of DEG for an FDR cutoff of 10%. (Note that 0.1 is the default target FDR for independent filtering as specified by 'alpha' argument of results(). I find that 1% is a very low FDR cutoff, but if you want to use 1% you should change 'alpha'.) And the optimal filter for the second run is no filter. So you have found some genes which were filtered for having low counts in the first run, although they are significant even for a more stringent test.

    Comment


    • #3
      thank you for the explanation and especially for the hint to set 'alpha' in results()
      at least in my very case the inconsistency has disappeared now

      bye

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      31 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      33 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      28 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      53 views
      0 likes
      Last Post seqadmin  
      Working...
      X