Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DESeq Variance Stabilizing Transformation

    Hello,

    I am looking for some feedback regarding the use of the variance-stabilization (VST) methods found in the DESeq2 package. Hopefully one of the authors will respond and the comments will be of help to others.

    For me, the purpose for applying this transformation is to be able to generate moderated fold changes for clustering of genes (not samples as in the vignette).

    My data consists of a time series, where for each time point there is a "treated" sample and a "control" sample. Each sample (timepoint) consists of 4 biological replicates.

    I performed the VST on the entire set of data and plot the per-gene standard deviation against the rank of the
    mean*, for the shifted logarithm log2 (n + 1) (left) and the variance stabilizing transformation (right), it does not appear to have a pronounced effect.



    However, if i set up a count dataset that consists of the samples corresponding to one timepoint only (first timepoint in the example below), and perform the VST and plot the standard deviation against rank of the mean, the transformed values have a much better stabilized standard deviation.



    So my questions are: Is there anyway to obtain better variance stabilized data when considering the entire timeseries? Should I just perform the VST on a per timepoint basis; after all I will only be computing fold changes between treatment and control samples at the same timepoint.

    *The procedure was performed as per the DESeq2 manual:

    dds <- estimateSizeFactors(dds)
    dds <- estimateDispersions(dds)
    vsd <- varianceStabilizingTransformation(dds)
    par(mfrow=c(1,2))
    plot(rank(rowMeans(counts(dds))), genefilter::rowVars(log2(counts(dds)+1)), main="log2(x+1) transform")
    plot(rank(rowMeans(assay(vsd))), genefilter::rowVars(assay(vsd)), main="VST")

  • #2
    As far as I know, you have to tell DESEQ to treat all expression values as if they were emerging from a single condition by specifying method="blind" when extimating the Dispersions.

    Comment


    • #3
      I have a slightly unrelated question. It's about the plot.
      Why is the variance low for low mean ? shouldn't it start high and decrease as the mean increase?
      I have a similar data set and even if I filter requiring higher cpm the trend still persists.
      Any one know of why this is the case?

      Comment


      • #4
        DESeq2 variance

        I guess it all depends on the type of data. For my NGS bacterial 16sRNA data, SD increase as the mean increases.
        Attached Files

        Comment


        • #5
          hi John,

          The VST helps to stabilize the variance over the mean, insofar as this can be captured by the parametric curve of dispersion over mean. You might also try the rlog transformation, which sometimes performs qualitatively better than the VST (for example, if the size factors vary a lot across samples).

          Comment


          • #6
            Hi guys,
            Is the VST package of DESeq still functional? Because most of the functions of VST including getVarianceStabilizedData() seem to be dysfunctional in R version 3.0.1. Please help.

            Comment


            • #7
              hi Ayana,

              Can you post the code which you think is not working. Please include full code, R output and sessionInfo()

              The VST and rlog are both implemented in DESeq2, which we suggest you use over DESeq.

              Comment


              • #8
                Originally posted by moritzhess View Post
                As far as I know, you have to tell DESEQ to treat all expression values as if they were emerging from a single condition by specifying method="blind" when extimating the Dispersions.
                Yes. And depending on the data, there may not always be a variance stabilising transformation. In particular, the error model on which the transformation is based assumes that for most genes the variance is dominated by technical noise and natural biological variation between replicates, and that the effects of true differential expression affect only a minority of genes. If that is not the case, then the whole concept does not really work.

                As Mike Love says, the variance stabilsing transformation tends to be misled in cases when the size factors strongly vary between samples, and (at least) in these case the rlog transformation is preferable.
                Last edited by Wolfgang Huber; 04-30-2014, 11:16 AM.
                Wolfgang Huber
                EMBL

                Comment


                • #9
                  @Him26: Note that in John's plots the y-axis is on a log-scale.
                  If you do the same kind of plot with sd computed on the original scale of the counts, then you will indeed expect them to increase with the mean.
                  Wolfgang Huber
                  EMBL

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:37 PM
                  0 responses
                  12 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 06:07 PM
                  0 responses
                  10 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  51 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  68 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X