Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Variance Estimation

    I have been using DESeq to analyze gene expression from SAGE samples. To decide how to compare samples we have been using ECDF (empirical cumulative data function) plots to determine the quality of samples. I was wondering If I could transform this data into a quantitative number by taking the integral of the ECD function. I havn't yet discovered a way to do this in DESeq, is there a better program to analyze with?

  • #2
    A few weeks ago, we have completely rewritten the DESeq vignette (manual). One of our changes was to remove everything about this ECDF plot of the variance residuals, as people kept misunderstanding its purpose (which was maybe never that clear anyway.) It is not to check quality of samples.

    The point of the variance residual ECDF plots was to check whether the assumption holds well that genes of similar expression strength have similar variance, because the old DESeq version did not deal well with "variance outliers", i.e., genes with variance much stronger than similar genes. See the new vignette to learn how we now simply take the maximum of fitted value and per-gene estimate to avoid making an error here.

    To judge the reproducibility of a protocol, i.e., the similarity of replicate samples, I now
    recommend the following two possibilities:

    (i) use the new 'estimateDispersions' function that now, by default, no longer does a local fit but a parametric fit, fitting a curve alpha = alpha_0 + alpha_1/mu on the dispersion alpha, or equivalently, a curve v = ( 1 + alpha_1 ) * mu + alpha_0 * mu^2 on the variance v. The value alpha_0 is a good measure of the overall (intensity-independent) variation between replicates, the value alpha_1 is a measure of the additional variance for weak genes. See vignette for details.

    (ii) use the variance-stabilizing transformation to make a sample-clustering heatmap, as described in the vignette, to see whether your replicates are more similar than samples from different treatment groups.

    Note that the new DESeq is available in the devel branch, not yet in the release branch, of Bioconductor

    Comment


    • #3
      Hello Simon,
      the "Package Downloads" links on the Bioconductor homepage (http://www.bioconductor.org/packages...tml/DESeq.html) are wrong. They still link to version 1.5.18 but should link to 1.5.19. Don't know wether you have any control over that.

      Best,
      Mark Onyango

      Comment


      • #4
        Do I need to delete the older version of DESeq? If so where do you think it would be?

        Comment


        • #5
          Hello Simon,
          could you please elaborate on why you switched from the local fit to a parametric fit as a default setting? I always found your idea for a more data-driven fit very sound.

          @KellerMac:
          It depends on what operating system you are using. If you use Windows you can safely install the development version parallel to the release version as it will also create a new library folder. So the two do not interfere.
          If you are using Linux (e.g. Ubuntu) you simply download the development sources of R into a folder of your choosing and compile it there. It won't be installed system-wide and can be started from that folder. All packages downloaded will be kept in that folder as well.
          So all in all there is no need to delete the current version of DESeq from you PC.

          Comment


          • #6
            Error: could not find function "estimateDispersion"

            I'm getting:

            Error: could not find function "estimateDispersion"

            What have I done wrong?

            I'm running R in OSX. I've had no trouble using DEseq before, just this new function.

            As far as I can tell, my DEseq is up to date

            Comment


            • #7
              oops, I think I am just struggling with how to update DEseq. I am still at DESeq 1.4 and the "update" window is not doing anything...

              Comment


              • #8
                Ok, last one, it seems something is wrong with the files linked in bioconductor:
                The Bioconductor project aims to develop and share open source software for precise and repeatable analysis of biological data. We foster an inclusive and collaborative community of developers and data scientists.


                Am I wrong?

                Comment


                • #9
                  You also need to use the development version of R (2.14) to be able to install the latest DESeq.

                  Comment


                  • #10
                    devel version

                    found the relevant thread about needing to install the development version of R as well... done.. things working for noW!

                    Comment


                    • #11
                      I am sorry to awaken this thread but I seem to have a problem with the latest Relase-Version of DESeq (1.6.1):

                      Whenever I try to execute the estimateDispersions function I receive the following error:

                      Parametric dispersion fit failed. Try a local fit and/or a pooled estimation. (See '?estimateDispersions')

                      Now this can only happen if the coefficients during the fitting process become negative (or at least some of them). Using the local fit kind of cures this but I still see some negative dispersion coefficients. My question therefor is: How can the coefficients become negative during fitting and how do I properly handle or interpret these?

                      Comment


                      • #12
                        The problem with the fit has little to do with the negative values, because DESeq "lifts" all negative dispersion values to something slightly above zero. Rather, our new parametric fit routine still has some weaknesses that we are not yet fully sure how to straighten out. This is why the package recommends reverting to the old method if the new one fails. In practice, the difference between the two methods turned out to be not that large, anyway.

                        To nevertheless explain the negative values: A random variable that is distributed according to a negative binomial with mean µ and dispersion a has variance v = µ + a µ². DESeq estimates a from the data with a method-of-moments estimator, i.e., it estimates µ and v and then calculated a = (v - µ ) / µ². (I'm skipping here over a few subtleties, explained in the supplement to our paper.) Especially for low µ, it may happen that the estimate for v is larger than that for µ, and the, the estimate for the dispersion a becomes negative. On the one hand, we know that a should be positive, and hence, we need to replace all negative values with small positive ones before the test. However, I prefer to do this only after the fit, as it introduces a positive bias.

                        Comment


                        • #13
                          Hi..
                          I'm running DESeq (2.11) on R (2.25.2) on windows platform. I'm getting same error as chrisbala
                          Error: could not find function "estimateDispersion"
                          Do I need to update anything or what am I doing wrong here?
                          Thanks!

                          Comment


                          • #14
                            Please install current versions of R and Bioconductor and try again.

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Strategies for Sequencing Challenging Samples
                              by seqadmin


                              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                              03-22-2024, 06:39 AM
                            • seqadmin
                              Techniques and Challenges in Conservation Genomics
                              by seqadmin



                              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                              Avian Conservation
                              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                              03-08-2024, 10:41 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, Yesterday, 06:37 PM
                            0 responses
                            10 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, Yesterday, 06:07 PM
                            0 responses
                            10 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-22-2024, 10:03 AM
                            0 responses
                            51 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-21-2024, 07:32 AM
                            0 responses
                            67 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X