Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MDS-plot diagnostics

    MDSplot-edgeR.pdf

    So, this really does not look good...
    Suggestions on whats going on here?

  • #2
    MDSplot-edgeR-no-outleiers.pdf

    And here it is without..

    Looks much better, next step is to see if they affect my result much.


    But what will decide if I keep then or not?

    Comment


    • #3
      That has got to be the most outlying outlier I've ever seen

      Perhaps that sample is from the wrong organ (our core facility has swapped samples before...which became very very obvious when I made the equivalent graph). Alternatively, what's the size factor on that sample? One sample having vastly fewer reads can also cause this sort of thing.

      There are algorithms to detect outliers, but if a sample doesn't stick out like a sore thumb on a graph like this then it should be kept in.

      Comment


      • #4
        Haha!
        Heres some comparisons...

        With outlier: D253-B2
        Variation: 49.21% BCV
        -1 232
        0 22578
        1 905

        Without outliers: D253-B2
        Variation: 49.99% BCV
        [,1]
        -1 120
        0 23038
        1 557

        Conclusion: With outlier gives twice as many hits, this is worrisome...

        Here the same thing with the world record outlier N270-B2

        With outliers: N208-B1 & N270-B2
        Variation: 48.87 BCV
        -1 399
        0 22131
        1 1185

        Without outliers: N208-B1 & N270-B2
        Variation: 49,78% BCV
        -1 147
        0 22800
        1 768

        About the same result...


        Should I exclude all three? Or run some more test and maybe keep everyone, except the insane one?

        Comment


        • #5
          You might consider using the SVA package on your dataset just too see if it detects an obvious background variable to control. I'm not surprised that removing D253-B2 gives fewer hits, that one sample was driving a lot of the results (I've never been that happy with how edgeR deals with outliers, which is why I usually use DESeq2). So, I wouldn't be too worried by that.

          For the other two samples, I suspect that sva will tell you that their variation is due to component that can be compensated for.

          Comment


          • #6
            Thank you very much for your response, very good as always.

            Let me just hack my own thread here.. How do you set the "prior.df" option ?

            d <- estimateGLMTagwiseDisp(d, design, prior.df = ???)

            Comment


            • #7
              I would have to search for that in the Bioconductor list (it's come up a few times, but I don't recall the answer since I rarely use edgeR). There's actually a way to avoid the issue, which is to use glmQLFTest(), which doesn't require that you estimate the tagwise dispersion (it ends up calling routines in limma that estimate the prior df). I've never actually tried that, but it should produce more conservative results.

              Comment


              • #8
                I have search to death...

                And read every thread popping up. But its different from the old vs new version. And also Gordon Smyth says: prior.df = G_0 * df.residual

                df.residuals = my libraries - GLM coeffisients

                But what is G_0??

                In old version n.prior = your libraries, which would help the calculation. But this option I cannot find anymore..

                Im running edgeR, DESeq2 and Cuffdiff 2.1.1 (soon also DEXseq).

                I don't want to exclude anyone, since everyone has their strength and weaknesses. How come you only rely on DESeq2?

                Comment


                • #9
                  It's consistently produced the most reliable results for my datasets. Cuffdiff can rarely handle my experimental designs, so it's not even in the running.

                  Comment


                  • #10
                    Im trying the glmQLFTest, thanks.

                    How did you find out why DEseq2 was the most consistent?

                    Comment


                    • #11
                      I don't really know why it's turned out to give a bit more reliable results, though it tends to deal with outlier values pretty well (it'll flag and ignore such genes by default, though sometimes you need to disable this). We've done enough qPCR validations of additional samples to give me some comfort in that. Of course that's for the datasets that I work on, YMMV!

                      Comment


                      • #12
                        glmQLFTest does not work...

                        Error:

                        "Error in quantile.default(zresid, prob = prob) :
                        missing values and NaN's not allowed if 'na.rm' is FALSE
                        In addition: Warning message:
                        In fitFDistRobustly(var, df1 = df, covariate = covariate, winsor.tail.p = winsor.tail.p) :
                        small x values have been offset away from zero"

                        Ive tried remove NAs:

                        # d$counts[is.na(d$counts)] <- 0
                        # apply(d$counts,2,function(x) sum(is.na(x)))

                        Did not work..

                        Ive read an answer here, but I don't have the development version...



                        *Will soon uninstall edgeR because of annoyed*

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Essential Discoveries and Tools in Epitranscriptomics
                          by seqadmin




                          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                          04-22-2024, 07:01 AM
                        • seqadmin
                          Current Approaches to Protein Sequencing
                          by seqadmin


                          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                          04-04-2024, 04:25 PM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, Today, 11:49 AM
                        0 responses
                        12 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, Yesterday, 08:47 AM
                        0 responses
                        16 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-11-2024, 12:08 PM
                        0 responses
                        61 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 10:19 PM
                        0 responses
                        60 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X