Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Summarizing PCA in DESeq2

    I am interested in knowing the proportion of variance that my components describe in the Principle Component Analysis in DESeq2. I have successfully been able to do the rlogtransformation and the variancestablizedtransformation, and plotPCA to see the clustering of my samples. Now I am interested in the std dev, proportion of variance, and cumulative proportion of this PCA... similar to the summary if you ran:

    > pca <- princomp(data, scores=TRUE, cor=TRUE)
    > summary(pca)

    Any suggestions for getting this information, or for changing the rld SummarizedExperiment into a regular data frame or matrix so that I can run princomp and summary as usual.

  • #2
    hi,

    In the vignette, we have:

    The two functions return SummarizedExperiment objects, as the data are no longer counts. The assay function is used to extract the matrix of normalized values.

    Comment


    • #3
      Hi,

      I'm not sure to understand your answer, Michael...

      I'm having the same issue: the PCA plot is fine (and quite nice in my case!), but I really want to get the contribution percentage of PCA1 and PCA2 like I get with every other PCA analysis (non-related with transcriptomics) I perform. The DESeq2 package has to calculate it at some point to be able to draw the graph, but I can't find a way to access it...

      Plus I'd love to be able to draw the 3D-PCA plot (PCA1,2,3), but I can't find info on that on the DESeq2 user's guide.

      Any thoughts? Thank you!

      Comment


      • #4
        hi Pauline,

        The previous question was how to get a matrix of values from the SummarizedExperiment. The answer is:

        Code:
        mat <- assay(rld)
        Your question is more of a general R question, once you have a matrix, how to get contributions from each PC.

        Inside the plotPCA function we have code similar to the following (with the 'select' variable used to pick out the top genes by variance):

        Code:
        rv = apply(mat, 1, var)
        select = order(rv, decreasing=TRUE)[seq_len(min(ntop, length(rv)))]
        pca = prcomp(t(mat[select,]))
        Check the help file for ?prcomp. This base R function gives you a list containing the results of the PCA. You are interested in the standard deviations of each component:

        Code:
        variances = pca$sdev ^ 2
        total.variance = sum(variances)
        variances/total.variance
        I don't have a recommendation on how to make 3D plots (I find it hard to see what's going on in these).

        Comment


        • #5
          OK! Then I get it.
          Thank you for deciphering the "inside" of the plotPCA function. I got the contributions from each PC right. And if I understand R the code you gave me and my R output well enough, the following output should be the %age of variation explained by PC1 to PC9 (please do correct me if I'm wrong!)

          > variances/total.variance
          [1] 7.254373e-01 1.269088e-01 8.017268e-02 3.342993e-02 1.212378e-02 1.140305e-02 6.017477e-03 4.507064e-03 1.299178e-31
          >

          I think I'm finally going to enjoy my first "on-my-own" RNAseq analysis not ;-)

          Comment


          • #6
            The simplest solution to your problems with DESeq2's plotPCA function?
            Don't use it.
            Use the PCA and plot.PCA functions in FactoMineR.
            I love the graph I got. Much more informative.

            Comment


            • #7
              Problem is I already tried with FactoMineR because it's the one I use for mormphometrics-related PCAs, but my count table is too big. I need to perform that on a better computer than my office computer, but the waiting list was too big to get to the clusters we have to run huge jobs... :-)

              Comment


              • #8
                Hi, Dr. Love,

                Thank you very much for your explanation.

                I just checked your plotPCA function in deseq2,

                your code: rv=rowVars(assay(x)), which is actually different from what you post here.

                I already calculated:

                vsd=varianceStabilizingTransformation(dds)
                when I ran: rv=rowVars(assay(vsd)),

                I got error message: could not find function "rowVars"

                I could get the PCA plot though.

                Could you please explain why?

                Thanks

                Comment


                • #9
                  rowVars is from genefilter, so just "genefilter::rowVars".

                  Note that he wrote that the code he posted was "similar", not identical, to that in plotPCA. The rv variable is just used to subset things for computing the PCA. The PCA will work just fine without subsetting.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:37 PM
                  0 responses
                  7 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 06:07 PM
                  0 responses
                  7 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  49 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  66 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X