Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Donor effect and cuffdiff

    Hello all,

    I was wondering if there's a way to deal with donor/batch effect if one wants to use cufflinks/cuffdiff pipeline. I have some human RNA-seq data, and using cuffdiff as is basically generates no differenitally expressed genes.

    However if you use design ~ donor + condition in DESeq2, there are quite a few differentially expressed genes. PCA suggests there is a strong donor effect also.

    So is there a way to address it with cuffdiff?

    Thank you in advance.

  • #2
    There isn't. Cuffdiff doesn't handle anything other than simple designs. Stick with DESeq2.

    Comment


    • #3
      I see. Thank you.

      Comment


      • #4
        Another quick question: if you are to use comBat on a RNA-seq dataset, would you use normalized log2-transformed counts? Like the results of rlog function or variance-stabilizing transformation from DESeq2?

        sva tutorial wasn't all that helpful, in all honesty.

        Comment


        • #5
          If you have known batches, just include the batch variable in the design for DESeq2.

          We don't recommend testing on transformed counts.

          If you have unknown batches, you can use svaseq or other packages. We are writing up a workflow which will be released in a few weeks and includes svaseq and RUVSeq.

          But briefly, add the SVA surrogate variables (columns of 'sv') to the colData, and then add these to the design. E.g., for two surrogate variables:

          Code:
          dds$SV1 <- svseq$sv[,1]
          dds$SV2 <- svseq$sv[,2]
          design(dds) <- ~ SV1 + SV2 + condition
          dds <- DESeq(dds)
          Last edited by Michael Love; 09-30-2014, 10:25 AM. Reason: markup

          Comment


          • #6
            Originally posted by Michael Love View Post
            If you have known batches, just include the batch variable in the design for DESeq2.

            We don't recommend testing on transformed counts.
            that's what I did right away, and it worked great. However I wanted to have an expression table with removed donor bias for PCA, visualization, etc.

            If you have unknown batches, you can use svaseq or other packages. We are writing up a workflow which will be released in a few weeks and includes svaseq and RUVSeq.

            But briefly, add the SVA surrogate variables (columns of 'sv') to the colData, and then add these to the design. E.g., for two surrogate variables:

            Code:
            dds$SV1 <- svseq$sv[,1]
            dds$SV2 <- svseq$sv[,2]
            design(dds) <- ~ SV1 + SV2 + condition
            dds <- DESeq(dds)
            Great, that would be ideal to incorporate it all into DESeq2 pipeline, because lots of things are already very conveniently done in DESeq2. Thank you for pointing the two packages out, it should help.

            Comment


            • #7
              limma has a function which easily removes batch effects from a matrix:



              (you'd want the input to be on the scale of log2 of counts, and the rlog or VST output is log2 scale)

              Comment


              • #8
                so, on a related topic - not sure it people are still reading this thread

                is there a way to evaluate donor effect quantitatively? I mean it looks fairly intuitive, check if in PCA space dots with the same donor are much closer to each other than to other dots, or something like that.

                problem is, if there are too many donors it's sometimes hard to tell.

                Comment


                • #9
                  PCA projects into a lower dimensional space, which is necessary to visualize, but also we think let's us see more signal and reduce noise. You could calculate the distance between the samples (for the 1st and 2nd PCs say, or more). You can then compute the average within-batch distance and the average within-condition distance.

                  Comment


                  • #10
                    Yep, that's something I was thinking about. Is there a ready solution or you suggest I do it myself?

                    Also, if you guys are including PCA/batch analysis and removal functions into DESeq2, it would be a cool thing to have.

                    Comment


                    • #11
                      Usually, it's best to do this kind of stuff yourself, all the functionality is there in R, and reimplementing stuff in Bioconductor is discouraged.

                      Hence we say in the help file for plotPCA: "Note that the source code of plotPCA is very simple and commented. Users should find it easy to customize this function."

                      We don't have a batch removal function, because for accounting for fold changes due to batch in testing of counts, one just adds a variable to the design. And for removing shifts from transformed counts, limma has a function which does this for you.

                      Comment


                      • #12
                        Sure, I understand. I do PCA a little different (using ggplot2 facilities), but it's still shouldn't be hard to calculate. Seems like a useful thing to have, in experiments with some 60 donors it's really not that easy to interpret the PCA plot in terms of bias...

                        Comment


                        • #13
                          By the way, check out the latest version of DESeq2 (1.6). I switched us to ggplot2 for the PCA plot. And tried to make it easier to customize. See the vignette and the workflow here:

                          bioconductor.org/help/workflows/rnaseqGene/

                          Comment


                          • #14
                            Sweet, thank you.

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Advancing Precision Medicine for Rare Diseases in Children
                              by seqadmin




                              Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                              12-16-2024, 07:57 AM
                            • seqadmin
                              Recent Advances in Sequencing Technologies
                              by seqadmin



                              Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                              Long-Read Sequencing
                              Long-read sequencing has seen remarkable advancements,...
                              12-02-2024, 01:49 PM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, 12-17-2024, 10:28 AM
                            0 responses
                            33 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 12-13-2024, 08:24 AM
                            0 responses
                            48 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 12-12-2024, 07:41 AM
                            0 responses
                            34 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 12-11-2024, 07:45 AM
                            0 responses
                            46 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X