Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DESeq2 dynamics in a timecourse experiment

    dear all,

    we are trying to analyse a time-course data set of several time-point with replica. we're looking for genes which are changing across all TP as well as genes with a difference on a pair-wise comparison between two time points.

    My questions regards the sensitivity of the dynamics in the analysis of DSeq2.
    Identifying significant changes when considering multiple time-points isn't as simple as a consistent change over all time-points continuously, because a gene can be high at 1 or 2 time-points and off at the rest, and that is a significant biological difference.

    my question is therefore does DESeq2 timecourse model design is able to identify genes which are going up and down over multiple time-points in my time course experiments

    for example, if I have a gene which goes high up after 1h, than normalized and than again go down after 24h, will deseq2 be able to identify this genes as a significantly deregulated genes over all time-points?
    Do I need to check for all coefficients to get a complete overview of the genes changed over time?

    thanks

    Assa

  • #2
    For the question of, "What genes change over time?", what you want to do is an LRT test where the reduced model lacks the time component. This will give you everything that changes, regardless of whether it's a consistent change or only up/down at one time point.

    Other than that, I think people are typically encouraged to spend some time with the PCA plots. You can often make some guess as to useful pairwise comparisons from them.

    Edit: Note that fold changes from the LRT results probably aren't meaningful, but the p-values are.

    Comment


    • #3
      Originally posted by dpryan View Post
      For the question of, "What genes change over time?", what you want to do is an LRT test where the reduced model lacks the time component. This will give you everything that changes, regardless of whether it's a consistent change or only up/down at one time point.
      this are my full and reduced models
      Code:
      # create the DESeq object from a matrix. 
      dds<-DESeqDataSetFromMatrix(countData=countTable, colData=phenotype, design= ~ replica + hours )
      dds = DESeq(dds, test="LRT", reduced=~replica)
      What will be the difference though both in term of my question and in the results I am getting, if I'll add the interaction term replica:time to the two models line that.

      Code:
      # create the DESeq object from a matrix. 
      dds<-DESeqDataSetFromMatrix(countData=countTable, colData=phenotype, design= ~ replica + replica:hours + hours )
      dds = DESeq(dds, test="LRT", reduced=~replica + replica:hours)
      Does the interaction term must be always at the end of the model (as this is usually, the part I'm interested in, when using such interactions?

      Comment


      • #4
        The odds are quite good that you want the interaction. Without it you're just looking for consistent changes over time. So the full model would be "~replica*hours" and the reduced model just "~replica".

        The order of things in the formula only matters for a few plotting functions (they usually default to coloring according to the last model component), the statistics will be the same regardless.

        Comment


        • #5
          Originally posted by dpryan View Post
          The odds are quite good that you want the interaction. Without it you're just looking for consistent changes over time. So the full model would be "~replica*hours" and the reduced model just "~replica".
          Is there a difference between these two models?
          full - ~replica * hours
          reduced - ~replica

          and

          full - ~replica + hours + replica:hours
          reduced - ~replica +replica:hours

          Assa

          Comment


          • #6
            The only difference is the reduced model, which in one case has the interaction. It's then a matter of what exact question you want to ask. A reduced model of "~replica" will give you "all hours dependent changes, including those only occurring due to a hours:replica interaction", while the "~replica+replica:hours" excludes the aforementioned interaction.

            Comment


            • #7
              Originally posted by dpryan View Post
              The only difference is the reduced model, which in one case has the interaction. It's then a matter of what exact question you want to ask. A reduced model of "~replica" will give you "all hours dependent changes, including those only occurring due to a hours:replica interaction", while the "~replica+replica:hours" excludes the aforementioned interaction.
              Thanks for the quick response.

              Maybe this is a very basic statistical question, but to be honest I am not quite sure where is the difference

              In this workflow I am trying to identify all the genes, which are showing a significant difference of behaviour across all time points, either in one specific time-point, in two consecutive or alternate time points.

              What exactly would the interaction term "replica:hours" would tell me / add to the information I already have?
              What changes can happen due to an hours:replica interaction?

              Does it mean, with this term I am getting the genes, which are sig. DE because the intensity of gene X from replica1 of TP1 is higher than ctrl1 of TP1, even though the same gene from the same condition, but from replica2 and ctrl2 of TP1 is not significant?
              so basically, If I am excluding this interaction - Is it possible to have replica-specific DE genes?

              Why would i want to have such genes at all (if I am correct in my assumption)?
              Last edited by frymor; 11-08-2015, 11:26 PM.

              Comment


              • #8
                If "replica" is doing anything then I presume that it can have a time-dependent effect. This is known as an "interaction" and is the "replica:hour" part of your model. Not including that in the reduced model says, "include any genes that might change due to this," which I presume you want...though without knowing the actual context that's just an assumption. If you just want genes that change as a function of time, ignoring any "replica"-dependent changes, then go ahead and keep "replica:hours" in you reduced model.

                BTW, I hope "replica" doesn't denote the biological replicates

                Comment


                • #9
                  Originally posted by dpryan View Post
                  BTW, I hope "replica" doesn't denote the biological replicates
                  Now I am confused. "replica" are the biological replica I have.

                  This is my colData:
                  Code:
                  sampleName	hours	replica	batch
                  IFM_Myoblast_1	0	1	2014
                  IFM_Myoblast_2	0	2	2014
                  IFM_Myoblast_3	0	3	2014
                  IFM16h_1	16	1	2013
                  IFM16h_2	16	2	2015
                  IFM16h_3	16	3	2015
                  IFM24h_1	24	1	2013
                  IFM24h_2	24	2	2015
                  IFM24h_3	24	3	2015
                  IFM30h_1	30	1	2013
                  IFM30h_2	30	2	2014
                  IFM48h_1	48	1	2013
                  IFM48h_2	48	2	2014
                  IFM48h_3	48	3	2014
                  IFM72h_1	72	1	2013
                  IFM72h_2	72	2	2014
                  IFM90h_1	90	1	2013
                  IFM90h_2	90	2	2013
                  IFM100h_1	100	1	2013
                  IFM100h_2	100	2	2014
                  As you can see, replica are the biological replica. I am assuming that thee replica are not significantly different from each other and therefore would like to assume, that within each time-point, I don't have any significant differences.

                  BUT

                  If i do have them there, I would like them to be ignored (=not taken into account, when calculating the differential expression). For that reason I do like to add them to the reduced model, if I understood it correctly.

                  Comment


                  • #10
                    Don't include a term for replicates in your models.

                    Your full model should just be ~hours and the reduced ~1. Note that "hours" MUST be a factor in order for this to work. I see that you have a batch effect, in which case a full model of ~batch+hours and a reduced model of ~batch will hopefully work. Make sure that "batch" is also a factor.

                    Comment


                    • #11
                      why shouldn't the biological replicates be included in the model?

                      I haven't included the batch, because we don't see any batch effect when plotting a PCA. But i will try it with that two, when removing the replica factors from the model.

                      Comment


                      • #12
                        If you include a coefficient for replicates then they won't be used as replicates anymore, rather they'll be treated as though they're different groups. This drastically reduces your statistical power. You should only ever include a factor for replicates when you have something like a case-control study and need specific samples to be paired to each other.

                        Comment


                        • #13
                          Hi all,

                          thanks Devon for all the help. It seems to works quite good.

                          Just to make sure I have understood the way to create the multi-factor designs I was asked by a college of mine about a different experiment
                          We have different mouse samples.
                          we have two conditions - wild-type and KO
                          we have stimulated (s) and unstimulated (us) samples
                          (=all together 12 samples, 3xWTus, 3xKOus, 3xWTs and 3xKOs samples).

                          Code:
                          name	condition	stimulation
                          Vav_KO_1	KO	no
                          Vav_KO_2	KO	no
                          Vav_KO_2_C	KO	yes
                          Vav_KO_4_C	KO	yes
                          Vav_KO_5	KO	no
                          Vav_KO_5_C	KO	yes
                          Vav_WT_1	wildtype	no
                          Vav_WT_1_C	wildtype	yes
                          Vav_WT_2	wildtype	no
                          Vav_WT_2_C	wildtype	yes
                          Vav_WT_4	wildtype	no
                          Vav_WT_4_C	wildtype	yes
                          We are interested in two groups of genes
                          1. What genes are differentially regulated due to the different conditions (knock-out)? In other words I would like to know what genes are influenced by the KO, and the KO only?
                          2. What genes are changed in the knock-out due to the stimulation?

                          As this is a pair-wise comparison I will use the Wald test.
                          This is the design formulae I would use in this case is
                          Code:
                          ~condition + stimulation + condition:stimulation
                          or, as these two are identical
                          Code:
                          dds <- DESeqDataSetFromMatrix(countData = countTable,
                                                        colData = Phenotype,
                                                        design = ~condition*stimulation)
                          dds <- DESeq(dds)
                          to identify the group of DE genes for each of the questions at hand, I will use the results() command with the following parameters:
                          to answer the first question and identify how the knock-out
                          Code:
                          resultsWT.KO <- results(dds, contrast=c("condition", "KO", "wildtype"))
                          for the second question, the defaults results command should suffice if I understnad it correctly, so running this command will give me the genes, which are changed between the KO and the WT due to the stimulation of the samples (and not due to the KO).
                          Code:
                          resultsStimulations <- results(dds)
                          Are my assumptions correct?

                          Thanks for any help and corrections if needed,

                          Assa

                          Comment


                          • #14
                            1. This isn't actually a pair-wise comparison, it's a classical factorial design (sorry, I know this gets overly confusing...I swear this makes sense if someone just draws stuff on a white board!). For this, you just want the results() from "condition" (using a contrast as you did is nice since then you know whether WT or KO is the numerator).

                            2. That should work, though you might specify results(dds, name="condition:stimulation") or something like that to be absolutely sure.

                            Comment


                            • #15
                              thanks again,

                              Originally posted by dpryan View Post
                              2. That should work, though you might specify results(dds, name="condition:stimulation") or something like that to be absolutely sure.
                              if I take one of the names from my resultNames() vector:
                              Code:
                              > resultsNames(dds)
                              [1] "Intercept"                         "condition_wildtype_vs_trippleKO"  
                              [3] "stimulation_none_vs_Curdlan"       "conditionwildtype.stimulationnone"
                              it would be as such:

                              Code:
                              resultsStimulations <- results(dds, name="conditionwildtype.stimulationnone")

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              9 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              51 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              67 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X