Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • understanding design formula in DESeq2

    Hi,

    I am trying to understand the role of using an interaction term in the design formula of DESeq2. I have read this explanation: http://bioconductor.org/packages/dev...l#interactions

    This contains the following paragraph:

    The key point to remember about designs with interaction terms is that, unlike for a design ~genotype + condition, where the condition effect represents the overall effect controlling for differences due to genotype, by adding genotype:condition, the main condition effect only represents the effect of condition for the reference level of genotype (I, or whichever level was defined by the user as the reference level). The interaction terms genotypeII.conditionB and genotypeIII.conditionB give the difference between the condition effect for a given genotype and the condition effect for the reference genotype.

    I would be happy if someone can confirm these affirmations to know if I understand this correctly:

    1) = ~condition + genotype + condition:genotype

    This is not looking at differential expression between conditions, typically a WT vs KO. This is in fact detecting the genes that are differentially expressed between conditions AND differently between genotypes.

    2) = ~ condition + genotype

    This is detecting differentially expressed genes correcting for the genotype effect. In other words, this is looking at differentially expressed genes between all the samples of condition A and all the samples of condition B, but correcting for the effect of the genotype (like we can correct for the batch effect).


    3) =~condition

    Same as above but not correcting for the genotype effect.


    I would like also to know if the following statement is correct:

    If now considering batches instead of genotypes, if one uses a package for batch effect correction such as sva, we can say that:

    1) (~condition + USAGE OF SVA) is equivalent, in the principle, to (~condition + batch). The difference is that a particular package will use a different method.



    Question:

    If the above statements are true, is it correct to say that the following code is equivalent to a 2 by 2 comparision in each genotype using only ~condition:

    `results(dds, contrast=c("group", "IB", "IA"))
    results(dds, contrast=c("group", "IIB", "IIA"))
    results(dds, contrast=c("group", "IIIB", "IIIA"))`

    or is it only subselecting genes that are different between all genotypes AND different between conditions for genotype X (X=c("I", "II", "III"))?

    Thanks a lot in advance.

  • #2
    Hi Nicolas

    Assertions 2-4 seem OK, but 1 is not correct. The best I could come up with to explain this is in the recent book: https://www.huber.embl.de/msmb/Chap-...ec:multifactor

    In particular, note that model formulae are not detecting any genes. They are a concise way of specifying a model with multiple parameters ("betas"), and the next step is saying which particular one of these parameters, or linear combination of them ("contrasts") you care about, and *then* you look for genes with a large value of this (univariate) parameter.

    Sorry, I didn't understand the "Question".

    Hope this helps (a little) -
    Wolfgang
    Last edited by Wolfgang Huber; 05-19-2019, 12:12 PM.
    Wolfgang Huber
    EMBL

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM
    • seqadmin
      Techniques and Challenges in Conservation Genomics
      by seqadmin



      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

      Avian Conservation
      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
      03-08-2024, 10:41 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 03-27-2024, 06:37 PM
    0 responses
    12 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-27-2024, 06:07 PM
    0 responses
    11 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-22-2024, 10:03 AM
    0 responses
    52 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-21-2024, 07:32 AM
    0 responses
    68 views
    0 likes
    Last Post seqadmin  
    Working...
    X