Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DESEQ2 multiple factors + interaction analysis

    Hello
    First, I want to emphasize that reading his forum was already really extremely helpful
    overcoming some of the issues I had initially grasping the concept of designs in Deseq2.
    Nevertheless I am stuck now and could need some help:

    I have the following experimental setup (each 3 replicates):

    condition: TREATMENT vs CONTROL
    tissue: A vs B
    genotype: MUTANT vs WT

    Now, I first wanted to find all the diferentially expressed genes for each of the condition (was asked by collaborator) , e.g.

    A vs B in the wild-type and treated
    A vs B in the wild-type and treated

    and so on (it is as well a good point to compare with already published data).

    For this I created groups to simplify the analysis e.g.

    TREATMENT WT A, TREATMENT WT B, CONTROL WT A ,...

    and run the analysis with the simple design: ~group

    which gave me the following result names:

    Code:
     resultsNames(dds)
        [1] "Intercept"        "groupTREATMENT.WT.A" "groupTREATMENT.WT.B"....
    and so on.

    Which I could then compare then in the fashion:
    Code:
    cond1 <- results(dds, contrast=list("groupTREATMENT.WT.A","groupTREATMENT.WT.B"))
    I am happy till here, but now comes the part which I am not sure how to analyze best.
    If I want to figure out the interaction
    • [1] tissue A vs B + WT vs MUTANT in TREATMENT
      [2] tissue A vs B + WT vs MUTANT in CONTROL
      [3] TREATMENT vs CONTROL + WT vs MUTANT in tissue A
      [4] TREATMENT vs CONTROL + WT vs MUTANT in tissue B


    I thought that I could first re-level the genotype to make the mutant the reference and get only the genes upregulated in the WT.

    Code:
    dds$genotype <- relevel(dds$genotype, "MUTANT")
    For the design I thought the proper manner would be:

    For the 1st example

    Code:
     ~tissue + genotype + condition + genotype:tissue
    The thought was that I get the interaction between the genotype and the condition and can control for the tissue. But I am somehow on the wrong track:
    Code:
    resultsNames(dds)
    [1] "Intercept"                  "tissue_A_vs_B"                "genotype_WT_vs_MUTANT"        "condition_TREATMENT_vs_CONTROL"     "tissueA.genotypeWT"
    How would I now extract for the previous described scenario (1 and 2) extract the list of diff. exp. genes with contrast?

  • #2
    hi,

    If I understand your question, it sounds like you want to use a model with all interactions.

    You will have to turn off the LFC shrinkage (betaPrior=FALSE in the DESeq() call), as in a situation with two levels of interaction terms (first order interactions between two variables and second order interactions between three variables), shrinkage of effects becomes complicated, and we did not implement routines for this.

    By, "tissue A vs B + WT vs MUTANT in TREATMENT", do you mean, test for a difference in the interaction effect of tissue and genotype for the treatment group vs the control group?

    You should then use a design of ~ tissue*genotype*condition

    And this difference is tested with

    Code:
    results(dds, name="tissueB.genotypeMUTANT.conditionTREATMENT")
    (This requires that you relevel so that A, WT, and CONTROL are base levels of the respective factors.)

    If you want to test the interaction of tissue and genotype specific for the treatment group, that would be the interaction effect of tissue and genotype for the control group and the difference in the interaction effect for the treatment group added together:

    Code:
    results(dds, contrast=list(c("tissueB.genotypeMUTANT","tissueB.genotypeMUTANT.conditionTREATMENT")))

    For, "tissue A vs B + WT vs MUTANT in CONTROL", if you mean the interaction effect of tissue and genotype specific for the control group, this would be the effect

    Code:
    results(dds, name="tissueB.genotypeMUTANT")
    For you to visualize, it might help to examine the model matrix:

    Code:
    model.matrix(~ tissue*genotype*condition, colData(dds))
    which should be the same as the following, if you use betaPrior=FALSE:

    Code:
    attr(dds, "modelMatrix")
    Last edited by Michael Love; 01-20-2015, 08:58 AM. Reason: clarifying

    Comment


    • #3
      Thanks, that was helpful!
      Indeed I find sometimes the contrast list somehow confusing (syntax wise).
      Why would be

      test for an interaction of tissue and genotype specific for the treatment group
      --> tissueB.genotypeMUTANT.conditionTREATMENT

      whereas

      interaction effect of tissue and genotype specific for the control group
      --> tissueB.genotypeMUTANT

      Comment


      • #4
        hi,

        I've tried to clarify the above text, adding in between a third results table, which might have been the one you are interested in. This is just the nature of interactions. Interactions are additional effects for the groups which are not the reference level (or "base level"). So the tissue:genotype interaction for the control group is just the first order interaction, while the tissue:genotype interaction for the treatment group is the first order interaction plus an additional effect.

        Comment


        • #5
          I'm also struggling with the syntax of the design formula. I want to identify genes differentially expressed between two ancestries of two health statuses at each of 4 time points. I'm not sure if I should do two separate analyses or combine everything into one design:

          Ancestry: A or B
          Status: could be Control or Case
          Time_Point: 1,2,3,4

          I tried the following design to be able to compare each thing individually and also the interaction between any of the things:
          dds <- DESeqDataSetFromMatrix(countData = countData,colData = colData3,design = ~ANCESTRY+Status+Time_Point + ANCESTRY:Time_Point + Status:Time_Point)

          dds <- DESeq(dds,parallel=TRUE)

          Which results in the following:
          resultsNames(dds)
          [1] "Intercept"
          [2] "ANCESTRYA"
          [3] "ANCESTRYB"
          [4] "StatusControl"
          [5] "StatusCase"
          [6] "Time_Point1"
          [7] "Time_Point2"
          [8] "Time_Point3"
          [9] "Time_Point4"
          [10] "ANCESTRYA.Time_Point1"
          [11] "ANCESTRYB.Time_Point1"
          [12] "ANCESTRYA.Time_Point2"
          [13] "ANCESTRYB.Time_Point3"
          [14] "ANCESTRYA.Time_Point3"
          [15] "ANCESTRYB.Time_Point3"
          [16] "ANCESTRYA.Time_Point4"
          [17] "ANCESTRYB.Time_Point4"
          [18] "StatusControl.Time_Point1"
          [19] "StatusCase.Time_Point1"
          [20] "StatusControl.Time_Point2"
          [21] "StatusCase.Time_Point2"
          [22] "StatusControl.Time_Point3"
          [23] "StatusCase.Time_Point3"
          [24] "StatusControl.Time_Point4"
          [25] "StatusCase.Time_Point4"

          This way I am able to identify genes differentially expressed between one time point and another regardless of status or ancestry:
          Time1v_2<-results(dds, contrast=c("Time_Point", "Time_Point1", "Time_Point2"),parallel=TRUE)

          Time2v_3<-results(dds, contrast=c("Time_Point", "Time_Point2", "Time_Point3"),parallel=TRUE)

          I am also able to find genes differentially expressed between Status A and Status B regardless of time point or ancestry:
          ControlvCase<-results(dds, contrast=c("Status", "Control", "Case"),parallel=TRUE)

          Similarly for ancestry regardless of status or time point:
          AncestryAvB<-results(dds, contrast=c("Ancestry", "A", "B"),parallel=TRUE)

          But I get confused when I want to identify genes differentially expressed between Control and Case at each specific time point, or Ancestry A and Ancestry B at each time point. Is the following the correct syntax for such a comparison?


          For genes differentially expressed in cases and controls at a particular time:
          ControlTime2vCaseTime2<-results(dds, contrast=list(c("StatusControl.Time_Point2", "StatusCase.Time_Point2")),parallel=TRUE)

          For ancestry differences at a particular time:
          AncestryATime2vAncestryBTime2<-results(dds, contrast=list(c("ANCESTRYEA.Time_Point3_BCG_24h", "ANCESTRYAJ.Time_Point3_BCG_24h")),parallel=TRUE)

          When I run these commands, I get no error and a results table, but I want to make sure those results are for the comparison I actually want.

          I have also read about the time series option, but am unsure how it would differ from the above design.

          Many thanks for any feedback!

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          59 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          57 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          53 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          56 views
          0 likes
          Last Post seqadmin  
          Working...
          X