Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • multifactor analysis or pairwise comparison?

    I can't find a simple answer to what's the difference between doing multifactor analysis or pairwise comparisons, example:

    I have two factors to consider: condition (treatment or control) and genotype (A and B). If I do
    1. a pairwise comparison between control and treatment across both genotypes and
    2. a pairwise comparison between treatment and control in genotype A and treatment and control in genotype B
    I expect the intersection of these two pairwise comparisons to give the same results as if I do mutlifactor analysis taking into account both factors: condition and genotype (in DESeq2 something like design(ddsM) <- formula(~condition*genotype)) but I don't

    What's in this case the biological interpretation of both analyses?

  • #2
    Your wording is a little confused so it's hard to tell exactly what you're doing in method (1) and (2). How about you just tell us the biological question you want to answer and then one of us can then just tell you the simplest way to do it.

    In general, you can get the same answer treating each condition:genotype combination as a separate group and doing comparisons as if you use a more classical factorial design (e.g., "~condition*genotype"). Some questions are simpler to answer with one design vs. the other.

    I would strongly encourage you not to do multiple comparisons and then take the intersection of the results. You're absolutely killing your statistical power when you're doing that. You also then lack a fold-change or p-value, which you kind of need to both publishing and follow-up experiments.

    Comment


    • #3
      My biological question is how do genotype A and B respond differently to the treatment. I need to take both factors into account. I wan't to know what's the difference between doing multifactor (condition*genotype) or treat vs control in A and treat vs ctl in B and compare the list of genes. Basically if I do the latter I get more genes differentially expressed than in the former.

      How do I interepret biologically the results of the two approaches?

      Comment


      • #4
        You'll want to use the ~condition*genotype model and extract the interaction term from it. That will directly address your question. The second method is statistically invalid and not answering any biological question.

        Comment


        • #5
          Thank you, I would like to know why the second option is statistically incorrect.

          And for sanity could you or someone confirm that this is the correct code in R (DESeq2 package):

          Code:
          countData=read.table(file.choose(),header=TRUE,row.names=1,sep="\t")
          condition=rep(rep(c("Ctr","Treat"),each=3),4)
          genotype=rep(rep(c("A","B"),each=6),2)
          colData=data.frame(condition,genotype,row.names=names(countData))
          dds <- DESeqDataSetFromMatrix(countData = countData,colData = colData,design = ~ condition+genotype)
          ddsM=dds
          design(ddsM) <- formula(~condition*genotype)
          ddsM <- DESeq(ddsM)
          resM <- results(ddsM)
          ddsMN=ddsM
          ddsMN = estimateSizeFactors(ddsM)
          ddsMN = estimateDispersions(ddsM)
          resGenotype=nbinomLRT(ddsMN, full = design(ddsM), reduced =  formula(~condition*genotype), maxit = 1000)
          resGenotype <- results(resMTolFact, contrast=c("genotype","A","B"))

          Comment


          • #6
            By intersecting lists you're not measuring anything, just asking, "if I take two sets of results and intersect them, what do I get?" That's not answering the question, "what's the additional effect of condition on genotype B?".

            Regarding the code, you can skip the LRT stuff and just use "results(ddsM)" and specify the interaction coefficient. For the actual LRT test, you're using the same formula for the full and reduced design. You want just "reduced = ~condition+genotype". "resMTolFact" is never defined. You can skip all of the "ddsMN" definition stuff.

            Comment


            • #7
              Hi Devon,

              Thank you for replying. If I understand correct I should avoid the LRT test? I could use the interaction term like so:
              resGenotype <- results(ddsM, contrast=list("conditionTreat.genotypeB")) ??

              This arises a new question, if I define my conditions variable as:

              Code:
               condition=rep(rep(c("A.ctl","A.treat","B.ctl","B.treat"),each=3),2)
              To the question, how do A and B respond differently to the treatment could also be answered like this?

              Code:
              AB <- DESeqDataSetFromMatrix(countData = countData,colData = colDataAB, design = ~ condition)
              ABpairwise <- DESeq(AB)
              resAB <- results(ABpairwise, contrast = c("condition", "A.treat", "B.treat"))

              Comment


              • #8
                Code:
                results(ABpairwise, contrast=c(1, -1, -1, 1))
                You want the additional effect due to the interaction of treatment and genotype B. In other words, you need to take B.treat - B.ctl - (A.treat - A.ctl) (represented by a vector of "1, -1, -1, 1").

                Regarding the LRT, you can do that too, but you'd need to be a bit more comfortable with what that's actually testing.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 03-27-2024, 06:37 PM
                0 responses
                12 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-27-2024, 06:07 PM
                0 responses
                11 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                52 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                68 views
                0 likes
                Last Post seqadmin  
                Working...
                X