SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
DESeq2 - Pairwise comparisons in multifactor experiments JakobW RNA Sequencing 0 01-25-2016 07:33 AM
Deseq2 multifactor analysis genomica Bioinformatics 6 09-26-2015 01:03 AM
grouped vs pairwise comparison for DE analysis NitaC Bioinformatics 4 06-18-2013 10:04 AM

Reply
 
Thread Tools
Old 04-27-2016, 04:50 AM   #1
illinu
Member
 
Location: US

Join Date: Jul 2013
Posts: 55
Default multifactor analysis or pairwise comparison?

I can't find a simple answer to what's the difference between doing multifactor analysis or pairwise comparisons, example:

I have two factors to consider: condition (treatment or control) and genotype (A and B). If I do
1. a pairwise comparison between control and treatment across both genotypes and
2. a pairwise comparison between treatment and control in genotype A and treatment and control in genotype B
I expect the intersection of these two pairwise comparisons to give the same results as if I do mutlifactor analysis taking into account both factors: condition and genotype (in DESeq2 something like design(ddsM) <- formula(~condition*genotype)) but I don't

What's in this case the biological interpretation of both analyses?
illinu is offline   Reply With Quote
Old 04-27-2016, 05:29 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Your wording is a little confused so it's hard to tell exactly what you're doing in method (1) and (2). How about you just tell us the biological question you want to answer and then one of us can then just tell you the simplest way to do it.

In general, you can get the same answer treating each condition:genotype combination as a separate group and doing comparisons as if you use a more classical factorial design (e.g., "~condition*genotype"). Some questions are simpler to answer with one design vs. the other.

I would strongly encourage you not to do multiple comparisons and then take the intersection of the results. You're absolutely killing your statistical power when you're doing that. You also then lack a fold-change or p-value, which you kind of need to both publishing and follow-up experiments.
dpryan is offline   Reply With Quote
Old 04-27-2016, 06:07 AM   #3
illinu
Member
 
Location: US

Join Date: Jul 2013
Posts: 55
Default

My biological question is how do genotype A and B respond differently to the treatment. I need to take both factors into account. I wan't to know what's the difference between doing multifactor (condition*genotype) or treat vs control in A and treat vs ctl in B and compare the list of genes. Basically if I do the latter I get more genes differentially expressed than in the former.

How do I interepret biologically the results of the two approaches?
illinu is offline   Reply With Quote
Old 04-27-2016, 06:11 AM   #4
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

You'll want to use the ~condition*genotype model and extract the interaction term from it. That will directly address your question. The second method is statistically invalid and not answering any biological question.
dpryan is offline   Reply With Quote
Old 04-27-2016, 06:51 AM   #5
illinu
Member
 
Location: US

Join Date: Jul 2013
Posts: 55
Default

Thank you, I would like to know why the second option is statistically incorrect.

And for sanity could you or someone confirm that this is the correct code in R (DESeq2 package):

Code:
countData=read.table(file.choose(),header=TRUE,row.names=1,sep="\t")
condition=rep(rep(c("Ctr","Treat"),each=3),4)
genotype=rep(rep(c("A","B"),each=6),2)
colData=data.frame(condition,genotype,row.names=names(countData))
dds <- DESeqDataSetFromMatrix(countData = countData,colData = colData,design = ~ condition+genotype)
ddsM=dds
design(ddsM) <- formula(~condition*genotype)
ddsM <- DESeq(ddsM)
resM <- results(ddsM)
ddsMN=ddsM
ddsMN = estimateSizeFactors(ddsM)
ddsMN = estimateDispersions(ddsM)
resGenotype=nbinomLRT(ddsMN, full = design(ddsM), reduced =  formula(~condition*genotype), maxit = 1000)
resGenotype <- results(resMTolFact, contrast=c("genotype","A","B"))
illinu is offline   Reply With Quote
Old 04-27-2016, 11:03 PM   #6
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

By intersecting lists you're not measuring anything, just asking, "if I take two sets of results and intersect them, what do I get?" That's not answering the question, "what's the additional effect of condition on genotype B?".

Regarding the code, you can skip the LRT stuff and just use "results(ddsM)" and specify the interaction coefficient. For the actual LRT test, you're using the same formula for the full and reduced design. You want just "reduced = ~condition+genotype". "resMTolFact" is never defined. You can skip all of the "ddsMN" definition stuff.
dpryan is offline   Reply With Quote
Old 04-28-2016, 05:00 AM   #7
illinu
Member
 
Location: US

Join Date: Jul 2013
Posts: 55
Default

Hi Devon,

Thank you for replying. If I understand correct I should avoid the LRT test? I could use the interaction term like so:
resGenotype <- results(ddsM, contrast=list("conditionTreat.genotypeB")) ??

This arises a new question, if I define my conditions variable as:

Code:
 condition=rep(rep(c("A.ctl","A.treat","B.ctl","B.treat"),each=3),2)
To the question, how do A and B respond differently to the treatment could also be answered like this?

Code:
AB <- DESeqDataSetFromMatrix(countData = countData,colData = colDataAB, design = ~ condition)
ABpairwise <- DESeq(AB)
resAB <- results(ABpairwise, contrast = c("condition", "A.treat", "B.treat"))
illinu is offline   Reply With Quote
Old 04-28-2016, 05:38 AM   #8
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Code:
results(ABpairwise, contrast=c(1, -1, -1, 1))
You want the additional effect due to the interaction of treatment and genotype B. In other words, you need to take B.treat - B.ctl - (A.treat - A.ctl) (represented by a vector of "1, -1, -1, 1").

Regarding the LRT, you can do that too, but you'd need to be a bit more comfortable with what that's actually testing.
dpryan is offline   Reply With Quote
Reply

Tags
biological interpretation, comparison, deseq2, multifactor analysis, pairwise

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:23 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO