Hi all,
I have just started using DESeq2 using a multi-factor approach. As I'm new to this (and to statistics), I was hoping to get feedback / corrections on what I think the analysis should look like:
I have the following colData:
Blocking for library batch effects, I am interested in finding the following DE genes:
Q1) DE genes between vehicle-treated "WT" and "KO" (i.e. the 'baseline' difference);
Q2) Genes that are DE following treatment within each genotype ("WT" and "KO");
Q3) Genes that show a genotype-dependent response to treatment.
Looking through the DESeq2 tutorials and form posts, Q1 and Q2 would probably be easiest addressed by created a 'genotype_treatment' factor and working with contrast:
and using a design
to extract results as:
Q1:
Q2.1:
Q2.2:
As for Q3, I would then repeat the analysis including an interaction term:
to extract results as:
Is this correct? I was wondering if there is a way to do this without running 2 separeate analysis designs...?
On a more general note, can anyone recommend a resource to get more familiar with these design formulas and what they 'mean'?
I have just started using DESeq2 using a multi-factor approach. As I'm new to this (and to statistics), I was hoping to get feedback / corrections on what I think the analysis should look like:
I have the following colData:
Code:
sample genotype treatment library_batch S1 WT vehicle B1 S2 WT vehicle B2 S3 WT vehicle B1 S4 WT vehicle B2 S5 WT Drug B2 S6 WT Drug B1 S7 WT Drug B2 S8 KO vehicle B1 S9 KO vehicle B2 S10 KO vehicle B1 S11 KO vehicle B2 S12 KO Drug B1 S13 KO Drug B2 S14 KO Drug B1 S15 KO Drug B2
Q1) DE genes between vehicle-treated "WT" and "KO" (i.e. the 'baseline' difference);
Q2) Genes that are DE following treatment within each genotype ("WT" and "KO");
Q3) Genes that show a genotype-dependent response to treatment.
Looking through the DESeq2 tutorials and form posts, Q1 and Q2 would probably be easiest addressed by created a 'genotype_treatment' factor and working with contrast:
Code:
colData$condition <- paste(colData$genotype, colData$treatment, sep="_")
Code:
design = ~ library_batch + condition
Q1:
Code:
res1 <- results(dds, contrast=c("condition", "KO_vehicle", "WT_vehicle"))
Code:
res2wt <- results(dds, contrast=c("condition", "WT_Drug", "WT_vehicle"))
Code:
res2ko <- results(dds, contrast=c("condition", "KO_Drug", "KO_vehicle"))
Code:
design = ~ library_batch + genotype*treatment
Code:
res3 <- results(dds, name="genotypeKO.treatmentvehicle")
On a more general note, can anyone recommend a resource to get more familiar with these design formulas and what they 'mean'?
Comment