View Single Post
Old 09-16-2015, 07:15 AM   #3
EVELE
Junior Member
 
Location: France

Join Date: Jun 2015
Posts: 9
Default

It is not the same.
I can see differences when I apply both ways to my data.

Common part of code:
> times=factor(c(1,rep(2,2),rep(3,2),0,1,rep(2,2),rep(3,2),0), levels=c(0,1,2,3))
> condition = factor(c(rep("factors", 6), rep("controls",6)), levels=c("controls","factors"))

> colData = data.frame( condition=condition, times=times)
> rownames(colData)=colnames(duo)

> dds<-DESeqDataSetFromMatrix(countData= as.matrix(duo), colData= colData ,design=formula(~condition+times+condition:times))
> dds <- DESeq(dds, test="LRT", reduced = ~ condition + times)
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing

> resultsNames(dds)
[1] "Intercept" "condition_factors_vs_controls"
[3] "times_1_vs_0" "times_2_vs_0"
[5] "times_3_vs_0" "conditionfactors.times1"
[7] "conditionfactors.times2" "conditionfactors.times3"


The way without test="Wald"

> res1 <- results(dds, name="conditionfactors.times1")
> res2 <- results(dds, name="conditionfactors.times2")
> res3 <- results(dds, name="conditionfactors.times3")


# Here I filter padj <= 0.05
> res1 <- res1[!is.na(res1$padj) & res1$padj <=0.05,]
> res2 <- res2[!is.na(res2$padj) & res2$padj <=0.05,]
> res3 <- res3[!is.na(res3$padj) & res3$padj <=0.05,]

For the gene PTET.51.1.G0050374 I obtain:

> res1["PTET.51.1.G0050374",]

log2 fold change (MLE): conditionfactors.times1
LRT p-value: '~ condition + times + condition:times' vs '~ condition + times'
DataFrame with 1 row and 6 columns
baseMean log2FoldChange lfcSE stat pvalue
<numeric> <numeric> <numeric> <numeric> <numeric>
PTET.51.1.G0050374 481.4894 5.916279 0.5890732 140.9655 2.340315e-30
padj
<numeric>
PTET.51.1.G0050374 6.717171e-26



The way with test="Wald"

> res1W <- results(dds, name="conditionfactors.times1",test="Wald")
> res2W <- results(dds, name="conditionfactors.times2",test="Wald")
> res3W <- results(dds, name="conditionfactors.times3",test="Wald")

> res1W <- res1W[!is.na(res1W$padj) & res1W$padj <=0.05,]
> res2W <- res2W[!is.na(res2W$padj) & res2W$padj <=0.05,]
> res3W <- res3W[!is.na(res3W$padj) & res3W$padj <=0.05,]

For the same gene I obtain different values in stat pvalue and padj columns

> res1W["PTET.51.1.G0050374",]

log2 fold change (MLE): conditionfactors.times1
Wald test p-value: conditionfactors.times1
DataFrame with 1 row and 6 columns
baseMean log2FoldChange lfcSE stat pvalue
<numeric> <numeric> <numeric> <numeric> <numeric>
PTET.51.1.G0050374 481.4894 5.916279 0.5890732 10.04337 9.826083e-24
padj
<numeric>
PTET.51.1.G0050374 1.210279e-19

Furthermore, the number of genes (after applying the same filter) vary between res1W, res2W and res3W
> dim(res1W)[1]
[1] 46
> dim(res2W)[1]
[1] 2
> dim(res3W)[1]
[1] 345


while the numbers are the same for res1, res2, res3 (without test="Wald")

> dim(res1)[1]
[1] 780
> dim(res2)[1]
[1] 780
> dim(res3)[1]
[1] 780

I cannot see what's happening here, but I'd like to know which of the 2 methods I should use in order to obtain all genes having a time-specific differential expression and a padj value <=0.05 in my time-course experiment.

Thank you
EVELE is offline   Reply With Quote