lran2008 |
06-08-2015 05:04 AM |
DESeq2 experimental design question
Hi,
I am having some problems with my DEseq2 experimental design (RNA-seq raw read counts). Basically, I have two treatment (sf, ls). Within each treatment, there are six cows with 3 time points(14,7 and 28). Time point 14 is the control. I want to find DE genes, 28 vs 14, 7 vs 14.
I first tried a design: ~ treat + treat:cowf + treat:day
Code:
cow treat day cowf
14LS2 2 ls 14 1
14LS63 63 ls 14 2
14LS66 66 ls 14 3
14LS67 67 ls 14 4
14LS73 73 ls 14 5
14LS74 74 ls 14 6
28LS2 2 ls 28 1
28LS63 63 ls 28 2
28LS66 66 ls 28 3
28LS67 67 ls 28 4
28LS73 73 ls 28 5
28LS74 74 ls 28 6
7LS2 2 ls 7 1
7LS63 63 ls 7 2
7LS66 66 ls 7 3
7LS67 67 ls 7 4
7LS73 73 ls 7 5
7LS74 74 ls 7 6
14SF5355 5355 sf 14 1
14SF61 61 sf 14 2
14SF62 62 sf 14 3
14SF70 70 sf 14 4
14SF71 71 sf 14 5
14SF72 72 sf 14 6
28SF5355 5355 sf 28 1
28SF61 61 sf 28 2
28SF62 62 sf 28 3
28SF70 70 sf 28 4
28SF71 71 sf 28 5
28SF72 72 sf 28 6
7SF5355 5355 sf 7 1
7SF61 61 sf 7 2
7SF62 62 sf 7 3
7SF70 70 sf 7 4
7SF71 71 sf 7 5
7SF72 72 sf 7 6
Here is the script to find DE genes of 28 vs 14 in ls treatment:
Code:
countdata <- read.csv("readcounts_matrix.csv", sep=",",stringsAsFactors=FALSE, row.names=1)
coldata <- read.csv("coldata.csv", sep=",",stringsAsFactors=TRUE, row.names=1)
coldata$cow <- factor(coldata$cow)
coldata$day <- factor(coldata$day)
coldata$cowf <- factor(rep(c(1,2,3,4,5,6),times=6))
coldata$day <-relevel(coldata$day, "14")
dds <- DESeqDataSetFromMatrix(countData = countdata, colData = coldata, design = ~ treat + treat:cowf + treat:day)
dds <- estimateSizeFactors(dds)
nc <- counts(dds, normalized=TRUE)
filter <- rowSums(nc >= 10) >= 12
dds <- dds[filter,]
dds <- DESeq(dds,modelMatrixType="standard")
ls_d28vsd14 <- results(dds, name = "treatls.day28")
I am not sure if the above design is correct. Since the two treatment are independent, i used data from each treatment separately for DE analysis, which is easier as I can remove the treatment factor.
Below is for ls treatment with design ~cow + day
Code:
cow treat day cowf
14LS2 2 ls 14 1
14LS63 63 ls 14 2
14LS66 66 ls 14 3
14LS67 67 ls 14 4
14LS73 73 ls 14 5
14LS74 74 ls 14 6
28LS2 2 ls 28 1
28LS63 63 ls 28 2
28LS66 66 ls 28 3
28LS67 67 ls 28 4
28LS73 73 ls 28 5
28LS74 74 ls 28 6
7LS2 2 ls 7 1
7LS63 63 ls 7 2
7LS66 66 ls 7 3
7LS67 67 ls 7 4
7LS73 73 ls 7 5
7LS74 74 ls 7 6
Code:
ls_countdata = countdata[,1:18]
ls_coldata = coldata[1:18,]
ls_dds <- DESeqDataSetFromMatrix(countData = ls_countdata, colData = ls_coldata, design = ~ cow + day)
ls_dds <- estimateSizeFactors(ls_dds)
nc <- counts(ls_dds, normalized=TRUE)
filter <- rowSums(nc >= 10) >= 6
ls_dds <- ls_dds[filter,]
ls_dds$day <- relevel(ls_dds$day, "14")
ls_dds <- DESeq(ls_dds)
ls_res_28vs14 <- results(ls_dds,contrast=c("day", "28","14"))
However, results from the two methods above produced different results. Could anybody explain me which one I should use? Thanks
|