Dear community,
I read several explanations on the usage of DESeq2 for DGE, but I am entirely new to this as well as to R usage, and am struggling to understand all the different options enough to make sure that I design my analysis correctly so as to answer the biological questions of interest.
Briefly, I have tested a treatment over several time points and I have 2 biological replicates for each time point (0, 1 and 14 days) and treatment (SED, CON):
SED: T1(x2), T14(x2)
CON:T0(x2), T1(x2), T14(x2)
I do not have T0 samples for the treatment because I took the baseline control samples from each of the 2 replicates then initiated the treatment on "treatment" organisms.
I would like to answer the following questions:
1) Are there genes that are typical responses to the treatment, irrespectively of the time the treatment is applied for (i.e. that vary generally with the treatment, regardless of time), and if so, which are those genes?
2) Which are the genes that are differentially expressed after 24h? 1week? 2weeks of treatment?
3) Is there an effect of time (e.g. does the expression of genes affected by the treatment increase over time)?
I have seen comments on this thread https://support.bioconductor.org/p/61801/, but as I mentioned I am a bit confused by all these various arguments.
Following the DESeq2 manual I attempted to set my design as a time series as follows:
but I got an error message: "error: inv(): matrix appears to be singular".
I therefore tried (successfully) the following:
If I understand things correctly, this design will tell me genes that are differentially expressed across treatments for all time points in general; is this correct? However, from what I have read in the documentation, the questions I am asking to these data would probably best be answered using a likelihood ratio test. So this is where I start getting really confused...
So, how would I obtain differential expression results for the following tests:
SED 24h vs CON 24h; SED 2wks vs CON 2wks? SED vs CON (all time points together)?
And is it possible to compare the evolution in expression of certain genes to see if they fit a regression model (e.g. increase in expression of treatment-affected genes over time)?
I have tried the following (I list all steps for clarity):
However I get the same error message as above:
Thank you very much in advance for your help! This is all very interesting, but the learning curve is quite steep!
Thanks,
Chris
I read several explanations on the usage of DESeq2 for DGE, but I am entirely new to this as well as to R usage, and am struggling to understand all the different options enough to make sure that I design my analysis correctly so as to answer the biological questions of interest.
Briefly, I have tested a treatment over several time points and I have 2 biological replicates for each time point (0, 1 and 14 days) and treatment (SED, CON):
SED: T1(x2), T14(x2)
CON:T0(x2), T1(x2), T14(x2)
I do not have T0 samples for the treatment because I took the baseline control samples from each of the 2 replicates then initiated the treatment on "treatment" organisms.
I would like to answer the following questions:
1) Are there genes that are typical responses to the treatment, irrespectively of the time the treatment is applied for (i.e. that vary generally with the treatment, regardless of time), and if so, which are those genes?
2) Which are the genes that are differentially expressed after 24h? 1week? 2weeks of treatment?
3) Is there an effect of time (e.g. does the expression of genes affected by the treatment increase over time)?
I have seen comments on this thread https://support.bioconductor.org/p/61801/, but as I mentioned I am a bit confused by all these various arguments.
Following the DESeq2 manual I attempted to set my design as a time series as follows:
Code:
dds <- DESeqDataSetFromMatrix(countData=countTable, colData= coldata ,coldata = ~ time + treatment + time:treatment)
I therefore tried (successfully) the following:
Code:
>dds <- DESeqDataSetFromMatrix(countData=countTable, colData= coldata ,design = ~ time + treatment) >dds <- DESeq(dds)
So, how would I obtain differential expression results for the following tests:
SED 24h vs CON 24h; SED 2wks vs CON 2wks? SED vs CON (all time points together)?
And is it possible to compare the evolution in expression of certain genes to see if they fit a regression model (e.g. increase in expression of treatment-affected genes over time)?
I have tried the following (I list all steps for clarity):
Code:
> countTable <- read.table("141029_eXpress_ndndns_eX_uniq.csv", header=TRUE, row.names=1) > coldata = data.frame(row.names=colnames(countTable), treatment = as.factor(c("SED","SED","CON","CON","CON","SED","SED","CON","CON","CON")),time=as.factor(c("1","14","0","1","14","1","14","0","1","14")), colony=as.factor(c("A","A","A","A","A","H","H","H","H","H"))) > treatment= coldata$treatment > time=coldata$time > colony=coldata$colony > time [1] 1 14 0 1 14 1 14 0 1 14 Levels: 0 1 14 > class(time) [1] "factor" > treatment [1] SED SED CON CON CON SED SED CON CON CON Levels: CON SED > class(treatment) [1] "factor" > library("DESeq2") > dds <- DESeqDataSetFromMatrix(countData=countTable, colData= coldata ,design = ~ colony + time + treatment) > dds <- DESeqDataSetFromMatrix(countData=countTable, colData= coldata ,coldata = ~ colony + time + treatment) > design(dds) = ~ colony + time + treatment + time:treatment > dds = DESeq(dds, test="LRT", reduced= ~ time + treatment)
Code:
estimating size factors estimating dispersions gene-wise dispersion estimates error: inv(): matrix appears to be singular Error: inv(): matrix appears to be singular
Thanks,
Chris
Comment