Hello,
Please note that I have posted this question on another website, but I did not receive any advice. I would really appreciate some help, and apologize if this is against the rules. I can delete my post if needed.
I am using DESeq2 and I am a beginner with using model matrices and contrasts. I have a design with three variables: tissue, time and phenotype, where the tissue has three levels (X, Y and Z, where X is the reference level), time as three levels (12h, 24h and 48h, where 12h is the reference level), and phenotype has three levels (ctrl, A and B, where ctrl is the reference level). I have added the dput of the design table below my post (first column is just the sample ID). I am using the following formula to consider all possible variables and interaction terms in the dds object:
~ tissue * time * phenotype
The reason I am doing this is because I am developing a function that uses DESeq2 with a special model matrix and returns a dds object, and my goal is to generate a single dds object, and then be able to use the contrasts to retrieve differential gene expression for any combination of these variables. For example, I could then want to get differential gene expression for (tissue Y time 48h phenotype B) vs (tissue Y time 48h phenotype ctrl), or another example could be (tissue X time 24h phenotype A) vs (tissue X time 24h phenotype ctrl). Please note that I DO know how to run DESeq2 for simple pairwise comparisons, but this is not the goal of this post. The question is not about figuring out the appropriate biological question, design, or formula. This is about designing a function that is able to handle a request like the two examples I mentioned.
So right now with this formula the resultsNames(dds) are:
tissueY
tissueZ
time24h
time48h
phenotypeA
phenotypeB
tissueY:time24h
tissueZ:time24h
tissueY:time48h
tissueZ:time48h
tissueYhenotypeA
tissueZhenotypeA
tissueYhenotypeB
tissueZhenotypeB
time24hhenotypeA
time48hhenotypeA
time24hhenotypeB
time48hhenotypeB
tissueY:time24hhenotypeA
tissueZ:time24hhenotypeA
tissueY:time48hhenotypeA
tissueZ:time48hhenotypeA
tissueY:time24hhenotypeB
tissueZ:time24hhenotypeB
tissueY:time48hhenotypeB
tissueZ:time48hhenotypeB
To extract differential expression between, for example, (tissue Y time 48h phenotype B) and (tissue Y time 48h phenotype ctrl), what would be the correct contrasts? And what would it become if were are interested in one of the reference levels, for example differential expression between (tissue X time 24h phenotype A) and (tissue X time 24h phenotype ctrl), where tissue X was set as the reference?
Thank you for your help!
The design deput:
Please note that I have posted this question on another website, but I did not receive any advice. I would really appreciate some help, and apologize if this is against the rules. I can delete my post if needed.
I am using DESeq2 and I am a beginner with using model matrices and contrasts. I have a design with three variables: tissue, time and phenotype, where the tissue has three levels (X, Y and Z, where X is the reference level), time as three levels (12h, 24h and 48h, where 12h is the reference level), and phenotype has three levels (ctrl, A and B, where ctrl is the reference level). I have added the dput of the design table below my post (first column is just the sample ID). I am using the following formula to consider all possible variables and interaction terms in the dds object:
~ tissue * time * phenotype
The reason I am doing this is because I am developing a function that uses DESeq2 with a special model matrix and returns a dds object, and my goal is to generate a single dds object, and then be able to use the contrasts to retrieve differential gene expression for any combination of these variables. For example, I could then want to get differential gene expression for (tissue Y time 48h phenotype B) vs (tissue Y time 48h phenotype ctrl), or another example could be (tissue X time 24h phenotype A) vs (tissue X time 24h phenotype ctrl). Please note that I DO know how to run DESeq2 for simple pairwise comparisons, but this is not the goal of this post. The question is not about figuring out the appropriate biological question, design, or formula. This is about designing a function that is able to handle a request like the two examples I mentioned.
So right now with this formula the resultsNames(dds) are:
tissueY
tissueZ
time24h
time48h
phenotypeA
phenotypeB
tissueY:time24h
tissueZ:time24h
tissueY:time48h
tissueZ:time48h
tissueYhenotypeA
tissueZhenotypeA
tissueYhenotypeB
tissueZhenotypeB
time24hhenotypeA
time48hhenotypeA
time24hhenotypeB
time48hhenotypeB
tissueY:time24hhenotypeA
tissueZ:time24hhenotypeA
tissueY:time48hhenotypeA
tissueZ:time48hhenotypeA
tissueY:time24hhenotypeB
tissueZ:time24hhenotypeB
tissueY:time48hhenotypeB
tissueZ:time48hhenotypeB
To extract differential expression between, for example, (tissue Y time 48h phenotype B) and (tissue Y time 48h phenotype ctrl), what would be the correct contrasts? And what would it become if were are interested in one of the reference levels, for example differential expression between (tissue X time 24h phenotype A) and (tissue X time 24h phenotype ctrl), where tissue X was set as the reference?
Thank you for your help!
The design deput:
Code:
dput(design) structure(list(tissue = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("X", "Y", "Z"), class = "factor"), time = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("12h", "24h", "48h"), class = "factor"), phenotype = structure(c(3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L ), .Label = c("A", "B", "ctrl"), class = "factor")), class = "data.frame", row.names = c("s1", "s2", "s3", "s4", "s5", "s6", "s7", "s8", "s9", "s10", "s11", "s12", "s13", "s14", "s15", "s16", "s17", "s18", "s19", "s20", "s21", "s22", "s23", "s24", "s25", "s26", "s27", "s28", "s29", "s30", "s31", "s32", "s33", "s34", "s35", "s36", "s37", "s38", "s39", "s40", "s41", "s42", "s43", "s44", "s45", "s46", "s47", "s48", "s49", "s50", "s51", "s52", "s53", "s54", "s55", "s56", "s57", "s58", "s59", "s60", "s61", "s62", "s63", "s64", "s65", "s66", "s67", "s68", "s69", "s70", "s71", "s72", "s73", "s74", "s75", "s76", "s77", "s78", "s79", "s80", "s81"))