Hi all!
I am new to RNA-sequencing analysis- just started a few months ago. I have conducted an intergenerational mouse study to explore the effect of the father's diet on offspring gene expression. Due to the nature of this experiment, I wouldn't expect a ton of gene expression differences but I am hoping for some, especially since I observed some phenotypic differences...
So I analyzed my data using cufflinks; RSEM-DESeq2; and HT-seq-count-DESeq2. I was advised to do these 3 different analyses as a way to sort of validate my findings. Our plan is to look at overlap between the results in all 3 analyses, and follow up on these findings with IPA to identify significant biological pathways that are enriched. I have 3 different diet groups that are being compared and I am analyzing males and females separately, as well as together, so I have a total of 18 different analyses. So the first issue is that cuffdiff resulted in more than 100 DE genes (q value< 0.05), but the RSEM and HT-Seq-count analyses resulted in far less, between 0 and 80 DE genes by adjusted p-value<0.05. Is it normal that the results would vary that much between methods? I was under the impression that cuffdiff is more conservative than the other methods, but that is not the case here. What we thought we could do is adjust the cutoffs of the RSEM and HT-seq-count results so that we would have about the same amount of significant genes as we have from cuffdiff, and then look at the overlap. Is this a good approach? The problem with this approach is that I'm not sure which cutoff to choose, because we have 6 analyses just for RSEM, for example, and when I choose a cutoff based on one analysis, it results in a disproportionate increase in DE genes in another analysis. Would it be better to just choose some other cutoff that people normally use, like adjusted p-value< 0.1? Sorry for all these questions! I'm not sure if this makes sense but I am basically just trying to make sense of 3 different types of analyses and wondering if it is even a good idea to use 3 analyses. Any advice would be much appreciated! Thank you so much
Best,
Julia
I am new to RNA-sequencing analysis- just started a few months ago. I have conducted an intergenerational mouse study to explore the effect of the father's diet on offspring gene expression. Due to the nature of this experiment, I wouldn't expect a ton of gene expression differences but I am hoping for some, especially since I observed some phenotypic differences...
So I analyzed my data using cufflinks; RSEM-DESeq2; and HT-seq-count-DESeq2. I was advised to do these 3 different analyses as a way to sort of validate my findings. Our plan is to look at overlap between the results in all 3 analyses, and follow up on these findings with IPA to identify significant biological pathways that are enriched. I have 3 different diet groups that are being compared and I am analyzing males and females separately, as well as together, so I have a total of 18 different analyses. So the first issue is that cuffdiff resulted in more than 100 DE genes (q value< 0.05), but the RSEM and HT-Seq-count analyses resulted in far less, between 0 and 80 DE genes by adjusted p-value<0.05. Is it normal that the results would vary that much between methods? I was under the impression that cuffdiff is more conservative than the other methods, but that is not the case here. What we thought we could do is adjust the cutoffs of the RSEM and HT-seq-count results so that we would have about the same amount of significant genes as we have from cuffdiff, and then look at the overlap. Is this a good approach? The problem with this approach is that I'm not sure which cutoff to choose, because we have 6 analyses just for RSEM, for example, and when I choose a cutoff based on one analysis, it results in a disproportionate increase in DE genes in another analysis. Would it be better to just choose some other cutoff that people normally use, like adjusted p-value< 0.1? Sorry for all these questions! I'm not sure if this makes sense but I am basically just trying to make sense of 3 different types of analyses and wondering if it is even a good idea to use 3 analyses. Any advice would be much appreciated! Thank you so much
Best,
Julia
Comment