Hi all, I'm using DESeq to identify differentially expressed genes in some human cancer samples. I'm using eXpress to estimate isoform abundance according to the Gencode v14 transcriptome and summing the effective counts from eXpress for all of the transcripts from a gene to obtain to obtain gene-level counts. These are my input to DESeq (after rounding to integers). I've noticed that the fit for from the estimateDispersions function is poor for my data (see attached). I'm looking at several different designs (i.e. different groups of samples based on different biological questions) and see similar fits.
For instance, the first fit I've attached is for 56 samples from three cancer types. 25 of these samples have a particular mutation and the other 31 do not. In this case, I'm including the cancer type as a covariate and testing for differences between the mutants and non-mutants. The second fit I've attached is for 26 samples, 14 in one group and 12 in the other.
In both cases I'm using the "gene-est-only" parameter for estimateDispersions because according to the DESeq manual, I have enough samples for this approach. I realize that the fit is therefore not affecting my downstream results. However, I'm wondering why the fit is so bad and whether this indicative of any problems. I would use DESeq2 but it seems that the gene-est-only option is no longer available, and I'm wondering if there is a reason for that.
Thanks for any help in advance.
For instance, the first fit I've attached is for 56 samples from three cancer types. 25 of these samples have a particular mutation and the other 31 do not. In this case, I'm including the cancer type as a covariate and testing for differences between the mutants and non-mutants. The second fit I've attached is for 26 samples, 14 in one group and 12 in the other.
In both cases I'm using the "gene-est-only" parameter for estimateDispersions because according to the DESeq manual, I have enough samples for this approach. I realize that the fit is therefore not affecting my downstream results. However, I'm wondering why the fit is so bad and whether this indicative of any problems. I would use DESeq2 but it seems that the gene-est-only option is no longer available, and I'm wondering if there is a reason for that.
Thanks for any help in advance.
Comment