Hi all,
I've analysed a series of 2 pairs of different biological replicates with 3 programs, Cuffdiff, DESeq and edgeR. At the moment I'm just looking at gene-level expression.
One would generally expect that in an ideal world there would be no significant differential expression between replicate sample and that most packages would overlap significantly in their predictions. For DESeq and edgeR this is indeed the case. Yet Cuffdiff seems to stand out from the crowd.
Let's call the replicates A1, A2 and B1, B2. So below we're looking at how many predictions agree when each pair of replicates is compared using (in an 'ideal' world 100% would agree when using the same package).
Cufflinks vs DESeq:
A1-A2 62% (4262/6931) predictions agree
B1-B2 54% (3712/6931) predictions agree
DESeq vs edgeR:
A1-A2 88% (6360/7220) predictions agree
B1-B2: 82% (5983/7220) predictions agree
Cufflinks vs edgeR:
A1-A2: 59% (5921/9987) predictions agree
B1-B2: 48% (4753/9987) predictions agree
Reads were obtained from a single-end Illumina 76bp run and mapped with Tophat. DESeq and edgeR require raw count values so these were extracted from the Tophat SAM file with the HTSeq program and a well annotated GFF3 file from the broad institute. Cufflinks was run on each of the biological replicates and then cuffcompare and cuffdiff on the resulting GTF files.
Has anyone seen this sort of discrepancy before and (more to the point) am I doing something criminally wrong when performing the Cuffdiff analysis?
Any help would be much appreciated!
I've analysed a series of 2 pairs of different biological replicates with 3 programs, Cuffdiff, DESeq and edgeR. At the moment I'm just looking at gene-level expression.
One would generally expect that in an ideal world there would be no significant differential expression between replicate sample and that most packages would overlap significantly in their predictions. For DESeq and edgeR this is indeed the case. Yet Cuffdiff seems to stand out from the crowd.
Let's call the replicates A1, A2 and B1, B2. So below we're looking at how many predictions agree when each pair of replicates is compared using (in an 'ideal' world 100% would agree when using the same package).
Cufflinks vs DESeq:
A1-A2 62% (4262/6931) predictions agree
B1-B2 54% (3712/6931) predictions agree
DESeq vs edgeR:
A1-A2 88% (6360/7220) predictions agree
B1-B2: 82% (5983/7220) predictions agree
Cufflinks vs edgeR:
A1-A2: 59% (5921/9987) predictions agree
B1-B2: 48% (4753/9987) predictions agree
Reads were obtained from a single-end Illumina 76bp run and mapped with Tophat. DESeq and edgeR require raw count values so these were extracted from the Tophat SAM file with the HTSeq program and a well annotated GFF3 file from the broad institute. Cufflinks was run on each of the biological replicates and then cuffcompare and cuffdiff on the resulting GTF files.
Has anyone seen this sort of discrepancy before and (more to the point) am I doing something criminally wrong when performing the Cuffdiff analysis?
Any help would be much appreciated!
Comment