Hi,
I have been working with Cufflinks v. 1.1.0 with mRNA-Seq data from some of the old (2008) runs of Illumina with 36bp reads. The only options I specified were (-I 5000 and -b <refSeqFasta>) and there was no reference GFF specified. In the resulting transcripts.gtf, I'm getting unusually high FPKM values on the scale of tens to hundreds of thousands (eg: FPKM=83456.4, 5571.5, 1017907.8) for several thousand transcripts.
Some previous posts had suggested short read length and reference FASTA as possible culprits. But, removing the -b option does not help. This is not a problem with the BAM format since SAM format also gives similar result. I tried the newer v. 1.3.0 and that too gives similar values. I'm not sure if short transcripts are being consistently inflated.
Strangely, the older v. 0.9.3 is giving respectable FPKM values (455.4 for the transcript that had 83456.4 previously), which I'd like to trust since they match manually calculated values (not quite, but close).
However, I wonder why the new versions of Cufflinks are inflating the FPKM values by several orders of magnitude? Has anyone found a solution to this problem? Can I still use the new versions without causing such FPKM inflation?
Thanks
I have been working with Cufflinks v. 1.1.0 with mRNA-Seq data from some of the old (2008) runs of Illumina with 36bp reads. The only options I specified were (-I 5000 and -b <refSeqFasta>) and there was no reference GFF specified. In the resulting transcripts.gtf, I'm getting unusually high FPKM values on the scale of tens to hundreds of thousands (eg: FPKM=83456.4, 5571.5, 1017907.8) for several thousand transcripts.
Some previous posts had suggested short read length and reference FASTA as possible culprits. But, removing the -b option does not help. This is not a problem with the BAM format since SAM format also gives similar result. I tried the newer v. 1.3.0 and that too gives similar values. I'm not sure if short transcripts are being consistently inflated.
Strangely, the older v. 0.9.3 is giving respectable FPKM values (455.4 for the transcript that had 83456.4 previously), which I'd like to trust since they match manually calculated values (not quite, but close).
However, I wonder why the new versions of Cufflinks are inflating the FPKM values by several orders of magnitude? Has anyone found a solution to this problem? Can I still use the new versions without causing such FPKM inflation?
Thanks
Comment