![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
cufflinks multiple FPKM values for same location | abh | Bioinformatics | 2 | 07-16-2013 09:19 AM |
Solution found: For Cuffdiff/links 2.0.2 Make cufflinks FPKM match Cuffdiff FPKM | NGSfan | RNA Sequencing | 4 | 04-16-2013 08:10 AM |
Different FPKM values of cufflinks and cuffdiff | mrfox | Bioinformatics | 5 | 10-17-2012 02:10 PM |
Cufflinks and cuffdiff FPKM values | combiochem | Bioinformatics | 12 | 10-14-2012 12:37 AM |
Different FPKM values of cufflinks and cuffdiff in latest version | mrfox | Bioinformatics | 1 | 11-23-2010 06:23 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Senior Member
Location: USA Join Date: Sep 2012
Posts: 130
|
![]()
I am doing my first RNA-seq for Drosophila melanogaster (I normally deal with human or mouse data). It turns out there are a lot of fly genes that have identical coordinates as other genes. In other words, the same location has multiple genes assigned to it.
If there are multiple genes that have the same exact coordinates, they should have the same FPKM values. However, running Cuffdiff using a GTF like that does not yield the same values for all genes. Is there a way to force Cuffdiff to assign the same values to all overlapping genes? I could not find any arguments that may do that. Is there a proper way of dealing with such situations? Do I need to optimize the GTF file? I use the one from iGenomes, which is endorsed by Cufflinks, so it seems like it should be fine. |
![]() |
![]() |
![]() |
#2 |
Member
Location: Oxford, UK Join Date: Nov 2011
Posts: 17
|
![]()
Hi,
The FPKM values for the genes are a sum of the FPKM values found for transcripts of that gene... so I guess you could expect differing values come from the difference in transcripts and the reads/fragments covering them. To test I would look at the transcripts FPKM values for each gene that have the same loci. |
![]() |
![]() |
![]() |
#3 | |
Senior Member
Location: USA Join Date: Sep 2012
Posts: 130
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#4 |
Member
Location: UK Join Date: Jun 2011
Posts: 61
|
![]()
I stopped using cufflinks/cuffdiff 3 months ago as the latest version was producing implausible results. I would recommend using tophat2 + htseq-count + edgeR (or DESeq). I based my workflow on this nice tutorial: http://www-huber.embl.de/pub/pdf/nprot.2013.099.pdf
|
![]() |
![]() |
![]() |
#5 |
Member
Location: Berlin Join Date: Oct 2010
Posts: 71
|
![]()
I agree with feralBiologist, but would switch to featureCounts instead of HTSeq-count for performance reasons (can be run multithreaded and does not require resorted SAM file).
|
![]() |
![]() |
![]() |
#6 | |
Member
Location: UK Join Date: Jun 2011
Posts: 61
|
![]() Quote:
EDIT: I realised that featureCounts is written by the authors of edgeR so it shall be straight-forward to substitute HTSeq-count. Thanks again to rboettcher. Last edited by feralBiologist; 11-08-2013 at 01:56 AM. |
|
![]() |
![]() |
![]() |
#7 | |
Member
Location: Berlin Join Date: Oct 2010
Posts: 71
|
![]() Quote:
EDIT: another nice feature is that fC outputs gene length, so computation of RPKM is straight forward. |
|
![]() |
![]() |
![]() |
#8 |
Member
Location: Oxford, UK Join Date: Nov 2011
Posts: 17
|
![]()
I am going to try these new methods out, thanks.
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|