SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
FPKM/RPKM without GFF/GFT file jgibbons1 Bioinformatics 2 02-22-2013 09:45 AM
Does Scripture calculate RPKM or FPKM? aquleaf Bioinformatics 0 08-13-2011 06:57 PM
FPKM/RPKM cut-off question lewewoo RNA Sequencing 1 05-06-2011 12:54 AM
RPKM, FPKM, etc. rgregor Bioinformatics 4 03-30-2011 02:06 PM
Calculate FPKM/RPKM manually BioTalk Bioinformatics 1 09-02-2010 09:31 AM

Reply
 
Thread Tools
Old 01-17-2014, 02:16 AM   #1
wynstep
Member
 
Location: Rome

Join Date: Jan 2014
Posts: 11
Default Correlation FPKM and RPKM

Hi everyone!
I'm trying to evaluate gene expression differences, in breast cancer cells before and after treatment. So I'm working on RNA-seq data (single-end reads).

I tried to correlate FPKM (CuffDiff output) and RPKM (counts from HTSeq-count, then classic "Mortazavi et al." calculation).
Reading the CuffLinks website, some papers and other forums, it seems that these values have to be the same for single-end reads data!

I also filtered miRNAs and other genes shorter then 300bp (could give false FPKM high values).
I hope that someone can help me!

Thanks in advace
wynstep is offline   Reply With Quote
Old 01-17-2014, 02:48 AM   #2
rboettcher
Member
 
Location: Berlin

Join Date: Oct 2010
Posts: 71
Default

You did not include a question or any findings in your post, so what exactly do you want to know?
rboettcher is offline   Reply With Quote
Old 01-17-2014, 03:00 AM   #3
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

For single-end reads, FPRKM==RPKM. Keep in mind that, due to how it works, the RPKM values produced by cufflinks will almost never be the same as those you compute by hand. Firstly, htseq-count only count uniquely mapped reads, whereas cufflinks will distribute fractional reads counts over transcripts and genes. Also, you likely have a single pre-defined length for each gene, presumably computed by just summing the length of each exon in a "union gene model". I recall that cufflinks tries to determine the actual length and distributions of the transcripts and then uses that.

BTW, unless you really want to discover new isoforms or genes, you might just directly use the counts from HTSeq-count in DESeq2 (or edgeR or limma).
dpryan is offline   Reply With Quote
Old 01-17-2014, 03:01 AM   #4
wynstep
Member
 
Location: Rome

Join Date: Jan 2014
Posts: 11
Default

Sorry, you're right! My question is:
Is possible to obtain equal FPKM and RPKM values? All says "yes", but deeply searching into the literature, I didn't find a protocol to do it or some correlation analysis.

The best match I've found is this:
http://www.cureffi.org/2013/09/12/co...ms-in-rna-seq/
but he tried to correlate crude counts to fpkm (without a great success...)

Thanks!
wynstep is offline   Reply With Quote
Old 01-17-2014, 03:08 AM   #5
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

By definition, an FPKM value computed with single-end reads is also the RPKM value (in fact, this is also true for paired-end reads if you only use reads where both ends map to the same feature). As I mentioned above, the reason that you're getting different values by hand than by cufflinks is that you're using vastly different methods to arrive at both counts and lengths.
dpryan is offline   Reply With Quote
Old 01-17-2014, 03:10 AM   #6
wynstep
Member
 
Location: Rome

Join Date: Jan 2014
Posts: 11
Default

Quote:
Originally Posted by dpryan View Post
For single-end reads, FPRKM==RPKM. Keep in mind that, due to how it works, the RPKM values produced by cufflinks will almost never be the same as those you compute by hand. Firstly, htseq-count only count uniquely mapped reads, whereas cufflinks will distribute fractional reads counts over transcripts and genes. Also, you likely have a single pre-defined length for each gene, presumably computed by just summing the length of each exon in a "union gene model". I recall that cufflinks tries to determine the actual length and distributions of the transcripts and then uses that.

BTW, unless you really want to discover new isoforms or genes, you might just directly use the counts from HTSeq-count in DESeq2 (or edgeR or limma).
sorry for my ignorance, what's the meaning of FPRKM? Thanks for your answer!
The reason why I tried to compare FPKM and RPKM, is only to have a value control!
I believe that I have to follow only one strand of analysis...
wynstep is offline   Reply With Quote
Old 01-17-2014, 03:11 AM   #7
wynstep
Member
 
Location: Rome

Join Date: Jan 2014
Posts: 11
Default

Quote:
Originally Posted by dpryan View Post
By definition, an FPKM value computed with single-end reads is also the RPKM value (in fact, this is also true for paired-end reads if you only use reads where both ends map to the same feature). As I mentioned above, the reason that you're getting different values by hand than by cufflinks is that you're using vastly different methods to arrive at both counts and lengths.
Thank you very much for your explanation!
wynstep is offline   Reply With Quote
Old 01-17-2014, 03:14 AM   #8
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

FPRKM was just a typo I meant FPKM
dpryan is offline   Reply With Quote
Old 01-17-2014, 03:16 AM   #9
wynstep
Member
 
Location: Rome

Join Date: Jan 2014
Posts: 11
Default

Quote:
Originally Posted by dpryan View Post
FPRKM was just a typo I meant FPKM
Ah ok! I was afraid I missed something important!
wynstep is offline   Reply With Quote
Old 01-17-2014, 04:52 AM   #10
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

I should also point you to figure 3D in this paper.
dpryan is offline   Reply With Quote
Old 01-17-2014, 05:11 AM   #11
wynstep
Member
 
Location: Rome

Join Date: Jan 2014
Posts: 11
Default

Quote:
Originally Posted by dpryan View Post
I should also point you to figure 3D in this paper.
I requested the article to my university, so interested! The 3D figure (I can't see it in high resolution for now...) seems really similar to my correlation curve between FPKM and RPKM

Thanks a lot!
wynstep is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:27 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO