View Single Post
Old 10-07-2008, 09:46 AM   #5
vruotti
Member
 
Location: US

Join Date: Feb 2008
Posts: 13
Default RPKM concern?

Hi Chema,

We have used Wold's (RPMK) method and got very good results. Are you saying that if a particular gene has less coverage (coverage=fewer number of mapped reads) this gene will contribute to a change regarding the accuracy of detecting the expression of the other genes? I think the this is the whole point of normalizing. The difference is that the overall expression should not be too different. Also, can you please check your numbers for us? Are they really 333,333 or 333,000,000 for genes a, b and c in condition 1 after converting to RPKM?

Maybe I'm doing this wrong. The formula I see in their paper is:

RPKM = 10^9 x C / NL, which is really just simply C/N

C= the number of mappable reads that felt onto the gene's exons
N= total number of mappable reads in the experiment
L= the sum of the exons in base pairs.

So, let's plug in your numbers.
Condition 1 Condition 2
Gene A 3*10^5 4.5*10^5
Gene B 3*10^5 4.5*10^5
Gene C 3*10^5 0
Total 9*10^5 9*10^5

Translate to RPKM , since they have the same length, it should be something like:
Condition 1 Condition 2
Gene A 333,000,000 500,000,000
Gene B 333,000,000 500,000,000
Gene C 333,000,000 500,000,000

If you look at these numbers you could argue whether the two expression values are differentially expressed. They are not that far apart. Sorry, did I miss your point? Can you explain your concern again?

Victor

Last edited by vruotti; 10-08-2008 at 07:38 AM.
vruotti is offline   Reply With Quote