I have several Illumina RNA-seq libraries that, I believe, represent the various components of a previous dataset that was not able to dissect their sample quite as finely. If I want to compare my data to theirs, what's the philosophically correct way to combine the FPKMs? I can come up with arguments for either a weighted sum or a weighted average (I'm leaning towards the latter, since FPKM is analogous to a concentration).
And having chosen one of those, what's the best way to determine the weights? I can make an estimate of the initial amounts of each sample in ng of total RNA based on a spike-in control, but the measurements on those are noisy enough that I'm a little skeptical.
One thing that troubles me is that the FPKM values for various ubiquitous genes (actins, at the moment, though I can certainly check others) vary across the samples, and vary roughly consistently across samples, so I'd like to correct based on a set of things that I can argue are uniformly expressed, but I'm a little worried that's going to come off as sketchy. Thoughts?
And having chosen one of those, what's the best way to determine the weights? I can make an estimate of the initial amounts of each sample in ng of total RNA based on a spike-in control, but the measurements on those are noisy enough that I'm a little skeptical.
One thing that troubles me is that the FPKM values for various ubiquitous genes (actins, at the moment, though I can certainly check others) vary across the samples, and vary roughly consistently across samples, so I'd like to correct based on a set of things that I can argue are uniformly expressed, but I'm a little worried that's going to come off as sketchy. Thoughts?