Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

 Similar Threads Thread Thread Starter Forum Replies Last Post mknut Bioinformatics 2 04-01-2014 02:13 AM yuzhang Sample Prep / Library Generation 3 09-27-2012 01:20 PM

 04-15-2014, 10:41 PM #1 wespiser Junior Member   Location: New England Join Date: Apr 2014 Posts: 5 Cytosol/Nucleus ratio from RNA-seq Hello, I have RNA-seq data from nucleus and cytosol fractions for a couple of different cell types. I am interested in determining the ratio of cytosol to nucleus expression for each cell type. My plan is to first calculate spike in normalized RPKM for each of gene by calculating RPKM ( exon only), then divide by the average NIST 14 spike in(ERCC) for each experiment. Then, for each gene in every cell type, calculate the cytosol/nucleus ratio as the sum of spike in normalized replicates in the cytosol, over the sum of the spike in normalized replicates in the nucleus. Is this a valid measurement of cytsolic vs. nucleus expression? Thank you, Adam
 04-19-2014, 09:45 PM #2 wespiser Junior Member   Location: New England Join Date: Apr 2014 Posts: 5 does anyone have experience with this? Thank you for your time!
 04-21-2014, 11:05 AM #3 jparsons Member   Location: SF Bay Area Join Date: Feb 2012 Posts: 62 In my experience with very similar calculations, there is something about the RPKM normalization which makes ERCC spike-ins behave abnormally in cases with widely-disparate RNA content, which i imagine that nucleus vs cytosol would fall into the category of. I can point you to some figures demonstrating this, if you'd like. Your calculation is, I believe, an accurate one to do, if you're dealing with unnormalized data to begin with (raw counts, for example). One similar calculation which i use to determine the relative RNA content of two cells follows: ρ=(/) * (/) It's a little unclear what you mean by "average NIST 14 spike in", but if you're referring to ERCC-00014, i wouldn't recommend choosing that particular spike in to normalize against, since it is present at a very low concentration in the commonly-used pools and therefore more affected by noise. In the above calculation, i actually use the *sum* of all 96 spike-ins for the mass and count calculations, although they don't vary too significantly if you choose a subset or even a single highly-expressed control to normalize to. Last edited by jparsons; 04-21-2014 at 11:07 AM.

 Tags localization, nist14, rna-seq, rpkm, spike ins