Hello, I was thinking about this for some time decided to get some opinions about this. Would be great to hear what people think about this.
There are several papers that compare the gene expression quantifications for small-rnas and mRNAs, e.g., correlate the micro rna expressions with mrna expressions over a set of samples to analyze and identify mirna targets. I think the idea is fine, but one has to be careful with the type of quantifications being correlated: The mirna expressions are usually computed from small rna-seq experiments in terms of rpms and mrna expressions are from polya+ rna-seq experiments. When the rpkms are computed, the normalizations (per million reads) are done on different ensemble of sequences. For example if the total small rna levels (e.g., tRNAs) goes up for a sample although the total # of mirna molecules stays constant, the mirna rpms will go down but this will not affect the mrna rpkms. In other words, rpm/rpkm is not a direct measure of the actual quantity of rnas in a cell. Therefore, the correlations will be affected. I hope it is clear what I am trying to explain. If we assume that the total absolute amount of small rna and mrna quantities (i.e., total # of molecules) dont change significantly, I think it is ok to correlate the quantities. Because in this case, if im not mistaken, then rpm (rpkm) is proportional to the actual levels of mirnas (mrnas).
My question is, does it make sense to correlate those two quantities? Do you think I am missing something? Can we assume that the total amount of small rna quantity stays constant from sample to sample? Thanks!
There are several papers that compare the gene expression quantifications for small-rnas and mRNAs, e.g., correlate the micro rna expressions with mrna expressions over a set of samples to analyze and identify mirna targets. I think the idea is fine, but one has to be careful with the type of quantifications being correlated: The mirna expressions are usually computed from small rna-seq experiments in terms of rpms and mrna expressions are from polya+ rna-seq experiments. When the rpkms are computed, the normalizations (per million reads) are done on different ensemble of sequences. For example if the total small rna levels (e.g., tRNAs) goes up for a sample although the total # of mirna molecules stays constant, the mirna rpms will go down but this will not affect the mrna rpkms. In other words, rpm/rpkm is not a direct measure of the actual quantity of rnas in a cell. Therefore, the correlations will be affected. I hope it is clear what I am trying to explain. If we assume that the total absolute amount of small rna and mrna quantities (i.e., total # of molecules) dont change significantly, I think it is ok to correlate the quantities. Because in this case, if im not mistaken, then rpm (rpkm) is proportional to the actual levels of mirnas (mrnas).
My question is, does it make sense to correlate those two quantities? Do you think I am missing something? Can we assume that the total amount of small rna quantity stays constant from sample to sample? Thanks!