Hi all,
I am trying to determine the most highly expressed genes in some publicly available RNA-seq data, as well as in my own samples. Thus, I am comparing between genes in the same samples, not looking at DE genes between samples. Because larger genes yield more fragments resulting in more mapped reads, I need to compute RPKM values for each gene in order to make a fair comparison. I know you can do this using the function rpkm() in edgeR, and I think I know how to extract gene lengths from a GTF file, but they are obviously different for different isoforms. So how do you compute gene lengths from a GTF file for genes with multiple isoforms?
Thanks in advance,
S
I am trying to determine the most highly expressed genes in some publicly available RNA-seq data, as well as in my own samples. Thus, I am comparing between genes in the same samples, not looking at DE genes between samples. Because larger genes yield more fragments resulting in more mapped reads, I need to compute RPKM values for each gene in order to make a fair comparison. I know you can do this using the function rpkm() in edgeR, and I think I know how to extract gene lengths from a GTF file, but they are obviously different for different isoforms. So how do you compute gene lengths from a GTF file for genes with multiple isoforms?
Thanks in advance,
S
Comment