I feel like the question might have been asked already, but I could not find it. So please bear with me. I am trying to use TCGA gene expression data for my project, and noticed that the sums of TPM values for tumor samples are not the same, not even close. This seems odd to me since I think, for each sample, the sum of TPM should be 1,000,000 by definition (or at least very close to it). I guess I must be missing something obvious here, but could not figure it out. Any help would be very much appreciated.
ps. if I need to compare the expression of a gene across samples, do I need to normalize it for each sample by its sum of TPM?
Best regards,
ps. if I need to compare the expression of a gene across samples, do I need to normalize it for each sample by its sum of TPM?
Best regards,