![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
one transcript many genes "chimeric" | BugSeq | RNA Sequencing | 0 | 02-13-2012 09:31 AM |
"allele balance ratio" and "quality by depth" in VCF files | efoss | Bioinformatics | 2 | 10-25-2011 12:13 PM |
The position file formats ".clocs" and "_pos.txt"? Ist there any difference? | elgor | Illumina/Solexa | 0 | 06-27-2011 08:55 AM |
"Systems biology and administration" & "Genome generation: no engineering allowed" | seb567 | Bioinformatics | 0 | 05-25-2010 01:19 PM |
SEQanswers second "publication": "How to map billions of short reads onto genomes" | ECO | Literature Watch | 0 | 06-30-2009 12:49 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: USA Join Date: Jun 2010
Posts: 6
|
![]()
I've clustered expression profiles from 20 experiments for a large group of highly related genes. I have the raw read counts and normalized this data using [total read counts uniquely matching to a gene]/[total counts in experiment]*[length of transcript]. However, because these genes have a large amount of non-unique sequence, I don't think that this method is correct. I'd like to try normalizing the expression data based on [length of unique k-mers within transcript] rather than [length of transcript]. Is there an existing tool that can calculate this?
Thanks in advance any help! |
![]() |
![]() |
![]() |
#2 |
Member
Location: Retirement - Not working with bioinformatics anymore. Join Date: Apr 2010
Posts: 63
|
![]()
I'm not particularly impressed with the RPKM measure either, as it is still biased towards long transcripts (Oshlack and Wakefield, Biology Direct 2009). The method found in this paper (doi:10.1093/bioinformatics/btp692) seems to be a more intelligent way of addressing this issues, though I haven't yet tested it out. I'm not sure that [length of unique k-mers within transcript] will be any better than [length of transcript] at eliminating the bias you think is there.
If you're set on doing it, though, I think you'll have to roll your own script to determine the length of unique k-mers, which may get to be fairly computationally intensive depending on how you go about doing that. |
![]() |
![]() |
![]() |
#3 | |
Senior Member
Location: Southern France Join Date: Aug 2009
Posts: 269
|
![]() Quote:
Another thing: because of biases in the read coverage at the end of the transcripts, it is frequent to disregard the initial and terminal exons and/or the UTRs. good luck, s. |
|
![]() |
![]() |
![]() |
Tags |
expression, normalize, rna-seq |
Thread Tools | |
|
|