Okay, first post, I hope I can explain this. I am interested in transcripts that encode ERCC5 isoforms. and according to this (and other sources), there is the full-length isoform and a number of short isoforms.
I'm just puzzled by the basic conceptual problem that if all that you get out of rna-seq data is short reads, how could you possibly assign a short read to a particular overlapping isoform? For example here, there is the full-length isoform, and also short 5' and 3' isoforms.
Does this rely on polymorphisms within individuals?
Or is this simply based on the quantity (count) of reads in certain regions, ie, if you have 2x reads at the 5' end and also 2x reads at the 3' end, but a region of 1x reads in the middle, then you fit this data to a model where there is a short 5' isoform and a short 3' isoform and that would explain the lack of reads in the middle of the gene.
Hope the question makes sense. Let me know if I can clarify.
Thanks! -Alan
I'm just puzzled by the basic conceptual problem that if all that you get out of rna-seq data is short reads, how could you possibly assign a short read to a particular overlapping isoform? For example here, there is the full-length isoform, and also short 5' and 3' isoforms.
Does this rely on polymorphisms within individuals?
Or is this simply based on the quantity (count) of reads in certain regions, ie, if you have 2x reads at the 5' end and also 2x reads at the 3' end, but a region of 1x reads in the middle, then you fit this data to a model where there is a short 5' isoform and a short 3' isoform and that would explain the lack of reads in the middle of the gene.
Hope the question makes sense. Let me know if I can clarify.
Thanks! -Alan
Comment