Hello everyone,
Traditionally in sequencing, X coverage is determined as the product of the number of sequenced reads and their length, divided by the genome size.
Regarding RNASeq experiments and using purified poly(A)+ RNA as starting material for library preparation, wouldn't be more accurate to use the product of the number of known genes (as in Ensembl) and tha average transcript size for the given organism, instead of the whole genome size, as the denominator in X coverage calculation?
At the end, because we are working with an mRNA-enriched library for sequencing it is obvious that there will be regions in the genome which we will not cover at all. So maybe we could just exclude them from our X coverage calculation?
Thanks in advanced for any help!
Traditionally in sequencing, X coverage is determined as the product of the number of sequenced reads and their length, divided by the genome size.
Regarding RNASeq experiments and using purified poly(A)+ RNA as starting material for library preparation, wouldn't be more accurate to use the product of the number of known genes (as in Ensembl) and tha average transcript size for the given organism, instead of the whole genome size, as the denominator in X coverage calculation?
At the end, because we are working with an mRNA-enriched library for sequencing it is obvious that there will be regions in the genome which we will not cover at all. So maybe we could just exclude them from our X coverage calculation?
Thanks in advanced for any help!
Comment