Does any one know of studies, or reported data of any sort, on the stochastic behavior of algorithms that align RNA-Seq reads to a genome in order to identify exon-exon splice junctions, the sort of thing that tophat does? By stochastic I mean, if one were to produce reads from two separate library preparations of the same source of RNA and run the alignment algorithm on the resulting sets of reads, what degree of agreement in the reported sets of junctions would one expect. I'm seeking results based on empirical human genome data; not results based on simulated reads, nor results from smaller genomes.
Thanks in advance for any pointers.
-francois
Thanks in advance for any pointers.
-francois
Comment