My PI and I have been wondering what exactly Tophat does with repeated sequences from a fastq file. Does it merely throw them away, or does it take the read and 'place' it at each location that it appears in the genome? We were going to use the CASAVA pipeline, and I do know that generates a file of repeated sequences that can later be indexed, but Tophat better serves our purposes.
Thanks for the Help.
Thanks for the Help.
Comment