View Single Post
Old 02-02-2017, 10:45 AM   #1
Location: ND, USA

Join Date: Oct 2011
Posts: 24
Default Cleaning up, merging de novo transcriptomes to create a quality reference


I have about 950 million reads from an RNA-Seq data set that covers many developmental time-points. Assembling all the reads doesn't really work because I reach a point where errors are being included at a higher rate than new k-mers (or so I have been advised...including all of the reads and digital-normalizing to 20x results in a very fragmented, low quality assembly).

If I assemble multiple time-points individually and then merge the transcriptomes, how would I select the best representative isoform from each assembly and jettison the rest to create a nice, clean final reference? What is the a good method to filter the garbage out and what is a good method merge them, favoring more complete sequences?

To clarify merging - I'm thinking of selecting individual transcripts from multiple assemblies, not merging actual sequences together to increase length, although that would be a source of improvement.
dacotahm is offline   Reply With Quote