View Single Post
Old 02-23-2012, 06:05 AM   #1
cerebralrust
Junior Member
 
Location: Sweden

Join Date: Jan 2012
Posts: 8
Exclamation Too few reads mapping back to contigs

I assembled plant transcriptome 454 data (non normalised) using trinity after the following

1)pre processing (removal of adaptors, vector contamination)
2)removal of rRna sequences
3)removal of chloroplast and mitochondrial genes using bwa

From 3,70,929 reads, i got 21,486 contigs. When i mapped the reads to the contigs using bwa, only 44,678 reads were used in the assembly. What am i doing wrong here? I randomly blasted the contigs to observe that they share over 90% similarity with related legume proteins (although many were hypothetical).
However, only a small percentage of the contigs align to the transcript assemblies of related legumes when mapped using bwa.

The velvet assembly of the same data resulted in 15,323 contigs with lesser n50 value, n90 value, max length etc.
MIRA assembly resulted in more contigs and more reads being used but lesser n50, n90 and avg length of contig.
Why are only 44,678 reads being used? Any advice is greatly appreciated.
cerebralrust is offline   Reply With Quote