View Single Post
Old 01-30-2015, 08:49 AM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

The number of reads is based on the abundance of specific community members, while the number of contigs is based on the overall diversity of the community. If 99% of the organisms are one species of bacteria with gene X, then gene X might be the most abundant gene based on read mapping. But if there are 1000 other species in the community making up the other 1% of the population, and none of them have gene X but all of them have gene Y, then you might get 1000 different versions of gene Y contigs.

Also, sometimes the dominant organism does not assemble very well because it may have lots of different strains, which confuse the assembler.
Brian Bushnell is offline   Reply With Quote