View Single Post
Old 01-31-2012, 08:16 AM   #3
Location: Pennsylvania

Join Date: Apr 2011
Posts: 27

Thanks for your response. The clusters should have a length of one read. They can contain for example 50 reads, but all reads start at position 1 ("left side" in aligned cluster). The reads in a cluster might differ in length based on the initial fragmentation.

To make it more difficult, our reads come from a pool of animals, so in addition to sequencing errors we also see SNPs. That is why we cannot use assembly based on let's say 99% homology. The de novo algorithm then starts adding read to our clusters that extend the cluster in length, mosty based on random inverted repeats in the genomic tags.
Retro is offline   Reply With Quote