Hi,
I am trying to assemble ~10 million 454 ESTs and ~1 million sanger ESTs. I tried newbler, CAP3 and TGICL. They all output identical sequences more or less in the contigs and singlets files.
For example, I found the 454Isotigs.fna file contains many sequences that are 100% identical but with different lengths (i.e. one sequence contains another shorter one). Isn't this supposed not to happen. I mean they should be assembled as one?
I also tried CAP3 and TGICL. Again, they also output identical sequences more or less in the contigs and singlets files.
Does anyone know why? Thanks ...
I am trying to assemble ~10 million 454 ESTs and ~1 million sanger ESTs. I tried newbler, CAP3 and TGICL. They all output identical sequences more or less in the contigs and singlets files.
For example, I found the 454Isotigs.fna file contains many sequences that are 100% identical but with different lengths (i.e. one sequence contains another shorter one). Isn't this supposed not to happen. I mean they should be assembled as one?
I also tried CAP3 and TGICL. Again, they also output identical sequences more or less in the contigs and singlets files.
Does anyone know why? Thanks ...
Comment