Hello,
I am new to this board and I apologize if this is old news, but I haven't been able to do a productive search for it.
My agency recently got a PacBio, which is located in a different state than I am so getting all the information takes me awhile.
Any way, I submitted a Salmonella genomic DNA sample that was the first bacterial sample that the lab ran on the machine. They ran the genome twice and assembled one with HGAP2 and the other with HGAP3. Both assemblies gave a single large contig that was in the size range expected for a Salmonella genome (about 4.7M) and three plasmids that match ones seen in a similar strain in NCBI. Average coverage on our runs was reported to me to be over 100X. Synteny with the NCBI genome was spot on.
The problem was that when the genomes were aligned, there were many places where a single base was missing, most often in at the beginning of a short homopolymeric run of maybe 6 to 10 bases. This happened throughout the genome at a rate of about once for every 4- to 6-thousand bases. In a three genome alignment (our two runs and the NCBI sequence) the missing bases in our runs were usually at different places.
So, does anybody know what is going on? What do we need to fix?
Thanks,
Rick
I am new to this board and I apologize if this is old news, but I haven't been able to do a productive search for it.
My agency recently got a PacBio, which is located in a different state than I am so getting all the information takes me awhile.
Any way, I submitted a Salmonella genomic DNA sample that was the first bacterial sample that the lab ran on the machine. They ran the genome twice and assembled one with HGAP2 and the other with HGAP3. Both assemblies gave a single large contig that was in the size range expected for a Salmonella genome (about 4.7M) and three plasmids that match ones seen in a similar strain in NCBI. Average coverage on our runs was reported to me to be over 100X. Synteny with the NCBI genome was spot on.
The problem was that when the genomes were aligned, there were many places where a single base was missing, most often in at the beginning of a short homopolymeric run of maybe 6 to 10 bases. This happened throughout the genome at a rate of about once for every 4- to 6-thousand bases. In a three genome alignment (our two runs and the NCBI sequence) the missing bases in our runs were usually at different places.
So, does anybody know what is going on? What do we need to fix?
Thanks,
Rick
Comment