I have a paired end bacterial genome data set. With help from some of you I was able to liberate the sequence data from the 454 .sff.
What assembly engine would work best? I want a "second opinion" from gsAssembler.
This is a small bacterial genome: less than 1.2 million bases. The sequence coverage is a little difficult to measure. But I would estimate, but it is around 50X. It does also contain some contaminating eukaryotic DNA. Very low pass on that however (maybe 0.01X, at most).
Directly using phrap on the .fasta and .qual files produced a very large number of contigs and scaffolds. I reduced all the quality values to 1/2 their original values and a single large scaffold emerged. But still 2x more contigs than gsAssembler gives.
Any advice?
--
Phillip
What assembly engine would work best? I want a "second opinion" from gsAssembler.
This is a small bacterial genome: less than 1.2 million bases. The sequence coverage is a little difficult to measure. But I would estimate, but it is around 50X. It does also contain some contaminating eukaryotic DNA. Very low pass on that however (maybe 0.01X, at most).
Directly using phrap on the .fasta and .qual files produced a very large number of contigs and scaffolds. I reduced all the quality values to 1/2 their original values and a single large scaffold emerged. But still 2x more contigs than gsAssembler gives.
Any advice?
--
Phillip
Comment