Hi all,
I'm a bit stuck on assemblying a bacterial genome with PacBio and Illumina data.
We have around 15x PacBio CLR data (average of 9kb) and 0.1x CCS data. In addition, we have some older Illumina paired-end 50bp data (100x coverage) from the GAIIx.
My problems so far;
- The coverage of the CCS data is too small to use for error correction with Celera Assembler.
- The length of the Illumina reads is too small for error correction, as a size of 100bp is suggested.
- Assemblers (like MIRA and Celera Assembler) require error corrected PacBio data, but i can't correct my data as explained above.
Does anyone have any suggestions on how to assemble these datasets? Are there tools that do not require error corrected PacBio data, or are there tools that can error correct the PacBio data with the short Illumina dataset?
Kind regards,
Boetsie
I'm a bit stuck on assemblying a bacterial genome with PacBio and Illumina data.
We have around 15x PacBio CLR data (average of 9kb) and 0.1x CCS data. In addition, we have some older Illumina paired-end 50bp data (100x coverage) from the GAIIx.
My problems so far;
- The coverage of the CCS data is too small to use for error correction with Celera Assembler.
- The length of the Illumina reads is too small for error correction, as a size of 100bp is suggested.
- Assemblers (like MIRA and Celera Assembler) require error corrected PacBio data, but i can't correct my data as explained above.
Does anyone have any suggestions on how to assemble these datasets? Are there tools that do not require error corrected PacBio data, or are there tools that can error correct the PacBio data with the short Illumina dataset?
Kind regards,
Boetsie
Comment