SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Pacific Biosciences



Similar Threads
Thread Thread Starter Forum Replies Last Post
DeNovo assembly using pacBio data krittika.sasmal Pacific Biosciences 50 06-05-2013 10:56 AM
The best genome de novo assembly software using hybrid data (Illumina, 454 & Sanger)? Godevil De novo discovery 36 08-01-2012 03:25 AM
hybrid assembly for PacBio and Illumina data laelaps Bioinformatics 1 05-01-2012 06:35 AM
Hybrid assembly of PacBio and Illumina allo Bioinformatics 3 05-01-2012 06:27 AM
De novo short read assembly? Which assembler is the best? Patrick De novo discovery 0 06-23-2009 07:42 PM

Reply
 
Thread Tools
Old 10-04-2012, 04:04 AM   #1
boetsie
Senior Member
 
Location: NL, Leiden

Join Date: Feb 2010
Posts: 245
Question De novo assembly of PacBio with short Illumina data

Hi all,

I'm a bit stuck on assemblying a bacterial genome with PacBio and Illumina data.

We have around 15x PacBio CLR data (average of 9kb) and 0.1x CCS data. In addition, we have some older Illumina paired-end 50bp data (100x coverage) from the GAIIx.

My problems so far;
- The coverage of the CCS data is too small to use for error correction with Celera Assembler.
- The length of the Illumina reads is too small for error correction, as a size of 100bp is suggested.
- Assemblers (like MIRA and Celera Assembler) require error corrected PacBio data, but i can't correct my data as explained above.

Does anyone have any suggestions on how to assemble these datasets? Are there tools that do not require error corrected PacBio data, or are there tools that can error correct the PacBio data with the short Illumina dataset?

Kind regards,
Boetsie
boetsie is offline   Reply With Quote
Old 10-06-2012, 01:35 PM   #2
jbingham
Member
 
Location: Silicon Valley

Join Date: Jul 2011
Posts: 24
Default

Here are 3 approaches you can try. All require installing SMRT Analysis.

First, with PacBio's Allora assembler, you can likely get a reasonable assembly without using the Illumina. Just feed in the PacBio long reads and CCS and see what comes out. The coverage is a bit low, but you'll make some progress at least.

Or you can go the route of assembling the Illumina data separately, and see if you can get any larger contigs out of it. If yes, use RS_AHA_Scaffolding to join contigs with the PacBio long reads.

Another option, even though 50bp Illumina data probably isn't long enough to reliably map for an assembly: use the RS_Allora_Assembly_EC protocol in PacBio's software, with the Illumina for EC.
jbingham is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:45 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO