SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Vendor Forum



Similar Threads
Thread Thread Starter Forum Replies Last Post
de novo assembler for se 36 bp illumina reads srividyanathan De novo discovery 2 05-28-2013 08:52 AM
parallel de novo assembler tmy1018 Bioinformatics 3 10-22-2012 09:31 AM
Input parameters GS de novo Assembler grassgirl 454 Pyrosequencing 0 06-03-2011 12:40 PM
Reproducibility of CLC de novo assembler corthay Bioinformatics 1 06-03-2010 06:07 AM
De Novo Short Read Assembler? doxologist De novo discovery 18 05-21-2010 06:55 AM

Reply
 
Thread Tools
Old 03-12-2014, 03:14 PM   #1
Geneious
Registered Vendor
 
Location: New Zealand

Join Date: Jul 2010
Posts: 22
Default Building a Circular de novo Assembler

Geneious developer Matt Kearse has written a blog about how he built the circular de novo assembler for the R7 Update. Hopefully some of you will find this an interesting read.

http://blog.geneious.com/blog/bid/37...novo-Assembler

Quote:
There are two approaches I could have taken to add circular contig support. The simple approach is at the end of the assembly process to circularize any contigs whose ends look sufficiently similar. The second more complex approach is to allow contigs to circularize during the assembly process and still allow similar sequences and contigs to merge into the circular contigs later. This approach is more robust and more likely to produce correct results. For example if we have two related species present in a data set, the ends of the temporary linear contigs may be sufficiently similar to merge into a large incorrect linear contig. But if we circularize during the assembly process, instead of merging they'll correctly circularize first.
Geneious is offline   Reply With Quote
Old 03-13-2014, 06:32 AM   #2
TiborNagy
Senior Member
 
Location: Budapest

Join Date: Mar 2010
Posts: 329
Default

Looks interesting. But how can handle this algorithm more than one circular contigs? (For example a bacterial genome and it's plasmids.)
TiborNagy is offline   Reply With Quote
Old 03-13-2014, 01:31 PM   #3
Matt Kearse
Member
 
Location: New Zealand

Join Date: Mar 2014
Posts: 20
Default

The algorithm may produce multiple circular contigs as each contig may independently circularize.

As a quick confirmation I downloaded a random sample of 100 viral genomes, 24 of which are circular. I generated simulated data from them all and mixed it all together.

De novo assembly of this mixed data produced 106 contigs, 6 of them being tiny contigs consisting of reads with errors. The other 100 contigs produced matched the original genomes perfectly apart from a 2 bp uncertainty due to read errors in 1 genome. 77 contigs were linear and 23 were circular in keeping with the original genomes. 1 failed to circularize due to insufficient coverage.
Matt Kearse is offline   Reply With Quote
Reply

Tags
circular, denovo assembly, plasmid

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:45 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO