SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Pacific Biosciences



Similar Threads
Thread Thread Starter Forum Replies Last Post
Improving assembly with PacBIo bioman1 Pacific Biosciences 5 04-03-2015 02:36 AM
Amplicon sequencing & clustering using Pacbio guntelo0545 Bioinformatics 1 02-26-2014 08:36 AM
Error in Pacbio Assembly jaysantos Bioinformatics 2 01-21-2014 04:10 AM
Have a genome assembly, what should I do with 15x Pacbio reads? lemur2 Pacific Biosciences 2 10-25-2012 10:06 PM
Hybrid assembly of PacBio and Illumina allo Bioinformatics 3 05-01-2012 06:27 AM

Reply
 
Thread Tools
Old 06-30-2015, 05:47 AM   #1
PatrickV
Junior Member
 
Location: France

Join Date: Jun 2015
Posts: 2
Default PacBio Amplicon reads assembly

I've got a question regarding the assembly of PacBio reads. We've created a library of approximately 5000 different amplicons between 1 and 3kb and successfully ran these on a couple of flow cells. Next we've ran the RS_ReadsOfInsert protocol and demultiplexed the data with the corresponding barcodes.
The next step is to align/assemble these reads to each other to build contigs of multiple reads mapping to the same consensus. However the majority of the tools (HGAP, Quiver) that I come across are designed to do (de novo) genome assembly, that is not what we are aiming at, we "just" want to align/assemble the PacBio demultiplexed reads and build contigs from 1-3 kb.
What tool would be the best to perform this job?
PatrickV is offline   Reply With Quote
Old 06-30-2015, 02:14 PM   #2
rhall
Senior Member
 
Location: San Francisco

Join Date: Aug 2012
Posts: 322
Default

I'm not sure I understand, are the 1-3kb amplicons tiled, and you are trying to assemble a longer sequence? Otherwise, using the quality filter in the reads_of_insert protocol you can generate 99.9 accurate amplicons, or are you trying to cluster the amplicon sequences?
rhall is offline   Reply With Quote
Old 07-01-2015, 12:01 AM   #3
PatrickV
Junior Member
 
Location: France

Join Date: Jun 2015
Posts: 2
Default

Quote:
Originally Posted by rhall View Post
I'm not sure I understand, are the 1-3kb amplicons tiled, and you are trying to assemble a longer sequence? Otherwise, using the quality filter in the reads_of_insert protocol you can generate 99.9 accurate amplicons, or are you trying to cluster the amplicon sequences?
No the amplicons are not tiled. After the generation of the reads I indeed would like to cluster the same amplicon sequences together.
PatrickV is offline   Reply With Quote
Old 07-01-2015, 12:47 AM   #4
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

I wrote a tool for clustering PacBio reads of insert. It does not generate a consensus, but it will output the single highest-quality read per cluster... or, you can generate a consensus from the clusters, if you have a good consensus-generation tool. For my application, the single best read was much better than the consensus, which tended to be chimeric.

Syntax:

dedupe.sh in=ros.fq csf=stats.txt outbest=best.fq qin=33 am=f ac=f fo c rnc=f mcs=2 k=27 mo=1400 cc pto nam=4 e=26 pattern=cluster_%.fq

I've found those specific settings to be extremely good for 16s sequences which are ~1500bp long. But if you have variable size amplicons, you may need to first bin them by size and use a different "mo" (min overlap) and "e" (max edit distance) setting for the individual bins.

Dedupe is part of the BBTools package.
Brian Bushnell is offline   Reply With Quote
Old 07-01-2015, 09:30 AM   #5
rhall
Senior Member
 
Location: San Francisco

Join Date: Aug 2012
Posts: 322
Default

To generate a consensus, I would use something like Brian's clustering tool above (usearch, and CDHit are other options) then generate a reference from the best cluster representatives and use it in a quiver resequencing job. This approach works best if the diversity is limited, and clusters represent the same sequence and not closely related sequences, in which case, as is pointed out above, a representative single molecule consensus (at ~QV30) may be more useful than a heterogeneous multi-molecule quiver consensus.
rhall is offline   Reply With Quote
Reply

Tags
amplicon, pacbio, sequencing

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:26 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO