Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • PacBio Amplicon reads assembly

    I've got a question regarding the assembly of PacBio reads. We've created a library of approximately 5000 different amplicons between 1 and 3kb and successfully ran these on a couple of flow cells. Next we've ran the RS_ReadsOfInsert protocol and demultiplexed the data with the corresponding barcodes.
    The next step is to align/assemble these reads to each other to build contigs of multiple reads mapping to the same consensus. However the majority of the tools (HGAP, Quiver) that I come across are designed to do (de novo) genome assembly, that is not what we are aiming at, we "just" want to align/assemble the PacBio demultiplexed reads and build contigs from 1-3 kb.
    What tool would be the best to perform this job?

  • #2
    I'm not sure I understand, are the 1-3kb amplicons tiled, and you are trying to assemble a longer sequence? Otherwise, using the quality filter in the reads_of_insert protocol you can generate 99.9 accurate amplicons, or are you trying to cluster the amplicon sequences?

    Comment


    • #3
      Originally posted by rhall View Post
      I'm not sure I understand, are the 1-3kb amplicons tiled, and you are trying to assemble a longer sequence? Otherwise, using the quality filter in the reads_of_insert protocol you can generate 99.9 accurate amplicons, or are you trying to cluster the amplicon sequences?
      No the amplicons are not tiled. After the generation of the reads I indeed would like to cluster the same amplicon sequences together.

      Comment


      • #4
        I wrote a tool for clustering PacBio reads of insert. It does not generate a consensus, but it will output the single highest-quality read per cluster... or, you can generate a consensus from the clusters, if you have a good consensus-generation tool. For my application, the single best read was much better than the consensus, which tended to be chimeric.

        Syntax:

        dedupe.sh in=ros.fq csf=stats.txt outbest=best.fq qin=33 am=f ac=f fo c rnc=f mcs=2 k=27 mo=1400 cc pto nam=4 e=26 pattern=cluster_%.fq

        I've found those specific settings to be extremely good for 16s sequences which are ~1500bp long. But if you have variable size amplicons, you may need to first bin them by size and use a different "mo" (min overlap) and "e" (max edit distance) setting for the individual bins.

        Dedupe is part of the BBTools package.

        Comment


        • #5
          To generate a consensus, I would use something like Brian's clustering tool above (usearch, and CDHit are other options) then generate a reference from the best cluster representatives and use it in a quiver resequencing job. This approach works best if the diversity is limited, and clusters represent the same sequence and not closely related sequences, in which case, as is pointed out above, a representative single molecule consensus (at ~QV30) may be more useful than a heterogeneous multi-molecule quiver consensus.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Recent Advances in Sequencing Analysis Tools
            by seqadmin


            The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
            05-06-2024, 07:48 AM
          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 06:35 AM
          0 responses
          15 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 05-09-2024, 02:46 PM
          0 responses
          21 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 05-07-2024, 06:57 AM
          0 responses
          18 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 05-06-2024, 07:17 AM
          0 responses
          19 views
          0 likes
          Last Post seqadmin  
          Working...
          X