Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • criggs
    Junior Member
    • Dec 2013
    • 3

    how to merge de novo transcriptome assemblies

    Hi all,

    I am working on building de novo transcriptome assemblies using Trinity. In the future, I would like to merge assemblies, so that I do not have to start my analysis from scratch each time I do more sequencing. (I am sequencing multiple stages of the organism I work on, and cannot do all the sequencing at once). Is there a way to merge assemblies from Trinity? Or to add to an existing assembly in Trinity, creating multiple iterations? My goal is to be able to combine de novo assemblies, or add to an existing assembly, without losing the original contigs (and therefore downstream analysis).

    Thanks for any advice!
  • westerman
    Rick Westerman
    • Jun 2008
    • 1104

    #2
    I suppose you could run the iterative assembly using the previous assembly as a 'reference genome'. Not sure how well this would work. As far as I know Trinity is best run de-novo each time since you should discover new lowly expressed transcripts this way.

    Comment

    • nareshvasani
      Member
      • Apr 2013
      • 57

      #3
      Merge 2 de novo assembly generated using Trinity!

      Hi Criggs!

      I am also trying to merge assembly which were generated using trinity.
      After merging assembly, I found that there were 2 same transcipt id in merged assembly.
      Could you please tell me how you merge your assembly.

      I would really appreciate your input.

      Thanks in advance.
      naresh






      Originally posted by criggs View Post
      Hi all,

      I am working on building de novo transcriptome assemblies using Trinity. In the future, I would like to merge assemblies, so that I do not have to start my analysis from scratch each time I do more sequencing. (I am sequencing multiple stages of the organism I work on, and cannot do all the sequencing at once). Is there a way to merge assemblies from Trinity? Or to add to an existing assembly in Trinity, creating multiple iterations? My goal is to be able to combine de novo assemblies, or add to an existing assembly, without losing the original contigs (and therefore downstream analysis).

      Thanks for any advice!

      Comment

      • gringer
        David Eccles (gringer)
        • May 2011
        • 845

        #4
        Using the previous assembly as a reference probably wouldn't be a great idea, because as far as I know Trinity will drop any sequences that can't be produced from the reference genome.

        I've been using minimus2 (from AMOS) to merge transcriptome assemblies, combining the merged contigs with singletons, but it's difficult to determine how good that merged assembly is.

        What I'd really like is a more generic "take these long sequences and generate consensus contigs" program, which would help for PacBio / MinION sequencing as well.

        Comment

        • Brian Bushnell
          Super Moderator
          • Jan 2014
          • 2709

          #5
          I made a tool related to this, called Dedupe, available with BBTools. Unlike minimus, it does not merge overlapping contigs together; therefore it cannot not introduce misassemblies, but it also won't usually produce as small a combined assembly. In practice, we use it before or instead of minimus because it is much faster and more stable, able to handle very large assemblies that cause minimus to fail.

          Dedupe ensures that there is at most one copy of any input sequence, optionally allowing containments (substrings) to be removed, and a variable hamming or edit distance to be specified. Usage:

          dedupe.sh in=assembly1.fa,assembly2.fa out=merged.fa

          That will absorb exact duplicates and containments. You can use "hdist" and "edist" flags to allow mismatches, or get a complete list of flags by running the shellscript with no arguments.

          Comment

          • nareshvasani
            Member
            • Apr 2013
            • 57

            #6
            Hi Gringer,

            Thanks for your quick response.
            What I did was, used existing assembly as reference and whichever sequence's or read's were not matched to reference, I extracted those read and created new small assembly of unmapped read using trinity.

            And then finally merged both assembly for better coverage.


            Thanks,
            naresh


            Originally posted by gringer View Post
            Using the previous assembly as a reference probably wouldn't be a great idea, because as far as I know Trinity will drop any sequences that can't be produced from the reference genome.

            I've been using minimus2 (from AMOS) to merge transcriptome assemblies, combining the merged contigs with singletons, but it's difficult to determine how good that merged assembly is.

            What I'd really like is a more generic "take these long sequences and generate consensus contigs" program, which would help for PacBio / MinION sequencing as well.

            Comment

            • nareshvasani
              Member
              • Apr 2013
              • 57

              #7
              Thanks Brian for your prompt reply.

              Naresh


              Originally posted by Brian Bushnell View Post
              I made a tool related to this, called Dedupe, available with BBTools. Unlike minimus, it does not merge overlapping contigs together; therefore it cannot not introduce misassemblies, but it also won't usually produce as small a combined assembly. In practice, we use it before or instead of minimus because it is much faster and more stable, able to handle very large assemblies that cause minimus to fail.

              Dedupe ensures that there is at most one copy of any input sequence, optionally allowing containments (substrings) to be removed, and a variable hamming or edit distance to be specified. Usage:

              dedupe.sh in=assembly1.fa,assembly2.fa out=merged.fa

              That will absorb exact duplicates and containments. You can use "hdist" and "edist" flags to allow mismatches, or get a complete list of flags by running the shellscript with no arguments.

              Comment

              Latest Articles

              Collapse

              • SEQadmin2
                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by SEQadmin2


                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                Here are nine questions we think about, in roughly the order they matter, before...
                06-18-2026, 07:11 AM
              • SEQadmin2
                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                by SEQadmin2


                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                ...
                06-02-2026, 10:05 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, 06-17-2026, 06:09 AM
              0 responses
              41 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-09-2026, 11:58 AM
              0 responses
              102 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-05-2026, 10:09 AM
              0 responses
              123 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-04-2026, 08:59 AM
              0 responses
              114 views
              0 reactions
              Last Post SEQadmin2  
              Working...