SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Merging two bacteria de novo assemblies into one gmarco Bioinformatics 6 08-13-2014 09:51 AM
merge two Trinity transcriptome assemblies into one Dampor Bioinformatics 15 09-24-2013 08:22 AM
CAP3 to merge two assemblies? LizBent Bioinformatics 4 06-21-2013 07:02 AM
using multiple de novo assemblies to aid in exon joining jbio Bioinformatics 0 03-29-2013 06:14 AM
Quality checking transcriptome assemblies ShellfishGene Bioinformatics 1 02-20-2010 06:44 AM

Reply
 
Thread Tools
Old 12-03-2013, 11:54 AM   #1
criggs
Junior Member
 
Location: Oregon

Join Date: Dec 2013
Posts: 3
Default how to merge de novo transcriptome assemblies

Hi all,

I am working on building de novo transcriptome assemblies using Trinity. In the future, I would like to merge assemblies, so that I do not have to start my analysis from scratch each time I do more sequencing. (I am sequencing multiple stages of the organism I work on, and cannot do all the sequencing at once). Is there a way to merge assemblies from Trinity? Or to add to an existing assembly in Trinity, creating multiple iterations? My goal is to be able to combine de novo assemblies, or add to an existing assembly, without losing the original contigs (and therefore downstream analysis).

Thanks for any advice!
criggs is offline   Reply With Quote
Old 12-04-2013, 06:51 AM   #2
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

I suppose you could run the iterative assembly using the previous assembly as a 'reference genome'. Not sure how well this would work. As far as I know Trinity is best run de-novo each time since you should discover new lowly expressed transcripts this way.
westerman is offline   Reply With Quote
Old 09-08-2014, 08:35 AM   #3
nareshvasani
Member
 
Location: NC

Join Date: Apr 2013
Posts: 57
Smile Merge 2 de novo assembly generated using Trinity!

Hi Criggs!

I am also trying to merge assembly which were generated using trinity.
After merging assembly, I found that there were 2 same transcipt id in merged assembly.
Could you please tell me how you merge your assembly.

I would really appreciate your input.

Thanks in advance.
naresh






Quote:
Originally Posted by criggs View Post
Hi all,

I am working on building de novo transcriptome assemblies using Trinity. In the future, I would like to merge assemblies, so that I do not have to start my analysis from scratch each time I do more sequencing. (I am sequencing multiple stages of the organism I work on, and cannot do all the sequencing at once). Is there a way to merge assemblies from Trinity? Or to add to an existing assembly in Trinity, creating multiple iterations? My goal is to be able to combine de novo assemblies, or add to an existing assembly, without losing the original contigs (and therefore downstream analysis).

Thanks for any advice!
nareshvasani is offline   Reply With Quote
Old 09-08-2014, 03:21 PM   #4
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 838
Default

Using the previous assembly as a reference probably wouldn't be a great idea, because as far as I know Trinity will drop any sequences that can't be produced from the reference genome.

I've been using minimus2 (from AMOS) to merge transcriptome assemblies, combining the merged contigs with singletons, but it's difficult to determine how good that merged assembly is.

What I'd really like is a more generic "take these long sequences and generate consensus contigs" program, which would help for PacBio / MinION sequencing as well.
gringer is offline   Reply With Quote
Old 09-08-2014, 03:48 PM   #5
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

I made a tool related to this, called Dedupe, available with BBTools. Unlike minimus, it does not merge overlapping contigs together; therefore it cannot not introduce misassemblies, but it also won't usually produce as small a combined assembly. In practice, we use it before or instead of minimus because it is much faster and more stable, able to handle very large assemblies that cause minimus to fail.

Dedupe ensures that there is at most one copy of any input sequence, optionally allowing containments (substrings) to be removed, and a variable hamming or edit distance to be specified. Usage:

dedupe.sh in=assembly1.fa,assembly2.fa out=merged.fa

That will absorb exact duplicates and containments. You can use "hdist" and "edist" flags to allow mismatches, or get a complete list of flags by running the shellscript with no arguments.
Brian Bushnell is offline   Reply With Quote
Old 09-09-2014, 06:06 AM   #6
nareshvasani
Member
 
Location: NC

Join Date: Apr 2013
Posts: 57
Smile

Hi Gringer,

Thanks for your quick response.
What I did was, used existing assembly as reference and whichever sequence's or read's were not matched to reference, I extracted those read and created new small assembly of unmapped read using trinity.

And then finally merged both assembly for better coverage.


Thanks,
naresh


Quote:
Originally Posted by gringer View Post
Using the previous assembly as a reference probably wouldn't be a great idea, because as far as I know Trinity will drop any sequences that can't be produced from the reference genome.

I've been using minimus2 (from AMOS) to merge transcriptome assemblies, combining the merged contigs with singletons, but it's difficult to determine how good that merged assembly is.

What I'd really like is a more generic "take these long sequences and generate consensus contigs" program, which would help for PacBio / MinION sequencing as well.
nareshvasani is offline   Reply With Quote
Old 09-09-2014, 06:06 AM   #7
nareshvasani
Member
 
Location: NC

Join Date: Apr 2013
Posts: 57
Smile

Thanks Brian for your prompt reply.

Naresh


Quote:
Originally Posted by Brian Bushnell View Post
I made a tool related to this, called Dedupe, available with BBTools. Unlike minimus, it does not merge overlapping contigs together; therefore it cannot not introduce misassemblies, but it also won't usually produce as small a combined assembly. In practice, we use it before or instead of minimus because it is much faster and more stable, able to handle very large assemblies that cause minimus to fail.

Dedupe ensures that there is at most one copy of any input sequence, optionally allowing containments (substrings) to be removed, and a variable hamming or edit distance to be specified. Usage:

dedupe.sh in=assembly1.fa,assembly2.fa out=merged.fa

That will absorb exact duplicates and containments. You can use "hdist" and "edist" flags to allow mismatches, or get a complete list of flags by running the shellscript with no arguments.
nareshvasani is offline   Reply With Quote
Reply

Tags
de novo assembly, merge, trinity

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:01 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO