SEQanswers

Go Back   SEQanswers > Introductions



Similar Threads
Thread Thread Starter Forum Replies Last Post
De Novo Assembly of a transcriptome Neil De novo discovery 82 02-28-2012 09:44 AM
de novo transcriptome assembly/RNA-seq samanta General 0 08-24-2011 12:07 PM
De Novo assembly of a plant transcriptome raonyguimaraes RNA Sequencing 7 07-05-2011 01:17 PM
De Novo Transcriptome Assembly QC Noremac General 0 05-19-2011 11:02 AM
de novo transcriptome assembly chenjy RNA Sequencing 4 12-06-2010 11:54 PM

Reply
 
Thread Tools
Old 01-30-2011, 03:46 PM   #1
Niharika
Junior Member
 
Location: Melbourne

Join Date: Jan 2011
Posts: 2
Question de novo transcriptome assembly

Hi,

I am new to this firld og NGS data analysis. I have just started working with de novo transcriptome assembly and came across many assemblers available like SOAP de novo, velvet, ABySS etc.

Which assembler is best to be used for paired end ILLUMINA sequencing data (90bp Reads).

How can I choose between K-mer lengths?

Your answers and help would be appreciated.
Niharika is offline   Reply With Quote
Old 01-30-2011, 06:15 PM   #2
rwenang
Member
 
Location: Singapore

Join Date: Jan 2009
Posts: 31
Default

Try trans-abyss or oases. They are more specialized in assembling transcriptome compared to genome assembler (SOAP de novo, velvet, abyss).
rwenang is offline   Reply With Quote
Old 01-30-2011, 06:55 PM   #3
Niharika
Junior Member
 
Location: Melbourne

Join Date: Jan 2011
Posts: 2
Default

Thank you rwenang.

Can anybody further tell me how to set K-mer lengths for denovo transcriptome assembly and regarding calculation of N50.
Niharika is offline   Reply With Quote
Old 01-31-2011, 09:43 AM   #4
samanta
Senior Member
 
Location: Seattle

Join Date: Feb 2010
Posts: 109
Default

Hello Niharika,

I have been doing something similar with paired end Solexa data (75 nt x2). We are using oases, which is part of velvet pipeline. This is what you need to do - (i) do an assembly using velvet and keep read tracking option on, (ii) run oases on the velvet result for transcriptome assembly. These are all explained in oases manual.

For my data, I played with few different K-mer lengths and settled on K=21 for best N50. You also need to keep the available memory size, etc. in mind, because that limits your ability to experiment with different K-mers. Oases uses lot more RAM than Velvet, and Velvet itself needs lot of memory.

Good luck,
Manoj

P. S.

1. SOAP denovo is for genome assembly. They cannot do transcriptomes, as far as I know.
2. ABySS is a parallel version of velvet. So, trans-ABySS is equivalent to OASES. However, I would recommend trying velvet first, because the parallel installation of ABySS requires some more effort.

---------------------

http://homolog.us

Last edited by samanta; 01-31-2011 at 09:47 AM.
samanta is offline   Reply With Quote
Old 02-02-2011, 12:03 PM   #5
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Arrow

Quote:
Originally Posted by samanta View Post


2. ABySS is a parallel version of velvet. So, trans-ABySS is equivalent to OASES.
To whomever it may concern:

I am afraid you are obviously wrong here.

ABySS is not a parallel version of Velvet.




ABySS paper in Genome Research (2008)
http://genome.cshlp.org/content/19/6/1117

Velvet paper in Genome Research (2009)
http://genome.cshlp.org/content/18/5/821.long

Trans-ABySS paper in Nature Methods (2010)
http://www.nature.com/nmeth/journal/...meth.1517.html
seb567 is offline   Reply With Quote
Old 02-02-2011, 02:06 PM   #6
samanta
Senior Member
 
Location: Seattle

Join Date: Feb 2010
Posts: 109
Default

I should have said ABySS implements parallel version of de Brujin graph, whereas Velvet is single node de Brujin assembler, but we are splitting hairs here.

Let's hear from the authors of papers you quoted -


Velvet paper -

"We have developed a new set of algorithms, collectively called “Velvet,” to manipulate de Bruijn graphs for genomic sequence assembly. A de Bruijn graph is a compact representation based on short words"



Abyss paper -

"The field of short read de novo assembly developed from pioneering work on de Bruijn graphs by Pevzner et al. (Pevzner and Tang 2001; Pevzner et al. 2001). The de Bruijn graph representation is prevalent in current short read assemblers, with Velvet (Zerbino and Birney 2008), ALLPATHS (Butler et al. 2008), and EULER-SR (Chaisson and Pevzner 2008) all following this approach."

"To assemble the very large data sets produced by sequencing individual human genomes, we have developed ABySS (Assembly By Short Sequencing). The primary innovation in ABySS is a distributed representation of a de Bruijn graph, which allows parallel computation of the assembly algorithm across a network of commodity computers." [emphasis mine]
__________________
http://homolog.us
samanta is offline   Reply With Quote
Old 02-03-2011, 07:05 AM   #7
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Smile

Quote:
Originally Posted by samanta View Post
I should have said ABySS implements parallel version of de Brujin graph, whereas Velvet is single node de Brujin assembler, but we are splitting hairs here.
I agree with you that these two software implement a similar algorithmic approach for the assembly of genomes using de Bruijn graphs.


But saying that "ABySS is a parallel version of Velvet." is false and undervalues the work done over the years by the numerous researchers in that very field.



The use of paired-end reads in Velvet is described in a PLoS ONE paper (2009).
http://www.plosone.org/article/info:...l.pone.0008407

For ABySS, I think the contigs are merged according to a threshold on the number of bridging pairs.

Quote:
Originally Posted by samanta View Post
Let's hear from the authors of papers you quoted -


Velvet paper -

"We have developed a new set of algorithms, collectively called “Velvet,” to manipulate de Bruijn graphs for genomic sequence assembly. A de Bruijn graph is a compact representation based on short words"
Precisely !

The said manipulation of these graphs is what makes Velvet so popular !

Furthermore, you can get acquainted with Dr. Zerbino's PhD thesis to fully apprehend the concepts he created for manipulating de Bruijn graphs.

Genome assembly and comparison using de Bruijn graphs
http://www.ebi.ac.uk/training/ftp/Ph...el_Zerbino.pdf

The novelty, I think, is the use of long read markers and short read markers.
(Sections 2.3.4 & 2.3.5 of his thesis)

Quote:
Originally Posted by samanta View Post
Abyss paper -

"The field of short read de novo assembly developed from pioneering work on de Bruijn graphs by Pevzner et al. (Pevzner and Tang 2001; Pevzner et al. 2001). The de Bruijn graph representation is prevalent in current short read assemblers, with Velvet (Zerbino and Birney 2008), ALLPATHS (Butler et al. 2008), and EULER-SR (Chaisson and Pevzner 2008) all following this approach."
Same thing here. Professor Pavel Pevzner introduced the use of de Bruijn graph in 2001. In the EULER papers, eulerian paths are utilized to manipulate the de Bruijn graph in order to obtain an assembly.


So this cited paragraph highlights the importance of the de Bruijn graph representation, not how this graph is processed to yield an assembly.

Quote:
Originally Posted by samanta View Post
"To assemble the very large data sets produced by sequencing individual human genomes, we have developed ABySS (Assembly By Short Sequencing). The primary innovation in ABySS is a distributed representation of a de Bruijn graph, which allows parallel computation of the assembly algorithm across a network of commodity computers." [emphasis mine]
I think the true innovation of this paper is not only the distributed de Bruijn graph, but also a working assembler that generates contigs for a human genome.

Cheers !

-seb
seb567 is offline   Reply With Quote
Old 02-03-2011, 08:35 AM   #8
samanta
Senior Member
 
Location: Seattle

Join Date: Feb 2010
Posts: 109
Default

Thank you......fully agree with what you said. I tend to get sloppy in my message board comments.
__________________
http://homolog.us
samanta is offline   Reply With Quote
Old 02-07-2011, 05:29 AM   #9
moritzhess
Member
 
Location: freiburg

Join Date: Apr 2010
Posts: 25
Default

SOAP denovo has also been used for transcriptome assembly:

"De Novo Analysis of Transcriptome Dynamics in the Migratory Locust during the Development of Phase Traits"

I would also recommend a paper about transAbyss. It explains the functionality of the trans-... addon:
"De novo assembly and analysis of RNA-Seq Data"

As far as I experienced Abyss is far!!! less demanding regarding memory.
moritzhess is offline   Reply With Quote
Reply

Tags
assemblers, assembly, assembly quality, denovo assembly

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:20 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO