VELVET or ABYSS for Transcriptome

geschickten

Member

Join Date: Jul 2009

Posts: 31
- Share
- Tweet
#1

VELVET or ABYSS for Transcriptome

08-03-2009, 10:46 AM

We are planning to use ABYSS and Velvet for de novo assembly on transcriptome data. Just wondering if the group can share their experience with either tool; also how does both compare? and which is the best tool available for the assembly of transcriptome data? Thank you...
Tags: None
yvan.wenger

Member

Join Date: Aug 2009

Posts: 30
- Share
- Tweet
#2

08-10-2009, 07:59 AM

De novo transcriptome assembly?

Hello,

I am wondering if you had any reply from your question concerning the best tool for the assembly of transcriptome... I am up to evaluate the tools but it seems that our draft genome gives an advantage to assemblies leading to short contigs as it has roughly 130'000 contigs (genomic then). As a consequence, the assembly with the best mapping to the genome is one with short contigs (otherwise large assembled contigs would jump from one genomic contig to another because those are quite shorts).

As N50 does not seem to be a good metric for transcriptomes, I was wondering what other measures/manip to use to rank the different assemblies. Also, I noted that both correct and wrong contigs can be found in all assemblies and that they are often different (you can find a correct contig that is only represented in a rather "bad" assembly for example). Given this, I am wondering if somebody in this forum as seen data on alternative methods to obtain good contigs without a good genome? I for instance just re-had a look on the Abyss paper (De novo Transcriptome Assembly with Abyss, Birol et al, Bioinformatics Advance Access published June 15 2009) and see there that they still assess their transcritpome assembly using the human genome. As an alternative, I am thinking to merge several assemblies, compare those that merge together if any, maybe keep a contig only if it appears in at least two different assemblies and so one... but everything needs to be done.

Any thoughts on all that? Or otherwise, is there a forum dedicated to this topic?

Best,

Yvan

Original message:
We are planning to use ABYSS and Velvet for de novo assembly on transcriptome data. Just wondering if the group can share their experience with either tool; also how does both compare? and which is the best tool available for the assembly of transcriptome data? Thank you...
Comment
geschickten

Member

Join Date: Jul 2009

Posts: 31
- Share
- Tweet
#3

08-10-2009, 08:54 PM

Yvan,

Well I haven't received any replies from the forum. I must admit I am new to this world of genomics and hence I may not be able to pass my comments on your observation.

I have not come across any forum dedicated to this topic.

Do let me know about your evaluation and if required we can even take this offline...
Comment
wjeck

Member

Join Date: Mar 2009

Posts: 39
- Share
- Tweet
#4

08-13-2009, 08:48 AM

all,

I believe that the short answer is: The proper tools are not publicly available yet. There is a wrong way to do this: assembling transcriptome data like it's genomic, and a right way: yet to be determined. I'm looking for pretty much the same thing and I can't seem to find it. The primary problem with assembling transcriptome data like it's genomic is that most transcriptome data sets have some genomic contamination, and they have alternative splicings. Both of these facts run counter to the assumptions of the genome assemblers, in which there is no alternative splicing (or at most two haplotype alternatives). If anyone is thinking about working on new assemblers for these new data sets PM me; I'm very interested in exploring the topic and maybe sitting down to write one.

Cheers,
--Will
Comment
geschickten

Member

Join Date: Jul 2009

Posts: 31
- Share
- Tweet
#5

08-13-2009, 08:59 AM

Will,

I am willing to work on this and if you are okay then we can work together to design/develop an assembler for transcriptome data!

prahalad
Comment
NSTbioinformatics

Member

Join Date: Apr 2009

Posts: 24
- Share
- Tweet
#6

08-14-2009, 12:28 AM

I have used velvet and ABySS to assembly genomic sequences from Illumina reads. However velvet runs very slow and can not process 36507944 reads X36 nt + 95398944 reads X 76 nt on 32 G memory computer, it stoped due to memory problem. I don't know how to solve it.

From the paper De novo Transcriptome Assembly with Abyss, Birol et al, ABySS could assemble shotgun + pairedend runs together. I am wondering how it works. In the manual of ABySS, it only shows to assemble shotgun run and paired end run separatly.

I would like to hear from others about them
Comment
geschickten

Member

Join Date: Jul 2009

Posts: 31
- Share
- Tweet
#7

08-14-2009, 12:37 AM

NST, do you think you can share this papaer "De novo Transcriptome Assembly with Abyss, Birol et al"

-p
Comment
jts

Member

Join Date: Feb 2009

Posts: 22
- Share
- Tweet
#8

08-14-2009, 01:09 AM

Hi NSTbioinformatics,

If you post the details of your problem on the abyss-users mailing list (http://www.bcgsc.ca/mailman/listinfo/abyss-users) Shaun Jackman or I can help you set up abyss for your data set. You will be able to assemble both single-end and paired-end reads in the same run but some care must be taken when choosing the assembly parameters.

Regards,
Jared Simpson
Comment
kmcarr

Senior Member

Join Date: May 2008

Posts: 1180
- Share
- Tweet
#9

08-14-2009, 05:08 AM

Here are the references for Abyss:

Simpson et al. ABySS: A parallel assembler for short read sequence data. Genome Res (2009) vol. 19 (6) pp. 1117-23

Birol et al. De novo Transcriptome Assembly with ABySS. Bioinformatics (2009) pp.
Comment
jnfass

Member

Join Date: Aug 2008

Posts: 88
- Share
- Tweet
#10

08-17-2009, 09:02 AM

Originally posted by NSTbioinformatics View Post

I have used velvet and ABySS to assembly genomic sequences from Illumina reads. However velvet runs very slow and can not process 36507944 reads X36 nt + 95398944 reads X 76 nt on 32 G memory computer, it stoped due to memory problem. I don't know how to solve it.

From the paper De novo Transcriptome Assembly with Abyss, Birol et al, ABySS could assemble shotgun + pairedend runs together. I am wondering how it works. In the manual of ABySS, it only shows to assemble shotgun run and paired end run separatly.

I would like to hear from others about them

I've done velvet assemblies with > 100M reads (some paired-end) on a 512G machine ... yes, it does take a lot of memory ... but I'd be interested in hearing if ABySS is any better. My understanding is that these assemblers like to have the whole assembly graph in memory at once, and that's the roadblock to assembling in smaller RAM spaces (though, I've seen a few comments from people working on parallelizing one or the other program).

Before I had access to a large memory machine, I ran the single ended assembly first, then used those contigs as "long" reads to add to an assembly of the paired reads.

Velvet can definitely do single and paired reads together, and if you change a parameter before compiling, you can have an unlimited number of different paired read sets, each with different insert lengths.
Comment
Zigster

Jeremy Leipzig

Join Date: May 2009

Posts: 116
- Share
- Tweet
#11

08-20-2009, 04:03 PM

Originally posted by NSTbioinformatics View Post

However velvet runs very slow and can not process 36507944 reads X36 nt + 95398944 reads X 76 nt on 32 G memory computer, it stoped due to memory problem. I don't know how to solve it.

I have had better luck with Velvet running at longer kmers and to a lesser extent higher coverage cutoffs. Apparently this is counter-intuitive given that there are 16x possible kmers of length 31 than say 29, but velvetg is much more likely to hit the wall at the shorter kmers.

I recently did a de novo transcriptome assembly of 100,425,440 72bp paired end reads totaling over 7,034,311,658 bp on a 256G machine but could not get below kmer 29 without crashing.

Fortunately velvet now accepts very large kmer lengths, so I would try those before giving up.

--
Jeremy Leipzig
Bioinformatics Programmer
--
My blog
Twitter
Comment
beelu

Junior Member

Join Date: Mar 2008

Posts: 7
- Share
- Tweet
#12

10-05-2009, 05:59 AM

Hi jnfass and Zigster, how do you build your machine to 512G/256G? How many CPU do you have and whats your RAM to core ratio? Thanks.

Beelu
Comment
Zigster

Jeremy Leipzig

Join Date: May 2009

Posts: 116
- Share
- Tweet
#13

10-05-2009, 06:08 AM

We use a Dell Poweredge something-or-other with 4 X7350 (16 cores total)

--
Jeremy Leipzig
Bioinformatics Programmer
--
My blog
Twitter
Comment
geschickten

Member

Join Date: Jul 2009

Posts: 31
- Share
- Tweet
#14

10-05-2009, 10:02 AM

Originally posted by jnfass View Post

I've done velvet assemblies with > 100M reads (some paired-end) on a 512G machine ... yes, it does take a lot of memory ... but I'd be interested in hearing if ABySS is any better. My understanding is that these assemblers like to have the whole assembly graph in memory at once, and that's the roadblock to assembling in smaller RAM spaces (though, I've seen a few comments from people working on parallelizing one or the other program).

Before I had access to a large memory machine, I ran the single ended assembly first, then used those contigs as "long" reads to add to an assembly of the paired reads.

Velvet can definitely do single and paired reads together, and if you change a parameter before compiling, you can have an unlimited number of different paired read sets, each with different insert lengths.

Hi jnfass,

Can you please share some information on who's doing the work on parallelizing assemblers? Also kindly point to some good open source parallel assemblers if you know any.. thank you
Comment
geschickten

Member

Join Date: Jul 2009

Posts: 31
- Share
- Tweet
#15

10-05-2009, 10:06 AM

Originally posted by Zigster View Post

I have had better luck with Velvet running at longer kmers and to a lesser extent higher coverage cutoffs. Apparently this is counter-intuitive given that there are 16x possible kmers of length 31 than say 29, but velvetg is much more likely to hit the wall at the shorter kmers.

I recently did a de novo transcriptome assembly of 100,425,440 72bp paired end reads totaling over 7,034,311,658 bp on a 256G machine but could not get below kmer 29 without crashing.

Fortunately velvet now accepts very large kmer lengths, so I would try those before giving up.

Hi Zigster,

Can you please share the exact configuration of the machine that you used to for this run. Also what's your take on if somebody allows you to run this in Cloud?? would you go for it?
Comment

Previous 1 2 template Next

Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
- Channel: Articles
Yesterday, 07:01 AM
Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 39 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 41 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 35 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

VELVET or ABYSS for Transcriptome

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News