![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
de novo assembly using Trinity versus Velvet-Oases | Nol | De novo discovery | 8 | 10-26-2013 12:56 PM |
how to resolve repeat areas with Velvet when doing de novo assembly | salmonella | De novo discovery | 1 | 10-24-2011 09:42 PM |
Velvet de novo assembly to amosvalidate | canuck | Bioinformatics | 5 | 07-17-2011 12:24 PM |
de novo assembly (velvet or others) | strob | Bioinformatics | 1 | 01-20-2010 05:53 AM |
Velvet de novo assembly of Solid reads HOWTO | KevinLam | De novo discovery | 1 | 01-10-2010 01:11 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Senior Member
Location: USA Join Date: Jan 2008
Posts: 482
|
![]()
Hi,
Does someone have an idea of how much memory would velvet require for a given input of short reads? And how would it possibly scale with more / longer reads? Also, any other 'large dataset' de novo assembly tools for the illumina reads. SOAP says 100Gb RAM for human sized genomes, are there other options and what would their memory requirements be? thanks for sharing.. |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: USA Join Date: Jan 2008
Posts: 482
|
![]()
To answer part of it myself, this is a useful source on velvet mailing-list
http://listserver.ebi.ac.uk/pipermai...ne/000359.html The gist is, Ram required for velvetg = -109635 + 18977*ReadSize + 86326*GenomeSize + 233353*NumReads - 51092*K Gives the answer in kb. Read size is in bases. Genome size is in millions of bases (Mb) Number of reads is in millions K is the kmer hash value used in velveth |
![]() |
![]() |
![]() |
#3 |
Senior Member
Location: The University of Melbourne, AUSTRALIA Join Date: Apr 2008
Posts: 275
|
![]()
The above formula derived by Simon Gladman has a caveat of only being applicable to Velvet when compiled with the default MAXKMERSIZE=31. If you compiled with 63 for example, the memory usage will increase.
|
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: SEA Join Date: Nov 2009
Posts: 203
|
![]()
So what happens when the machine doesn't have enough ram?
does it give a error or just proceed very very slowly? would having a large enough swap partition help? |
![]() |
![]() |
![]() |
#5 |
(Jeremy Leipzig)
Location: Philadelphia, PA Join Date: May 2009
Posts: 116
|
![]()
It will segfault, but sometimes it will lock up a machine so badly you will have to physically pull the plug.
I suggest using ulimit, for example I have a 256gb machine and use ulimit -v 240000000 before every run |
![]() |
![]() |
![]() |
#6 |
Senior Member
Location: Worcester, MA Join Date: Oct 2009
Posts: 133
|
![]()
We've been needing approximately 30g of RAM for velvet assembly with a minimum of 24g depending on the kmer length specified. *This is with single-ended read 36bp Illumina data.
Last edited by jgibbons1; 01-06-2010 at 08:52 AM. |
![]() |
![]() |
![]() |
#7 |
Member
Location: Udine (Italy) Join Date: Jan 2009
Posts: 50
|
![]()
In order to assmebly a lane of paired reads of length 75 we used 120 giga with a k-mer size of 47.
Obviously the amount of date decrease with a smaller k-mer, but a shorter k-mer implies a higher possibility of mistakes. I think, this is a my opinion, that with the increasing of the read length tools like velvet will became too memory consuming, and they will became unpractical. With a read length of 150 an approach like PCAP, ARACNE and EDENA that build an overlap graph and not a de bruijn graph is the only feasible opportunity |
![]() |
![]() |
![]() |
#8 |
Senior Member
Location: USA Join Date: Jan 2008
Posts: 482
|
![]()
Is it human genome you are working on
One approach is to map reads to reference, and assemble the unmapped reads. Though this can yield a pretty fragmented assembly that is hard to use eventually... Do you usually do things like remove contaminants or low quality reads, take only the unique set of reads.. ? These can certainly reduce the run time, but last I looked, using a redundant set of reads gave slightly different assembly than a non-redundant one. |
![]() |
![]() |
![]() |
#9 |
Senior Member
Location: Worcester, MA Join Date: Oct 2009
Posts: 133
|
![]()
Sorry...I replied to the wrong thread.
Last edited by jgibbons1; 01-06-2010 at 03:05 PM. Reason: replied to the wrong thread |
![]() |
![]() |
![]() |
#10 |
Member
Location: india Join Date: Feb 2011
Posts: 16
|
![]()
I m using velevet for assembly and velevtg is consuming aroung 90% of my memeory ...is there any ways where in i can control the same ... say by threading or any other step?
|
![]() |
![]() |
![]() |
#11 |
Senior Member
Location: Worcester, MA Join Date: Oct 2009
Posts: 133
|
![]()
I've found that one of the best ways to reduce the memory requirements is to quality filter your read set before assembly. Low quality reads directly impact memory. Trimmomatic and Quake are both very good for quality filtering.
|
![]() |
![]() |
![]() |
#12 |
Member
Location: NY Join Date: Mar 2012
Posts: 35
|
![]()
Hi all,
I have tried to use the velvet for my RNAseq data assembly. My machine is about 40G RAM. The read length is about 101 for my dataset. The total number reads is about 60 million for one file of pair end. The total size is about 120 million. However, when I try to assembly it, and when it run after GHost threads and begin to threading through reads. I find it has occupied about 65% RAM. Hence, I need to stop it. Can anyone give me some suggestions about how to reduce the memory usage? I have set the Kmer to 75 for my dataset. Jingjing |
![]() |
![]() |
![]() |
#13 |
Senior Member
Location: Berlin Join Date: Jul 2011
Posts: 156
|
![]()
1. Get a machine with more RAM
2. Use shorter k-mers 3. Try to reduce complexity in your reads by using Quake or something similar 4. Subsample your reads 5. Use a different assembler Velvet is known to be memory-hungry, therefore 1 is the best choice. However, if this isn't an option, you should at least try 2 (75 sounds very, very high) or 3, with 4 as the last resort - unless you want to try a completely different assembler. CLC is very memory efficient but commercial... |
![]() |
![]() |
![]() |
Tags |
assemble, de novo, memory, soap, velvet |
Thread Tools | |
|
|