Dear Velvet users,
I try to do some hybrid assembly with Abi Solid and Roche 454 reads, when encountered memory / swap out error.
We have a 48Gb RAMed / 24Gb swapped 2*8 Xeon workstation, runs on the latest Ubuntu Linux 64bit.
Reads stat:
Solid: mate pair, ~37 000 000 reads, 2*50nt length
454: frag ~160000 reads, ~400nt length
Predicted genome size: 7Mb
Steps:
1.a ./saet_mp solidreads_F3.csfasta solidreads _F3_QV.qual 7000000 -numcores 16 -globalrounds 2 -qvupdate -qvhigh -nosampling OK!!!
1.a ./saet_mp solidreads _R3.csfastas olidreads _R3_QV.qual 7000000 -numcores 16 -globalrounds 2 -qvupdate -qvhigh -nosampling OK!!!
1c. ./encodeFasta.py -l -n -a 454reads.fna > samplebacteria.de (colorizer from Corona Lite package) OK!!!
2. ./solid_denovo_preprocessor_v1.2.pl --run_type mates --output preproced_dir_name --f3 solidreads _F3.csfasta --r3 solidreads_R3.csfasta OK!!!
3. ./velveth_de hashed/ 21 -shortPaired doubleEncoded_input.de -long samplebacteria.de OK!!!
4. ./velvetg_de hashed/ -exp_cov 20 -ins_length 2600 -min_contig_lgth 200 -cov_cutoff 2
At this step I ran out of memory/swap and the process automatically killed by the OS.
.
.
[3502.183999] 25532000 nodes visited
[3502.252314] Concatenation...
[3504.923849] Renumbering nodes
[3504.923874] Initial node count 13837915
[3505.706345] Removed 384358 null nodes
[3505.706369] Concatenation over!
[3505.706372] Clipping short tips off graph, drastic
[3515.809932] Concatenation...
[3560.510149] Renumbering nodes
[3560.510170] Initial node count 13453557
[3560.980761] Removed 3542018 null nodes
[3560.980778] Concatenation over!
[3560.980781] 9911539 nodes left
[3561.090558] Writing into graph file hashed//Graph2...
[6433.379904] Removing contigs with coverage < -2.000000...
[6521.066093] Concatenation...
[6573.221114] Renumbering nodes
[6573.221139] Initial node count 9911539
[6573.252071] Removed 0 null nodes
[6573.252101] Concatenation over!
[6573.691763] Concatenation...
[6575.770354] Renumbering nodes
[6575.770376] Initial node count 9911539
[6575.789992] Removed 0 null nodes
[6575.790011] Concatenation over!
[6575.790476] Clipping short tips off graph, drastic
[6576.074753] Concatenation...
[6578.174284] Renumbering nodes
[6578.174312] Initial node count 9911539
[6578.193785] Removed 0 null nodes
[6578.193809] Concatenation over!
[6578.193811] 9911539 nodes left
[6578.193971] Read coherency...
[6578.891643] Identifying unique nodes
[6579.250084] Done, 8196 unique nodes counted
[6579.250108] Trimming read tips
[6598.101153] Renumbering nodes
[6598.101177] Initial node count 9911539
[6605.287075] Removed 1511 null nodes
[6605.500090] Renumbering nodes
[6605.500105] Initial node count 9910028
[6605.520381] Removed 0 null nodes
[6605.520400] Confronted to 5 multiple hits and 15329 null over 16845
[6605.520403] Read coherency over!
[6610.547042] Starting pebble resolution...
[6610.910790] Preparing to correct graph with cutoff 0.200000
[6635.017686] Computing read to node mapping array sizes
Killed
Intermediate file sizes:
55Gb Graph2
2,7Gb sequences
2,8Gb roadmaps
533Mb pregraph
What can be wrong? I've made successful hybrid assembly with the CLC Geno Workbench on the same dataset, generated ~200 long contigs with ~7Mb summarized lenght.
Thank you for any idea:
Blaize
I try to do some hybrid assembly with Abi Solid and Roche 454 reads, when encountered memory / swap out error.
We have a 48Gb RAMed / 24Gb swapped 2*8 Xeon workstation, runs on the latest Ubuntu Linux 64bit.
Reads stat:
Solid: mate pair, ~37 000 000 reads, 2*50nt length
454: frag ~160000 reads, ~400nt length
Predicted genome size: 7Mb
Steps:
1.a ./saet_mp solidreads_F3.csfasta solidreads _F3_QV.qual 7000000 -numcores 16 -globalrounds 2 -qvupdate -qvhigh -nosampling OK!!!
1.a ./saet_mp solidreads _R3.csfastas olidreads _R3_QV.qual 7000000 -numcores 16 -globalrounds 2 -qvupdate -qvhigh -nosampling OK!!!
1c. ./encodeFasta.py -l -n -a 454reads.fna > samplebacteria.de (colorizer from Corona Lite package) OK!!!
2. ./solid_denovo_preprocessor_v1.2.pl --run_type mates --output preproced_dir_name --f3 solidreads _F3.csfasta --r3 solidreads_R3.csfasta OK!!!
3. ./velveth_de hashed/ 21 -shortPaired doubleEncoded_input.de -long samplebacteria.de OK!!!
4. ./velvetg_de hashed/ -exp_cov 20 -ins_length 2600 -min_contig_lgth 200 -cov_cutoff 2
At this step I ran out of memory/swap and the process automatically killed by the OS.
.
.
[3502.183999] 25532000 nodes visited
[3502.252314] Concatenation...
[3504.923849] Renumbering nodes
[3504.923874] Initial node count 13837915
[3505.706345] Removed 384358 null nodes
[3505.706369] Concatenation over!
[3505.706372] Clipping short tips off graph, drastic
[3515.809932] Concatenation...
[3560.510149] Renumbering nodes
[3560.510170] Initial node count 13453557
[3560.980761] Removed 3542018 null nodes
[3560.980778] Concatenation over!
[3560.980781] 9911539 nodes left
[3561.090558] Writing into graph file hashed//Graph2...
[6433.379904] Removing contigs with coverage < -2.000000...
[6521.066093] Concatenation...
[6573.221114] Renumbering nodes
[6573.221139] Initial node count 9911539
[6573.252071] Removed 0 null nodes
[6573.252101] Concatenation over!
[6573.691763] Concatenation...
[6575.770354] Renumbering nodes
[6575.770376] Initial node count 9911539
[6575.789992] Removed 0 null nodes
[6575.790011] Concatenation over!
[6575.790476] Clipping short tips off graph, drastic
[6576.074753] Concatenation...
[6578.174284] Renumbering nodes
[6578.174312] Initial node count 9911539
[6578.193785] Removed 0 null nodes
[6578.193809] Concatenation over!
[6578.193811] 9911539 nodes left
[6578.193971] Read coherency...
[6578.891643] Identifying unique nodes
[6579.250084] Done, 8196 unique nodes counted
[6579.250108] Trimming read tips
[6598.101153] Renumbering nodes
[6598.101177] Initial node count 9911539
[6605.287075] Removed 1511 null nodes
[6605.500090] Renumbering nodes
[6605.500105] Initial node count 9910028
[6605.520381] Removed 0 null nodes
[6605.520400] Confronted to 5 multiple hits and 15329 null over 16845
[6605.520403] Read coherency over!
[6610.547042] Starting pebble resolution...
[6610.910790] Preparing to correct graph with cutoff 0.200000
[6635.017686] Computing read to node mapping array sizes
Killed
Intermediate file sizes:
55Gb Graph2
2,7Gb sequences
2,8Gb roadmaps
533Mb pregraph
What can be wrong? I've made successful hybrid assembly with the CLC Geno Workbench on the same dataset, generated ~200 long contigs with ~7Mb summarized lenght.
Thank you for any idea:
Blaize
Comment