Seqanswers Leaderboard Ad

**seb567** · 06-05-2012, 07:02 AM

Originally posted by Anelda View Post

Hi there,

Do you have any news on the colourspace issue? We ran RAY today for the first time and was very impressed, except that we mostly deal with SOLiD data and would need the contigs in base space eventually :-)

Thanks!

Anelda

We have not worked on color space recently.

Ray can assemble color space reads, but will generate a double-encoded color space assembly.

**seb567** · 06-05-2012, 07:04 AM

This was fixed on 2012-04-27.

A variable shadowing caused a compilation error on OS X using openmpi… · sebhtml/ray@53495ad

https://github.com/sebhtml/ray/commit/53495ad591bf0c9ac665595d5a1feef979767411

…cxx. This was reported by Adam Caldwell. This is fixed now.

v2.0.0-rc8 is quite stable too.

Originally posted by steph View Post

Hi everyone,

I encountered a problem when trying to build the latest stable version of Ray (1.7) with the latest version of GCC (v4.7.0).

The problem occured at the make step.

With GCC v4.7.0, I got the following errors:

Code:

code/communication/MessageProcessor.cpp: In member function 'void MessageProcessor::call_RAY_MPI_TAG_ASK_VERTEX_PATH(Message*)':
code/communication/MessageProcessor.cpp:1685:7: error: redeclaration of 'int i'
code/communication/MessageProcessor.cpp:1675:10: error: 'int i' previously declared here
make: *** [code/communication/MessageProcessor.o] Error

However, when I used GCC v4.1.2 (which was also installed on this machine) instead, the installation finished correctly.

**seb567** · 06-23-2012, 05:54 PM

Ray 2.0.0 released

Hello,

Ray 2.0.0 codenamed "Dark Astrocyte of Knowledge" is available for download.
This version ships with RayPlatform 1.0.3 codenamed "Gray Pylon of Wisdom".

Not much thing changed since v2.0.0-rc8.

Ray 2.0.0 can do de novo assembly of metagenomes and also taxonomic profiling
with k-mers.

To get Ray v2.0.0:

Ray -- Parallel genome assemblies for parallel DNA sequencing

http://denovoassembler.sf.net

Also, there is a new section on the website for
frequently asked questions.

Changes in Ray between v2.0.0-rc8 and v2.0.0

commit 6adeef3d814dc2acbc32444ec3ed5a49a709e98c
Author: Sébastien Boisvert <[email protected]>
Date: Fri Jun 22 20:58:37 2012 -0400

This is Ray v2.0.0.

commit 2243df732615cb2419e81c57a233cb1ffd214583
Author: Sébastien Boisvert <[email protected]>
Date: Thu Jun 21 20:27:15 2012 -0400

Floating numbers must not be stored with the integer type 'int'.

commit 4b4815772354ea402b0ce5a9500d66eed37be8d7
Author: Sébastien Boisvert <[email protected]>
Date: Thu Jun 21 16:23:35 2012 -0400

This solves a division by 0.

commit d96047c2040bed3395ae36a13505608b617b3346
Author: Sébastien Boisvert <[email protected]>
Date: Thu Jun 21 15:21:24 2012 -0400

This change set improves the fidelity of Ray when computing peak
coverage for short seeds (let's say that a short seed has a length
lower than 512 vertices).

This closes an old ticket.

maybe use another way for computePeak in seeds · Issue #47 · sebhtml/ray

https://github.com/sebhtml/ray/issues/47

but first, audit the XML file for alarming stuff.

commit 22d4ec0f29d31a659fe5c4791039dd2497264039
Author: Sébastien Boisvert <[email protected]>
Date: Thu Jun 21 15:00:04 2012 -0400

The multiplicator used for spawning read helpers was changed from 2.0 to 1.5.
This should remove any assemblies and should not affect contiguity.

commit 048ba764c0424bb74f416276ef7f16cf463cc0fd
Author: Sébastien Boisvert <[email protected]>
Date: Thu Jun 21 14:12:24 2012 -0400

The peak finder should detects simulated data as well.

commit 7102b16f1f282a0003f9982b42a3a744d9bffe59
Author: Sébastien Boisvert <[email protected]>
Date: Wed Jun 20 14:48:20 2012 -0400

A compilation warning was removed for an integer comparison.

commit dcc738984c61a478d7da699f19dfae4e2d1ec1f2
Author: Sébastien Boisvert <[email protected]>
Date: Wed Jun 20 14:32:23 2012 -0400

Routing strategies were updated.

commit 91052b31051fb58561231c0cd2f3bab720184b56
Author: Sébastien Boisvert <[email protected]>
Date: Wed Jun 20 11:12:34 2012 -0400

Patch information was updated.

commit 8c950405d1695c2ca691537c1eba341155c9731a
Author: Sébastien Boisvert <[email protected]>
Date: Wed Jun 20 09:53:36 2012 -0400

The release procedure was updated.

commit 95f458950e9856c93b2af120845899975b74851e
Author: Sébastien Boisvert <[email protected]>
Date: Thu Jun 14 14:33:57 2012 -0400

A new option is available to disable read recycling.

commit 459f26f59076ffabe603f0d8b7163a69e2f51837
Author: Sébastien Boisvert <[email protected]>
Date: Thu Jun 14 14:10:11 2012 -0400

The options for using checkpointing features requires a
directory.

Changes in RayPlatform between v1.0.2 and v1.0.3:

commit 09517b6862d04743f64abc181de21b7d8c8b5dbd
Author: Sébastien Boisvert <[email protected]>
Date: Fri Jun 22 20:59:58 2012 -0400

This is the release of RayPlatform v1.0.3 codenamed
"Gray Pylon of Wisdom".

commit 86ddad8ee7b9cdbb6142561f38fd75e05e4622f2
Author: Sébastien Boisvert <[email protected]>
Date: Tue Jun 5 13:51:08 2012 -0400

Ray crashed sometimes when the number of processor cores was less or equal to 3.
This change fixes this. Ray can run of 1 processor core up to 4096 processor cores
at the moment with routing. Without routing, the maximum number of cores is larger.

Reported-by: krobinson#seqanswers.com
Reported-by: severin#seqanswers.com

Sébastien Boisvert
Granularity specialist/PhD student
Université Laval

**VidJa** · 06-26-2012, 06:56 AM

I'm impressed with Ray. However
what is a recommended method to find out the optimal k-mer size? just trial and error?
Typical datasets: --> 100bp paired end illumina (5-10mln pairs), bacterial genomes
Does anyone recommend a method to merge assemblies of different kmer size runs?

**seb567** · 06-26-2012, 07:38 AM

Originally posted by VidJa View Post

I'm impressed with Ray. However
what is a recommended method to find out the optimal k-mer size? just trial and error?
Typical datasets: --> 100bp paired end illumina (5-10mln pairs), bacterial genomes

For Illumina(R) HiSeq(R) data, I usually just set the k-mer length to 31.

Originally posted by VidJa View Post

Does anyone recommend a method to merge assemblies of different kmer size runs?

There is the Zorro assembler

and Minimus assembler based on the AMOS framework.

At that point, however, I think you may want to inspect your assembly with Hawkeye or Tablet and eventually finish it.

**snowbear24** · 07-03-2012, 03:52 PM

I have a question about multiple compilations of Ray. For instance, if I want to try multiple k-mer sizes by compiling with options such as:
PREFIX=Ray-Large-k-mers MAXKMERLENGTH=64
with values other than 64, can I keep multiple renamed Ray executables in the same directory and run them by calling a specific Ray_64 vs. Ray_127? I'm wondering if the underlying mechanisms (maybe TARGETS and PREFIX?) bind to a specific exe or can these different compilations co-exist peacefully in a single directory?

**seb567** · 07-04-2012, 05:00 AM

It is fine.

But you should do

HTML Code:

make clean
make PREFIX=Ray-Large-k-mers-64 MAXKMERLENGTH=64 
make install
mpiexec -n 1 Ray-Large-k-mers-64/Ray -version

and

HTML Code:

make clean
make PREFIX=Ray-Large-k-mers-128 MAXKMERLENGTH=128 
make install
mpiexec -n 1 Ray-Large-k-mers-128/Ray -version

Originally posted by snowbear24 View Post

I have a question about multiple compilations of Ray. For instance, if I want to try multiple k-mer sizes by compiling with options such as:
PREFIX=Ray-Large-k-mers MAXKMERLENGTH=64
with values other than 64, can I keep multiple renamed Ray executables in the same directory and run them by calling a specific Ray_64 vs. Ray_127? I'm wondering if the underlying mechanisms (maybe TARGETS and PREFIX?) bind to a specific exe or can these different compilations co-exist peacefully in a single directory?

**snowbear24** · 07-16-2012, 11:36 AM

Running Ray multiple times with only one network test?

I'm wondering if it's possible to re-run Ray on an identical cluster node and re-use the network test results. I.e. can I skip the network test for multiple runs in a row after the first run with a network test?

**seb567** · 07-16-2012, 11:40 AM

Originally posted by snowbear24 View Post

I'm wondering if it's possible to re-run Ray on an identical cluster node and re-use the network test results. I.e. can I skip the network test for multiple runs in a row after the first run with a network test?

No, it is not possible to skip network testing. However, this steps usually only requires
a few seconds.

**snowbear24** · 07-16-2012, 11:44 AM

Originally posted by seb567 View Post

No, it is not possible to skip network testing. However, this steps usually only requires
a few seconds.

Thanks for the input. The network test took 4 minutes, 35 seconds, that's why I inquired.

**seb567** · 07-16-2012, 11:49 AM

Originally posted by snowbear24 View Post

Thanks for the input. The network test took 4 minutes, 35 seconds, that's why I inquired.

Is that too long ?

**VidJa** · 07-25-2012, 02:05 AM

Ray 2.0.0 fails sometimes after several days of execution with messages like:

...
Rank 15 reached 9900 vertices from seed 0, flow 1
Speed RAY_SLAVE_MODE_EXTENSION 247 units/second
Rank 15: assembler memory usage: 686056 KiB
Rank 2 reached 10400 vertices from seed 0, flow 1
Speed RAY_SLAVE_MODE_EXTENSION 265 units/second
Rank 2: assembler memory usage: 686948 KiB
Speed RAY_SLAVE_MODE_EXTENSION 305 units/second
Rank 29: assembler memory usage: 680960 KiB
rank 29 in job 53 bamicsb_51206 caused collective abort of all ranks
exit status of rank 29: killed by signal 11

hardware: 35 core virtual machine with 230 GB memory, Ubuntu 10.04, mpich2
commandline:
mpiexec -np 30 Ray -k 41 -p Paired-end/T_R1_val_1.fastq Paired-end/T_R2_val_2.fastq -s Paired-end/T_R1_unpaired_1.fastq Paired-end/T_R2_unpaired_2.fastq -o Ray_trimmed_PE_S_k41

Typical memory usage about 180 GB at the time of the crash.
Input about 13GB of paired end and single end quality clipped (q25) Illumina reads (100bp)

Is this a hardware issue (maybe faulty memory banks) or something with Ray or the input material. Ive seen this behavious with other runs as well since I switched to 2.0.0, but never when using Ray 1.7.

**seb567** · 07-25-2012, 03:55 AM

Hello,

First, I don't see how it can use 180 GB for 13 GB of data.
From your log, it says '680960 KiB' for core 29.
And your command indicates that you are using 30 cores.

35 * 680 MB = 20400 MB or about 20 GB

Second, you also need to add -s in front of Paired-end/T_R2_unpaired_2.fastq.

Do you have any log with more details because the only error lines are

rank 29 in job 53 bamicsb_51206 caused collective abort of all ranks
exit status of rank 29: killed by signal 11

And rank 29 said that before dying:

Rank 29: assembler memory usage: 680960 KiB

If you compile with DEBUG=y ASSERT=y, you may get more information out of this, depending on your system.

Originally posted by VidJa View Post

Ray 2.0.0 fails sometimes after several days of execution with messages like:

...
Rank 15 reached 9900 vertices from seed 0, flow 1
Speed RAY_SLAVE_MODE_EXTENSION 247 units/second
Rank 15: assembler memory usage: 686056 KiB
Rank 2 reached 10400 vertices from seed 0, flow 1
Speed RAY_SLAVE_MODE_EXTENSION 265 units/second
Rank 2: assembler memory usage: 686948 KiB
Speed RAY_SLAVE_MODE_EXTENSION 305 units/second
Rank 29: assembler memory usage: 680960 KiB
rank 29 in job 53 bamicsb_51206 caused collective abort of all ranks
exit status of rank 29: killed by signal 11

hardware: 35 core virtual machine with 230 GB memory, Ubuntu 10.04, mpich2
commandline:
mpiexec -np 30 Ray -k 41 -p Paired-end/T_R1_val_1.fastq Paired-end/T_R2_val_2.fastq -s Paired-end/T_R1_unpaired_1.fastq Paired-end/T_R2_unpaired_2.fastq -o Ray_trimmed_PE_S_k41

Typical memory usage about 180 GB at the time of the crash.
Input about 13GB of paired end and single end quality clipped (q25) Illumina reads (100bp)

Is this a hardware issue (maybe faulty memory banks) or something with Ray or the input material. Ive seen this behavious with other runs as well since I switched to 2.0.0, but never when using Ray 1.7.

**VidJa** · 09-04-2012, 05:13 AM

Thanks, it worked out and we got a high quality assembly, indeed the total memory usage was not by Ray but by another process.
After switching to a non-virtual machine the behaviour stopped, so maybe it was the VM configuration.

**seb567** · 09-04-2012, 05:28 AM

Originally posted by VidJa View Post

Thanks, it worked out and we got a high quality assembly, indeed the total memory usage was not by Ray but by another process.
After switching to a non-virtual machine the behaviour stopped, so maybe it was the VM configuration.

Cool.

What was the faulty process ?

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 27 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 26 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News