Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by Anelda View Post
    Hi there,

    Do you have any news on the colourspace issue? We ran RAY today for the first time and was very impressed, except that we mostly deal with SOLiD data and would need the contigs in base space eventually :-)

    Thanks!

    Anelda
    We have not worked on color space recently.

    Ray can assemble color space reads, but will generate a double-encoded color space assembly.

    Comment


    • This was fixed on 2012-04-27.



      v2.0.0-rc8 is quite stable too.

      Originally posted by steph View Post
      Hi everyone,

      I encountered a problem when trying to build the latest stable version of Ray (1.7) with the latest version of GCC (v4.7.0).

      The problem occured at the make step.

      With GCC v4.7.0, I got the following errors:

      Code:
      code/communication/MessageProcessor.cpp: In member function 'void MessageProcessor::call_RAY_MPI_TAG_ASK_VERTEX_PATH(Message*)':
      code/communication/MessageProcessor.cpp:1685:7: error: redeclaration of 'int i'
      code/communication/MessageProcessor.cpp:1675:10: error: 'int i' previously declared here
      make: *** [code/communication/MessageProcessor.o] Error
      However, when I used GCC v4.1.2 (which was also installed on this machine) instead, the installation finished correctly.

      Comment


      • Ray 2.0.0 released

        Hello,

        Ray 2.0.0 codenamed "Dark Astrocyte of Knowledge" is available for download.
        This version ships with RayPlatform 1.0.3 codenamed "Gray Pylon of Wisdom".

        Not much thing changed since v2.0.0-rc8.

        Ray 2.0.0 can do de novo assembly of metagenomes and also taxonomic profiling
        with k-mers.

        To get Ray v2.0.0:




        Also, there is a new section on the website for
        frequently asked questions.



        Changes in Ray between v2.0.0-rc8 and v2.0.0


        commit 6adeef3d814dc2acbc32444ec3ed5a49a709e98c
        Author: Sébastien Boisvert <[email protected]>
        Date: Fri Jun 22 20:58:37 2012 -0400

        This is Ray v2.0.0.

        commit 2243df732615cb2419e81c57a233cb1ffd214583
        Author: Sébastien Boisvert <[email protected]>
        Date: Thu Jun 21 20:27:15 2012 -0400

        Floating numbers must not be stored with the integer type 'int'.

        commit 4b4815772354ea402b0ce5a9500d66eed37be8d7
        Author: Sébastien Boisvert <[email protected]>
        Date: Thu Jun 21 16:23:35 2012 -0400

        This solves a division by 0.

        commit d96047c2040bed3395ae36a13505608b617b3346
        Author: Sébastien Boisvert <[email protected]>
        Date: Thu Jun 21 15:21:24 2012 -0400

        This change set improves the fidelity of Ray when computing peak
        coverage for short seeds (let's say that a short seed has a length
        lower than 512 vertices).

        This closes an old ticket.



        commit 22d4ec0f29d31a659fe5c4791039dd2497264039
        Author: Sébastien Boisvert <[email protected]>
        Date: Thu Jun 21 15:00:04 2012 -0400

        The multiplicator used for spawning read helpers was changed from 2.0 to 1.5.
        This should remove any assemblies and should not affect contiguity.

        commit 048ba764c0424bb74f416276ef7f16cf463cc0fd
        Author: Sébastien Boisvert <[email protected]>
        Date: Thu Jun 21 14:12:24 2012 -0400

        The peak finder should detects simulated data as well.

        commit 7102b16f1f282a0003f9982b42a3a744d9bffe59
        Author: Sébastien Boisvert <[email protected]>
        Date: Wed Jun 20 14:48:20 2012 -0400

        A compilation warning was removed for an integer comparison.

        commit dcc738984c61a478d7da699f19dfae4e2d1ec1f2
        Author: Sébastien Boisvert <[email protected]>
        Date: Wed Jun 20 14:32:23 2012 -0400

        Routing strategies were updated.

        commit 91052b31051fb58561231c0cd2f3bab720184b56
        Author: Sébastien Boisvert <[email protected]>
        Date: Wed Jun 20 11:12:34 2012 -0400

        Patch information was updated.

        commit 8c950405d1695c2ca691537c1eba341155c9731a
        Author: Sébastien Boisvert <[email protected]>
        Date: Wed Jun 20 09:53:36 2012 -0400

        The release procedure was updated.

        commit 95f458950e9856c93b2af120845899975b74851e
        Author: Sébastien Boisvert <[email protected]>
        Date: Thu Jun 14 14:33:57 2012 -0400

        A new option is available to disable read recycling.

        commit 459f26f59076ffabe603f0d8b7163a69e2f51837
        Author: Sébastien Boisvert <[email protected]>
        Date: Thu Jun 14 14:10:11 2012 -0400

        The options for using checkpointing features requires a
        directory.





        Changes in RayPlatform between v1.0.2 and v1.0.3:


        commit 09517b6862d04743f64abc181de21b7d8c8b5dbd
        Author: Sébastien Boisvert <[email protected]>
        Date: Fri Jun 22 20:59:58 2012 -0400

        This is the release of RayPlatform v1.0.3 codenamed
        "Gray Pylon of Wisdom".

        commit 86ddad8ee7b9cdbb6142561f38fd75e05e4622f2
        Author: Sébastien Boisvert <[email protected]>
        Date: Tue Jun 5 13:51:08 2012 -0400

        Ray crashed sometimes when the number of processor cores was less or equal to 3.
        This change fixes this. Ray can run of 1 processor core up to 4096 processor cores
        at the moment with routing. Without routing, the maximum number of cores is larger.

        Reported-by: krobinson#seqanswers.com
        Reported-by: severin#seqanswers.com


        Sébastien Boisvert
        Granularity specialist/PhD student
        Université Laval

        Comment


        • I'm impressed with Ray. However
          what is a recommended method to find out the optimal k-mer size? just trial and error?
          Typical datasets: --> 100bp paired end illumina (5-10mln pairs), bacterial genomes
          Does anyone recommend a method to merge assemblies of different kmer size runs?

          Comment


          • Originally posted by VidJa View Post
            I'm impressed with Ray. However
            what is a recommended method to find out the optimal k-mer size? just trial and error?
            Typical datasets: --> 100bp paired end illumina (5-10mln pairs), bacterial genomes
            For Illumina(R) HiSeq(R) data, I usually just set the k-mer length to 31.

            Originally posted by VidJa View Post
            Does anyone recommend a method to merge assemblies of different kmer size runs?

            There is the Zorro assembler


            and Minimus assembler based on the AMOS framework.

            At that point, however, I think you may want to inspect your assembly with Hawkeye or Tablet and eventually finish it.

            Comment


            • I have a question about multiple compilations of Ray. For instance, if I want to try multiple k-mer sizes by compiling with options such as:
              PREFIX=Ray-Large-k-mers MAXKMERLENGTH=64
              with values other than 64, can I keep multiple renamed Ray executables in the same directory and run them by calling a specific Ray_64 vs. Ray_127? I'm wondering if the underlying mechanisms (maybe TARGETS and PREFIX?) bind to a specific exe or can these different compilations co-exist peacefully in a single directory?

              Comment


              • It is fine.

                But you should do

                HTML Code:
                make clean
                make PREFIX=Ray-Large-k-mers-64 MAXKMERLENGTH=64 
                make install
                mpiexec -n 1 Ray-Large-k-mers-64/Ray -version
                and

                HTML Code:
                make clean
                make PREFIX=Ray-Large-k-mers-128 MAXKMERLENGTH=128 
                make install
                mpiexec -n 1 Ray-Large-k-mers-128/Ray -version
                Originally posted by snowbear24 View Post
                I have a question about multiple compilations of Ray. For instance, if I want to try multiple k-mer sizes by compiling with options such as:
                PREFIX=Ray-Large-k-mers MAXKMERLENGTH=64
                with values other than 64, can I keep multiple renamed Ray executables in the same directory and run them by calling a specific Ray_64 vs. Ray_127? I'm wondering if the underlying mechanisms (maybe TARGETS and PREFIX?) bind to a specific exe or can these different compilations co-exist peacefully in a single directory?

                Comment


                • Running Ray multiple times with only one network test?

                  I'm wondering if it's possible to re-run Ray on an identical cluster node and re-use the network test results. I.e. can I skip the network test for multiple runs in a row after the first run with a network test?

                  Comment


                  • Originally posted by snowbear24 View Post
                    I'm wondering if it's possible to re-run Ray on an identical cluster node and re-use the network test results. I.e. can I skip the network test for multiple runs in a row after the first run with a network test?
                    No, it is not possible to skip network testing. However, this steps usually only requires
                    a few seconds.

                    Comment


                    • Originally posted by seb567 View Post
                      No, it is not possible to skip network testing. However, this steps usually only requires
                      a few seconds.
                      Thanks for the input. The network test took 4 minutes, 35 seconds, that's why I inquired.

                      Comment


                      • Originally posted by snowbear24 View Post
                        Thanks for the input. The network test took 4 minutes, 35 seconds, that's why I inquired.
                        Is that too long ?

                        Comment


                        • Ray 2.0.0 fails sometimes after several days of execution with messages like:

                          ...
                          Rank 15 reached 9900 vertices from seed 0, flow 1
                          Speed RAY_SLAVE_MODE_EXTENSION 247 units/second
                          Rank 15: assembler memory usage: 686056 KiB
                          Rank 2 reached 10400 vertices from seed 0, flow 1
                          Speed RAY_SLAVE_MODE_EXTENSION 265 units/second
                          Rank 2: assembler memory usage: 686948 KiB
                          Speed RAY_SLAVE_MODE_EXTENSION 305 units/second
                          Rank 29: assembler memory usage: 680960 KiB
                          rank 29 in job 53 bamicsb_51206 caused collective abort of all ranks
                          exit status of rank 29: killed by signal 11

                          hardware: 35 core virtual machine with 230 GB memory, Ubuntu 10.04, mpich2
                          commandline:
                          mpiexec -np 30 Ray -k 41 -p Paired-end/T_R1_val_1.fastq Paired-end/T_R2_val_2.fastq -s Paired-end/T_R1_unpaired_1.fastq Paired-end/T_R2_unpaired_2.fastq -o Ray_trimmed_PE_S_k41

                          Typical memory usage about 180 GB at the time of the crash.
                          Input about 13GB of paired end and single end quality clipped (q25) Illumina reads (100bp)

                          Is this a hardware issue (maybe faulty memory banks) or something with Ray or the input material. Ive seen this behavious with other runs as well since I switched to 2.0.0, but never when using Ray 1.7.

                          Comment


                          • Hello,

                            First, I don't see how it can use 180 GB for 13 GB of data.
                            From your log, it says '680960 KiB' for core 29.
                            And your command indicates that you are using 30 cores.

                            35 * 680 MB = 20400 MB or about 20 GB

                            Second, you also need to add -s in front of Paired-end/T_R2_unpaired_2.fastq.

                            Do you have any log with more details because the only error lines are


                            rank 29 in job 53 bamicsb_51206 caused collective abort of all ranks
                            exit status of rank 29: killed by signal 11


                            And rank 29 said that before dying:

                            Rank 29: assembler memory usage: 680960 KiB


                            If you compile with DEBUG=y ASSERT=y, you may get more information out of this, depending on your system.


                            Originally posted by VidJa View Post
                            Ray 2.0.0 fails sometimes after several days of execution with messages like:

                            ...
                            Rank 15 reached 9900 vertices from seed 0, flow 1
                            Speed RAY_SLAVE_MODE_EXTENSION 247 units/second
                            Rank 15: assembler memory usage: 686056 KiB
                            Rank 2 reached 10400 vertices from seed 0, flow 1
                            Speed RAY_SLAVE_MODE_EXTENSION 265 units/second
                            Rank 2: assembler memory usage: 686948 KiB
                            Speed RAY_SLAVE_MODE_EXTENSION 305 units/second
                            Rank 29: assembler memory usage: 680960 KiB
                            rank 29 in job 53 bamicsb_51206 caused collective abort of all ranks
                            exit status of rank 29: killed by signal 11

                            hardware: 35 core virtual machine with 230 GB memory, Ubuntu 10.04, mpich2
                            commandline:
                            mpiexec -np 30 Ray -k 41 -p Paired-end/T_R1_val_1.fastq Paired-end/T_R2_val_2.fastq -s Paired-end/T_R1_unpaired_1.fastq Paired-end/T_R2_unpaired_2.fastq -o Ray_trimmed_PE_S_k41

                            Typical memory usage about 180 GB at the time of the crash.
                            Input about 13GB of paired end and single end quality clipped (q25) Illumina reads (100bp)

                            Is this a hardware issue (maybe faulty memory banks) or something with Ray or the input material. Ive seen this behavious with other runs as well since I switched to 2.0.0, but never when using Ray 1.7.

                            Comment


                            • Thanks, it worked out and we got a high quality assembly, indeed the total memory usage was not by Ray but by another process.
                              After switching to a non-virtual machine the behaviour stopped, so maybe it was the VM configuration.

                              Comment


                              • Originally posted by VidJa View Post
                                Thanks, it worked out and we got a high quality assembly, indeed the total memory usage was not by Ray but by another process.
                                After switching to a non-virtual machine the behaviour stopped, so maybe it was the VM configuration.
                                Cool.

                                What was the faulty process ?

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM
                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                27 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                30 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 09:21 AM
                                0 responses
                                26 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-04-2024, 09:00 AM
                                0 responses
                                52 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X