![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
meta-velvet returns nodes instead of contigs in assembly? | deprekate | Bioinformatics | 2 | 10-25-2012 02:53 PM |
De Novo Assembly using Ray | Farhat | De novo discovery | 18 | 05-23-2012 02:19 PM |
Meta assembly | Autotroph | Metagenomics | 1 | 04-05-2012 01:32 PM |
PubMed: Ray: simultaneous assembly of reads from a mix of high-throughput sequencing | Newsbot! | Literature Watch | 0 | 03-01-2011 11:30 AM |
![]() |
|
Thread Tools |
![]() |
#21 | ||||
Senior Member
Location: Québec, Canada Join Date: Jul 2008
Posts: 260
|
![]() Quote:
Quote:
Quote:
Code:
mkdir merged for draft in $(ls drafts) do cat drafts/$draft/*.fna > merged/$draft.fasta done Quote:
Code:
if test ! -d NCBI-Finished-Bacterial-Genomes then echo "Creating $OutputDirectory/NCBI-Finished-Bacterial-Genomes, please wait." mkdir NCBI-Finished-Bacterial-Genomes cd NCBI-Finished-Bacterial-Genomes for i in $(ls ../uncompressed/all.fna) do name=$(echo $i|sed 's/_uid/ /g'|awk '{print $1}') cat ../uncompressed/all.fna/$i/*.fna > $name".fasta" done echo "Done." cd .. fi |
||||
![]() |
![]() |
![]() |
#22 |
Senior Member
Location: Québec, Canada Join Date: Jul 2008
Posts: 260
|
![]()
Hello,
Ray v2.2.0 is now available worldwide. The delay between v2.1.0 and v2.2.0 was quite huge. Ray v2.2.0 brings a lot of bug fixes and some new features. The tarball is available at: http://sourceforge.net/projects/deno...v2.2.0.tar.bz2 The most significant changes include: * SequencesLoader: the Illumina export format is now supported * add build option for MPI I/O * void infinite loops during read recycling * messages must not be passed by value * Fixed a linking error caused by ordering * FusionTaskCreator: don't lose genomic regions during merging * new file GraphPartition.txt shows the distribution of objects * readahead operations are used for reading gz files * core: fixed a race condition occurring with -route-messages * SeedingData: fix regression for seed checkpointing * all the code of Ray was ported to this new GraphPath framework The GraphPath framework reduces the memory usage and avoid some misassembly errors by enforcing the Bruijn graph property. * Scaffolder: don't fetch reads from repeated objects This fixes running time issues on large genomes with repeats. * SeedingData: implemented a staggered mean algorithm * Mock: removed the limit on the number of input files * Library: implemented checkpointing for paired reads * removed all calls to fflush(stdout) and cout.flush() * SeedExtender: reduce the verbosity of graph traversal * reduced the amount of information in the standard output * JoinerTaskCreator: reduced the default verbosity * KmerAcademyBuilder: reduced the verbosity for graph construction * implemented an adaptive Bloom filter * store a path as a sequence instead of a vector of vertices for efficiency * SequencesLoader: add support for short file names All changes in Ray between v2.1.0 and v2.2.0 Charles Joly Beauparlant (1): Added an example plugin. Sébastien Boisvert (160): Some work around the minirank model. Ported Ray plugins to the mini-ranks RayPlatform. Ray plugins were ported to the mini-ranks. Moved the destruction of allocators in RayPlatform. I ported Ray to some changes in some classes in RayPlatform. application_core: the application code was simplified Social networks were added to the release procedure Code names of old releases were added Fixed a linking error caused by ordering Fixed the scope of options in build system The build system was simplified AR and LD are not needed here Ray must abort if the output directory exists The RayCommand.txt file was fixed for mini-ranks Added the name of each rank (or mini-rank) in network test The subgraph must be built regardless if it will be used Merge branch 'minirank-model' of git://github.com/sebhtml/ray.git core: CONFIG_* variables are private core: The option -mini-rank-per-rank was added ship: removed 6 files in shipped products core: don't return parameters by value Mock: new plugin called that does nothing SequencesLoader: a regression for .bz2 file support was fixed messages must not be passed by value Ordered all headers Updated copyrights Documentation: there is only one repository for research tools reverted a wrong hunk from commit 7c361f1530d084c6f99 FusionTaskCreator: don't lose genomic regions during merging SeedExtender: properly format extension file name Scaffolder: only put one new line after scaffold sequence KmerAcademyBuilder: use vertexRank() to find who owns an object new file GraphPartition.txt shows the distribution of objects the line that shows the process identifier was moved CoverageGatherer: kmers.txt should have 1 header only recursive make was improved readahead operations are used for reading gz files SequencesLoader: added the rank number when loading files core: the partitioner needs the correct rank number core: fixed a race condition occurring with -route-messages SeedExtender: display the number of traversed nucleotide symbols Seeds: new runtime metrics for seeding algorithms new header for SeedLengthDistribution.txt new header for any paired read file LibraryN.txt SequencesLoader: added a few assertions for read partitions new header for CoverageDistribution.txt Merge branch 'master' of github.com:sebhtml/ray Documentation: added the polytope with 4225 vertices SeedingData: fix regression for seed checkpointing added documentation for using the torus Documentation: added arguments for a 5D torus with 1024 vertices Documentation: fixed permissions removed the output file called MessagePassingInterface.txt renamed the AssemblySeed to GraphPath so it can be reused all the code of Ray was ported to this new GraphPath framework Documentation: fixed the degree of the polytope Scaffolder: don't fetch reads from repeated objects SeedExtender: added documentation in the code for repeated vertices fixed a couple of compilation warnings SeedingData: implemented a staggered mean algorithm Scaffolder: replaced getMode() by the new GraphPath framework Mock: removed the limit on the number of input files remove the limitation regarding the maximum number of files moved message handlers from MessageProcessor to SequencesLoader Scaffolder: fixed 2 compilation warnings Library: implemented checkpointing for paired reads SeedingData: reduced amount of printed information removed all calls to fflush(stdout) and cout.flush() SeedExtender: reduce the verbosity of graph traversal reduced the amount of information in the standard output JoinerTaskCreator: reduced the default verbosity KmerAcademyBuilder: reduced the verbosity for graph construction SequencesLoader: reduced verbosity VerticesExtractor: reduced verbosity reduced verbosity reduced verbosity SequencesLoader: the Illumina export format is now supported added a loader interface for file formats SequencesLoader: all supported formats use the interface SequencesLoader: implemented a product factory Mock: updated documentation for new export format Mock: output a single file for library data implemented an adaptive Bloom filter improved the interface of path objects add debug symbols by default store a path as a sequence instead of a vector of vertices for efficiency Mock: the path storage using blocks is not ready SeedingData: enforce de Bruijn graph property for path storage SeedingData: use the GraphPath storage code to compute seeds SeedingData: refactor code so that m_content is abstracted SeedingData: use 2-bit encoding for paths SeedingData: plugin options are parsed by plugins use constants for symbols SeedingData: correctly detect dead ends add more information for coding style MachineHelper: registerPlugin and resolveSymbols must be last SeedingData: tips can not be seeds SequencesLoader: add support for short file names SeedingData: tips are not valid seeds move some handlers in the Scaffolder plugin Scaffolder: implement the handler for packed chunks fix a race condition during directory probing reduce verbosity of components add documentation for building on IBM Blue Gene/Q add code name for upcoming release SequencesLoader: fix regression (added in ca979832) for line widths add plugin PathEvaluator to evaluate paths PathEvaluator: write ContigPaths checkpoints in parallel reserve storage capacity for sequence file perform parallel I/O operations fix a bug when disabling scaffolding use MPI I/O to write Contigs.fasta use a file view for each MPI rank add build option for MPI I/O avoid parallel I/O without MPI I/O avoid infinite loops during read recycling update polytope documentation add comments for old class add a new plugin to process spurious seeds port some plugins to the simplified RayPlatform API iterate on seeds to filter them register seed paths in the distributed graph hide hash values for Bloom filter push the workflow in a helper class fetch ancestors of seed heads seed lengths must be collected after analysis write seed statistics after analysis write seed checkpoints after the quality control analysis write seed files after analysis (-write-seeds) skip seed quality analysis if checkpoints exist add steps for better dead end detection hide mini-ranks in help if they are disabled correct a bunch of bugs for adapters in Ray reuse code paths to obtain sequence information eliminate seeds that have a dead-end on the left discard seeds with dead-ends on the right increase the maximum depth for searches add a class to fetch the attributes of a DNA sequence create a class to fetch annotations in a portable way fetch nearby paths to detect bubbles fix a bug during the registration of seeds remove any seed that is a weak part of a bubble add 4 methods that will be implemented later fix a regression that prevented the closing of a file add new reference in the output disable the seed filter when using short kmers add a maximum coverage depth for dead end search adapt the allowed depth in function of the data add design blueprints for the new plugin SpuriousSeedAnnihilator: disable debug messages by default TaxonomyViewer: rename the plugin to TaxonomyViewer remove plugin_ from all plugin directory names add new line for publications application_core: fix buggy message routing SeedExtender: don't traverse path if it's consumed already SeedingData: fix a bug for the phix system test update the CMakeList.txt use git to store version names Disable the filtering code during the computation of seeds This is Ray v2.2.0 All changes in RayPlatform between v1.1.0 and v1.1.1 Sébastien Boisvert (56): initial work on miniranks with VirtualMachine and Minirank I added some design documentation for mini-ranks. spinlocks are more suitable for this job added design documentation for mini-ranks. First implementation of mini-ranks in RayPlatform The core must provide the mini-rank number. Documentation: added description of macros. Fixed some bugs in the mini-ranks model. Moved the destruction of allocators in the core. Mini-rank source and mini-rank destination are required. The desctructor of the middleware must be called. A mini-rank must tell the rank that it has messages to send. The class MessageQueue does the job of receiving messages. Non-blocking queues will be used for the communication. The non-blocking message queue for mini-ranks is ready. MPI_Recv must be called to get the mini-rank numbers. This is the branch for RayPlatform v7.0.0. core: The old behavior (no mini-ranks) now works as expected core: RayPlatform is responsible for creating mini-ranks The old adapter API documentation was removed Message reception is now interleaved with send operations. More buffers are needed for mini-ranks communication: don't register already registered buffers The build system is less verbose New API call to get the number of mini-ranks per rank Added a method to get the MessagesHandler object Merge branch 'minirank-model' of github.com:sebhtml/RayPlatform into minirank-model Merge branch 'minirank-model' of git://github.com/sebhtml/RayPlatform.git handlers: new option to cache operation codes communication: messages must be passed with a pointer Ordered headers in all files Updated copyrights The short name was updated in headers The website was updated in every file a retry is necessary when a message is pushed into a full ring Documentation: updated RayPlatform mini-ranks blueprints communication: moved writeFiles() in a second method communication: removed a few debugging instructions Documentation: added gate blueprints Documentation: improved design for non-linear scheduling routing: renamed the hypercube to polytope Documentation: added Torus description a radix of 2 produces a hypercube use the Q and ASSERT build arguments in RayPlatform routing: implemented a new communication graph: the torus Merge branch 'master' of github.com:sebhtml/RayPlatform core: use specific code to get memory usage on Blue Gene/Q the next release will likely be 1.2.0 and not 7.0.0 add option to provide public access to a master mode add the core in each plugin add two macros to configure handlers fixed directives to compile mini-ranks core: fix buggy message routing improve the patch for message routing with a configuration core: fix a regression for registered handle names This is RayPlatform v1.1.0. |
![]() |
![]() |
![]() |
#23 |
Junior Member
Location: Saskatchewan Join Date: Feb 2009
Posts: 6
|
![]()
Hello,
I have a general question about the way Ray Meta works. When the taxonomy and gene ontology profiles are provided, I assume that is only for the assembled contigs? Or would that also include results for reads that were not assembled into contigs? We have fairly low read coverage for our samples, so I anticipate a large portion of the reads will not be assembled into contigs. As such, is there a way to get the taxonomy and gene ontology profiles of the entire data set (i.e., contigs and any reads that were not assembled into contigs)? Thanks in advance for your response, and complements to your group on the paper describing Ray Meta. The thorough documentation and supplementary material is very much appreciated! |
![]() |
![]() |
![]() |
#24 | |
Senior Member
Location: Québec, Canada Join Date: Jul 2008
Posts: 260
|
![]() Quote:
So you should be OK with your low coverage data I think. |
|
![]() |
![]() |
![]() |
#25 |
Member
Location: NY Join Date: Mar 2012
Posts: 35
|
![]()
Dear all,
I am trying to run Ray on a test data. However, when I try to run it according to the mannual, there is an error like this: [jingjing@tll-bioinfo02 Ray-v2.2.0]$ mpiexec -n 1 ray-build/Ray -o test -p PE1.fa PE2.fa -k 31 ssh: Could not resolve hostname tll-bioinfo02: Name or service not known It seems the mpi mode is something wrong. Can anyone give me some suggestions? Thanks! |
![]() |
![]() |
![]() |
#26 | |
Senior Member
Location: Québec, Canada Join Date: Jul 2008
Posts: 260
|
![]() Quote:
Maybe tll-bioinfo02 is listed in a hostfile and that your MPI installation is using that. Try: mpiexec -n 1 -host localhost ray-build/Ray -o test -p PE1.fasta PE2.fasta -k 31 |
|
![]() |
![]() |
![]() |
#27 |
Senior Member
Location: Québec, Canada Join Date: Jul 2008
Posts: 260
|
![]()
Hello,
I am proud to announce the immediate availability of Ray 2.3.0 (434 KB). Most significant change: - add -detect-sequence-files to detect supported files With this option, you just need to put your sequence data files in one directory, and use "mpiexec -n 99 Ray -detect-sequence-files directory. This option will match paired files and everything. What's new: - new option "-run-surveyor" to compare several samples (see Documentation/) - support long reads in -amos option (reported by Bastian Hornung @ wur.nl) - Scaffolder: fix a bug in the formatting of scaffolds (Rob Egan @ Lawrence Berkeley Laboratory) - ElapsedTime.txt is now in tabular format (suggested by James Vincent @ The University of Vermont) - add new sequence file extensions such as .fq.gz (see the manual) - fix a interger overflow for the Bloom filter (thanks to Chien-Chi Lo) - remove the symbolic loop in RayPlatform (reported by Nick Holway) - add the ability to send SIGUSR1 to Ray processes to debug them - use the polytope by default with option -route-messages (instead of the de Bruijn graph) Download link: http://master.dl.sourceforge.net/pro...-2.3.0.tar.bz2 Mirrors: https://github.com/sebhtml/Ray-Relea...-2.3.0.tar.bz2 https://bitbucket.org/sebhtml/ray-re...-2.3.0.tar.bz2 Thanks ! Sébastien ----- Paper for Ray Meta for metagenomics: http://genomebiology.com/2012/13/12/R122 Ticket for the release: https://github.com/sebhtml/ray/issues/194 |
![]() |
![]() |
![]() |
#28 |
Junior Member
Location: Nebraska Join Date: Nov 2012
Posts: 2
|
![]()
Hello,
When I look through some outputs generated from the amos file following assembly, many of the contigs were assigned 0 reads (used default bank2contig after seeing many contigs were not showing up in the generated sam file). Obviously, this does not make much sense, but I was wondering if anyone else has came across this? I was trying to avoid mapping by using the amos file and now I just want to confirm that the contigs I am getting are 'real' I suppose. I thought this may be due to read recycling at first, but reads show up under multiple contigs still. Anyone have other ideas what is causing this issue or how to correct it during assembly? Chris |
![]() |
![]() |
![]() |
#29 |
Member
Location: Italy Join Date: May 2013
Posts: 50
|
![]()
Hi, Everyone
I am doing metagenomic shotgun assembly in Abyss and Ray i got the result but i wanted to know how many reads are used and unused? I tried but failed. can anyone just guide me to find out? Any help will be appreciated... |
![]() |
![]() |
![]() |
#30 |
Member
Location: Italy Join Date: May 2013
Posts: 50
|
![]()
Hi, All
Please help me guys... ![]() ![]() |
![]() |
![]() |
![]() |
Thread Tools | |
|
|