SEQanswers

Go Back   SEQanswers > Literature Watch



Similar Threads
Thread Thread Starter Forum Replies Last Post
meta-velvet returns nodes instead of contigs in assembly? deprekate Bioinformatics 2 10-25-2012 02:53 PM
De Novo Assembly using Ray Farhat De novo discovery 18 05-23-2012 02:19 PM
Meta assembly Autotroph Metagenomics 1 04-05-2012 01:32 PM
PubMed: Ray: simultaneous assembly of reads from a mix of high-throughput sequencing Newsbot! Literature Watch 0 03-01-2011 11:30 AM

Reply
 
Thread Tools
Old 01-08-2013, 05:57 AM   #1
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default Ray Meta: scalable de novo metagenome assembly and profiling

Ray Meta: scalable de novo metagenome assembly and profiling
Genome Biology 2012, 13:R122 doi:10.1186/gb-2012-13-12-r122

Voluminous parallel sequencing datasets, especially metagenomic experiments, require distributed computing for de novo assembly and taxonomic profiling. Ray Meta is a massively distributed metagenome assembler that is coupled with Ray Communities, which profiles microbiomes based on uniquely-colored k-mers. It can accurately assemble and profile a three billion read metagenomic experiment representing 1,000 bacterial genomes of uneven proportions in 15 hours with 1,024 processor cores, using only 1.5 GB per core. The software will facilitate the processing of large and complex datasets, and will help in generating biological insights on specific environments. Ray Meta is open source and available at http://denovoassembler.sf.net.
seb567 is offline   Reply With Quote
Old 02-20-2013, 01:06 PM   #2
severin
Genome Informatics Facility
 
Location: Iowa @isugif

Join Date: Sep 2009
Posts: 105
Default Ray Meta

How do I include genomes other than the bacteria that are found in the NCBI-taxonomy directory that your script generates? I could drop the fasta file into a folder however...

Is there an easy way to include the taxonomy information about the genomes I add? You added Human in the paper, but if I wanted to include multiple species that the taxonomy is known do I have to do this manually or is there a tool that can help me achieve this?

Also, I am interested in not just obtaining the abundances but also assigning the scaffolds to particular species or other level in the taxonomy. Does Ray output the scaffold to taxon information somewhere?

One last question.
If I have an assembly from say Trinity can I run the assembly through Ray-Meta and have it return abundances based on the transcripts themselves? How dependent is the algorithm to have done the assembly prior? Can I feed Ray-Meta a kmer graph?


Thanks and really excited to use this tool.

Last edited by severin; 02-20-2013 at 01:08 PM.
severin is offline   Reply With Quote
Old 02-21-2013, 09:41 AM   #3
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Hi,

Quote:
Originally Posted by severin View Post
How do I include genomes other than the bacteria that are found in the NCBI-taxonomy directory that your script generates?

Genome-to-Taxon.tsv has 2 columns (tab-separated): GenBankIdentifier taxonIdentifier.

Both are integers.

So you need to append entries to this file.

See https://github.com/sebhtml/ray/blob/...n/Taxonomy.txt

Quote:
Originally Posted by severin View Post
I could drop the fasta file into a folder however...
Indeed, sequences deposited in directories that you provide to Ray with the -search option
will be picked up by Ray Communities plugins.

Quote:
Originally Posted by severin View Post

Is there an easy way to include the taxonomy information about the genomes I add?
No, you need to add one line for each relationship you desire.

Quote:
Originally Posted by severin View Post
You added Human in the paper, but if I wanted to include multiple species that the taxonomy is known do I have to do this manually or is there a tool that can help me achieve this?
Well, because what people want to add in this system can come from various sources (not
just NCBI), it's hard to devise a tool that will be usable and portable for all these sources.

So I guess your best bet is to write a small tool that does it for you so that you
don't have to do it manually.

If you think that this should be a service provided by Ray, you can fill in a ticket at

https://github.com/sebhtml/ray/issues/new

Quote:
Originally Posted by severin View Post

Also, I am interested in not just obtaining the abundances but also assigning the scaffolds to particular species or other level in the taxonomy. Does Ray output the scaffold to taxon information somewhere?
The system will identify contigs for you on the basis on sequences provided by the -search
options.

Files:

Code:
RayMicrobiomeAnalysis/
BiologicalAbundances/
_DenovoAssembly/
Contigs.tsv
*.CoverageData.xml

_Coloring/
_Frequencies/

NCBI-bacteria-directory/
ContigIdentifications.tsv
_Files.tsv
SequenceAbundances.xml

NCBI-viruses-directory/
ContigIdentifications.tsv
_Files.tsv
SequenceAbundances.xml
See https://github.com/sebhtml/ray/blob/...Abundances.txt

Quote:
Originally Posted by severin View Post


One last question.
If I have an assembly from say Trinity can I run the assembly through Ray-Meta and have it return abundances based on the transcripts themselves?
This is a feature that a sizable number of people at my institution are desiring too --
that Ray provides a feature to build the de Bruijn graph from assembled sequences (with
other tools) to benefit from other capabilities like Ray Communities.

The Ray C++ API for messages actually supports this, but the plugins that build the de Bruijn graph
(namely plugin_SequencesLoader, plugin_KmerAcademyBuilder and plugin_VerticesExtractor) are
working only on reads at the moment.

Quote:
Originally Posted by severin View Post

How dependent is the algorithm to have done the assembly prior?
It's independant. The quantification algorithms work on a colored de Bruijn graph.
But it does not really use assembled paths for these computations (aside from what's in
files for contig identification obviously).

Quote:
Originally Posted by severin View Post

Can I feed Ray-Meta a kmer graph?
No, this is not possible at the moment.
But that's something that could be implemented as Ray (and ABySS too)
supports the Ray Cloud Browser kmer graph format.

The file format is like this:

map.csv (ASCII) (called kmers.txt in Ray)

The file is tab-separated, any line starting with a '#' is a comment.


A line looks like this.

GCGGTTATGCTTGCGTCCACCGTAAGTTCGGATTCAGACTTAATCAAAGGTTTTAACAAAGCGCTGGCAACCCCACGGCGGGGGTATTCAG;47;T;G

See https://github.com/sebhtml/Ray-Cloud...Map-format.txt


If you did not know about Ray Cloud Browser, it allows end users to interactively skim processed genomics data with energy.

Demo: http://browser.cloud.boisvert.info/c...location=13000

All you need to get started is a kmer graph and fasta sequences (with Ray: kmers.txt and Contigs.fasta).

Regarding kmer graphs (you mentionned that in your question):

Quote:
Originally Posted by severin View Post

Thanks and really excited to use this tool.
We are also very exciting to have end users adopting our highly scalable methods for genomics.
seb567 is offline   Reply With Quote
Old 02-21-2013, 10:54 AM   #4
severin
Genome Informatics Facility
 
Location: Iowa @isugif

Join Date: Sep 2009
Posts: 105
Default estimates of composition

Thanks for the quick reply. As I am working with these features more I am curious about the following.

What does ray do with contigs and scaffolds it cannot assign to a taxon?

Are they included in the composition analysis?
severin is offline   Reply With Quote
Old 02-21-2013, 11:23 AM   #5
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Quote:
Originally Posted by severin View Post
Thanks for the quick reply. As I am working with these features more I am curious about the following.

What does ray do with contigs and scaffolds it cannot assign to a taxon?

Are they included in the composition analysis?
The composition analysis is performed on the colored de Bruijn graph, not on contigs.


See our Genome Biology paper
seb567 is offline   Reply With Quote
Old 02-26-2013, 08:35 AM   #6
severin
Genome Informatics Facility
 
Location: Iowa @isugif

Join Date: Sep 2009
Posts: 105
Default Nice tool

Sebastien,

This really is a nice tool. Sorry to bombard you with so many questions but I would like to know the limitations of the tools I am using. Some of the runs I have experienced where not all the contigs are assigned to a species. In which case wouldn't this lead to a misrepresentation of what is present in the sample?

How hard would it be to also output the relationship between contig and Taxonomic level? ... Order family genus etc

ie contig-001 Micrococcineae

In other cases every contig is assigned, in which case, how do we determine quality of match to a bacteria or virus if those are the genomes we are using when in actuality the contig belongs to a Eukaryote? Ie possible miss-assignment due to limited number of genomes in the search.

Finally, How does kmer length affect ability to assign a contig to a species/taxonomic group? Have you look at this?

Thanks for all your help on this.

Regards,

Andrew
severin is offline   Reply With Quote
Old 02-26-2013, 05:48 PM   #7
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Quote:
Originally Posted by severin View Post
Sebastien,

This really is a nice tool. Sorry to bombard you with so many questions but I would like to know the limitations of the tools I am using.

Some of the runs I have experienced where not all the contigs are assigned to a species. In which case wouldn't this lead to a misrepresentation of what is present in the sample?
Do you mean that the percentage of unknown life forms is underrepresented ?

Quote:
Originally Posted by severin View Post

How hard would it be to also output the relationship between contig and Taxonomic level? ... Order family genus etc
It's just a matter of adding the code at the good place.

Quote:
Originally Posted by severin View Post

ie contig-001 Micrococcineae

In other cases every contig is assigned, in which case, how do we determine quality of match to a bacteria or virus if those are the genomes we are using when in actuality the contig belongs to a Eukaryote? Ie possible miss-assignment due to limited number of genomes in the search.
If you search for a virus, and a given mammal genome contains all the sequences
of the virus and this mammal genome is not provided to Ray Communities, then yes, Ray
will tell you that it's from a virus.

If you provide Ray Communities with the virus genome and the mammal genome, then the
software will look for those kmers that are not in common, if any.

Quote:
Originally Posted by severin View Post

Finally, How does kmer length affect ability to assign a contig to a species/taxonomic group?
Longer kmers are more specific.

Allowing mismatches would allow sensitive kmer search with large kmers. Mismatches
are not implemented at the moment.

Quote:
Originally Posted by severin View Post
Have you look at this?
Not a lot, honestly.

Quote:
Originally Posted by severin View Post

Thanks for all your help on this.

Regards,

Andrew
seb567 is offline   Reply With Quote
Old 03-13-2013, 10:49 AM   #8
severin
Genome Informatics Facility
 
Location: Iowa @isugif

Join Date: Sep 2009
Posts: 105
Default lots of searching

Hi again.

I was wondering if there is a way to restart a search if the run is terminated prematurely.

I am running Ray meta with all genomes from ncbi. I have a sample that contains multiple eukaryotic and microbial transcriptomes of unknown origin.
I have 256 cores on this and it takes about 3 hours to assemble the genome but it takes more than 21 hours to load the genomes I want to search. I get the impression that checkpoints do not include the ray meta analysis. is it possible that this could be included in the checkpoints?


Andrew
severin is offline   Reply With Quote
Old 03-13-2013, 10:54 AM   #9
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Quote:
Originally Posted by severin View Post
Hi again.

I was wondering if there is a way to restart a search if the run is terminated prematurely.

I am running Ray meta with all genomes from ncbi. I have a sample that contains multiple eukaryotic and microbial transcriptomes of unknown origin.
I have 256 cores on this and it takes about 3 hours to assemble the genome but it takes more than 21 hours to load the genomes I want to search. I get the impression that checkpoints do not include the ray meta analysis. is it possible that this could be included in the checkpoints?


Andrew
What is your command ?
seb567 is offline   Reply With Quote
Old 03-13-2013, 11:01 AM   #10
severin
Genome Informatics Facility
 
Location: Iowa @isugif

Join Date: Sep 2009
Posts: 105
Default command

Quote:
Originally Posted by seb567 View Post
What is your command ?
mpirun -np 256 Ray-v2.1.0/Ray -k 41 -read-write-checkpoints checkpoints -one-color-per-file -search ./6b/ftp.ncbi.nih.gov/genomes/EURKARYOTES/ -search ./6b/ftp.ncbi.nih.gov/genomes/Viruses -search ./6b/GIF_2c/ftp.ncbi.nih.gov/genomes/Bacteria ./6b/GIF_2c/ftp.ncbi.nih.gov/genomes/Bacteria_DRAFT -search ./6b/GIF_2c/ftp.ncbi.nih.gov/genomes/HUMAN_MICROBIOM/Bacteria -search ./6b/ftp.ncbi.nih.gov/genomes/Fungi -with-taxonomy ./4/NCBI-taxonomy/Genome-to-Taxon.tsv ./4/NCBI-taxonomy/TreeOfLife-Edges.tsv ./4/NCBI-taxonomy/Taxon-Names.tsv -i ./TrimmedFiles/Combined.data.Trmatic.sorted.keep.pe.fasta -s ./TrimmedFiles/Combined.data.Trmatic.sorted.keep.se.fasta
severin is offline   Reply With Quote
Old 03-14-2013, 06:44 AM   #11
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Quote:
Originally Posted by severin View Post
mpirun -np 256 Ray-v2.1.0/Ray -k 41 -read-write-checkpoints checkpoints -one-color-per-file -search ./6b/ftp.ncbi.nih.gov/genomes/EURKARYOTES/ -search ./6b/ftp.ncbi.nih.gov/genomes/Viruses -search ./6b/GIF_2c/ftp.ncbi.nih.gov/genomes/Bacteria ./6b/GIF_2c/ftp.ncbi.nih.gov/genomes/Bacteria_DRAFT -search ./6b/GIF_2c/ftp.ncbi.nih.gov/genomes/HUMAN_MICROBIOM/Bacteria -search ./6b/ftp.ncbi.nih.gov/genomes/Fungi -with-taxonomy ./4/NCBI-taxonomy/Genome-to-Taxon.tsv ./4/NCBI-taxonomy/TreeOfLife-Edges.tsv ./4/NCBI-taxonomy/Taxon-Names.tsv -i ./TrimmedFiles/Combined.data.Trmatic.sorted.keep.pe.fasta -s ./TrimmedFiles/Combined.data.Trmatic.sorted.keep.se.fasta
Is the standard output file still being updated ?

Also, the -read-write-checkpoints option does not do anything after the scaffolding.
seb567 is offline   Reply With Quote
Old 03-14-2013, 07:15 AM   #12
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Quote:
Originally Posted by severin View Post
Hi again.

I was wondering if there is a way to restart a search if the run is terminated prematurely.

I am running Ray meta with all genomes from ncbi. I have a sample that contains multiple eukaryotic and microbial transcriptomes of unknown origin.
I have 256 cores on this and it takes about 3 hours to assemble the genome but it takes more than 21 hours to load the genomes I want to search. I get the impression that checkpoints do not include the ray meta analysis. is it possible that this could be included in the checkpoints?


Andrew
Hi,

I checked the logs, this was fixed on 2012-09-27.

The change is already available to all users with the development version of Ray.

The last stable version of Ray is v2.1.0, which was released on 2012-10-30.

Which version are you using ?
seb567 is offline   Reply With Quote
Old 03-14-2013, 07:39 AM   #13
severin
Genome Informatics Facility
 
Location: Iowa @isugif

Join Date: Sep 2009
Posts: 105
Default

Quote:
Originally Posted by seb567 View Post
Hi,

I checked the logs, this was fixed on 2012-09-27.

The change is already available to all users with the development version of Ray.

The last stable version of Ray is v2.1.0, which was released on 2012-10-30.

Which version are you using ?
I am using Ray v2.1.0. Where do I download the developers version?

Ray --version
Ray version 2.1.0
License for Ray: GNU General Public License version 3
RayPlatform version: 1.1.0
License for RayPlatform: GNU Lesser General Public License version 3

MAXKMERLENGTH: 99
KMER_U64_ARRAY_SIZE: 4
Maximum coverage depth stored by CoverageDepth: 4294967295
MAXIMUM_MESSAGE_SIZE_IN_BYTES: 4000 bytes
FORCE_PACKING = n
ASSERT = n
HAVE_LIBZ = n
HAVE_LIBBZ2 = n
CONFIG_PROFILER_COLLECT = n
CONFIG_CLOCK_GETTIME = n
__linux__ = y
_MSC_VER = n
__GNUC__ = y
RAY_32_BITS = n
RAY_64_BITS = y
MPI standard version: MPI 2.1
MPI library: Open-MPI 1.6.1
Compiler: GNU gcc/g++ Intel(R) C++ g++ 4.4 mode
severin is offline   Reply With Quote
Old 03-14-2013, 08:16 AM   #14
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Quote:
Originally Posted by severin View Post
I am using Ray v2.1.0. Where do I download the developers version?

Ray --version
Ray version 2.1.0
License for Ray: GNU General Public License version 3
RayPlatform version: 1.1.0
License for RayPlatform: GNU Lesser General Public License version 3

MAXKMERLENGTH: 99
KMER_U64_ARRAY_SIZE: 4
Maximum coverage depth stored by CoverageDepth: 4294967295
MAXIMUM_MESSAGE_SIZE_IN_BYTES: 4000 bytes
FORCE_PACKING = n
ASSERT = n
HAVE_LIBZ = n
HAVE_LIBBZ2 = n
CONFIG_PROFILER_COLLECT = n
CONFIG_CLOCK_GETTIME = n
__linux__ = y
_MSC_VER = n
__GNUC__ = y
RAY_32_BITS = n
RAY_64_BITS = y
MPI standard version: MPI 2.1
MPI library: Open-MPI 1.6.1
Compiler: GNU gcc/g++ Intel(R) C++ g++ 4.4 mode
To get the development version:

Code:
git clone git://github.com/sebhtml/ray.git
git clone git://github.com/sebhtml/RayPlatform.git
cd ray
make
./Ray -version
seb567 is offline   Reply With Quote
Old 03-14-2013, 09:28 AM   #15
severin
Genome Informatics Facility
 
Location: Iowa @isugif

Join Date: Sep 2009
Posts: 105
Default read-write checkpoints

Quote:
Originally Posted by seb567 View Post
To get the development version:

Code:
git clone git://github.com/sebhtml/ray.git
git clone git://github.com/sebhtml/RayPlatform.git
cd ray
make
./Ray -version

So when you say it is fixed in the developers version does that mean the read-write checkpoints will go beyond the scaffolding process?

Thanks
severin is offline   Reply With Quote
Old 03-14-2013, 10:35 AM   #16
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Quote:
Originally Posted by severin View Post
So when you say it is fixed in the developers version does that mean the read-write checkpoints will go beyond the scaffolding process?

Thanks
The -read-write-checkpoints does not do anything after the scaffolding in development version too.

However, that's a feature that could be added.
seb567 is offline   Reply With Quote
Old 03-14-2013, 11:43 AM   #17
severin
Genome Informatics Facility
 
Location: Iowa @isugif

Join Date: Sep 2009
Posts: 105
Default Install Error

Quote:
Originally Posted by seb567 View Post
To get the development version:

Code:
git clone git://github.com/sebhtml/ray.git
git clone git://github.com/sebhtml/RayPlatform.git
cd ray
make
./Ray -version
when I follow your suggestion with the developmental version I get the following errors

icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_KmerAcademyBuilder/Kmer.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_Library/LibraryPeakFinder.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_Library/LibraryWorker.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_Library/Library.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_MachineHelper/MachineHelper.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_MessageProcessor/MessageProcessor.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_Mock/Parameters.o
icpc: command line warning #10159: invalid argument for option '-std'
code/plugin_Mock/Parameters.cpp(2129): warning #68: integer conversion resulted in a change of sign
uint64_t value=-1;
^

If I run the make file without the -std=c++98 the ray program crashes during the step that follows Selection of optimal read markers

[node195:41872] [10] /lib64/libc.so.6(__libc_start_main+0xfd) [0x33bb21ec5d]
[node195:41872] [11] Ray() [0x469429]
[node195:41872] *** End of error message ***
[node193:49049] 8 more processes have sent help message help-odls-default.txt / odls-default:could-not-kill

==> BATCH_OUTPUT.ray4 <==
[-9] ------> AAAAAAAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTAC
[-8] ------> AAAAAAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACG
[-7] ------> AAAAAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGA
[-6] ------> AAAAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGAC
[-5] ------> AAAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACC
[-4] ------> AAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACCT
[-3] ------> AAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACCTC
[-2] ------> AATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACCTCA
[-1] ------> ATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACCTCAA
[0] ------> TGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACCTCAAC


I see someone else had the same error but I didn't see a resolution for it
http://www.mail-archive.com/denovoas.../msg00317.html
severin is offline   Reply With Quote
Old 03-15-2013, 08:59 AM   #18
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Quote:
Originally Posted by severin View Post
when I follow your suggestion with the developmental version I get the following errors

icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_KmerAcademyBuilder/Kmer.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_Library/LibraryPeakFinder.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_Library/LibraryWorker.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_Library/Library.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_MachineHelper/MachineHelper.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_MessageProcessor/MessageProcessor.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_Mock/Parameters.o
icpc: command line warning #10159: invalid argument for option '-std'
code/plugin_Mock/Parameters.cpp(2129): warning #68: integer conversion resulted in a change of sign
uint64_t value=-1;
^

If I run the make file without the -std=c++98 the ray program crashes during the step that follows Selection of optimal read markers

[node195:41872] [10] /lib64/libc.so.6(__libc_start_main+0xfd) [0x33bb21ec5d]
[node195:41872] [11] Ray() [0x469429]
[node195:41872] *** End of error message ***
[node193:49049] 8 more processes have sent help message help-odls-default.txt / odls-default:could-not-kill

==> BATCH_OUTPUT.ray4 <==
[-9] ------> AAAAAAAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTAC
[-8] ------> AAAAAAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACG
[-7] ------> AAAAAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGA
[-6] ------> AAAAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGAC
[-5] ------> AAAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACC
[-4] ------> AAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACCT
[-3] ------> AAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACCTC
[-2] ------> AATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACCTCA
[-1] ------> ATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACCTCAA
[0] ------> TGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACCTCAAC


I see someone else had the same error but I didn't see a resolution for it
http://www.mail-archive.com/denovoas.../msg00317.html
Someone else also had the problem on 1 sample out of 15 samples during the coloring of the graph (endless processing with v2.1.0 on some samples):


I will fix this. Maybe for the v2.2.0 release, but it will probably appear in the v2.2.1 release later.

Last edited by seb567; 03-15-2013 at 08:59 AM. Reason: added use case
seb567 is offline   Reply With Quote
Old 03-27-2013, 09:17 AM   #19
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Hi,

I did a test with the Intel compiler and everything went fine.

Code:
icpc: command line warning #10159: invalid argument for option '-std'
This warning changes nothing for the Ray executable, it's just a warning saying that -std=c++98 is not an option of the Intel compiler.



Quote:
Originally Posted by severin View Post
when I follow your suggestion with the developmental version I get the following errors

icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_KmerAcademyBuilder/Kmer.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_Library/LibraryPeakFinder.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_Library/LibraryWorker.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_Library/Library.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_MachineHelper/MachineHelper.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_MessageProcessor/MessageProcessor.o
icpc: command line warning #10159: invalid argument for option '-std'
CXX code/plugin_Mock/Parameters.o
icpc: command line warning #10159: invalid argument for option '-std'
code/plugin_Mock/Parameters.cpp(2129): warning #68: integer conversion resulted in a change of sign
uint64_t value=-1;
^

If I run the make file without the -std=c++98 the ray program crashes during the step that follows Selection of optimal read markers

[node195:41872] [10] /lib64/libc.so.6(__libc_start_main+0xfd) [0x33bb21ec5d]
[node195:41872] [11] Ray() [0x469429]
[node195:41872] *** End of error message ***
[node193:49049] 8 more processes have sent help message help-odls-default.txt / odls-default:could-not-kill

==> BATCH_OUTPUT.ray4 <==
[-9] ------> AAAAAAAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTAC
[-8] ------> AAAAAAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACG
[-7] ------> AAAAAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGA
[-6] ------> AAAAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGAC
[-5] ------> AAAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACC
[-4] ------> AAAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACCT
[-3] ------> AAATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACCTC
[-2] ------> AATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACCTCA
[-1] ------> ATGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACCTCAA
[0] ------> TGTGCCTTCGTTTCAAGTTCTATTCATTCTACGACCTCAAC


I see someone else had the same error but I didn't see a resolution for it
http://www.mail-archive.com/denovoas.../msg00317.html
seb567 is offline   Reply With Quote
Old 04-02-2013, 08:43 AM   #20
suzumar
Junior Member
 
Location: France

Join Date: Sep 2012
Posts: 2
Default

Hi Sebastian and thanks for developing Ray. I am working on a sponge metagenome (ion torrent) and I am trying to setup ray for taxonomy and communities.

I an trying to setup the files for the latest version of greengenes (2012_08) and have parsed the information in the fasta file to the same format as 2011_01, and I am trying to manually run the script

Paper-Replication-2012 / Build-Input-Files-for-GreenGenes-Taxonomy / main.sh

and have one question regarding fasta files for Ray Taxonomy and Communities

I have notices that for the NCBI taxonomy the script Paper-Replication-2012 / Build-Input-Files-for-NCBI-Taxonomy / CreateRayInputStructures.sh

Creates a single fasta file with for each genome. My question is whether those reference fasta files are just a concatenation of all .fna files associated with anty given genome. (and so there are multiples IDs and accessions associated with a given "genome". This becomes an is an issue for draft genomes (lots of scaffolds) or eukaryotic chromosomes, which I will have to "manually merge"

Actually after I double checked the CreateRayInputStructures.sh script it seems to be the case, but would you please confirm it?

Marcelino
suzumar is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:54 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO