SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
parallel de novo assembler tmy1018 Bioinformatics 3 10-22-2012 09:31 AM
PubMed: A Comparison of Parallel Pyrosequencing and Sanger Clone-Based Sequencing and Newsbot! Literature Watch 0 11-01-2011 03:00 AM
Contrail - a hadoop-based de novo sequence assembler samanta General 0 09-08-2011 12:16 PM
looking for reference genome based assembler for short-reads zchou Bioinformatics 3 12-16-2009 09:13 PM
PubMed: ABySS: A parallel assembler for short read sequence data. Newsbot! Literature Watch 0 03-03-2009 06:00 AM

Reply
 
Thread Tools
Old 04-29-2013, 11:29 AM   #241
yifangt
Member
 
Location: Canada

Join Date: Feb 2011
Posts: 61
Default RE: Ray2.2.0

More questions about the insert size:
1) I checked the LibraryStatistics.txt, I found the number is quite different (303bp) from what my Lab tech provided (~1kb);
2) Does this mean I have to run Ray first to get the insert size first then provide Ray with these parameters and run it again?
I want to make sure each point, and Ray is really fast and it helps very much. Thanks a lot!
YT

Last edited by yifangt; 04-29-2013 at 11:36 AM.
yifangt is offline   Reply With Quote
Old 05-17-2013, 08:54 AM   #242
ercfrtz
Member
 
Location: Iowa

Join Date: Aug 2010
Posts: 23
Default

I am having issues installing. When I run the make prefix=ray-build like the installation file says, I get the following error:

Code:
make prefix=ray-build
make[1]: Entering directory `/home/eric.fritz/Ray-v2.2.0/RayPlatform'
  CXX RayPlatform/memory/ReusableMemoryStore.o
make[1]: execvp: mpicxx: Not a directory
make[1]: *** [memory/ReusableMemoryStore.o] Error 127
make[1]: Leaving directory `/home/eric.fritz/Ray-v2.2.0/RayPlatform'
make: *** [RayPlatform/libRayPlatform.a] Error 2
I do have all the requirements on the system. Thanks.
ercfrtz is offline   Reply With Quote
Old 05-23-2013, 12:51 PM   #243
akshaya.ramesh
Member
 
Location: Boston, MA

Join Date: Apr 2013
Posts: 22
Default

[QUOTE=seb567;103233]I fixed this build script.

https://github.com/sebhtml/ray/commi...e15e5ce6146d45

Thank you for fixing the buildscript. I have been able to make modifications to the makefile by altering the MAXKMERLENGTH and it is working alright. For some reason, every time I run Rayv2.2.0, the following error comes up (for any K-mer specification; I have tried values from 21-91; my reads are 101 bp in length):

Rank 0: Assembler panic: no k-mers found in reads.
Rank 0: Perhaps reads are shorter than the k-mer length (change -k).

I am sure I am missing something very basic, but I am a newbie any comments would be greatly appreciated.

Thanks,
Akshaya
akshaya.ramesh is offline   Reply With Quote
Old 07-29-2013, 06:35 PM   #244
bwawrik
Junior Member
 
Location: Oklahoma

Join Date: Jul 2013
Posts: 2
Default problem with RAY assembly

Hi,
I've been using Ray very successfully on our cluster, but ran into a problem with my last data set. It's a relatively modest MySeq partial run of about 750k paired reads that originate from a pure bacterial strain. Originally, I ran the data and I got a decent assembly, but I discovered that there was some degree of read-through and that the adapters were not trimmed on the 3' end for all reads (not sure why illumina basespace does not check for that by default). I used Trimmomactic 0.3 to remove the adapters and tried re-assembling with Ray, but every time I do, it produces a crash. Basically, it goes into some sort of endless loop that ends up using 80 gig of swap at which point the cluster bounces my job. I sliced my data into three parts and assembled them separately or in combinations and the data assembles fine that way (so it's not corrupted data files). Only when I combine all three parts, do I get the crash. I added the stderr below. The stdout is >84gigs and I also added the end of the file.

Any thoughts what the problem might be ?

Thanks in advance !

stderr:

Ray:18680 terminated with signal 11 at PC=2aebbf8198d7 SP=7fff758b6340. Backtrace:
/usr/mpi/intel/openmpi-1.4.3-qlc/lib64/libopen-pal.so.0(opal_memory_ptmalloc2_int_malloc+0xfd7)[0x2aebbf8198d7]
/usr/mpi/intel/openmpi-1.4.3-qlc/lib64/libopen-pal.so.0(+0x49a05)[0x2aebbf817a05]
/usr/lib64/libstdc++.so.6(_Znwm+0x1d)[0x2aebc014809d]
/home/bstamps/Ray/Ray(_ZNSt6vectorI4KmerSaIS0_EE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPS0_S2_EERKS0_+0x130)[0x473bc0]
/home/bstamps/Ray/Ray(_ZN12SeedExtender28markCurrentVertexAsAssembledEP4KmerP13RingAllocatorPiP12StaticVectoriiP13ExtensionDataPbS9_S4_S9_PSt6vectorIS0_SaIS0_EEP7ChooserP10BubbleDataiP20OpenAssemblerChooseriPSA_I12AssemblySeedSaISK_EE+0x1371)[0x4e96e1]
/home/bstamps/Ray/Ray(_ZN12SeedExtender29call_RAY_SLAVE_MODE_EXTENSIONEv+0x5f7)[0x4f16f7]
/home/bstamps/Ray/Ray(_ZN11ComputeCore10runVanillaEv+0x9b)[0x534f7b]
/home/bstamps/Ray/Ray(_ZN11ComputeCore3runEv+0x6e)[0x53638e]
/home/bstamps/Ray/Ray(_ZN7Machine5startEv+0x1697)[0x4487a7]
/home/bstamps/Ray/Ray(main+0x3a)[0x443bba]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x2aebc0a67cdd]
/home/bstamps/Ray/Ray[0x443a89]


end of stdout:

Rank 43: assembler memory usage: 24483328 KiB
Rank 43 reached 536865000 vertices from seed 0, flow 2
Speed RAY_SLAVE_MODE_EXTENSION 9921 units/second
Rank 43: assembler memory usage: 24483328 KiB
Rank 43 reached 536866000 vertices from seed 0, flow 2
Speed RAY_SLAVE_MODE_EXTENSION 12596 units/second
Rank 43: assembler memory usage: 24483328 KiB
Rank 43 reached 536867000 vertices from seed 0, flow 2
Speed RAY_SLAVE_MODE_EXTENSION 12235 units/second
Rank 43: assembler memory usage: 24483328 KiB
Rank 43 reached 536868000 vertices from seed 0, flow 2
Speed RAY_SLAVE_MODE_EXTENSION 13596 units/second
Rank 43: assembler memory usage: 24483328 KiB
Rank 43 reached 536869000 vertices from seed 0, flow 2
Speed RAY_SLAVE_MODE_EXTENSION 10610 units/second
Rank 43: assembler memory usage: 24483328 KiB
Rank 43 reached 536870000 vertices from seed 0, flow 2
Speed RAY_SLAVE_MODE_EXTENSION 12145 units/second
Rank 43: assembler memory usage: 24483328 KiB
Job /lsf/7.0/linux2.6-glibc2.3-x86_64/bin/openmpi_wrapper /home/bstamps/Ray/Ray -k 31 -p /scratch/bwawrik/Colin/output_forward_paired2.fastq /scratch/bwawrik/Colin/output_reverse_paired2.fastq -o /scratch/bwawrik/Colin/ray_adapt_rem -minimum-contig-length 1000
bwawrik is offline   Reply With Quote
Old 07-30-2013, 10:24 AM   #245
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Quote:
Originally Posted by yifangt View Post
More questions about the insert size:
1) I checked the LibraryStatistics.txt, I found the number is quite different (303bp) from what my Lab tech provided (~1kb);
2) Does this mean I have to run Ray first to get the insert size first then provide Ray with these parameters and run it again?
I want to make sure each point, and Ray is really fast and it helps very much. Thanks a lot!
YT
Hi !

You can use the data in the LibraryData.xml to check the actual distribution of your libraries and then maybe show this information to your laboratory technician. The 303 bp reported by Ray is an average for the detected peak in the data.

For mates, sometimes you will have a low-frequency peak on the right (in your case at 1 kb).

Thanks for the good words about Ray. It is appreciated !
seb567 is offline   Reply With Quote
Old 07-30-2013, 10:25 AM   #246
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Quote:
Originally Posted by ercfrtz View Post
I am having issues installing. When I run the make prefix=ray-build like the installation file says, I get the following error:

Code:
make prefix=ray-build
make[1]: Entering directory `/home/eric.fritz/Ray-v2.2.0/RayPlatform'
  CXX RayPlatform/memory/ReusableMemoryStore.o
make[1]: execvp: mpicxx: Not a directory
make[1]: *** [memory/ReusableMemoryStore.o] Error 127
make[1]: Leaving directory `/home/eric.fritz/Ray-v2.2.0/RayPlatform'
make: *** [RayPlatform/libRayPlatform.a] Error 2
I do have all the requirements on the system. Thanks.
Can you provide the output of this command:

type mpicxx
seb567 is offline   Reply With Quote
Old 07-30-2013, 10:26 AM   #247
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

[QUOTE=akshaya.ramesh;105505]
Quote:
Originally Posted by seb567 View Post
I fixed this build script.

https://github.com/sebhtml/ray/commi...e15e5ce6146d45

Thank you for fixing the buildscript. I have been able to make modifications to the makefile by altering the MAXKMERLENGTH and it is working alright. For some reason, every time I run Rayv2.2.0, the following error comes up (for any K-mer specification; I have tried values from 21-91; my reads are 101 bp in length):

Rank 0: Assembler panic: no k-mers found in reads.
Rank 0: Perhaps reads are shorter than the k-mer length (change -k).

I am sure I am missing something very basic, but I am a newbie any comments would be greatly appreciated.

Thanks,
Akshaya
How many reads do you have ? This information is available in the file

RayOutput/NumberOfSequences.txt


This is usually due to using k-mers that are longer than reads or to assembling 0 reads.
seb567 is offline   Reply With Quote
Old 07-30-2013, 10:29 AM   #248
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Quote:
Originally Posted by bwawrik View Post
Hi,
I've been using Ray very successfully on our cluster, but ran into a problem with my last data set. It's a relatively modest MySeq partial run of about 750k paired reads that originate from a pure bacterial strain. Originally, I ran the data and I got a decent assembly, but I discovered that there was some degree of read-through and that the adapters were not trimmed on the 3' end for all reads (not sure why illumina basespace does not check for that by default). I used Trimmomactic 0.3 to remove the adapters and tried re-assembling with Ray, but every time I do, it produces a crash. Basically, it goes into some sort of endless loop that ends up using 80 gig of swap at which point the cluster bounces my job. I sliced my data into three parts and assembled them separately or in combinations and the data assembles fine that way (so it's not corrupted data files). Only when I combine all three parts, do I get the crash. I added the stderr below. The stdout is >84gigs and I also added the end of the file.

Any thoughts what the problem might be ?

Thanks in advance !

stderr:

Ray:18680 terminated with signal 11 at PC=2aebbf8198d7 SP=7fff758b6340. Backtrace:
/usr/mpi/intel/openmpi-1.4.3-qlc/lib64/libopen-pal.so.0(opal_memory_ptmalloc2_int_malloc+0xfd7)[0x2aebbf8198d7]
/usr/mpi/intel/openmpi-1.4.3-qlc/lib64/libopen-pal.so.0(+0x49a05)[0x2aebbf817a05]
/usr/lib64/libstdc++.so.6(_Znwm+0x1d)[0x2aebc014809d]
/home/bstamps/Ray/Ray(_ZNSt6vectorI4KmerSaIS0_EE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPS0_S2_EERKS0_+0x130)[0x473bc0]
/home/bstamps/Ray/Ray(_ZN12SeedExtender28markCurrentVertexAsAssembledEP4KmerP13RingAllocatorPiP12StaticVectoriiP13ExtensionDataPbS9_S4_S9_PSt6vectorIS0_SaIS0_EEP7ChooserP10BubbleDataiP20OpenAssemblerChooseriPSA_I12AssemblySeedSaISK_EE+0x1371)[0x4e96e1]
/home/bstamps/Ray/Ray(_ZN12SeedExtender29call_RAY_SLAVE_MODE_EXTENSIONEv+0x5f7)[0x4f16f7]
/home/bstamps/Ray/Ray(_ZN11ComputeCore10runVanillaEv+0x9b)[0x534f7b]
/home/bstamps/Ray/Ray(_ZN11ComputeCore3runEv+0x6e)[0x53638e]
/home/bstamps/Ray/Ray(_ZN7Machine5startEv+0x1697)[0x4487a7]
/home/bstamps/Ray/Ray(main+0x3a)[0x443bba]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x2aebc0a67cdd]
/home/bstamps/Ray/Ray[0x443a89]


end of stdout:

Rank 43: assembler memory usage: 24483328 KiB
Rank 43 reached 536865000 vertices from seed 0, flow 2
Speed RAY_SLAVE_MODE_EXTENSION 9921 units/second
Rank 43: assembler memory usage: 24483328 KiB
Rank 43 reached 536866000 vertices from seed 0, flow 2
Speed RAY_SLAVE_MODE_EXTENSION 12596 units/second
Rank 43: assembler memory usage: 24483328 KiB
Rank 43 reached 536867000 vertices from seed 0, flow 2
Speed RAY_SLAVE_MODE_EXTENSION 12235 units/second
Rank 43: assembler memory usage: 24483328 KiB
Rank 43 reached 536868000 vertices from seed 0, flow 2
Speed RAY_SLAVE_MODE_EXTENSION 13596 units/second
Rank 43: assembler memory usage: 24483328 KiB
Rank 43 reached 536869000 vertices from seed 0, flow 2
Speed RAY_SLAVE_MODE_EXTENSION 10610 units/second
Rank 43: assembler memory usage: 24483328 KiB
Rank 43 reached 536870000 vertices from seed 0, flow 2
Speed RAY_SLAVE_MODE_EXTENSION 12145 units/second
Rank 43: assembler memory usage: 24483328 KiB
Job /lsf/7.0/linux2.6-glibc2.3-x86_64/bin/openmpi_wrapper /home/bstamps/Ray/Ray -k 31 -p /scratch/bwawrik/Colin/output_forward_paired2.fastq /scratch/bwawrik/Colin/output_reverse_paired2.fastq -o /scratch/bwawrik/Colin/ray_adapt_rem -minimum-contig-length 1000
It says "Rank 43 reached 536870000 vertices from seed 0, flow 2"

That is a lot of vertices !

Which version of Ray are you using ?

I believe that this issue was fixed in Ray 2.2.0. The ticket was https://github.com/sebhtml/ray/issues/161


You can obtain the version of Ray with this command:


mpiexec -n 1 Ray -version
seb567 is offline   Reply With Quote
Old 07-30-2013, 08:45 PM   #249
bwawrik
Junior Member
 
Location: Oklahoma

Join Date: Jul 2013
Posts: 2
Default

Quote:
Originally Posted by seb567 View Post
It says "Rank 43 reached 536870000 vertices from seed 0, flow 2"

That is a lot of vertices !

Which version of Ray are you using ?

I believe that this issue was fixed in Ray 2.2.0. The ticket was https://github.com/sebhtml/ray/issues/161
You can obtain the version of Ray with this command:
mpiexec -n 1 Ray -version
We were running v2.1.0. After installing 2.3.0, the data ran great, thank you. I also think you link was right on. Somehow it created an endless loop and we think that it ended up reaching the swap limit on one of our nodes, bouncing the job.

Regardless, it works great now ! Thanks for the help !

B
bwawrik is offline   Reply With Quote
Old 08-11-2013, 10:25 AM   #250
OTU
Member
 
Location: Utah

Join Date: May 2013
Posts: 44
Default

Hi everyone!

I am having troubles with installing Ray.
So, after trying to run this line:
./scripts/Build-Link-Time-Optimization.sh

I am having an error message:
/bin/ld: skipping incompatible /usr/lib64/libbz2.so when searching for -lbz2
/bin/ld: cannot find -lbz2
collect2: error: ld returned 1 exit status
strip: 'Ray': No such file

What can be an issue with -lbz2?

When I am looking for this library, I got:
# locate libbz2
/opt/google/chrome/libbz2.so.1.0
/usr/lib/libbz2.so.1
/usr/lib/libbz2.so.1.0.6
/usr/lib64/libbz2.so
/usr/lib64/libbz2.so.1
/usr/lib64/libbz2.so.1.0.6

Please, need help!

Last edited by OTU; 08-11-2013 at 10:35 AM.
OTU is offline   Reply With Quote
Old 08-11-2013, 10:35 AM   #251
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Quote:
Originally Posted by OTU View Post
Hi everyone!

I am having troubles with installing Ray.
So, after trying to run this line:
./scripts/Build-Link-Time-Optimization.sh

I am having an error message:
/bin/ld: skipping incompatible /usr/lib64/libbz2.so when searching for -lbz2
/bin/ld: cannot find -lbz2
collect2: error: ld returned 1 exit status
strip: 'Ray': No such file

What can be an issue with -lbz2?

Please, need help!
Hello,

The message "skipping incompatible /usr/lib64/libbz2.so when searching for -lbz2"
means something I guess. I never saw this message before.
It is likely related to your tool chain, not Ray.

Can you do:

file /usr/lib64/libbz2.so

Also:


Which compiler are you using ?

What is your operating system ?
seb567 is offline   Reply With Quote
Old 08-11-2013, 10:41 AM   #252
OTU
Member
 
Location: Utah

Join Date: May 2013
Posts: 44
Default

Ok,

I got:
# file /usr/lib64/libbz2.so
/usr/lib64/libbz2.so: symbolic link to `libbz2.so.1'

I am using MPI, as stated to be done in instructions...
Operating system - Fedora19
OTU is offline   Reply With Quote
Old 08-11-2013, 11:09 AM   #253
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Quote:
Originally Posted by OTU View Post
Ok,

I got:
# file /usr/lib64/libbz2.so
/usr/lib64/libbz2.so: symbolic link to `libbz2.so.1'

I am using MPI, as stated to be done in instructions...
Operating system - Fedora19
Did you try to build Ray using the Makefile using the
"make" command ?


Which version of Ray are you building ?


In Fedora 19, there is the precompiled package Ray-openmpi.x86_64
that provides Ray.

I am using Fedora 18 myself ( x86_64) and I never saw this cryptic error message
from the linker ld.
seb567 is offline   Reply With Quote
Old 08-11-2013, 11:21 AM   #254
OTU
Member
 
Location: Utah

Join Date: May 2013
Posts: 44
Default

Yes, I was trying to use make command..
I am installing Ray-v2.2.0

I tried to install what you told:
[root@localhost Ray-v2.2.0]# rpm -Uvh '/home/annet/Downloads/epel-release-6-8.noarch.rpm'
warning: /home/annet/Downloads/epel-release-6-8.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID 0608b895: NOKEY
error: Failed dependencies:
fedora-release conflicts with epel-release-6-8.noarch
OTU is offline   Reply With Quote
Old 08-11-2013, 11:24 AM   #255
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Quote:
Originally Posted by OTU View Post
Yes, I was trying to use make command..
I am installing Ray-v2.2.0

I tried to install what you told:
[root@localhost Ray-v2.2.0]# rpm -Uvh '/home/annet/Downloads/epel-release-6-8.noarch.rpm'
warning: /home/annet/Downloads/epel-release-6-8.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID 0608b895: NOKEY
error: Failed dependencies:
fedora-release conflicts with epel-release-6-8.noarch
You can simply type this to install in Fedora 19:

sudo yum install Ray-openmpi




For your buddy libbz2.so, can you type this command:

file -L /usr/lib64/libbz2.so


I suppose this is an issue with the architecture of the library.
seb567 is offline   Reply With Quote
Old 08-11-2013, 11:27 AM   #256
OTU
Member
 
Location: Utah

Join Date: May 2013
Posts: 44
Default

Through yum - got success! Thank you!
And with libz2:
[root@localhost Ray-v2.2.0]# file -L /usr/lib64/libbz2.so
/usr/lib64/libbz2.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=0x7705fc6fbd1d6aa0ffde9b4f54189a5b637e3ee0, stripped

Is this ok?
OTU is offline   Reply With Quote
Old 08-11-2013, 11:35 AM   #257
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Quote:
Originally Posted by OTU View Post
Through yum - got success! Thank you!
And with libz2:
[root@localhost Ray-v2.2.0]# file -L /usr/lib64/libbz2.so
/usr/lib64/libbz2.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=0x7705fc6fbd1d6aa0ffde9b4f54189a5b637e3ee0, stripped

Is this ok?
The package you installed with yum is 2.1.0-6.fc19 (not 2.2.0).


So if you also want 2.2.0, can you try this (with gz support but without bz2 support):


make clean
make HAVE_LIBZ=y HAVE_LIBBZ2=n
seb567 is offline   Reply With Quote
Old 08-11-2013, 11:54 AM   #258
OTU
Member
 
Location: Utah

Join Date: May 2013
Posts: 44
Default

Em....

And after this, what should I do? Does this mean the end of installation of Ray-v2.2.0?
I am sorry, I am just confused with all this stuff...
OTU is offline   Reply With Quote
Old 08-12-2013, 09:46 AM   #259
seb567
Senior Member
 
Location: Québec, Canada

Join Date: Jul 2008
Posts: 260
Default

Quote:
Originally Posted by OTU View Post
Em....

And after this, what should I do? Does this mean the end of installation of Ray-v2.2.0?
I am sorry, I am just confused with all this stuff...
No. The command above is to build Ray with gz support but no bz2 support.


Did you try:

make clean
make HAVE_LIBZ=y HAVE_LIBBZ2=n
seb567 is offline   Reply With Quote
Old 11-20-2013, 02:48 PM   #260
Brian E
Junior Member
 
Location: DC metro area

Join Date: Oct 2010
Posts: 5
Default

So, to add to an already massive thread, it's looking like Ray is going to be the assembler we're going to be using for our Metagenome work, and I was wondering if it is possible to obtain the location that each read ends up in the assembly. We're using multiple biological replicates, sequenced separately (multiplexed) and we know that there is some variation in the abundance of the organisms that are present in the community. What I would like is the contribution of each individual sample to the local coverage of the whole assembly. I think this could help with pulling out individual genomes.
I know could get at this by mapping with e.g. Bowtie, but if it is possible to keep track of where these reads are ending up, it would save a significant amount of time on our computer cluster allocation. Is this possible?
Brian E is offline   Reply With Quote
Reply

Tags
assembler, genome, illumina, mix

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:51 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO