SEQanswers

Go Back   SEQanswers > Applications Forums > De novo discovery



Similar Threads
Thread Thread Starter Forum Replies Last Post
Tophat2 never completes Generating SAM header samhokin Bioinformatics 1 12-30-2015 09:39 AM
[ABySS] Problems with LSF and ABySS 1.3.4 bstamps De novo discovery 0 10-25-2012 09:14 AM
HiSeq 2500 run, not hopeful on a quality PE-150 quick run epistatic Illumina/Solexa 2 10-10-2012 06:47 AM
How to run Abyss in parallel? bulletproofpenguin Bioinformatics 2 03-24-2010 02:07 PM

Reply
 
Thread Tools
Old 07-29-2015, 10:33 PM   #1
Haumich
Junior Member
 
Location: Israel

Join Date: Jul 2015
Posts: 3
Default ABYSS never completes a run

Hello Everyone,

I am currently trying to assemble mammalian paired-end DNASeq files from Illumina using ABYSS. The problem is that ABYSS gets stuck during the Assembly and never finishes (or at least it didn't during seven days).
The data consists of two 80GB fastq files that I fed to ABYSS 1.5.2. After days, the terminal still shows
0: Reading `/data/reads_1.fq'...
1: Reading `/data/reads_2.fq'...
It always stays at this prompt if I try to read multiple files. I can see that no data is actually read into the RAM.
When I merge those files to a single one, the program reads data to the RAM, makes it to "Finding adjacenct k-mer..." and remains there.
I can see that the processes are still running, but no files are written to the disk. I am working on a single node with 32 Processors and 720GB RAM. The server uses SLURM and openMPI version 1.8.5.
I have read that sometimes the eager limit of MPI is to small, so I set it to a higher value with a given formula, but that didn't solve the problem.
The command I use:
Code:
sbatch --job-name=abyss --partition=hive --nodes=1 --ntasks-per-node=32 --wrap "
abyss-pe k=51 n=10 name=test np=32 in='/data/reads_1.fq /data/reads_2.fq'
"
If someone had a similar problem and would like to share some thoughts, any help is aprreciated.

EDIT
Thanks pmiguel for the remark.
With the "v=-vv" option, I got additional output. It seems as if ABYSS really takes that long to load the reads. In the beginning, it takes about 35 seconds to read 100.000 reads. That increases to about 1 minute and so on. I have the feeling, its the server that is so slow.

Last edited by Haumich; 08-03-2015 at 06:16 AM. Reason: Additional information
Haumich is offline   Reply With Quote
Old 07-31-2015, 11:27 AM   #2
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

I suggest adding "v=-vv" (verbose mode) to your abyss-pe parameters to get a better idea of what is hanging it up. As long as you stdout and stderr are being captured to a log file, you will be able to see what the program is up to.

Also, maybe you should update ABySS? The most recent release is 1.9.0
--
Phillip
pmiguel is offline   Reply With Quote
Old 08-10-2015, 10:19 AM   #3
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

Quote:
Originally Posted by Haumich View Post
With the "v=-vv" option, I got additional output. It seems as if ABYSS really takes that long to load the reads. In the beginning, it takes about 35 seconds to read 100.000 reads. That increases to about 1 minute and so on. I have the feeling, its the server that is so slow.
I think, but do not know, that it is ABySS that is slow. The more reads it has to work with the slower it goes. Much worse than linear speed. Could be a hashing issue (in which case normalized reads or sub-sampled reads might help) or a memory space issue. Phillip and I are currently working on a mammalian genome and might get some insights into ABySS's performance along the way.

BTW: The ABySS forum is now hosted on Biostars.

https://www.biostars.org/t/abyss/

In particular you could look at:

https://www.biostars.org/p/150328/
westerman is offline   Reply With Quote
Old 08-10-2015, 10:51 PM   #4
Haumich
Junior Member
 
Location: Israel

Join Date: Jul 2015
Posts: 3
Default

Hello,

thanks everyone for their suggestions. It seems that the problem was openMPI. At least, once I changed to mpich 3.1.3, the speed of the assembly increased greatly and the program finished in about a day.
Haumich is offline   Reply With Quote
Reply

Tags
abyss-pe, dnaseq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:24 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO