SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Cufflinks memory usage oscarluoinau Bioinformatics 9 12-10-2012 06:55 AM
segment_juncs memory usage while running Tophat genec Bioinformatics 1 11-22-2011 07:09 AM
Memory usage Seta General 2 04-05-2011 10:44 AM
Memory Usage in Newbler 2.3 smg283 454 Pyrosequencing 5 11-09-2010 07:46 AM
SHRiMP Memory Usage DNAjunk Bioinformatics 5 08-05-2009 02:07 PM

Reply
 
Thread Tools
Old 11-17-2011, 06:18 PM   #1
Kotoro
Member
 
Location: Farmington CT

Join Date: May 2011
Posts: 31
Default bwa mt branch extreme memory usage?

We attempted to align some illumina reads with bwa using the multi-thread branch. We submitted the job to a cluster and it landed on a node with ~100 GB RAM and 48 cores. Ram usage exceeded the system resources and crashed the node.

We trimmed the data with dynamic trim but had not yet filtered out shortened reads, could this be responsible for the extreme RAM usage or is there a known issue with the mt branch?

(attempting to run this with the single thread sampe/samse took SO long that I suspect the memory issue might be present there as well.) I'm having a hard time diagnosing the cause of the problem.
Kotoro is offline   Reply With Quote
Old 11-18-2011, 01:18 PM   #2
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

You may want to ping the bwa folks (bio-bwa-help@lists.sourceforge.net) or PM lh3 (the author).
nilshomer is offline   Reply With Quote
Old 11-19-2011, 10:43 AM   #3
Kotoro
Member
 
Location: Farmington CT

Join Date: May 2011
Posts: 31
Default

I am also contacting them there, just wondering if anybody else has tried the mt branch with the same problems.
Kotoro is offline   Reply With Quote
Old 11-19-2011, 11:20 AM   #4
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 700
Default

1) Examine the fastq input. Make sure the read names are correct. Make sure there are 4 lines per read. Make sure somebody didn't accidentally redirect stderr into the stdout at some step and mess up the input to the next step.

2) Check your genome build files (what you "indexed") for bwa to align reads against. Are you using the same version of BWA for indexing and alignment? There is a very new BWA where the indexing must be redone.

3) Where does it fail? At the "aln" step or the the "samse/sampe" step of BWA?
Richard Finney is offline   Reply With Quote
Old 11-20-2011, 12:17 AM   #5
Kotoro
Member
 
Location: Farmington CT

Join Date: May 2011
Posts: 31
Default

the aln step completes just fine, Its the sampe step that appears to have a memory leak. The program was using more than 100GB of RAM, which simply shouldn't be happening.

Remember this isn't the vanilla bwa available on the sourceforge page, I went to the git repository and tested the multi-threaded code branch.

I am also recieving some assistance from the bio bwa help mail list.


M. Gooch
Kotoro is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:05 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO