SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
About samtools sort Richard.Y Genomic Resequencing 1 07-04-2013 08:34 AM
samtools sort hanshart Bioinformatics 4 07-01-2013 08:45 AM
merging bam files with samtools tahamasoodi Bioinformatics 3 01-19-2013 12:38 AM
samtools sort EBER Bioinformatics 1 06-08-2012 06:15 PM
failed when merging the example VCF files using VCFtools jianfeng.mao Bioinformatics 1 02-02-2011 06:44 PM

Reply
 
Thread Tools
Old 05-05-2014, 07:55 PM   #1
acoada
Junior Member
 
Location: China

Join Date: Dec 2013
Posts: 8
Default what if samtools sort failed while merging from temp files..

I was sorting a hug BAM file using
Code:
samtools sort -@ 5 -m 4G  <in.bam> <out.prefix>
While merging from temp files
Code:
 [bam_sort_core] merging from 630 files...
it was killed 'cos lack of memory...

So I checked the temp files and found that I could not simply use samtools merge to merge those temp files 'cos they were not in order

What shall I do if I just don't wanna start over again...*
acoada is offline   Reply With Quote
Old 05-05-2014, 09:08 PM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

FYI, the "-@ 5" and "-m 4G" specify 4GB per thread for 5 threads, so 20GB total. You need to reduce the number of threads or the memory limit if you were trying to stay under 4GB.
Brian Bushnell is offline   Reply With Quote
Old 05-05-2014, 09:13 PM   #3
acoada
Junior Member
 
Location: China

Join Date: Dec 2013
Posts: 8
Default

Quote:
Originally Posted by Brian Bushnell View Post
FYI, the "-@ 5" and "-m 4G" specify 4GB per thread for 5 threads, so 20GB total. You need to reduce the number of threads or the memory limit if you were trying to stay under 4GB.
Yep, the total memory should under 20Gb, yet it got 25Gb while merging, so it was killed.

I just wanna know is there any way to tell samtools to redo the merge( just the merge), 'cos the temp files are already sorted in some way( fast sort? I'm not sure) and I don't want to waste the time to redo the sort.

Thank you~
acoada is offline   Reply With Quote
Old 05-05-2014, 09:19 PM   #4
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

No, it probably died sorting the temp files. The split and merge parts are both low-memory, only the sort is high-memory.
Brian Bushnell is offline   Reply With Quote
Old 05-05-2014, 09:24 PM   #5
acoada
Junior Member
 
Location: China

Join Date: Dec 2013
Posts: 8
Default

Quote:
Originally Posted by Brian Bushnell View Post
No, it probably died sorting the temp files. The split and merge parts are both low-memory, only the sort is high-memory.
samtools has told me that
Code:
  [bam_sort_core] merging from 630 files...
thus I am sure the all the temp files have been sorted and
samtools was killed while doing merging work ...

The point is, is there any way to redo the merging only, 'cos sorting was really time consuming...
acoada is offline   Reply With Quote
Old 05-05-2014, 09:32 PM   #6
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

I imagine that the merge and sort are done simultaneously. Anyway - if you want results you can trust, of course you can't use temp files from a program that crashed with a nondeterministic out-of-memory error; they could be in any condition.

Note - I could be wrong; I can't remember the order of samtools' sort messages since I always ignore them. But there's no reason for it to need 25GB RAM while merging unless it is sorting at the same time, or writes are being internally buffered; and in either case, the temp files could be corrupt.

Last edited by Brian Bushnell; 05-05-2014 at 09:37 PM.
Brian Bushnell is offline   Reply With Quote
Old 05-05-2014, 10:04 PM   #7
acoada
Junior Member
 
Location: China

Join Date: Dec 2013
Posts: 8
Default

Quote:
Originally Posted by Brian Bushnell View Post
I imagine that the merge and sort are done simultaneously. Anyway - if you want results you can trust, of course you can't use temp files from a program that crashed with a nondeterministic out-of-memory error; they could be in any condition.

Note - I could be wrong; I can't remember the order of samtools' sort messages since I always ignore them. But there's no reason for it to need 25GB RAM while merging unless it is sorting at the same time, or writes are being internally buffered; and in either case, the temp files could be corrupt.
You're right. Samtools do sort while merging temp files, because there isn't ordinal order between temp files( each temp file contains all the chromosome... ) .
acoada is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:41 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO