![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
sorting bam files | frymor | Bioinformatics | 23 | 02-10-2016 06:46 PM |
repeat sequences/large files in galaxy | Giles | Bioinformatics | 2 | 06-27-2011 12:08 PM |
sorting sam file | crh | Bioinformatics | 2 | 06-16-2011 07:45 AM |
blast e-value sorting | NicoBxl | Bioinformatics | 10 | 03-09-2011 08:31 AM |
"R Killed" when working with large BAM files | mixter | Bioinformatics | 2 | 07-05-2010 12:47 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: italy Join Date: Sep 2010
Posts: 55
|
![]()
Hi there,
I used soap2 in order to align 45 Gig ilumina reads over an indexed reference. Now I should use soapsnp to get the consensus sequence and to call snp. The output of sopa2 should be sorted though! The problem is that if I have a 153 Gigb file (which represent only chromosome 1 !!). How can I sort such a huge file? I was trying the unix sort command, but will it use an amount of memory comparable to the size of the file to sort (I don't know whether I will be able to access a computer with such an amount of free memory)? thanks for your help |
![]() |
![]() |
![]() |
#2 |
Member
Location: Iowa City, IA Join Date: Jul 2010
Posts: 95
|
![]()
I have used the unix sort command for very large files. It keeps only a small buffer in memory. You can increase the buffer size with -S to improve performance. Test this for yourself by sorting a 1GB file and watching the memory use.
|
![]() |
![]() |
![]() |
#3 |
Member
Location: italy Join Date: Sep 2010
Posts: 55
|
![]()
Great news!
I'm gonna go for it thanks |
![]() |
![]() |
![]() |
#4 | |
Senior Member
Location: The University of Melbourne, AUSTRALIA Join Date: Apr 2008
Posts: 275
|
![]() Quote:
As already mentioned, you can use the -S option to increase buffer sizes. You can use it like "-S 80%" to use 80% of available RAM, or "-S 2G" for 2 GB RAM. For more information, read the texinfo page by typing "pinfo sort" or "info sort". |
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|