Hi all, first post. Great site!
Thought I'd share a new problem... I'm just starting with Picard tools (version 1.56) to estimate redundancy, and have predictably been wrestling with memory issues...
But not with the heap, as I'd expected (and not initially noticed). Instead, I'm running out of PermGen space. One of my .bam's is really large, but it happens even on much smaller .bam's containing single ends of mate pairs.
I increased it to 1g (-XX:PermSize=1g -XX:MaxPermSize=1g), and it still died, though after 2 hrs CPU time rather than 10 minutes as before. I've increased it now to 4g and we'll see how it goes.
Does this point to memory leak issues within Picard tools, that the permanent heap gets this full?? Seems to be way beyond where JVM expects things to be, and I've rarely seen PermGen space problems mentioned, never for Picard tools.
Cheers,
Doug
[Mon Nov 21 19:11:40 CET 2011] net.sf.picard.sam.MarkDuplicates INPUT=map.CLCh001.lib300.bam_sorted.bam OUTPUT=map.CLCh001.lib300.bam_sorted.bam.PicardDups.bam METRICS_FILE=map.CLCh001.lib300.bam_sorted.bam.MarkDuplicates REMOVE_DUPLICATES=true ASSUME_SORTED=true MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=80000 TMP_DIR=[tmp] MAX_RECORDS_IN_RAM=10000000 SORTING_COLLECTION_SIZE_RATIO=0.25 READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9][0-9]+)[0-9]+)[0-9]+).* OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 CREATE_INDEX=false CREATE_MD5_FILE=false
[Mon Nov 21 19:11:40 CET 2011] Executing as douglas.scofield@xxxxxxx on Linux 2.6.32-131.17.1.el6.x86_64 amd64; OpenJDK 64-Bit Server VM 1.6.0_20-b20
INFO 2011-11-21 19:11:40 MarkDuplicates Start of doWork freeMemory: 132124215176; totalMemory: 132857659392; maxMemory: 132857659392
INFO 2011-11-21 19:11:40 MarkDuplicates Reading input file and constructing read end information.
INFO 2011-11-21 19:11:40 MarkDuplicates Will retain up to 527212934 data points before spilling to disk.
[Mon Nov 21 21:44:56 CET 2011] net.sf.picard.sam.MarkDuplicates done. Elapsed time: 153.26 minutes.
Runtime.totalMemory()=132857659392
Exception in thread "main" java.lang.OutOfMemoryError: PermGen space
at java.lang.String.intern(Native Method)
at net.sf.samtools.SAMSequenceRecord.<init>(SAMSequenceRecord.java:83)
at net.sf.samtools.SAMTextHeaderCodec.parseSQLine(SAMTextHeaderCodec.java:205)
at net.sf.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:96)
at net.sf.samtools.BAMFileReader.readHeader(BAMFileReader.java:391)
at net.sf.samtools.BAMFileReader.<init>(BAMFileReader.java:144)
at net.sf.samtools.BAMFileReader.<init>(BAMFileReader.java:114)
at net.sf.samtools.SAMFileReader.init(SAMFileReader.java:514)
at net.sf.samtools.SAMFileReader.<init>(SAMFileReader.java:167)
at net.sf.samtools.SAMFileReader.<init>(SAMFileReader.java:122)
at net.sf.picard.sam.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:267)
at net.sf.picard.sam.MarkDuplicates.doWork(MarkDuplicates.java:117) at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:175) at net.sf.picard.sam.MarkDuplicates.main(MarkDuplicates.java:101)
Thought I'd share a new problem... I'm just starting with Picard tools (version 1.56) to estimate redundancy, and have predictably been wrestling with memory issues...
But not with the heap, as I'd expected (and not initially noticed). Instead, I'm running out of PermGen space. One of my .bam's is really large, but it happens even on much smaller .bam's containing single ends of mate pairs.
I increased it to 1g (-XX:PermSize=1g -XX:MaxPermSize=1g), and it still died, though after 2 hrs CPU time rather than 10 minutes as before. I've increased it now to 4g and we'll see how it goes.
Does this point to memory leak issues within Picard tools, that the permanent heap gets this full?? Seems to be way beyond where JVM expects things to be, and I've rarely seen PermGen space problems mentioned, never for Picard tools.
Cheers,
Doug
[Mon Nov 21 19:11:40 CET 2011] net.sf.picard.sam.MarkDuplicates INPUT=map.CLCh001.lib300.bam_sorted.bam OUTPUT=map.CLCh001.lib300.bam_sorted.bam.PicardDups.bam METRICS_FILE=map.CLCh001.lib300.bam_sorted.bam.MarkDuplicates REMOVE_DUPLICATES=true ASSUME_SORTED=true MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=80000 TMP_DIR=[tmp] MAX_RECORDS_IN_RAM=10000000 SORTING_COLLECTION_SIZE_RATIO=0.25 READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9][0-9]+)[0-9]+)[0-9]+).* OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 CREATE_INDEX=false CREATE_MD5_FILE=false
[Mon Nov 21 19:11:40 CET 2011] Executing as douglas.scofield@xxxxxxx on Linux 2.6.32-131.17.1.el6.x86_64 amd64; OpenJDK 64-Bit Server VM 1.6.0_20-b20
INFO 2011-11-21 19:11:40 MarkDuplicates Start of doWork freeMemory: 132124215176; totalMemory: 132857659392; maxMemory: 132857659392
INFO 2011-11-21 19:11:40 MarkDuplicates Reading input file and constructing read end information.
INFO 2011-11-21 19:11:40 MarkDuplicates Will retain up to 527212934 data points before spilling to disk.
[Mon Nov 21 21:44:56 CET 2011] net.sf.picard.sam.MarkDuplicates done. Elapsed time: 153.26 minutes.
Runtime.totalMemory()=132857659392
Exception in thread "main" java.lang.OutOfMemoryError: PermGen space
at java.lang.String.intern(Native Method)
at net.sf.samtools.SAMSequenceRecord.<init>(SAMSequenceRecord.java:83)
at net.sf.samtools.SAMTextHeaderCodec.parseSQLine(SAMTextHeaderCodec.java:205)
at net.sf.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:96)
at net.sf.samtools.BAMFileReader.readHeader(BAMFileReader.java:391)
at net.sf.samtools.BAMFileReader.<init>(BAMFileReader.java:144)
at net.sf.samtools.BAMFileReader.<init>(BAMFileReader.java:114)
at net.sf.samtools.SAMFileReader.init(SAMFileReader.java:514)
at net.sf.samtools.SAMFileReader.<init>(SAMFileReader.java:167)
at net.sf.samtools.SAMFileReader.<init>(SAMFileReader.java:122)
at net.sf.picard.sam.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:267)
at net.sf.picard.sam.MarkDuplicates.doWork(MarkDuplicates.java:117) at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:175) at net.sf.picard.sam.MarkDuplicates.main(MarkDuplicates.java:101)
Comment