![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Picard: problem with Java? | aleidenroth | Bioinformatics | 9 | 06-29-2013 07:37 PM |
Picard Collect Insert Size Java Problem | chongm | Bioinformatics | 0 | 02-11-2013 11:01 AM |
using snappy java in picard tools | doc.ramses | Bioinformatics | 3 | 10-04-2011 07:17 AM |
how to check whether a bam fille is sorted using picard in java | jay2008 | Bioinformatics | 0 | 05-23-2011 03:14 PM |
Picard MarkDuplicates throws a java.lang.NegativeArraySizeException | rdeborja | Bioinformatics | 0 | 01-21-2011 08:28 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: Houston Join Date: Oct 2014
Posts: 3
|
![]()
I am developing a pipeline for calling SNPs in RNAseq data (based on Piskol R, Ramaswami G, Li JB. Reliable identification of genomic variants from RNA-seq data. Am J Hum Genet. 2013 Oct 3;93(4):641-51.).
Now that I am scaling up to my real dataset, I would like my pipeline to run faster. The pipeline involves uses Picard Tools v1.84 (ReorderSam, MarkDuplicates, and BuildBamIndex) and GenomeAnalysisTK v2.3-9-ge5ebf34 (RealignerTargetCreator, IndelRealigner, TableRecalibration, and UnifiedGenotyper). I am working on a computer with 8 cores and 64Gb of memory. Is there a way to run Picard Tools and GATK on all 8 cores? I've searched for general Java command line options and specific Picard Tools and GATK options, to no avail. Thanks Josh Last edited by JoshT; 10-16-2014 at 10:15 AM. Reason: formatting |
![]() |
![]() |
![]() |
#2 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,480
|
![]()
Many of the GATK tools can use multiple cores (see the -nt option). For picard, I think it's mostly single-threaded. Of course, you can also just run picard on individual samples in parallel (likewise with GATK).
|
![]() |
![]() |
![]() |
#3 |
Member
Location: Taiwan Join Date: Feb 2011
Posts: 19
|
![]()
Sambamba (http://lomereiter.github.io/sambamba/) provides some useful multi-threaded utilities like view, sort, mark duplicate, etc. We have taken advantage of this to speed up our process.
|
![]() |
![]() |
![]() |
#5 |
Junior Member
Location: Houston Join Date: Oct 2014
Posts: 3
|
![]()
adamyao, thanks for the link. I'll check out that program.
|
![]() |
![]() |
![]() |
#6 |
Member
Location: Sri Lanka Join Date: Mar 2014
Posts: 19
|
![]()
Just a reminder : you can use java -Xmx[your memory]G to get the maximum use of it. In your case you can use
java -Xmx60G -jar /picard-tools/ ... (Leave 4GB for background processes) |
![]() |
![]() |
![]() |
#7 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,092
|
![]()
Intel had presented some work they have done with NGS/GATK but I am not sure if the optimized code is available: http://bioinformatics.gatech.edu/sit...Processing.pdf
|
![]() |
![]() |
![]() |
#8 |
Member
Location: Germany Join Date: Feb 2010
Posts: 32
|
![]()
On GATK it is possible on many tools to use the -nt or -nct options für using multiple threads. However, my advice is to go through every step with a benchmarking dataset to find where you get the best data per time ratio and maybe rather analyze multiple datasets at once than wase time by using more cores than it is beneficial for each step.
|
![]() |
![]() |
![]() |
Tags |
gatk, java, multi-core, multicore, picard |
Thread Tools | |
|
|