Hi,
I am running exome analysis using BWA, samtools, picard, and GATK. When I reached GATK CountCovariates tool, I recieved an error: "Bad input: Could not find any usable data in the input BAM file(s)."
The bam file I used as input for CountCovariates was generated by samtools from BWA sam files. I merged my bam files using picard, used AddorReplaceReadGroups, sorted and indexed using samtools, used Markduplicates to create a dedup.bam file, samtools to index that dedup.bam file, used RealignerTargetCreator, used IndelRealigner to create a realigned.bam and used that realigned.bam as input for the CountCovariate tool.
Any idea what is going on?
Script:
java -Xmx5g -jar /Users/Cable/Bioinformatics/Applications/GenomeAnalysisTK-1.6-11-g3b2fab9/GenomeAnalysisTK.jar -R /Users/Cable/Bioinformatics/GATKbundle1.5/ucsc.hg19.fasta -knownSites /Users/Cable/Bioinformatics/GATKbundle1.5/hg19/dbsnp_135.hg19.vcf -I /Users/Cable/Bioinformatics/exomepipelinefiles/LD-04400.realignedretry.bam -T CountCovariates -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov DinucCovariate -recalFile /Users/Cable/Bioinformatics/exomepipelinefiles/LD-04400.recal_data.csv
INFO 20:19:33,071 HelpFormatter - ---------------------------------------------------------------------------------
INFO 20:19:33,073 HelpFormatter - The Genome Analysis Toolkit (GATK) v1.6-11-g3b2fab9, Compiled 2012/06/05 21:00:10
INFO 20:19:33,074 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 20:19:33,074 HelpFormatter - Please view our documentation at http://www.broadinstitute.org/gsa/wiki
INFO 20:19:33,074 HelpFormatter - For support, please view our support site at http://getsatisfaction.com/gsa
INFO 20:19:33,075 HelpFormatter - Program Args: -R /Users/Cable/Bioinformatics/GATKbundle1.5/ucsc.hg19.fasta -knownSites /Users/Cable/Bioinformatics/GATKbundle1.5/hg19/dbsnp_135.hg19.vcf -I /Users/Cable/Bioinformatics/exomepipelinefiles/LD-04400.realignedretry.bam -T CountCovariates -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov DinucCovariate -recalFile /Users/Cable/Bioinformatics/exomepipelinefiles/LD-04400.recal_data.csv
INFO 20:19:33,075 HelpFormatter - Date/Time: 2012/07/11 20:19:33
INFO 20:19:33,075 HelpFormatter - ---------------------------------------------------------------------------------
INFO 20:19:33,075 HelpFormatter - ---------------------------------------------------------------------------------
INFO 20:19:33,088 RodBindingArgumentTypeDescriptor - Dynamically determined type of /Users/Cable/Bioinformatics/GATKbundle1.5/hg19/dbsnp_135.hg19.vcf to be VCF
INFO 20:19:33,114 GenomeAnalysisEngine - Strictness is SILENT
INFO 20:19:33,296 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 20:19:33,341 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.04
INFO 20:19:33,358 RMDTrackBuilder - Loading Tribble index from disk for file /Users/Cable/Bioinformatics/GATKbundle1.5/hg19/dbsnp_135.hg19.vcf
INFO 20:19:34,908 CountCovariatesWalker - The covariates being used here:
INFO 20:19:34,908 CountCovariatesWalker - ReadGroupCovariate
INFO 20:19:34,908 CountCovariatesWalker - QualityScoreCovariate
INFO 20:19:34,908 CountCovariatesWalker - CycleCovariate
INFO 20:19:34,909 CountCovariatesWalker - DinucCovariate
INFO 20:19:35,219 CountCovariatesWalker - Writing raw recalibration data...
INFO 20:19:36,952 GATKRunReport - Uploaded run statistics report to AWS S3
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 1.6-11-g3b2fab9):
##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
##### ERROR Please do not post this error to the GATK forum
##### ERROR
##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
##### ERROR Visit our wiki for extensive documentation http://www.broadinstitute.org/gsa/wiki
##### ERROR Visit our forum to view answers to commonly asked questions http://getsatisfaction.com/gsa
##### ERROR
##### ERROR MESSAGE: Bad input: Could not find any usable data in the input BAM file(s).
My sam file is as follows:
@HD VN:1.0 GO:none SO:coordinate
@SQ SN:chrM LN:16571
@SQ SN:chr1 LN:249250621
@SQ SN:chr2 LN:243199373
@SQ SN:chr3 LN:198022430
@SQ SN:chr4 LN:191154276
@SQ SN:chr5 LN:180915260
@SQ SN:chr6 LN:171115067
@SQ SN:chr7 LN:159138663
@SQ SN:chr8 LN:146364022
@SQ SN:chr9 LN:141213431
@SQ SN:chr10 LN:135534747
@SQ SN:chr11 LN:135006516
@SQ SN:chr12 LN:133851895
@SQ SN:chr13 LN:115169878
@SQ SN:chr14 LN:107349540
@SQ SN:chr15 LN:102531392
@SQ SN:chr16 LN:90354753
@SQ SN:chr17 LN:81195210
@SQ SN:chr18 LN:78077248
@SQ SN:chr19 LN:59128983
@SQ SN:chr20 LN:63025520
@SQ SN:chr21 LN:48129895
@SQ SN:chr22 LN:51304566
@SQ SN:chrX LN:155270560
@SQ SN:chrY LN:59373566
@SQ SN:chr1_gl000191_random LN:106433
@SQ SN:chr1_gl000192_random LN:547496
@SQ SN:chr4_ctg9_hap1 LN:590426
@SQ SN:chr4_gl000193_random LN:189789
@SQ SN:chr4_gl000194_random LN:191469
@SQ SN:chr6_apd_hap1 LN:4622290
@SQ SN:chr6_cox_hap2 LN:4795371
@SQ SN:chr6_dbb_hap3 LN:4610396
@SQ SN:chr6_mann_hap4 LN:4683263
@SQ SN:chr6_mcf_hap5 LN:4833398
@SQ SN:chr6_qbl_hap6 LN:4611984
@SQ SN:chr6_ssto_hap7 LN:4928567
@SQ SN:chr7_gl000195_random LN:182896
@SQ SN:chr8_gl000196_random LN:38914
@SQ SN:chr8_gl000197_random LN:37175
@SQ SN:chr9_gl000198_random LN:90085
@SQ SN:chr9_gl000199_random LN:169874
@SQ SN:chr9_gl000200_random LN:187035
@SQ SN:chr9_gl000201_random LN:36148
@SQ SN:chr11_gl000202_random LN:40103
@SQ SN:chr17_ctg5_hap1 LN:1680828
@SQ SN:chr17_gl000203_random LN:37498
@SQ SN:chr17_gl000204_random LN:81310
@SQ SN:chr17_gl000205_random LN:174588
@SQ SN:chr17_gl000206_random LN:41001
@SQ SN:chr18_gl000207_random LN:4262
@SQ SN:chr19_gl000208_random LN:92689
@SQ SN:chr19_gl000209_random LN:159169
@SQ SN:chr21_gl000210_random LN:27682
@SQ SN:chrUn_gl000211 LN:166566
@SQ SN:chrUn_gl000212 LN:186858
@SQ SN:chrUn_gl000213 LN:164239
@SQ SN:chrUn_gl000214 LN:137718
@SQ SN:chrUn_gl000215 LN:172545
@SQ SN:chrUn_gl000216 LN:172294
@SQ SN:chrUn_gl000217 LN:172149
@SQ SN:chrUn_gl000218 LN:161147
@SQ SN:chrUn_gl000219 LN:179198
@SQ SN:chrUn_gl000220 LN:161802
@SQ SN:chrUn_gl000221 LN:155397
@SQ SN:chrUn_gl000222 LN:186861
@SQ SN:chrUn_gl000223 LN:180455
@SQ SN:chrUn_gl000224 LN:179693
@SQ SN:chrUn_gl000225 LN:211173
@SQ SN:chrUn_gl000226 LN:15008
@SQ SN:chrUn_gl000227 LN:128374
@SQ SN:chrUn_gl000228 LN:129120
@SQ SN:chrUn_gl000229 LN:19913
@SQ SN:chrUn_gl000230 LN:43691
@SQ SN:chrUn_gl000231 LN:27386
@SQ SN:chrUn_gl000232 LN:40652
@SQ SN:chrUn_gl000233 LN:45941
@SQ SN:chrUn_gl000234 LN:40531
@SQ SN:chrUn_gl000235 LN:34474
@SQ SN:chrUn_gl000236 LN:41934
@SQ SN:chrUn_gl000237 LN:45867
@SQ SN:chrUn_gl000238 LN:39939
@SQ SN:chrUn_gl000239 LN:33824
@SQ SN:chrUn_gl000240 LN:41933
@SQ SN:chrUn_gl000241 LN:42152
@SQ SN:chrUn_gl000242 LN:43523
@SQ SN:chrUn_gl000243 LN:43341
@SQ SN:chrUn_gl000244 LN:39929
@SQ SN:chrUn_gl000245 LN:36651
@SQ SN:chrUn_gl000246 LN:38154
@SQ SN:chrUn_gl000247 LN:36422
@SQ SN:chrUn_gl000248 LN:39786
@SQ SN:chrUn_gl000249 LN:38502
@RG ID:vanishing PL:illumina PU:matter LB:white SM:VWM_04400
@PG ID:bwa PN:bwa VN:0.6.2-r126
@PG ID:GATK IndelRealigner VN:1.6-11-g3b2fab9 CL:knownAlleles=[(RodBinding name=knownAlleles source=/Users/Cable/Bioinformatics/GATKbundle1.5/hg19/1000G_phase1.indels.hg19.vcf)] targetIntervals=/Users/Cable/Bioinformatics/GATKbundle1.5/output.intervals LODThresholdForCleaning=5.0 consensusDeterminationModel=USE_READS entropyThreshold=0.15 maxReadsInMemory=150000 maxIsizeForMovement=3000 maxPositionalMoveAllowed=200 maxConsensuses=30 maxReadsForConsensuses=120 maxReadsForRealignment=20000 noOriginalAlignmentTags=false nWayOut=null generate_nWayOut_md5s=false check_early=false noPGTag=false keepPGTags=false indelsFileForDebugging=null statisticsFileForDebugging=null SNPsFileForDebugging=null
HWI-H173:16109L7ACXX:5:1101:10000:100567 77 * 0 0 * * 0 0 TAGTTTCTTTTTCATTCCTGCTCCCTGCCTTAACTCCTCCTCCCACTGCCCCTGATCCCABCCDDDFEFHGHHIJJJJIECIJIJIFHIIJIIJIIJJJJJJIIJJJAHH8BFGHIIIJI RG:Z:vanishing
I'm kind of stuck here and don't really know why I am getting this error. I can't find this error in the forum. Any help would be much appreciated.
Thanks,
Nathan
I am running exome analysis using BWA, samtools, picard, and GATK. When I reached GATK CountCovariates tool, I recieved an error: "Bad input: Could not find any usable data in the input BAM file(s)."
The bam file I used as input for CountCovariates was generated by samtools from BWA sam files. I merged my bam files using picard, used AddorReplaceReadGroups, sorted and indexed using samtools, used Markduplicates to create a dedup.bam file, samtools to index that dedup.bam file, used RealignerTargetCreator, used IndelRealigner to create a realigned.bam and used that realigned.bam as input for the CountCovariate tool.
Any idea what is going on?
Script:
java -Xmx5g -jar /Users/Cable/Bioinformatics/Applications/GenomeAnalysisTK-1.6-11-g3b2fab9/GenomeAnalysisTK.jar -R /Users/Cable/Bioinformatics/GATKbundle1.5/ucsc.hg19.fasta -knownSites /Users/Cable/Bioinformatics/GATKbundle1.5/hg19/dbsnp_135.hg19.vcf -I /Users/Cable/Bioinformatics/exomepipelinefiles/LD-04400.realignedretry.bam -T CountCovariates -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov DinucCovariate -recalFile /Users/Cable/Bioinformatics/exomepipelinefiles/LD-04400.recal_data.csv
INFO 20:19:33,071 HelpFormatter - ---------------------------------------------------------------------------------
INFO 20:19:33,073 HelpFormatter - The Genome Analysis Toolkit (GATK) v1.6-11-g3b2fab9, Compiled 2012/06/05 21:00:10
INFO 20:19:33,074 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 20:19:33,074 HelpFormatter - Please view our documentation at http://www.broadinstitute.org/gsa/wiki
INFO 20:19:33,074 HelpFormatter - For support, please view our support site at http://getsatisfaction.com/gsa
INFO 20:19:33,075 HelpFormatter - Program Args: -R /Users/Cable/Bioinformatics/GATKbundle1.5/ucsc.hg19.fasta -knownSites /Users/Cable/Bioinformatics/GATKbundle1.5/hg19/dbsnp_135.hg19.vcf -I /Users/Cable/Bioinformatics/exomepipelinefiles/LD-04400.realignedretry.bam -T CountCovariates -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov DinucCovariate -recalFile /Users/Cable/Bioinformatics/exomepipelinefiles/LD-04400.recal_data.csv
INFO 20:19:33,075 HelpFormatter - Date/Time: 2012/07/11 20:19:33
INFO 20:19:33,075 HelpFormatter - ---------------------------------------------------------------------------------
INFO 20:19:33,075 HelpFormatter - ---------------------------------------------------------------------------------
INFO 20:19:33,088 RodBindingArgumentTypeDescriptor - Dynamically determined type of /Users/Cable/Bioinformatics/GATKbundle1.5/hg19/dbsnp_135.hg19.vcf to be VCF
INFO 20:19:33,114 GenomeAnalysisEngine - Strictness is SILENT
INFO 20:19:33,296 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 20:19:33,341 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.04
INFO 20:19:33,358 RMDTrackBuilder - Loading Tribble index from disk for file /Users/Cable/Bioinformatics/GATKbundle1.5/hg19/dbsnp_135.hg19.vcf
INFO 20:19:34,908 CountCovariatesWalker - The covariates being used here:
INFO 20:19:34,908 CountCovariatesWalker - ReadGroupCovariate
INFO 20:19:34,908 CountCovariatesWalker - QualityScoreCovariate
INFO 20:19:34,908 CountCovariatesWalker - CycleCovariate
INFO 20:19:34,909 CountCovariatesWalker - DinucCovariate
INFO 20:19:35,219 CountCovariatesWalker - Writing raw recalibration data...
INFO 20:19:36,952 GATKRunReport - Uploaded run statistics report to AWS S3
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 1.6-11-g3b2fab9):
##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
##### ERROR Please do not post this error to the GATK forum
##### ERROR
##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
##### ERROR Visit our wiki for extensive documentation http://www.broadinstitute.org/gsa/wiki
##### ERROR Visit our forum to view answers to commonly asked questions http://getsatisfaction.com/gsa
##### ERROR
##### ERROR MESSAGE: Bad input: Could not find any usable data in the input BAM file(s).
My sam file is as follows:
@HD VN:1.0 GO:none SO:coordinate
@SQ SN:chrM LN:16571
@SQ SN:chr1 LN:249250621
@SQ SN:chr2 LN:243199373
@SQ SN:chr3 LN:198022430
@SQ SN:chr4 LN:191154276
@SQ SN:chr5 LN:180915260
@SQ SN:chr6 LN:171115067
@SQ SN:chr7 LN:159138663
@SQ SN:chr8 LN:146364022
@SQ SN:chr9 LN:141213431
@SQ SN:chr10 LN:135534747
@SQ SN:chr11 LN:135006516
@SQ SN:chr12 LN:133851895
@SQ SN:chr13 LN:115169878
@SQ SN:chr14 LN:107349540
@SQ SN:chr15 LN:102531392
@SQ SN:chr16 LN:90354753
@SQ SN:chr17 LN:81195210
@SQ SN:chr18 LN:78077248
@SQ SN:chr19 LN:59128983
@SQ SN:chr20 LN:63025520
@SQ SN:chr21 LN:48129895
@SQ SN:chr22 LN:51304566
@SQ SN:chrX LN:155270560
@SQ SN:chrY LN:59373566
@SQ SN:chr1_gl000191_random LN:106433
@SQ SN:chr1_gl000192_random LN:547496
@SQ SN:chr4_ctg9_hap1 LN:590426
@SQ SN:chr4_gl000193_random LN:189789
@SQ SN:chr4_gl000194_random LN:191469
@SQ SN:chr6_apd_hap1 LN:4622290
@SQ SN:chr6_cox_hap2 LN:4795371
@SQ SN:chr6_dbb_hap3 LN:4610396
@SQ SN:chr6_mann_hap4 LN:4683263
@SQ SN:chr6_mcf_hap5 LN:4833398
@SQ SN:chr6_qbl_hap6 LN:4611984
@SQ SN:chr6_ssto_hap7 LN:4928567
@SQ SN:chr7_gl000195_random LN:182896
@SQ SN:chr8_gl000196_random LN:38914
@SQ SN:chr8_gl000197_random LN:37175
@SQ SN:chr9_gl000198_random LN:90085
@SQ SN:chr9_gl000199_random LN:169874
@SQ SN:chr9_gl000200_random LN:187035
@SQ SN:chr9_gl000201_random LN:36148
@SQ SN:chr11_gl000202_random LN:40103
@SQ SN:chr17_ctg5_hap1 LN:1680828
@SQ SN:chr17_gl000203_random LN:37498
@SQ SN:chr17_gl000204_random LN:81310
@SQ SN:chr17_gl000205_random LN:174588
@SQ SN:chr17_gl000206_random LN:41001
@SQ SN:chr18_gl000207_random LN:4262
@SQ SN:chr19_gl000208_random LN:92689
@SQ SN:chr19_gl000209_random LN:159169
@SQ SN:chr21_gl000210_random LN:27682
@SQ SN:chrUn_gl000211 LN:166566
@SQ SN:chrUn_gl000212 LN:186858
@SQ SN:chrUn_gl000213 LN:164239
@SQ SN:chrUn_gl000214 LN:137718
@SQ SN:chrUn_gl000215 LN:172545
@SQ SN:chrUn_gl000216 LN:172294
@SQ SN:chrUn_gl000217 LN:172149
@SQ SN:chrUn_gl000218 LN:161147
@SQ SN:chrUn_gl000219 LN:179198
@SQ SN:chrUn_gl000220 LN:161802
@SQ SN:chrUn_gl000221 LN:155397
@SQ SN:chrUn_gl000222 LN:186861
@SQ SN:chrUn_gl000223 LN:180455
@SQ SN:chrUn_gl000224 LN:179693
@SQ SN:chrUn_gl000225 LN:211173
@SQ SN:chrUn_gl000226 LN:15008
@SQ SN:chrUn_gl000227 LN:128374
@SQ SN:chrUn_gl000228 LN:129120
@SQ SN:chrUn_gl000229 LN:19913
@SQ SN:chrUn_gl000230 LN:43691
@SQ SN:chrUn_gl000231 LN:27386
@SQ SN:chrUn_gl000232 LN:40652
@SQ SN:chrUn_gl000233 LN:45941
@SQ SN:chrUn_gl000234 LN:40531
@SQ SN:chrUn_gl000235 LN:34474
@SQ SN:chrUn_gl000236 LN:41934
@SQ SN:chrUn_gl000237 LN:45867
@SQ SN:chrUn_gl000238 LN:39939
@SQ SN:chrUn_gl000239 LN:33824
@SQ SN:chrUn_gl000240 LN:41933
@SQ SN:chrUn_gl000241 LN:42152
@SQ SN:chrUn_gl000242 LN:43523
@SQ SN:chrUn_gl000243 LN:43341
@SQ SN:chrUn_gl000244 LN:39929
@SQ SN:chrUn_gl000245 LN:36651
@SQ SN:chrUn_gl000246 LN:38154
@SQ SN:chrUn_gl000247 LN:36422
@SQ SN:chrUn_gl000248 LN:39786
@SQ SN:chrUn_gl000249 LN:38502
@RG ID:vanishing PL:illumina PU:matter LB:white SM:VWM_04400
@PG ID:bwa PN:bwa VN:0.6.2-r126
@PG ID:GATK IndelRealigner VN:1.6-11-g3b2fab9 CL:knownAlleles=[(RodBinding name=knownAlleles source=/Users/Cable/Bioinformatics/GATKbundle1.5/hg19/1000G_phase1.indels.hg19.vcf)] targetIntervals=/Users/Cable/Bioinformatics/GATKbundle1.5/output.intervals LODThresholdForCleaning=5.0 consensusDeterminationModel=USE_READS entropyThreshold=0.15 maxReadsInMemory=150000 maxIsizeForMovement=3000 maxPositionalMoveAllowed=200 maxConsensuses=30 maxReadsForConsensuses=120 maxReadsForRealignment=20000 noOriginalAlignmentTags=false nWayOut=null generate_nWayOut_md5s=false check_early=false noPGTag=false keepPGTags=false indelsFileForDebugging=null statisticsFileForDebugging=null SNPsFileForDebugging=null
HWI-H173:16109L7ACXX:5:1101:10000:100567 77 * 0 0 * * 0 0 TAGTTTCTTTTTCATTCCTGCTCCCTGCCTTAACTCCTCCTCCCACTGCCCCTGATCCCABCCDDDFEFHGHHIJJJJIECIJIJIFHIIJIIJIIJJJJJJIIJJJAHH8BFGHIIIJI RG:Z:vanishing
I'm kind of stuck here and don't really know why I am getting this error. I can't find this error in the forum. Any help would be much appreciated.
Thanks,
Nathan