I am new to bam file and GATK tools. I want to convert bam into vcf by running
But I got
Is there a way to add the missing RG tag?
Code:
java -jar /media/zwang10/Elements/UK10K/GenomeAnalysisTK-3.5/GenomeAnalysisTK.jar -R /media/zwang10/Elements/UK10K/human_g1k_v37.fasta -T HaplotypeCaller -I _EGAR00001038931_36843.pe.raw.sorted.bam --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -o raw_variants.vcf
But I got
Code:
INFO 19:11:19,792 HelpFormatter - -------------------------------------------------------------------------------- INFO 19:11:19,798 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.5-0-g36282e4, Compiled 2015/11/25 04:03:56 INFO 19:11:19,798 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 19:11:19,799 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 19:11:19,807 HelpFormatter - Program Args: -R /media/zwang10/Elements/UK10K/human_g1k_v37.fasta -T HaplotypeCaller -I _EGAR00001038931_36843.pe.raw.sorted.bam --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -o raw_variants.vcf INFO 19:11:19,820 HelpFormatter - Executing as zwang10@zwang10-K55N on Linux 3.13.0-74-generic amd64; OpenJDK 64-Bit Server VM 1.7.0_91-b02. INFO 19:11:19,821 HelpFormatter - Date/Time: 2016/01/03 19:11:19 INFO 19:11:19,822 HelpFormatter - -------------------------------------------------------------------------------- INFO 19:11:19,823 HelpFormatter - -------------------------------------------------------------------------------- INFO 19:11:20,220 GenomeAnalysisEngine - Strictness is SILENT INFO 19:11:20,537 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 500 INFO 19:11:20,554 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 19:11:20,783 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.23 INFO 19:11:20,887 HCMappingQualityFilter - Filtering out reads with MAPQ < 20 INFO 19:11:21,120 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files INFO 19:11:22,436 GenomeAnalysisEngine - Done preparing for traversal INFO 19:11:22,437 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 19:11:22,439 ProgressMeter - | processed | time | per 1M | | total | remaining INFO 19:11:22,440 ProgressMeter - Location | active regions | elapsed | active regions | completed | runtime | runtime INFO 19:11:22,441 HaplotypeCaller - Disabling physical phasing, which is supported only for reference-model confidence output INFO 19:11:22,562 StrandBiasTest - SAM/BAM data was found. Attempting to use read data to calculate strand bias annotations values. WARN 19:11:22,563 InbreedingCoeff - Annotation will not be calculated. InbreedingCoeff requires at least 10 unrelated samples. INFO 19:11:22,565 StrandBiasTest - SAM/BAM data was found. Attempting to use read data to calculate strand bias annotations values. INFO 19:11:22,930 HaplotypeCaller - Using global mismapping rate of 45 => -4.5 in log10 likelihood units INFO 19:11:27,999 GATKRunReport - Uploaded run statistics report to AWS S3 ##### ERROR ------------------------------------------------------------------------------------------ ##### ERROR A USER ERROR has occurred (version 3.5-0-g36282e4): ##### ERROR ##### ERROR This means that one or more arguments or inputs in your command are incorrect. ##### ERROR The error message below tells you what is the problem. ##### ERROR ##### ERROR If the problem is an invalid argument, please check the online documentation guide ##### ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool. ##### ERROR ##### ERROR Visit our website and forum for extensive documentation and answers to ##### ERROR commonly asked questions http://www.broadinstitute.org/gatk ##### ERROR ##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself. ##### ERROR ##### ERROR MESSAGE: SAM/BAM/CRAM file htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter@51762faf is malformed. Please see http://gatkforums.broadinstitute.org/discussion/1317/collected-faqs-about-input-files-for-sequence-read-data-bam-cramfor more information. Error details: Read FCC03A6ABXX:3:2107:11142:198335#TAGCTTAT is missing the read group (RG) tag, which is required by the GATK. Please see http://gatkforums.broadinstitute.org/discussion/59/companion-utilities-replacereadgroups to fix this problem ##### ERROR ------------------------------------------------------------------------------------------ zwang10@zwang10-K55N:/media/zwang10/Elements/UK10K$ java -jar /media/zwang10/Elements/UK10K/GenomeAnalysisTK-3.5/GenomeAnalysisTK.jar -R /media/zwang10/Elements/UK10K/human_g1k_v37.fasta -T HaplotypeCaller -I _EGAR00001038931_36843.pe.raw.sorted.bam --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -o raw_variants.vcf^C zwang10@zwang10-K55N:/media/zwang10/Elements/UK10K$ java -jar /media/zwang10/Elements/UK10K/GenomeAnalysisTK-3.5/GenomeAnalysisTK.jar -R /media/zwang10/Elements/UK10K/human_g1k_v37.fasta -T HaplotypeCaller -I _EGAR00001038931_36843.pe.raw.sorted.bam --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -o raw_variants.vcf > error INFO 19:14:00,776 HelpFormatter - -------------------------------------------------------------------------------- INFO 19:14:00,783 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.5-0-g36282e4, Compiled 2015/11/25 04:03:56 INFO 19:14:00,784 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 19:14:00,785 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 19:14:00,793 HelpFormatter - Program Args: -R /media/zwang10/Elements/UK10K/human_g1k_v37.fasta -T HaplotypeCaller -I _EGAR00001038931_36843.pe.raw.sorted.bam --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -o raw_variants.vcf INFO 19:14:00,806 HelpFormatter - Executing as zwang10@zwang10-K55N on Linux 3.13.0-74-generic amd64; OpenJDK 64-Bit Server VM 1.7.0_91-b02. INFO 19:14:00,807 HelpFormatter - Date/Time: 2016/01/03 19:14:00 INFO 19:14:00,808 HelpFormatter - -------------------------------------------------------------------------------- INFO 19:14:00,808 HelpFormatter - -------------------------------------------------------------------------------- INFO 19:14:01,199 GenomeAnalysisEngine - Strictness is SILENT INFO 19:14:01,500 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 500 INFO 19:14:01,517 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 19:14:01,668 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.15 INFO 19:14:01,739 HCMappingQualityFilter - Filtering out reads with MAPQ < 20 INFO 19:14:01,982 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files INFO 19:14:03,265 GenomeAnalysisEngine - Done preparing for traversal INFO 19:14:03,266 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 19:14:03,268 ProgressMeter - | processed | time | per 1M | | total | remaining INFO 19:14:03,269 ProgressMeter - Location | active regions | elapsed | active regions | completed | runtime | runtime INFO 19:14:03,270 HaplotypeCaller - Disabling physical phasing, which is supported only for reference-model confidence output INFO 19:14:03,390 StrandBiasTest - SAM/BAM data was found. Attempting to use read data to calculate strand bias annotations values. WARN 19:14:03,391 InbreedingCoeff - Annotation will not be calculated. InbreedingCoeff requires at least 10 unrelated samples. INFO 19:14:03,393 StrandBiasTest - SAM/BAM data was found. Attempting to use read data to calculate strand bias annotations values. INFO 19:14:03,675 HaplotypeCaller - Using global mismapping rate of 45 => -4.5 in log10 likelihood units INFO 19:14:08,680 GATKRunReport - Uploaded run statistics report to AWS S3 ##### ERROR ------------------------------------------------------------------------------------------ ##### ERROR A USER ERROR has occurred (version 3.5-0-g36282e4): ##### ERROR ##### ERROR This means that one or more arguments or inputs in your command are incorrect. ##### ERROR The error message below tells you what is the problem. ##### ERROR ##### ERROR If the problem is an invalid argument, please check the online documentation guide ##### ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool. ##### ERROR ##### ERROR Visit our website and forum for extensive documentation and answers to ##### ERROR commonly asked questions http://www.broadinstitute.org/gatk ##### ERROR ##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself. ##### ERROR ##### ERROR MESSAGE: SAM/BAM/CRAM file htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter@5c0bb1d5 is malformed. Please see http://gatkforums.broadinstitute.org/discussion/1317/collected-faqs-about-input-files-for-sequence-read-data-bam-cramfor more information. Error details: Read FCC03A6ABXX:3:2107:11142:198335#TAGCTTAT is missing the read group (RG) tag, which is required by the GATK. Please see http://gatkforums.broadinstitute.org/discussion/59/companion-utilities-replacereadgroups to fix this problem ##### ERROR ------------------------------------------------------------------------------------------
Comment