Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • missing the read group (RG) tag, which is required by the GATK.

    I am new to GATK and BAM file. I want to convert bam file into vcf file by running
    Code:
    java -jar /media/zwang10/Elements/UK10K/GenomeAnalysisTK-3.5/GenomeAnalysisTK.jar -R /media/zwang10/Elements/UK10K/human_g1k_v37.fasta -T HaplotypeCaller -I _EGAR00001038931_36843.pe.raw.sorted.bam --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -o raw_variants.vcf
    But I got
    Code:
    INFO  19:14:00,776 HelpFormatter - -------------------------------------------------------------------------------- 
    INFO  19:14:00,783 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.5-0-g36282e4, Compiled 2015/11/25 04:03:56 
    INFO  19:14:00,784 HelpFormatter - Copyright (c) 2010 The Broad Institute 
    INFO  19:14:00,785 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk 
    INFO  19:14:00,793 HelpFormatter - Program Args: -R /media/zwang10/Elements/UK10K/human_g1k_v37.fasta -T HaplotypeCaller -I _EGAR00001038931_36843.pe.raw.sorted.bam --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -o raw_variants.vcf 
    INFO  19:14:00,806 HelpFormatter - Executing as zwang10@zwang10-K55N on Linux 3.13.0-74-generic amd64; OpenJDK 64-Bit Server VM 1.7.0_91-b02. 
    INFO  19:14:00,807 HelpFormatter - Date/Time: 2016/01/03 19:14:00 
    INFO  19:14:00,808 HelpFormatter - -------------------------------------------------------------------------------- 
    INFO  19:14:00,808 HelpFormatter - -------------------------------------------------------------------------------- 
    INFO  19:14:01,199 GenomeAnalysisEngine - Strictness is SILENT 
    INFO  19:14:01,500 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 500 
    INFO  19:14:01,517 SAMDataSource$SAMReaders - Initializing SAMRecords in serial 
    INFO  19:14:01,668 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.15 
    INFO  19:14:01,739 HCMappingQualityFilter - Filtering out reads with MAPQ < 20 
    INFO  19:14:01,982 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files 
    INFO  19:14:03,265 GenomeAnalysisEngine - Done preparing for traversal 
    INFO  19:14:03,266 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] 
    INFO  19:14:03,268 ProgressMeter -                 |      processed |    time |         per 1M |           |   total | remaining 
    INFO  19:14:03,269 ProgressMeter -        Location | active regions | elapsed | active regions | completed | runtime |   runtime 
    INFO  19:14:03,270 HaplotypeCaller - Disabling physical phasing, which is supported only for reference-model confidence output 
    INFO  19:14:03,390 StrandBiasTest - SAM/BAM data was found. Attempting to use read data to calculate strand bias annotations values. 
    WARN  19:14:03,391 InbreedingCoeff - Annotation will not be calculated. InbreedingCoeff requires at least 10 unrelated samples. 
    INFO  19:14:03,393 StrandBiasTest - SAM/BAM data was found. Attempting to use read data to calculate strand bias annotations values. 
    INFO  19:14:03,675 HaplotypeCaller - Using global mismapping rate of 45 => -4.5 in log10 likelihood units 
    INFO  19:14:08,680 GATKRunReport - Uploaded run statistics report to AWS S3 
    ##### ERROR ------------------------------------------------------------------------------------------
    ##### ERROR A USER ERROR has occurred (version 3.5-0-g36282e4): 
    ##### ERROR
    ##### ERROR This means that one or more arguments or inputs in your command are incorrect.
    ##### ERROR The error message below tells you what is the problem.
    ##### ERROR
    ##### ERROR If the problem is an invalid argument, please check the online documentation guide
    ##### ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
    ##### ERROR
    ##### ERROR Visit our website and forum for extensive documentation and answers to 
    ##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ##### ERROR
    ##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
    ##### ERROR
    ##### ERROR MESSAGE: SAM/BAM/CRAM file htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter@5c0bb1d5 is malformed. Please see http://gatkforums.broadinstitute.org/discussion/1317/collected-faqs-about-input-files-for-sequence-read-data-bam-cramfor more information. Error details: Read FCC03A6ABXX:3:2107:11142:198335#TAGCTTAT is missing the read group (RG) tag, which is required by the GATK. Please see http://gatkforums.broadinstitute.org/discussion/59/companion-utilities-replacereadgroups to fix this problem
    ##### ERROR ------------------------------------------------------------------------------------------
    Why does this bam file miss RG tag?
    Can some one tell me how to add read group tag to
    Code:
    Read FCC03A6ABXX:3:2107:11142:198335#TAGCTTAT

  • #2
    Similar question covered in this thread: http://seqanswers.com/forums/showthread.php?t=65293

    Comment


    • #3
      This is addressed in the GATK documentation:



      Feel free to ask any other GATK-related questions in the GATK forum; we're there to help.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM
      • seqadmin
        Techniques and Challenges in Conservation Genomics
        by seqadmin



        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

        Avian Conservation
        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
        03-08-2024, 10:41 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 06:37 PM
      0 responses
      10 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 06:07 PM
      0 responses
      9 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-22-2024, 10:03 AM
      0 responses
      49 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-21-2024, 07:32 AM
      0 responses
      67 views
      0 likes
      Last Post seqadmin  
      Working...
      X