Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • missing header information in bam cause GATK unifiedgenotyper fail

    Hi, I am this mitochondria sequence data, aligned using only chrM reference using BWA, and aligned against HG19 using bwa.

    The bam file header looks like
    @SQ SN:chrM LN:16569

    When I run gatk unifiedgenotyper, got this error message:
    ##### ERROR MESSAGE: SAM/BAM file SAMFileReader{/home3/guoy1/CaiMitochondria/analysis_chrM/SNPcall/../out.bam} is malformed: Read HWUSI-EAS614_0001:5:1:1051:19990#0 is missing the read group, which is required by the GATK

    But when I use the the bam file from HG19
    The header looks like
    @SQ SN:chr1 LN:249250621
    @SQ SN:chr2 LN:243199373
    @SQ SN:chr3 LN:198022430
    @SQ SN:chr4 LN:191154276
    @SQ SN:chr5 LN:180915260
    @SQ SN:chr6 LN:171115067
    @SQ SN:chr7 LN:159138663
    @SQ SN:chr8 LN:146364022
    @SQ SN:chr9 LN:141213431
    @SQ SN:chr10 LN:135534747
    @SQ SN:chr11 LN:135006516
    @SQ SN:chr12 LN:133851895
    @SQ SN:chr13 LN:115169878
    @SQ SN:chr14 LN:107349540
    @SQ SN:chr15 LN:102531392
    @SQ SN:chr16 LN:90354753
    @SQ SN:chr17 LN:81195210
    @SQ SN:chr18 LN:78077248
    @SQ SN:chr19 LN:59128983
    @SQ SN:chr20 LN:63025520
    @SQ SN:chr21 LN:48129895
    @SQ SN:chr22 LN:51304566
    @SQ SN:chrX LN:155270560
    @SQ SN:chrY LN:59373566
    @SQ SN:chrM LN:16571

    And no error for unifiedgenotyper. Both bam files don't have read group information in the headers, why would unifiedgenotyper complain about the chrM only one?

    Anyone know anything please let me know.

    Thanks

  • #2
    Is this related to this: "BWA patch to generate read group"

    Comment


    • #3
      My advice is to use the pre-produced genomes in the gatk_resources.tgz file you can obtain for their site. The program is too finicky if you try using something else in my experience. Also, you do need RG flags for it to work properly regardless. The link Ishen contributed explains how to add them downstream if you didn't add them when you did alignment. Good luck!
      Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
      Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
      Projects: U87MG whole genome sequence [Website] [Paper]

      Comment


      • #4
        Originally posted by lshen View Post
        Is this related to this: "BWA patch to generate read group"

        http://www.broadinstitute.org/gsa/wi...ate_read_group
        Hello-

        I've never used patch before. It told me 9 of 9 hunks (and then in subsequent attempts 7 of 7 hunks) failed. When I tried recompiling I got the following errors:

        make[1]: Entering directory `/home/arup/software/bwa-0.5.8c'
        make[1]: Nothing to be done for `lib'.
        make[1]: Leaving directory `/home/arup/software/bwa-0.5.8c'
        make[1]: Entering directory `/home/arup/software/bwa-0.5.8c/bwt_gen'
        make[1]: Nothing to be done for `lib'.
        make[1]: Leaving directory `/home/arup/software/bwa-0.5.8c/bwt_gen'
        gcc -c -g -Wall -O2 -m64 -DHAVE_PTHREAD bntseq.c -o bntseq.o
        In file included from bntseq.c:32:
        bntseq.h:87: error: conflicting types for ‘read_group_t’
        bntseq.h:74: error: previous declaration of ‘read_group_t’ was here
        bntseq.h:100: error: conflicting types for ‘read_group_t’
        bntseq.h:87: error: previous declaration of ‘read_group_t’ was here
        make: *** [bntseq.o] Error 1

        Any suggestions?

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        18 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        22 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        16 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        47 views
        0 likes
        Last Post seqadmin  
        Working...
        X