Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GATK and SAM format

    I want to call SNP/indel using GATK tools.

    however, GATK occurs error because my input bam file is malformed

    because some of @HD, @SQ, @RG, @PG info are missing.

    I got the meaning of each component from 'the sam format specification(v1.4-r985)' pdf file from samtools,

    but i don't know exactly what should i type each information and wht's critical for GATK because there are many options to fill it.

    does anyone who knows it? is there any website to specify it automatically?

  • #2
    Hi!

    From which aligner do the bam files come from?

    I have never tried this but in samtools there is the option reheader, where you can replace (or add?) the header.
    You would have to prepare a file containing the header lines based on the information that is specific for your alignment = sam/bam file.
    You would have to include @HD (including the sort order of your alignments), @SQ (information on the reference sequence) and @PG (information on the program used for alignment).
    For the @RG lines, I think you could user the AddOrReplaceReadgroups command from picard.

    maybe that helps...
    cheers!
    Last edited by sdvie; 02-10-2012, 02:26 AM.

    Comment


    • #3
      I use both bwa and novoalign. i use data of illumina Hiseq, pair-end.

      Thank you for your reply, Sdvie.

      and if you don't mind, may i ask you a example sam file you had used without error?

      i do want the header section only, not aligned information.

      if you don't want to show others, please send me an email [email protected] ...

      Thank you

      Comment


      • #4
        I sent you an example of a header. You can see that the SQ line is multiplied because the reference sequence fasta and index has these subsections by chromosome etc. It is crucial that they coincide with the reference sequence you are using in all steps (alignment, GATK etc). Therefore, usually the aligner program itself adds these lines, I have never done this manually. bwa for sure does add the HD, SQ and PG headers, I have not worked with novoalign. As I said before, the read groups may be added either via a bwa argument (check the bwa documentation) during alignment or by picard afterwards.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 08:47 AM
        0 responses
        15 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        54 views
        0 likes
        Last Post seqadmin  
        Working...
        X