Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • help recovering data from bam w/ bad header

    I have a bam file that I need to work with to reproduce an issue for some troubleshooting. This bam file was left by a student in a collaborator's lab and we pretty much only have this bam file (no upstream or downstream derivatives of the file). Samtools view gives this error message:

    [bam_header_read] invalid BAM binary header (this is not a BAM file).
    [main_samview] fail to read the header from "7_2.bam".

    Picard's ValidateSamFile produces only the message:

    ERROR: Read groups is empty
    SAMFormatException on record 01

    along with STDERR that looks like the java code is erroring out. Bamtools stats gives this error:

    bamtools stats ERROR: could not open input BAM file(s)... Aborting.


    This file is of a size that would be appropriate for the mapping done, and somehow the person who created it used it. It is a rather old copy that has now been passed around a bit but if it was truncated from a transfer I'd have expected different errors (I think...).

    Is there some way I can dump just the data from a bam file without dealing with the bad header? Or is there some way I can read in a generic 'dummy' header over the bad one to save the data?


    Thanks,
    John

  • #2
    How's your C programming? I suspect you're going to have to code something with htslib to figure out (A) where the problem is and (B) get around it as best as possible.

    Comment


    • #3
      I've no experience with C coding, I mainly stick to scripting languages (perl & python). But I'll take a look at the htslib methods and see if I can muddle through.

      Thanks,
      John

      Comment


      • #5
        I'll try the GATK ValidateSamFile and see what it says. GATK tools usually do give helpful error messages and I haven't tried that one yet.

        Thanks,
        John

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM
        • seqadmin
          Techniques and Challenges in Conservation Genomics
          by seqadmin



          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

          Avian Conservation
          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
          03-08-2024, 10:41 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 06:37 PM
        0 responses
        10 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, Yesterday, 06:07 PM
        0 responses
        9 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-22-2024, 10:03 AM
        0 responses
        50 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-21-2024, 07:32 AM
        0 responses
        67 views
        0 likes
        Last Post seqadmin  
        Working...
        X