Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • oiiio
    Senior Member
    • Jan 2011
    • 105

    Newbie Question: [bam_header_read] EOF marker is absent.

    What does this error mean with respect to the completion of my samtools command?

    [bam_header_read] EOF marker is absent.

    Does it mean that the command made it to the end of the file and completed satisfactorily? But just found no specific line indicating the end of the file?
  • maubp
    Peter (Biopython etc)
    • Jul 2009
    • 1544

    #2
    The cryptic error from samtools "EOF marker is absent" is referring to the absence of a special empty BGZF block of 28 bytes, which samtools looks for at the end of the data to indicate the BAM file is complete.

    If you see that error, either:

    (a) Your file is somehow truncated or incomplete (a real error)
    (b) Your file is from a tool not writing this EOF marker (perhaps a very old samtools?)

    Where did your BAM file come from?

    Comment

    • oiiio
      Senior Member
      • Jan 2011
      • 105

      #3
      My bam file was actually made with BWA and the most recent version of SAM. I am concerned because although i received the error, the files are the right size. I'll probably just redo them. Thanks for the clarification though

      Comment

      • brdido
        Member
        • Apr 2011
        • 17

        #4
        oiio, please post the command you are trying to execute.

        This message happens too if you're trying to run samtools with a SAM file instead of a BAM file.

        Comment

        • oiiio
          Senior Member
          • Jan 2011
          • 105

          #5
          The command lines are very simple... samtools sort 1.bam 1.sorted ... etc
          Also I don't think some of them would work if the file was still a SAM. Thanks though.

          Does anyone know of/practice a fast way to check a ton of BAMs for the presence of the EOF marker?

          Comment

          • maubp
            Peter (Biopython etc)
            • Jul 2009
            • 1544

            #6
            Probably this would work:

            Code:
            tail problem.bam | hexdump -C
            You're looking for the following in hex as the final 28 bytes,

            Code:
            0x1f 0x8b 0x08 0x04 0x00 0x00 0x00 0x00
            0x00 0xff 0x06 0x00 0x42 0x43 0x02 0x00
            0x1b 0x00 0x03 0x00 0x00 0x00 0x00 0x00
            0x00 0x00 0x00 0x00
            Or in octal if you prefer that, "\037\213\010\4\0\0\0\0\0\377\6\0\102\103\2\0\033\0\3\0\0\0\0\0\0\0\0\0" as used in function bgzf_check_EOF in samtools file bgzf.c

            Comment

            • oiiio
              Senior Member
              • Jan 2011
              • 105

              #7
              Awesome, thanks

              Comment

              • xied75
                Senior Member
                • Feb 2012
                • 129

                #8
                What about this?

                What if the end is 31 bytes:

                1F 8B 08 04 00 00 00 00 00 FF 06 00 42 43 02 00 1E 00 01 00 00 FF FF 00 00 00 00 00 00 00 00

                And by the way if you use Windows, HxD is really cool to open how ever large your BAM.

                Best,

                dong

                Comment

                • maubp
                  Peter (Biopython etc)
                  • Jul 2009
                  • 1544

                  #9
                  Originally posted by xied75 View Post
                  What if the end is 31 bytes:

                  1F 8B 08 04 00 00 00 00 00 FF 06 00 42 43 02 00 1E 00 01 00 00 FF FF 00 00 00 00 00 00 00 00

                  And by the way if you use Windows, HxD is really cool to open how ever large your BAM.

                  Best,

                  dong
                  You're seeing a different empty BGZF block, a known bug in samtools output for uncompressed BAM. See https://github.com/lh3/samtools/pull/7 and associated mailing list thread http://sourceforge.net/mailarchive/m...sg_id=28413844

                  Edit: Recap post with current patch http://sourceforge.net/mailarchive/m...sg_id=28843382
                  Last edited by maubp; 02-25-2012, 11:07 AM. Reason: Adding another URL

                  Comment

                  • xied75
                    Senior Member
                    • Feb 2012
                    • 129

                    #10
                    Thanks Peter, you are my hero.

                    Comment

                    • ehlin
                      Member
                      • Jan 2012
                      • 12

                      #11
                      Originally posted by maubp View Post
                      You're seeing a different empty BGZF block, a known bug in samtools output for uncompressed BAM. See https://github.com/lh3/samtools/pull/7 and associated mailing list thread http://sourceforge.net/mailarchive/m...sg_id=28413844
                      Hi, sorry to bother you, but I found your code and was wondering how to implement it. I'm pretty new to Unix and bioinformatics in general and I was wondering if you could refer me to a guide on how to set this up or give me a general step-by-step thing. Thanks a lot!

                      Comment

                      • maubp
                        Peter (Biopython etc)
                        • Jul 2009
                        • 1544

                        #12
                        I meant it for information only really (and as a reminder to the samtools team).

                        The easy answer is to be aware that this EOF warning can be a false positive.

                        If you are interested, you'll need to learn a bit about patch files. The Unix command diff creates a list of differences, also called a patch. The Unix patch command takes these files as inputs and applies the changes to your copy of the original files. The idea is you could download the samtools source code, apply this patch (make the correction for the bug), then compile and install the fixed samtools.

                        Comment

                        • ehlin
                          Member
                          • Jan 2012
                          • 12

                          #13
                          Thank you very much! I will look into that.

                          -Edwin

                          Comment

                          • Charitra
                            Member
                            • Feb 2013
                            • 57

                            #14
                            I got he same error [bam_header_read] EOF marker is absent.
                            [bam_header_read] invalid BAM binary header (this is not a BAM file).
                            File ./merged_asm/tmp/mergeSam_filepsu0Hv doesn't appear to be a valid BAM file, trying SAM...
                            [11:16:29] Loading reference annotation.
                            [11:16:55] Inspecting reads and determining fragment length distribution.
                            Processed 39384 loci.

                            As you can see, trying for SAM... and Loading reference annotation.. and then the process continues...
                            My questions are
                            1. what if continue like this (trying SAM)? is it OK (without trying tail problem.bam | hexdump -C) ?

                            2. It skips the large bundle as below:
                            [11:16:56] Assembling transcripts and estimating abundances.
                            6:126102153-130463972 Warning: Skipping large bundle.
                            Processed 39383 loci.
                            Is it okay to go for this ? or how can I add large bundle ?




                            Originally posted by maubp View Post
                            I meant it for information only really (and as a reminder to the samtools team).

                            The easy answer is to be aware that this EOF warning can be a false positive.

                            If you are interested, you'll need to learn a bit about patch files. The Unix command diff creates a list of differences, also called a patch. The Unix patch command takes these files as inputs and applies the changes to your copy of the original files. The idea is you could download the samtools source code, apply this patch (make the correction for the bug), then compile and install the fixed samtools.

                            Comment

                            • lwebs
                              Junior Member
                              • Mar 2017
                              • 7

                              #15
                              Hi all,

                              This is a really old thread, but I have come across the same issue and I'm not sure how to fix it with the patch.

                              I am using samtools to convert a .sam file mapped using bowtie2 to a .bam file.
                              The .sam file looks like it's all there, but when I use the below command, something strange happens during the conversion. I'm trying to incorporate this info into the anvio pipeline and I am using the anvio-init-bam command to sort. Any ideas?

                              $samtools view -F 4 -bS -u ecosphere_merged_MAPPING/Past_Sample_01.sam > ecosphere_merged_MAPPING/Past_Sample_01-RAW.bam
                              [samopen] SAM header is present: 196761 sequences.

                              $anvi-init-bam ecosphere_merged_MAPPING/Past_Sample_01-RAW.bam -o ecosphere_merged_MAPPING/Past_Sample_01.bam

                              [28 May 17 12:34:55 SORT] Sorting BAM File... May take a while depending on the size. [W::bam_hdr_read] EOF marker is absent. The input is probably truncated.
                              [E::bgzf_read] bgzf_read_block error -1 after 0 of 4 bytes
                              Traceback (most recent call last):
                              File "/usr/local/bin/anvi-init-bam", line 75, in <module>
                              output_file_path = args.output_file,))
                              File "/usr/local/bin/anvi-init-bam", line 48, in init_bam_file
                              pysam.sort("-o", output_file_path, input_file_path)
                              File "/usr/local/lib/python3.5/dist-packages/pysam/utils.py", line 75, in __call__
                              stderr))
                              pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=, stderr=[bam_sort_core] truncated file. Aborting.\n'

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM
                              • SEQadmin2
                                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                by SEQadmin2


                                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                Introduction

                                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                05-22-2026, 06:42 AM
                              • SEQadmin2
                                Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                by SEQadmin2

                                Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                05-06-2026, 09:04 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              21 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 11:40 AM
                              0 responses
                              14 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-28-2026, 11:40 AM
                              0 responses
                              29 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-26-2026, 10:12 AM
                              0 responses
                              31 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...