SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
cufflinks. [bam_header_read] EOF marker is absent epistatic Bioinformatics 2 11-10-2011 05:22 AM
1000 genomes newbie question brofallon Bioinformatics 1 06-16-2011 06:50 AM
how to convert sam to bam with EOF marker in header jianfeng.mao Bioinformatics 2 12-17-2010 06:56 AM
Newbie Question Here Schoenbrau General 1 12-05-2010 09:20 PM
Newbie Question, Getting my Terminology in Order Bardj General 0 01-21-2010 12:26 PM

Reply
 
Thread Tools
Old 11-08-2011, 01:41 PM   #1
oiiio
Senior Member
 
Location: USA

Join Date: Jan 2011
Posts: 105
Default Newbie Question: [bam_header_read] EOF marker is absent.

What does this error mean with respect to the completion of my samtools command?

[bam_header_read] EOF marker is absent.

Does it mean that the command made it to the end of the file and completed satisfactorily? But just found no specific line indicating the end of the file?
oiiio is offline   Reply With Quote
Old 11-08-2011, 01:52 PM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

The cryptic error from samtools "EOF marker is absent" is referring to the absence of a special empty BGZF block of 28 bytes, which samtools looks for at the end of the data to indicate the BAM file is complete.

If you see that error, either:

(a) Your file is somehow truncated or incomplete (a real error)
(b) Your file is from a tool not writing this EOF marker (perhaps a very old samtools?)

Where did your BAM file come from?
maubp is offline   Reply With Quote
Old 11-09-2011, 08:16 AM   #3
oiiio
Senior Member
 
Location: USA

Join Date: Jan 2011
Posts: 105
Default

My bam file was actually made with BWA and the most recent version of SAM. I am concerned because although i received the error, the files are the right size. I'll probably just redo them. Thanks for the clarification though
oiiio is offline   Reply With Quote
Old 11-09-2011, 11:10 AM   #4
brdido
Member
 
Location: Sao Paulo, Brazil

Join Date: Apr 2011
Posts: 17
Default

oiio, please post the command you are trying to execute.

This message happens too if you're trying to run samtools with a SAM file instead of a BAM file.
brdido is offline   Reply With Quote
Old 11-10-2011, 03:48 PM   #5
oiiio
Senior Member
 
Location: USA

Join Date: Jan 2011
Posts: 105
Default

The command lines are very simple... samtools sort 1.bam 1.sorted ... etc
Also I don't think some of them would work if the file was still a SAM. Thanks though.

Does anyone know of/practice a fast way to check a ton of BAMs for the presence of the EOF marker?
oiiio is offline   Reply With Quote
Old 11-10-2011, 04:06 PM   #6
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Probably this would work:

Code:
tail problem.bam | hexdump -C
You're looking for the following in hex as the final 28 bytes,

Code:
0x1f 0x8b 0x08 0x04 0x00 0x00 0x00 0x00
0x00 0xff 0x06 0x00 0x42 0x43 0x02 0x00
0x1b 0x00 0x03 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00
Or in octal if you prefer that, "\037\213\010\4\0\0\0\0\0\377\6\0\102\103\2\0\033\0\3\0\0\0\0\0\0\0\0\0" as used in function bgzf_check_EOF in samtools file bgzf.c
maubp is offline   Reply With Quote
Old 11-11-2011, 07:51 AM   #7
oiiio
Senior Member
 
Location: USA

Join Date: Jan 2011
Posts: 105
Default

Awesome, thanks
oiiio is offline   Reply With Quote
Old 02-16-2012, 05:00 AM   #8
xied75
Senior Member
 
Location: Oxford

Join Date: Feb 2012
Posts: 129
Default What about this?

What if the end is 31 bytes:

1F 8B 08 04 00 00 00 00 00 FF 06 00 42 43 02 00 1E 00 01 00 00 FF FF 00 00 00 00 00 00 00 00

And by the way if you use Windows, HxD is really cool to open how ever large your BAM.

Best,

dong
xied75 is offline   Reply With Quote
Old 02-16-2012, 05:21 AM   #9
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Quote:
Originally Posted by xied75 View Post
What if the end is 31 bytes:

1F 8B 08 04 00 00 00 00 00 FF 06 00 42 43 02 00 1E 00 01 00 00 FF FF 00 00 00 00 00 00 00 00

And by the way if you use Windows, HxD is really cool to open how ever large your BAM.

Best,

dong
You're seeing a different empty BGZF block, a known bug in samtools output for uncompressed BAM. See https://github.com/lh3/samtools/pull/7 and associated mailing list thread http://sourceforge.net/mailarchive/m...sg_id=28413844

Edit: Recap post with current patch http://sourceforge.net/mailarchive/m...sg_id=28843382

Last edited by maubp; 02-25-2012 at 11:07 AM. Reason: Adding another URL
maubp is offline   Reply With Quote
Old 02-16-2012, 05:29 AM   #10
xied75
Senior Member
 
Location: Oxford

Join Date: Feb 2012
Posts: 129
Default

Thanks Peter, you are my hero.
xied75 is offline   Reply With Quote
Old 02-24-2012, 01:12 PM   #11
ehlin
Member
 
Location: NYC

Join Date: Jan 2012
Posts: 12
Default

Quote:
Originally Posted by maubp View Post
You're seeing a different empty BGZF block, a known bug in samtools output for uncompressed BAM. See https://github.com/lh3/samtools/pull/7 and associated mailing list thread http://sourceforge.net/mailarchive/m...sg_id=28413844
Hi, sorry to bother you, but I found your code and was wondering how to implement it. I'm pretty new to Unix and bioinformatics in general and I was wondering if you could refer me to a guide on how to set this up or give me a general step-by-step thing. Thanks a lot!
ehlin is offline   Reply With Quote
Old 02-24-2012, 01:19 PM   #12
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

I meant it for information only really (and as a reminder to the samtools team).

The easy answer is to be aware that this EOF warning can be a false positive.

If you are interested, you'll need to learn a bit about patch files. The Unix command diff creates a list of differences, also called a patch. The Unix patch command takes these files as inputs and applies the changes to your copy of the original files. The idea is you could download the samtools source code, apply this patch (make the correction for the bug), then compile and install the fixed samtools.
maubp is offline   Reply With Quote
Old 02-25-2012, 11:01 AM   #13
ehlin
Member
 
Location: NYC

Join Date: Jan 2012
Posts: 12
Default

Thank you very much! I will look into that.

-Edwin
ehlin is offline   Reply With Quote
Old 04-24-2013, 07:30 PM   #14
Charitra
Member
 
Location: Seoul, Korea

Join Date: Feb 2013
Posts: 57
Unhappy

I got he same error [bam_header_read] EOF marker is absent.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_filepsu0Hv doesn't appear to be a valid BAM file, trying SAM...
[11:16:29] Loading reference annotation.
[11:16:55] Inspecting reads and determining fragment length distribution.
Processed 39384 loci.

As you can see, trying for SAM... and Loading reference annotation.. and then the process continues...
My questions are
1. what if continue like this (trying SAM)? is it OK (without trying tail problem.bam | hexdump -C) ?

2. It skips the large bundle as below:
[11:16:56] Assembling transcripts and estimating abundances.
6:126102153-130463972 Warning: Skipping large bundle.
Processed 39383 loci.
Is it okay to go for this ? or how can I add large bundle ?




Quote:
Originally Posted by maubp View Post
I meant it for information only really (and as a reminder to the samtools team).

The easy answer is to be aware that this EOF warning can be a false positive.

If you are interested, you'll need to learn a bit about patch files. The Unix command diff creates a list of differences, also called a patch. The Unix patch command takes these files as inputs and applies the changes to your copy of the original files. The idea is you could download the samtools source code, apply this patch (make the correction for the bug), then compile and install the fixed samtools.
Charitra is offline   Reply With Quote
Old 05-28-2017, 06:34 AM   #15
lwebs
Junior Member
 
Location: MA

Join Date: Mar 2017
Posts: 7
Default

Hi all,

This is a really old thread, but I have come across the same issue and I'm not sure how to fix it with the patch.

I am using samtools to convert a .sam file mapped using bowtie2 to a .bam file.
The .sam file looks like it's all there, but when I use the below command, something strange happens during the conversion. I'm trying to incorporate this info into the anvio pipeline and I am using the anvio-init-bam command to sort. Any ideas?

$samtools view -F 4 -bS -u ecosphere_merged_MAPPING/Past_Sample_01.sam > ecosphere_merged_MAPPING/Past_Sample_01-RAW.bam
[samopen] SAM header is present: 196761 sequences.

$anvi-init-bam ecosphere_merged_MAPPING/Past_Sample_01-RAW.bam -o ecosphere_merged_MAPPING/Past_Sample_01.bam

[28 May 17 12:34:55 SORT] Sorting BAM File... May take a while depending on the size. [W::bam_hdr_read] EOF marker is absent. The input is probably truncated.
[E::bgzf_read] bgzf_read_block error -1 after 0 of 4 bytes
Traceback (most recent call last):
File "/usr/local/bin/anvi-init-bam", line 75, in <module>
output_file_path = args.output_file,))
File "/usr/local/bin/anvi-init-bam", line 48, in init_bam_file
pysam.sort("-o", output_file_path, input_file_path)
File "/usr/local/lib/python3.5/dist-packages/pysam/utils.py", line 75, in __call__
stderr))
pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=, stderr=[bam_sort_core] truncated file. Aborting.\n'
lwebs is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:24 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO