Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
Converting PED/MAP files to BAM files Abdulaziz Bioinformatics 3 04-19-2016 03:24 AM
sam files convert to bam files error awayihaha Bioinformatics 7 03-11-2014 10:03 AM
SOLiD denovo: contigs corrupted after conversion from double-encoding to nucleotide Alex8 SOLiD 6 06-24-2012 05:57 AM
Reverse engineering BAM files: BAM -> FASTQ gene coder Bioinformatics 3 01-03-2012 03:42 PM
NEw to Chip-seq and have .bam/.sam/.bam.bai files... then what? NGS newbie Bioinformatics 11 05-25-2011 08:48 AM

Thread Tools
Old 09-01-2016, 10:16 AM   #1
Senior Member
Location: US

Join Date: Aug 2011
Posts: 106
Default Potentially corrupted BAM files

Hi all,
I have a question about detecting possible corruptions to BAM files. Essentially, we had a serious hard drive issue and failure across our network. While we have been able to restore our hard drives and retrieve our data, we have noticed issues with many files. for example, in some text files, some information has been changed to non ascii data, or lines deleted from them. We still do not yet know the full extend of the issues but are trying to figure it out. One, potential issue is that we have many 1000's of mapped bam files. We do not know if the files have been corrupted in anyway, and are interested in finding out. One approach (I think) would be to try to convert them back to SAM files, and if it breaks then the file is somehow corrupted. But I was wondering if anyone knew of some other method that can check the file.

I should note that we do have most of this backed up on a tape storage system that we could go back to, but this would take an exceedingly long time to do for our whole system. So we would like to maybe find the 'bad' files and replace those only. Also, I am not a software engineer, so hopefully this all makes sense
lre1234 is offline   Reply With Quote
Old 09-01-2016, 12:42 PM   #2
Richard Finney
Senior Member
Location: bethesda

Join Date: Feb 2009
Posts: 700

zcat the bam file and check for errors ... example ...

rfinney@pigdog:~$ file test.bam
test.bam: gzip compressed data, extra field

rfinney@pigdog:~$ samtools index test.bam

rfinney@pigdog:~$ cat test.bam | tr " " "x" > test2.bam # corrupt the file test2.bam

rfinney@pigdog:~$ samtools index test2.bam
[E::bam_hdr_read] invalid BAM binary header
samtools index: "test2.bam" is corrupted or unsorted

rfinney@pigdog:~$ zcat test.bam > /dev/null

rfinney@pigdog:~$ zcat test2.bam > /dev/null
gzip: test2.bam: invalid compressed data--crc error
gzip: test2.bam: invalid compressed data--length error

Last edited by Richard Finney; 09-01-2016 at 12:46 PM.
Richard Finney is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 07:25 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO