Hi all, i have come across a strange problem. I am getting a chromosome SAM file from 1K genomes using the following command, for e.g.
samtools view -h ftp://ftp-trace.ncbi.nih.gov/1000gen...e.20101123.bam 11 > test.sam
(please note its for the entire chromosome 11).
Now I want to convert test.sam into a bam file so i use the following command
samtools view -bS test.sam -o test1.bam
The problem is in test1.bam, when i open it with gedit the first few characters are
BAM\A3>\00\00@HD
and the very end of the file is again full of junk
T\00\00\00\00\00\001\00=C\DB\00\00\002\00\8D\ED~\00\00\003\00\95\CD\00\00\004\00d\C8d\00\00\005\00<\8C\C8
\00\00\006\00;3
\00\00\007\00gC| \00\00\008\00vV\B9\00\00\009\00\F7\BEj\00\00\0010\00\9B\00\00\0011\004 \00\00\0012\00\F7j\FA\00\00\0013\00VZ\DD\00\00\0014\00$f\00\00\0015\00@\81\00\00\0016\00A\B4b\00\00\0017\00\CA\F0\D6\00\00\0018\00@]\A7\00\00\0019\00\97<\86\00\00\0020\00p\B1\C1\00\00\0021\00gg\DE\00\00\0022\00v\D8\00\00\00X\00\A0=A \00\00\00Y\00\FE\F7\89\00\00\00MT\00\B9@\00\00\00\00\00GL000207.1\00\A6\00\00\00\00\00GL000226.1\00\A0:\00\00\00\00\00GL000229.1\00\C9M\00\00\00\00\00GL000231.1\00\FAj\00\00\00\00\00GL000210.1\00"l\00\00\00\00\00GL000239.1\00 \84\00\00\00\00\00GL000235.1\00\AA\86\00\00\00\00\00GL000201.1\004\8D\00\00\00\00\00GL000247.1\00F\8E\00\00\00\00\00GL000245.1\00+\8F\00\00\00\00\00GL000197.1\007\91\00\00\00\00\00GL000203.1\00z\92\00\00\00\00\00GL000246.1\00
\95\00\00\00\00\00GL000249.1\00f\96\00\00\00\00\00GL000196.1\00\98\00\00\00\00\00GL000248.1\00j\9B\00\00\00\00\00GL000244.1\00\F9\9B\00\00\00\00\00GL000238.1\00\9C\00\00\00\00\00GL000202.1\00\A7\9C\00\00\00\00\00GL000234.1\00S\9E\00\00\00\00\00GL000232.1\00̞\00\00\00\00\00GL000206.1\00)\A0\00\00\00\00\00GL000240.1\00ͣ\00\00\00\00\00GL000236.1\00Σ\00\00\00\00\00GL000241.1\00\A8\A4\00\00\00\00\00GL000243.1\00M\A9\00\00\00\00\00GL000242.1\00\AA\00\00\00\00\00GL000230.1\00\AB\AA\00\00\00\00\00GL000237.1\00+\B3\00\00\00\00\00GL000233.1\00u\B3\00\00\00\00\00GL000204.1\00\9E=\00\00\00\00GL000198.1\00\E5_\00\00\00\00GL000208.1\00j\00\00\00\00GL000191.1\00\C1\9F\00\00\00\00GL000227.1\00v\F5\00\00\00\00GL000228.1\00`\F8\00\00\00\00GL000214.1\00\F6\00\00\00\00GL000221.1\00_\00\00\00\00GL000209.1\00\C1m\00\00\00\00GL000218.1\00{u\00\00\00\00GL000220.1\00
x\00\00\00\00GL000213.1\00\8F\81\00\00\00\00GL000211.1\00\A6\8A\00\00\00\00GL000199.1\00\92\97\00\00\00\00GL000217.1\00u\A0\00\00\00\00GL000216.1\00\A1\00\00\00\00GL000215.1\00\A2\00\00\00\00GL000205.1\00\FC\A9\00\00\00\00GL000219.1\00\FE\BB\00\00\00\00GL000224.1\00\ED\BD\00\00\00\00GL000223.1\00\E7\C0\00\00\00\00GL000195.1\00p\CA\00\00\00\00GL000212.1\00\EA\D9\00\00\00\00GL000222.1\00\ED\D9\00\00\00\00GL000200.1\00\9B\DA\00\00\00\00GL000193.1\00]\E5\00\00\00\00GL000194.1\00\ED\EB\00\00\00\00GL000225.1\00\E58\00\00\00\00GL000192.1\00\A8Z\00
Can anyone please tell me why this may be occurring?
Thanks in advance.
Ashwin
samtools view -h ftp://ftp-trace.ncbi.nih.gov/1000gen...e.20101123.bam 11 > test.sam
(please note its for the entire chromosome 11).
Now I want to convert test.sam into a bam file so i use the following command
samtools view -bS test.sam -o test1.bam
The problem is in test1.bam, when i open it with gedit the first few characters are
BAM\A3>\00\00@HD
and the very end of the file is again full of junk
T\00\00\00\00\00\001\00=C\DB\00\00\002\00\8D\ED~\00\00\003\00\95\CD\00\00\004\00d\C8d\00\00\005\00<\8C\C8
\00\00\006\00;3
\00\00\007\00gC| \00\00\008\00vV\B9\00\00\009\00\F7\BEj\00\00\0010\00\9B\00\00\0011\004 \00\00\0012\00\F7j\FA\00\00\0013\00VZ\DD\00\00\0014\00$f\00\00\0015\00@\81\00\00\0016\00A\B4b\00\00\0017\00\CA\F0\D6\00\00\0018\00@]\A7\00\00\0019\00\97<\86\00\00\0020\00p\B1\C1\00\00\0021\00gg\DE\00\00\0022\00v\D8\00\00\00X\00\A0=A \00\00\00Y\00\FE\F7\89\00\00\00MT\00\B9@\00\00\00\00\00GL000207.1\00\A6\00\00\00\00\00GL000226.1\00\A0:\00\00\00\00\00GL000229.1\00\C9M\00\00\00\00\00GL000231.1\00\FAj\00\00\00\00\00GL000210.1\00"l\00\00\00\00\00GL000239.1\00 \84\00\00\00\00\00GL000235.1\00\AA\86\00\00\00\00\00GL000201.1\004\8D\00\00\00\00\00GL000247.1\00F\8E\00\00\00\00\00GL000245.1\00+\8F\00\00\00\00\00GL000197.1\007\91\00\00\00\00\00GL000203.1\00z\92\00\00\00\00\00GL000246.1\00
\95\00\00\00\00\00GL000249.1\00f\96\00\00\00\00\00GL000196.1\00\98\00\00\00\00\00GL000248.1\00j\9B\00\00\00\00\00GL000244.1\00\F9\9B\00\00\00\00\00GL000238.1\00\9C\00\00\00\00\00GL000202.1\00\A7\9C\00\00\00\00\00GL000234.1\00S\9E\00\00\00\00\00GL000232.1\00̞\00\00\00\00\00GL000206.1\00)\A0\00\00\00\00\00GL000240.1\00ͣ\00\00\00\00\00GL000236.1\00Σ\00\00\00\00\00GL000241.1\00\A8\A4\00\00\00\00\00GL000243.1\00M\A9\00\00\00\00\00GL000242.1\00\AA\00\00\00\00\00GL000230.1\00\AB\AA\00\00\00\00\00GL000237.1\00+\B3\00\00\00\00\00GL000233.1\00u\B3\00\00\00\00\00GL000204.1\00\9E=\00\00\00\00GL000198.1\00\E5_\00\00\00\00GL000208.1\00j\00\00\00\00GL000191.1\00\C1\9F\00\00\00\00GL000227.1\00v\F5\00\00\00\00GL000228.1\00`\F8\00\00\00\00GL000214.1\00\F6\00\00\00\00GL000221.1\00_\00\00\00\00GL000209.1\00\C1m\00\00\00\00GL000218.1\00{u\00\00\00\00GL000220.1\00
x\00\00\00\00GL000213.1\00\8F\81\00\00\00\00GL000211.1\00\A6\8A\00\00\00\00GL000199.1\00\92\97\00\00\00\00GL000217.1\00u\A0\00\00\00\00GL000216.1\00\A1\00\00\00\00GL000215.1\00\A2\00\00\00\00GL000205.1\00\FC\A9\00\00\00\00GL000219.1\00\FE\BB\00\00\00\00GL000224.1\00\ED\BD\00\00\00\00GL000223.1\00\E7\C0\00\00\00\00GL000195.1\00p\CA\00\00\00\00GL000212.1\00\EA\D9\00\00\00\00GL000222.1\00\ED\D9\00\00\00\00GL000200.1\00\9B\DA\00\00\00\00GL000193.1\00]\E5\00\00\00\00GL000194.1\00\ED\EB\00\00\00\00GL000225.1\00\E58\00\00\00\00GL000192.1\00\A8Z\00
Can anyone please tell me why this may be occurring?
Thanks in advance.
Ashwin
Comment