SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
"allele balance ratio" and "quality by depth" in VCF files efoss Bioinformatics 2 10-25-2011 11:13 AM
The position file formats ".clocs" and "_pos.txt"? Ist there any difference? elgor Illumina/Solexa 0 06-27-2011 07:55 AM
help on "truncated file", when converting sam to bam jianfeng.mao Bioinformatics 0 12-18-2010 02:03 AM
"Systems biology and administration" & "Genome generation: no engineering allowed" seb567 Bioinformatics 0 05-25-2010 12:19 PM
SEQanswers second "publication": "How to map billions of short reads onto genomes" ECO Literature Watch 0 06-29-2009 11:49 PM

Reply
 
Thread Tools
Old 11-03-2009, 09:44 AM   #1
axiom7
Member
 
Location: Southwest

Join Date: Aug 2009
Posts: 14
Default Samtools "is recognized as '*'" "truncated file" error

Hi,

I posted this last week to the samtools thread, but did not receive a reply, so I'm taking another stab at it:

samtools-0.1.6_x86_64-linux; precompiled version downloaded today

$ bowtie --version
bowtie version 0.11.3
64-bit
Built on myserver
Fri Oct 23 13:27:05 MDT 2009
Compiler: gcc version 4.1.2 20070115 (prerelease) (SUSE Linux)
Options: -O3
Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}

$ samtools faidx hs_ref_chr10.fa
$ cat hs_ref_chr10.fa.fai
gi|89161187|ref|NC_000010.9|NC_000010 135374737 105 70 71

Created sam format with bowtie in two ways (behavior described below is the same for both methods):
1. using -S in bowtie command
2. using samtools bowtie2sam.pl on a bowtie --refout map file

sam file from method 2 (chromosome 10 only):

$ cat ref00000.map.sam
@0-0-3-9833 0 gi|89161187|ref|NC_000010.9|NC_000010 62380535 0 15M * 0 0 GCAAAGGNNATCATT IIIIIIIIIIIIIII NM:i:1 X1:i:5 MD:Z:3A3N0N6
@0-0-3-9833 16 gi|89161187|ref|NC_000010.9|NC_000010 62382480 0 15M * 0 0 GGGCTANNGCTCATC IIIIIIIIIIIIIII NM:i:1 X1:i:5 MD:Z:7N0N6
@0-0-6-12817 0 gi|89161187|ref|NC_000010.9|NC_000010 6095909 0 15M * 0 0 TACCACCNNGCCCTT IIIIIIIIIIIIIII NM:i:1 X1:i:265 MD:Z:1A5N0N6
@0-0-6-12817 16 gi|89161187|ref|NC_000010.9|NC_000010 6097174 0 15M * 0 0 GCATCANNCTCCCGA IIIIIIIIIIIIIII NM:i:1 X1:i:265 MD:Z:7N0N6


$ samtools view -bt ~/work/hs_ref_chr/hs_ref_chr10.fa.fai -o out.bam ref00000.map.sam
[sam_header_read2] 1 sequences loaded.
[sam_read1] reference '16 gi|89161187|ref|NC_000010.9|NC_000010 6097174 0 15M * 0 0 GCATCANNCTCCCGA IIIIIIIIIIIIIII NM:i:1 X1:i:265MD:Z:7N0N6

' is recognized as '*'.
[main_samview] truncated file.

out.bam is created, but I cannot do anything further with it.
axiom7 is offline   Reply With Quote
Old 04-15-2012, 02:37 PM   #2
mdjones66
Junior Member
 
Location: Cambridge, MA

Join Date: Jul 2008
Posts: 3
Default

Yeah this is driving me nuts too. Seems google only know that the question has been asked but it doesn't now the answer.
mdjones66 is offline   Reply With Quote
Old 11-26-2014, 12:51 AM   #3
thh32
Member
 
Location: UK

Join Date: Feb 2014
Posts: 60
Default

Same issue here, ever find out a solution?
thh32 is offline   Reply With Quote
Old 11-26-2014, 02:53 AM   #4
A.N.Other
Member
 
Location: London, UK

Join Date: Feb 2012
Posts: 25
Default

Pretty sure the OP has got past this by now, but for thh32, it looks to me like it's an issue caused by a strange read ID that contains the '@' symbol at the start.

'@' at the start of a line indicates a comment (header) line in SAM, so it's interpreting your actual reads as part of the header, which then falls over because they aren't in the right format.

The header should look something like ...

Code:
@HD	VN:1.0	SO:unsorted
@SQ	SN:chr1	LN:195471971
@SQ	SN:chr1_GL456210_random	LN:169725
...etc...
... for example, and the reads should then NOT start with an @ ...

Code:
HWI-ST539:109:D14VPACXX:1:1303:19984:9383	65	chr4	147868767	60	99M	=	147868771	0	GTTGGTCAGTAGTACTCGGTTACGCAATTTCCGGATGTAAAGTCTCTAATGGCAGTGGATAGGTGGGGCTAGAGACTCCGGCAACTTTGACCTTTTCAC	??CCC@44((3@8+88BCB@?;8>>6;='/86?EEE@HEGGHC@D@)A>GIGGFGHEGF@GHGDIGCHFF9G>BFIIIHCGF<@GEHFA<HDDDDD@@@	AS:i:495	NM:i:0	XI:f:1	X0:i:1	X1:i:0	XE:i:29	XR:i:99MD:Z:99
What do the read IDs look like in the fastq file you're aligning?
A.N.Other is offline   Reply With Quote
Reply

Tags
samtools bowtie

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:12 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO