SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
SAM/CIGAR for deletion before first match stuartrbaker Bioinformatics 4 04-11-2012 04:53 AM
interpreting the CIGAR in the SAM format efoss Bioinformatics 2 10-29-2011 10:04 AM
The 'S' in CIGAR of sam file (bwa) qixiaofei General 6 09-15-2011 11:28 PM
the meaning of CIGAR column in SAM file outputs by BWA holywoool Bioinformatics 2 01-04-2011 04:34 AM
column meaning of hits_accepted.sam produced from TopHat jiwu2573 Bioinformatics 5 03-08-2010 07:10 PM

Reply
 
Thread Tools
Old 05-05-2012, 11:45 AM   #1
GeneJockey
Junior Member
 
Location: Tucson, AZ

Join Date: May 2012
Posts: 4
Default Hyphens in CIGAR column of sam file?

I'm trying to convert a BLAT .psl file to a BAM file.

I've managed to convert the .psl to a SAM file using the psl2sam.pl script. Converting the SAM to BAM is giving me problems. I'm attempting the following:

samtools view -bS -t chicken.ref_List -o mtcontigs_test.bam mtcontigs_test.sam

and getting ...

[sam_header_read2] 34 sequences loaded.
Parse error at line 25: invalid CIGAR character
Aborted (core dumped)

So, I check line 25 and my CIGAR column has this:

47H78M38I-921M1025H

Where are these hyphens coming from? They're in quite a few of my entries. I can't find any reference as to what they might be so I can't figure out if I just need to delete them or replace them with something else.

Anyone seen something like this?
GeneJockey is offline   Reply With Quote
Old 05-08-2012, 09:06 AM   #2
GeneJockey
Junior Member
 
Location: Tucson, AZ

Join Date: May 2012
Posts: 4
Default

I still haven't found a reason for the hyphens. No ideas?
GeneJockey is offline   Reply With Quote
Old 05-08-2012, 09:21 AM   #3
Rocketknight
Member
 
Location: Ireland

Join Date: Sep 2011
Posts: 86
Default

- isn't an allowed part of the sam file spec that I know of, which suggests this is probably a bug in psl2sam, or something weird in the input psl files you're feeding it that's causing the script to act unusually.

Can you paste the entire line of the SAM file that contains this CIGAR string? That should make it clear whether there's missing information in the CIGAR string, or if it's just a correct CIGAR string with an unwanted '-' in the middle.
Rocketknight is offline   Reply With Quote
Old 05-08-2012, 09:34 AM   #4
GeneJockey
Junior Member
 
Location: Tucson, AZ

Join Date: May 2012
Posts: 4
Default

Here's a few lines from one of the SAM files with hyphens ...

contig00103 16 chr1 154791541 0 51H97M349252D24M1073H * 0 0 * * AS:i:0
contig00103 16 chr1 167090504 0 50H54M1141H * 0 0 * * AS:i:10
contig00103 16 chr1 18839958 0 46H94M2I36M1067H * 0 0 * * AS:i:0
contig00103 16 chr1 62707216 0 49H48M1148H * 0 0 * * AS:i:12
contig00103 16 chr1 130804050 0 10H118M18I48M1051H * 0 0 * * AS:i:0
contig00103 16 chrZ 28303141 0 47H78M38I-921M1025H * 0 0 * * AS:i:0
contig00103 16 chr9 18318424 0 52H61M2I-964M1073H * 0 0 * * AS:i:15


The last two lines have hyphens.
GeneJockey is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:11 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO