SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
CIGAR string from BWA-SW output incorrect ? robs Bioinformatics 13 01-13-2012 05:07 AM
The 'S' in CIGAR of sam file (bwa) qixiaofei General 6 09-16-2011 12:28 AM
bowtie - invalid CIGAR string - wrong sam format genome Bioinformatics 2 02-16-2011 02:36 PM
generate CIGAR string from 2 sequences? bbimber Bioinformatics 0 03-20-2010 10:44 AM
bwa MD and cigar fields inconsistency biterbilen Bioinformatics 4 07-28-2009 09:37 AM

Reply
 
Thread Tools
Old 04-12-2011, 10:22 AM   #1
foxyg
Member
 
Location: US

Join Date: May 2010
Posts: 54
Default BWA generating incorrect CIGAR string?

I algined a single end sample against HG18 reference using the latest BWA. Then I tried to convert the sam file to bam file using samtools,

I got the following error,
Parse error at line 119: sequence and quality are inconsistent

and line 119 looks like
HWI-EAS266_0011:1:1:6:1607#0 16 12 2662146 37 1S35M * 0 0 GGGAACAAATGTGGGGAGGCAGAGGCAGGTCCCTGA $ $$""####$""$#$"###

I searched around, seen people talking about this, but no real solution.

Anyone have any idea?
foxyg is offline   Reply With Quote
Old 04-19-2011, 08:03 PM   #2
flipwell
Member
 
Location: Queensland

Join Date: Feb 2011
Posts: 14
Default

I have had this error a couple of times as well and found that if I reran sampe/samse and tried to convert again then it was fine
flipwell is offline   Reply With Quote
Old 04-30-2011, 06:06 AM   #3
nntao
Junior Member
 
Location: Mid-west

Join Date: Jan 2010
Posts: 4
Default CIGAR field only contain *|\d+M

Hi,

I noticed that the CIGAR string in my bwa mapping output file (paired-end illumina reads against a reference sequence file) contain either * or "\d+M" like "35M" when using -s (-s disable Smith-Waterman for the unmapped mate) for better speed. I thought it only affect unmapped mate. Is it true that only "\d+M" is reported when "-s" option is used for "bwa sampe"? Does it only report matches that cover the whole read length and ignore those with partial matches when using such option?


Thanks!

Bob

Last edited by nntao; 04-30-2011 at 08:20 AM. Reason: More testing answered partially own question
nntao is offline   Reply With Quote
Old 09-13-2011, 06:29 AM   #4
xchen5
Junior Member
 
Location: georgia

Join Date: Mar 2010
Posts: 3
Default

I have something to share with:
look at the followings generated by BWA and then Samtools from paired ends, the five reads are identical, but why they mapped on different location and why the cigar are "*" ? (ignor the "N"s, the reference sequence includes a identical region to the read's sequence)



HWI-ST565_0121:4:2207:1671:63901#ATCACG 181 segment1 19 0 * = 19 0 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNACTGATAGCCAGACAGCCATCAAAAGGATTCGTTTGGAGGAATCAAAATAAAATCACTAAAAATGA BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB`bbcccccddb_`eeeeegbgggihiihghffiihgfhiiihhiihhfghhgcbhfhfiiiihhhg
HWI-ST565_0121:4:1108:5261:43887#ATCACG 117 segment1 21 0 * = 21 0 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNACTGATAGCCAGACAGCCATCAAAAGGATTCGTTTGGAGGAATCAAAATAAAATCACTAAAAATGA BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBcdccccccdddddeeeeeggggghdhiiiiiiiihiihiiihihiiiihiiihgfbihiiifgde^
HWI-ST565_0121:4:2106:9301:25723#ATCACG 181 segment1 22 0 * = 22 0 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNACTGATAGCCAGACAGCCATCAAAAGGATTCGTTTGGAGGAATCAAAATAAAATCACTAAAAATGA BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBcdbcccccdbbdbeeeeegggggiiihiiiihhghiiihhiiiiiiiiiiihhhihiiiiifggdX
HWI-ST565_0121:4:1103:2424:11895#ATCACG 181 segment1 24 0 * = 24 0 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNACTGATAGCCAGACAGCCATCAAAAGGATTCGTTTGGAGGAATCAAAATAAAATCACTAAAAATGA BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBcdccbb^bbbb__ebaaeggfeggeiiihhhhiiiggihfgcgihiihhehihfebhhiiihggb^
HWI-ST565_0121:4:2106:3549:50867#ATCACG 117 segment1 25 0 * = 25 0 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNACTGATAGCCAGACAGCCATCAAAAGGATTCGTTTGGAGGAATCAAAATAAAATCACTAAAAATGA BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB_cb^ZZZbbb]_Za_a]bbgdd^__bcfdghhhffhhhhfccgfcbhfffg`fcaShgagdffbbP
xchen5 is offline   Reply With Quote
Old 09-13-2011, 09:29 AM   #5
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

Quote:
Originally Posted by xchen5 View Post
I have something to share with:
look at the followings generated by BWA and then Samtools from paired ends, the five reads are identical, but why they mapped on different location and why the cigar are "*" ? (ignor the "N"s, the reference sequence includes a identical region to the read's sequence)
All five reads have the 4 flagged. (181 = 128+32+16+4+1, 117 = 64+32+16+4+1))They are really unmapped, no matter what the rest of the line looks like. Sam specs call for unmapped reads to be given the mapping position of their partner, so the two reads will sort together.
swbarnes2 is offline   Reply With Quote
Old 09-15-2011, 06:52 AM   #6
Brajbio
Member
 
Location: India

Join Date: Jun 2010
Posts: 20
Default

Hi I have bwa-0.5.9/solid2fastq.pl version. I have two files SolF3.csfasta & SolF3_QV.qual which i want to convert in 'fastq'. After running the command as :

perl solid2fastq.pl Sol SolTest

I am getting the file SolTest.single.fastq.gz but with no reads in file after i unzip it, whereas i have good and equivalent amount of reads in my input file.Can you explain me the reason if you have any idea.


Strange to say the same command is working fine with another set of file....

Last edited by Brajbio; 09-15-2011 at 07:09 AM.
Brajbio is offline   Reply With Quote
Old 09-16-2011, 12:22 PM   #7
xchen5
Junior Member
 
Location: georgia

Join Date: Mar 2010
Posts: 3
Default

Quote:
Originally Posted by swbarnes2 View Post
All five reads have the 4 flagged. (181 = 128+32+16+4+1, 117 = 64+32+16+4+1))They are really unmapped, no matter what the rest of the line looks like. Sam specs call for unmapped reads to be given the mapping position of their partner, so the two reads will sort together.
thanks swbarners

but the other question is that those identical reads, (if the "N"s are removed), have identical region in the reference, then why they become unmapped reads?

thanks in advance for any useful hints
xchen5 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:57 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO