SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
NGS whole genome sequence versus sequence capture for quality control houkto Bioinformatics 0 02-02-2012 05:16 AM
Anyone knows sequence quality score 0-99? holywoool Bioinformatics 4 11-02-2010 08:35 PM
Quality scores and sequence alignment warrenemmett General 0 09-01-2010 06:35 AM
samtools view error: CIGAR and sequence length are inconsistent (tophat/bowtie) glacierbird Bioinformatics 2 06-29-2010 02:58 AM
Sequence Alignment Quality Control nilshomer Bioinformatics 4 02-06-2010 08:24 AM

Reply
 
Thread Tools
Old 02-02-2012, 01:27 PM   #1
myronpeto
Member
 
Location: Portland, OR

Join Date: Sep 2011
Posts: 10
Default Sequence and quality are inconsistent

I hope this hasn't been addressed in a previous thread. I'm getting the following error message when I try to import into a .bam file (using samtools import).

[samopen] SAM header is present: 84 sequences.
Parse error at line 90: sequence and quality are inconsistent

I'm using bwa aln to find coordinates and bwa sampe to generate alignments. When I look at that line I find the following record:

HWI-ST609:107:C0DBJACXX:6:1101:2173:1948 83 2 112534540 0 66S35M = 112534381 -194 TTCACACCAGGGAGGAATTCCGAGTTCTTAAAGAGCCCCCGCTTAGGGTGGTTCTGCAGCCGCTCCTGATGGCTTCGGGAGCTGAAAAACTCTAACACNAA !
! RG:Z:UACC812BTR_tumor1 XC:i:35 XT:A:R NM:i:1 SM:i:0 AM:i:0 X0:i:2 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:32C2 XA:Z:2,-112534540,101M,1;


I can go back to the fastq file and find the corresponding record:

@HWI-ST609:107:C0DBJACXX:6:1101:2173:1948 1:Y:0:GCCAAT
TTNGTGTTAGAGTTTTTCAGCTCCCGAAGCCATCAGGAGCGGCTGCAGAACCACCCTAAGCGGGGGCTCTTTAAGAACTCGGAATTCCTCCCTGGTGTGAA
+
;;#2:5(@66:;;=<?8@)2<>>?@?39=??9>??3?>??;??2=?7>?999=9??#############################################

Here are the aln, sampe and import commands I'm using:

bwa-0.5.10/bwa aln -I -q 15 -t 12 human_g1k_v37.fasta XXXX_tumor1_1.fq.gz > XXXX_tumor1_read1.sai

bwa-0.5.10/bwa aln -I -q 15 -t 12 human_g1k_v37.fasta XXXX_tumor1_2.fq.gz > XXXX_tumor1_read2.sai

bwa-0.5.10/bwa sampe -P -r '@RG\tID:XXXX_tumor1\tLB:XXXX_tumor\tSM:XXXX_tumor' -f XXXX_tumor1.sam human_g1k_v37.fasta XXXX_tumor1_read1.sai XXXX_tumor1_read2.sai XXXX_tumor1_1.fq.gz XXXX_tumor1_2.fq.gz

samtools-0.1.18/samtools import /human_g1k_v37.fasta.fai XXXX_tumor1.sam XXXX_tumor1_unsorted.bam

I've tried using other versions of bwa and samtools. It seems the record is getting mixed up in the alignment process. Can anybody shed light on this?

Myron Peto
myronpeto is offline   Reply With Quote
Old 02-02-2012, 02:11 PM   #2
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

The trivial answer...I think import is deprecated. So try SAMtools view; that command does a number of things, including converting sams to bams.

Also, .sam files are really big. Where possible, you should pipe the output of whatever progam is making your .sam files (like bwa sampe, in your case) straight into samtools view.

So your command line should look like

Quote:
bwa sampe ref.fa r1.sai r2.sai 1.fq 1.fq | samtools view -bS - > out.bam
the -bS means "input is sam, output is bam".
swbarnes2 is offline   Reply With Quote
Old 02-02-2012, 03:12 PM   #3
myronpeto
Member
 
Location: Portland, OR

Join Date: Sep 2011
Posts: 10
Default Good to know...

Thanks for the response. I tried using samtools view on the data piped directly from bwq sampe and also on the sam file I generated. I still get the sequence and quality inconsistent error. From the record in the .sam file it seems the sequence and quality strings are of different length and the problem is upstream of the conversion from .sam to .bam

Regardless, it's nice to know that view is the preferred option now.

Myron
myronpeto is offline   Reply With Quote
Old 02-02-2012, 03:49 PM   #4
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 699
Default

Check to make sure index is compatible with 0.5.10 version of bwa. Indexing changed with 0.6.0.

The previous aln or sampe command(s) is/are messing up.

Your symptom is often the result of piping stderr to same same stdout in a previous process. Try adding " 2> stderr.txt" to explicitly force the stderr output to a file you can later examine on every command.

Using a batch feeder that automagically adds a "2>&1" will pump stderr into stdout and you'll get that "mismatch read/quality thing " ... if your lucky.

You can do a "strings filename" command on your sai file to see if there's anything bad there. ("bad"as in stderr messages, any other string should look like junk).
Richard Finney is offline   Reply With Quote
Old 02-02-2012, 04:40 PM   #5
myronpeto
Member
 
Location: Portland, OR

Join Date: Sep 2011
Posts: 10
Default

Again, thanks for the response. I tried using 0.6.0 and 0.6.1 versions of bwa but got a segmentation fault with both of them. Maybe it would be useful to try to troubleshoot that path.

As for the stderr and stdout, I left out the full version of the command. Here it is:

bwa-0.5.10/bwa sampe -P -r '@RG\tID:XXXX_tumor1\tLB:XXXX_tumor\tSM:XXXX_tumor' -f XXXX_tumor1.sam human_g1k_v37.fasta XXXX_tumor1_read1.sai XXXX_tumor1_read2.sai XXXX_tumor1_1.fq.gz XXXX_tumor1_2.fq.gz 1>>XXXX_output.log 2>&1

However, I've also tried running the aln and sampe commands without redirecting the outputs and I've gotten the same error every time.

I tried strings on the .sai fill and did get a bunch of gibberish. I only went down a few pages and didn't grep the whole file for "error" or anything. I then piped it to "grep error" and came up empty.

Myron
myronpeto is offline   Reply With Quote
Old 02-02-2012, 05:08 PM   #6
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 699
Default

Well. The -r parameter might be the problem. Bad output is RG:Z:UACC812BTR_tumor1 which looks suspiciously like '@RG\tID:XXXX_tumor1\tLB:XXXX_tumor\tSM:XXXX_tumor'

Try running it without the -r and maybe -P parameters.
It could be you don't have enough memory for -P.

You can reheader the sam/bam output to put the RG in.
Richard Finney is offline   Reply With Quote
Old 02-02-2012, 10:32 PM   #7
myronpeto
Member
 
Location: Portland, OR

Join Date: Sep 2011
Posts: 10
Default

Good suggestions they were, but, alas, I still get that same error message. I tried without the -r and them without either the -r or -P option but still get a misformed sam file.

Any other ideas are most welcome.

Myron
myronpeto is offline   Reply With Quote
Old 02-03-2012, 04:17 AM   #8
dnusol
Senior Member
 
Location: Spain

Join Date: Jul 2009
Posts: 133
Default

were these files generated with CASAVA 1.8+?

if so, try removing -I from command. This flag is for 1.5+ and now CASAVA can get qualities upto 41

HTH

dave


#Edit: sorry, I meant 1.3+ and not 1.5+

Last edited by dnusol; 02-03-2012 at 04:57 AM.
dnusol is offline   Reply With Quote
Old 02-06-2012, 10:05 AM   #9
myronpeto
Member
 
Location: Portland, OR

Join Date: Sep 2011
Posts: 10
Default

Hallelujah!!

That last recommendation did the trick - the import (or view) from a sam to bam file is now proceeding. Knock on wood, hopefully it will finish without error.

Thanks much to all who offered suggestions, I offer my sincere gratitude.

Myron
myronpeto is offline   Reply With Quote
Old 10-29-2013, 03:06 PM   #10
Elsie
Member
 
Location: Australia

Join Date: Mar 2011
Posts: 85
Default

For anyone who comes across this post in future with the same error message, one tip is to check the size of your .sai files. I had this error, one sai file was significantly smaller, I checked and one of the *.fastq.gz files had not quite transferred across correctly, hence the error.
Elsie is offline   Reply With Quote
Reply

Tags
bwa, import, inconsistent, quality, samtools

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:24 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO