SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Find unmapped read from sam/bam file genelab Bioinformatics 9 03-18-2014 01:35 PM
GATK "MAPQ should be 0 for unmapped read" complaint for mapped read efoss Bioinformatics 12 10-18-2012 01:19 AM
BWA MapQ field yaten2020 Bioinformatics 1 11-22-2011 11:10 AM
mate unmapped and read unmapped rururara Bioinformatics 1 02-25-2011 01:31 AM
CIGAR should have zero elements for unmapped read doc2r Bioinformatics 2 01-19-2011 08:44 AM

Reply
 
Thread Tools
Old 03-05-2010, 12:50 AM   #1
KevinLam
Senior Member
 
Location: SEA

Join Date: Nov 2009
Posts: 199
Default MAPQ must should be 0 for unmapped read

Hi I am not too sure what to make of this.
the sam file was from bwa
is it a bug in bwa output?

java -Xmx2g -jar /home/corona/bin/source/picard-tools-1.14/ViewSam.jar INPUT=sorted.sam ALIGNMENT_STATUS=Aligned > sortedaligned.sam


Exception in thread "main" net.sf.samtools.SAMFormatException: Error parsing text SAM file. MAPQ must should be 0 for unmapped read.; File sorted.sam; Line 8910023
Line: ./S2:747_1696_219 4 chr18 90771994 25 48M * 0 0 AGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGTTTGGGGCTT ]]]]]]PLQ]]XW[]QA6H]]VI+3]M9FSIFG@QQ:!)]0+OJRL:7 XT:A:U CM:i:1 XN:i:10 X0:i:1 X1:i:0 XM:i:4 XO:i:0 XG:i:0 MD:Z:40C7
at net.sf.samtools.SAMTextReader.reportErrorParsingLine(SAMTextReader.java:176)
at net.sf.samtools.SAMTextReader.access$500(SAMTextReader.java:40)
at net.sf.samtools.SAMTextReader$RecordIterator.parseLine(SAMTextReader.java:385)
at net.sf.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:232)
at net.sf.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:196)
at net.sf.picard.sam.ViewSam.doWork(ViewSam.java:68)
at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:143)
at net.sf.picard.sam.ViewSam.main(ViewSam.java:58)
KevinLam is offline   Reply With Quote
Old 03-09-2010, 05:38 PM   #2
mard
Member
 
Location: Melbourne

Join Date: Jan 2010
Posts: 21
Default

I am getting a similar error with Picard ValidateSamFile.jar after running bwa 0.5.6...


$ java -Xmx4g -jar picard-tools-1.14/ValidateSamFile.jar INPUT=IC201N.sam

[Tue Mar 09 11:35:57 EST 2010] net.sf.picard.sam.ValidateSamFile INPUT=IC201N.sam MODE=VERBOSE MAX_OUTPUT=100 IGNORE_WARNINGS=false TMP_DIR=/var/folders/zK/zKWfvkbvHui1XRNLQFGiFrpceyw/-Tmp-/mard VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000
ERROR: Read groups is empty
ERROR: Record 115632, Read name HWUSI-EAS715_100113:3:4:917:341#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 115632, Read name HWUSI-EAS715_100113:3:4:917:341#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 179735, Read name HWUSI-EAS715_100113:3:6:472:1636#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 179735, Read name HWUSI-EAS715_100113:3:6:472:1636#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 1944399, Read name HWUSI-EAS715_100113:3:59:1626:1079#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 1944399, Read name HWUSI-EAS715_100113:3:59:1626:1079#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 2044255, Read name HWUSI-EAS715_100113:3:63:461:1816#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 2044255, Read name HWUSI-EAS715_100113:3:63:461:1816#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 2173246, Read name HWUSI-EAS715_100113:3:67:1075:709#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 2173246, Read name HWUSI-EAS715_100113:3:67:1075:709#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 2174105, Read name HWUSI-EAS715_100113:3:67:1125:1723#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 2174105, Read name HWUSI-EAS715_100113:3:67:1125:1723#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 2663210, Read name HWUSI-EAS715_100113:3:83:1248:281#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 2663210, Read name HWUSI-EAS715_100113:3:83:1248:281#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 3064036, Read name HWUSI-EAS715_100113:3:96:383:1101#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 3301642, Read name HWUSI-EAS715_100113:3:103:698:837#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 3301642, Read name HWUSI-EAS715_100113:3:103:698:837#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 3718533, Read name HWUSI-EAS715_100113:3:115:843:1180#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 3718533, Read name HWUSI-EAS715_100113:3:115:843:1180#0, CIGAR should have zero elements for unmapped read.
[Tue Mar 09 11:36:26 EST 2010] net.sf.picard.sam.ValidateSamFile done.
Runtime.totalMemory()=84475904


Turns out that the first read from above maps off the end of the reference chromosome...

Ref# CAGAGGCGGCGGCTCGGGGAGAAACCTCAGGCACGGCCGGGCCACCAGGAAAACACGGCCGCGGGATC <- end of chromosome
Read CGGGGGCGGCGGCTCGGGGAGAAACCTCAGGCACGGCCGGGGCACCAGGAAAACACGGCCGCGGGATCCCA

So is this a bug in bwa?
mard is offline   Reply With Quote
Old 03-09-2010, 05:49 PM   #3
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by mard View Post
I am getting a similar error with Picard ValidateSamFile.jar after running bwa 0.5.6...


$ java -Xmx4g -jar picard-tools-1.14/ValidateSamFile.jar INPUT=IC201N.sam

[Tue Mar 09 11:35:57 EST 2010] net.sf.picard.sam.ValidateSamFile INPUT=IC201N.sam MODE=VERBOSE MAX_OUTPUT=100 IGNORE_WARNINGS=false TMP_DIR=/var/folders/zK/zKWfvkbvHui1XRNLQFGiFrpceyw/-Tmp-/mard VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000
ERROR: Read groups is empty
ERROR: Record 115632, Read name HWUSI-EAS715_100113:3:4:917:341#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 115632, Read name HWUSI-EAS715_100113:3:4:917:341#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 179735, Read name HWUSI-EAS715_100113:3:6:472:1636#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 179735, Read name HWUSI-EAS715_100113:3:6:472:1636#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 1944399, Read name HWUSI-EAS715_100113:3:59:1626:1079#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 1944399, Read name HWUSI-EAS715_100113:3:59:1626:1079#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 2044255, Read name HWUSI-EAS715_100113:3:63:461:1816#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 2044255, Read name HWUSI-EAS715_100113:3:63:461:1816#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 2173246, Read name HWUSI-EAS715_100113:3:67:1075:709#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 2173246, Read name HWUSI-EAS715_100113:3:67:1075:709#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 2174105, Read name HWUSI-EAS715_100113:3:67:1125:1723#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 2174105, Read name HWUSI-EAS715_100113:3:67:1125:1723#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 2663210, Read name HWUSI-EAS715_100113:3:83:1248:281#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 2663210, Read name HWUSI-EAS715_100113:3:83:1248:281#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 3064036, Read name HWUSI-EAS715_100113:3:96:383:1101#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 3301642, Read name HWUSI-EAS715_100113:3:103:698:837#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 3301642, Read name HWUSI-EAS715_100113:3:103:698:837#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 3718533, Read name HWUSI-EAS715_100113:3:115:843:1180#0, MAPQ must should be 0 for unmapped read.
ERROR: Record 3718533, Read name HWUSI-EAS715_100113:3:115:843:1180#0, CIGAR should have zero elements for unmapped read.
[Tue Mar 09 11:36:26 EST 2010] net.sf.picard.sam.ValidateSamFile done.
Runtime.totalMemory()=84475904


Turns out that the first read from above maps off the end of the reference chromosome...

Ref# CAGAGGCGGCGGCTCGGGGAGAAACCTCAGGCACGGCCGGGCCACCAGGAAAACACGGCCGCGGGATC <- end of chromosome
Read CGGGGGCGGCGGCTCGGGGAGAAACCTCAGGCACGGCCGGGGCACCAGGAAAACACGGCCGCGGGATCCCA

So is this a bug in bwa?
It's complicated (best Top Gun quote). See the samtools mailing lists (help and devel) regarding these two issues. Feel free to voice your concerns on those lists as the solution to the above is ongoing.
nilshomer is offline   Reply With Quote
Old 03-09-2010, 06:52 PM   #4
drio
Senior Member
 
Location: 4117'49"N / 24'42"E

Join Date: Oct 2008
Posts: 323
Default

Indeed,

http://sourceforge.net/mailarchive/f...samtools-devel


Quote:
Goose: [Extending his middle finger] You know, the finger!
Charlie: Yes, I know the finger, Goose.
Goose: Sorry. I hate when it does that.
Charlie: [to Maverick] So you're the one.
Maverick: Yes, ma'am.
__________________
-drd
drio is offline   Reply With Quote
Old 03-09-2010, 09:21 PM   #5
mard
Member
 
Location: Melbourne

Join Date: Jan 2010
Posts: 21
Default

Thanks for the info.

So looks like, for the moment, that the solution is to ignore these warnings in Picard by adding either IGNORE={INVALID_MAPPING_QUALITY,INVALID_CIGAR} (for ValidateSamFile.jar) or VALIDATION_STRINGENCY=SILENT (for ViewSam.jar, SortSam.jar and MarkDuplicates.jar)

Would that be correct?
mard is offline   Reply With Quote
Old 03-09-2010, 11:40 PM   #6
KevinLam
Senior Member
 
Location: SEA

Join Date: Nov 2009
Posts: 199
Default

Yes it would seem so.
I am afraid that it will cause viewing problems in IGV downstream though but I can't confirm it as well.
KevinLam is offline   Reply With Quote
Old 11-23-2010, 01:35 AM   #7
Bruins
Member
 
Location: Groningen

Join Date: Feb 2010
Posts: 78
Default

Sorry for the bump, but I wonder what the consequences are when we start using VALIDATION_STRINGENCY=SILENT. We'd like Picard to ignore

Code:
ERROR: Record 29078883, Read name HWUSI-EAS536_0001:4:51:19663:20378#0, CIGAR should have zero elements for unmapped read.
ERROR: Record 29183722, Read name HWUSI-EAS536_0001:4:14:8317:13044#0, MAPQ should be 0 for unmapped read.
ERROR: Record 29183722, Read name HWUSI-EAS536_0001:4:14:8317:13044#0, CIGAR should have zero elements for unmapped read.
which, in the dataset we're currently analysing, are the only 3 errors.

But... won't VALIDATION_STRINGENCY=SILENT ignore other, more serious issues as well (in the next dataset)?

Also, when we have Picard ignore these errors, what about later steps? We are currently using FixMates (even on single end data) to get rid of the error.

For the record, we too are using GATK with BWA.

Quote:
Originally Posted by KevinLam
I am afraid that it will cause viewing problems in IGV downstream though but I can't confirm it as well.
Did anyone confirm this?

Last edited by Bruins; 11-23-2010 at 04:49 AM.
Bruins is offline   Reply With Quote
Old 11-23-2010, 04:55 AM   #8
drio
Senior Member
 
Location: 4117'49"N / 24'42"E

Join Date: Oct 2008
Posts: 323
Default

I would suggest not using SILENT at all. If you have to, be more specific on the error you want to ignore. The validation is very granular.
__________________
-drd
drio is offline   Reply With Quote
Old 11-23-2010, 05:28 AM   #9
Bruins
Member
 
Location: Groningen

Join Date: Feb 2010
Posts: 78
Default

Quote:
Originally Posted by drio View Post
I would suggest not using SILENT at all. If you have to, [...]
Well... the problem is that after reading this thread and some googling, I get the feeling that SILENT (or IGNORE=) is the only solution, other than running FixMates.

That takes me back to my question one: what would be the consequence of using SILENT? Drio would you like to comment on that some more?

I'm planning to run some tests, I'll report back later.
Bruins is offline   Reply With Quote
Old 11-23-2010, 07:25 AM   #10
drio
Senior Member
 
Location: 4117'49"N / 24'42"E

Join Date: Oct 2008
Posts: 323
Default

You can setup VALIDATION_STRINGENCY=LENIENT, that will tell picard to show any error it sees but to continue with the processing. After that, you can inspect the validation output and decide to bail out or continue the execution of you your pipeline.
__________________
-drd
drio is offline   Reply With Quote
Old 12-14-2011, 10:39 AM   #11
petriedish
Junior Member
 
Location: Colorado

Join Date: Nov 2010
Posts: 3
Default I know its way later but...

I noticed recently that Picard has CleanSam.jar whose only purpose at this point is to clip mappings that extend beyond the reference. That may be helpful in removing these errors.

Last edited by petriedish; 12-14-2011 at 10:39 AM. Reason: Typo
petriedish is offline   Reply With Quote
Old 03-21-2012, 04:17 PM   #12
odoyle81
Member
 
Location: United States

Join Date: Aug 2011
Posts: 31
Default

CleanSam didn't seem to work - It just ignored the same reads but didn't remove them. At least for me..
Setting to lenient validation seemed to work though!
odoyle81 is offline   Reply With Quote
Old 05-11-2012, 03:03 AM   #13
oliviera
Member
 
Location: germany

Join Date: Apr 2010
Posts: 31
Default

I am getting the same problem by using:
MarkDuplicates.jar
Using LENIENT also goes through.

Did someone look at the original question: is it a bug in bwa?
I try to repeat the pipeline published by Bowen et al 2012 in Genetics and they do not mention this problem with picard.
oliviera is offline   Reply With Quote
Old 08-07-2012, 07:09 PM   #14
mxr1895
Junior Member
 
Location: new zealand

Join Date: Feb 2012
Posts: 6
Default

Bowtie generated alignments do not have this issue.
mxr1895 is offline   Reply With Quote
Old 01-30-2013, 03:27 PM   #15
Naarkhoo
Member
 
Location: Unja

Join Date: Jan 2013
Posts: 11
Default

What could be the conclusion of this post ? I am facing the same problem.
Naarkhoo is offline   Reply With Quote
Old 01-30-2013, 03:43 PM   #16
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

Use VALIDATION_STRINGENCY=LENIANT

That way, it will report all the errors it sees to STDIN, but it will finish the job anyway.
swbarnes2 is offline   Reply With Quote
Old 02-01-2013, 10:40 AM   #17
Naarkhoo
Member
 
Location: Unja

Join Date: Jan 2013
Posts: 11
Default

Quote:
Originally Posted by swbarnes2 View Post
Use VALIDATION_STRINGENCY=LENIANT

That way, it will report all the errors it sees to STDIN, but it will finish the job anyway.
That's true, it doesn't complain, but the size of output file is 0 !
Is it something, the sequencing vendor should fix at their end ?
Naarkhoo is offline   Reply With Quote
Old 02-03-2013, 02:28 PM   #18
Naarkhoo
Member
 
Location: Unja

Join Date: Jan 2013
Posts: 11
Default

any hint ? I have asked this question in picard, but havent heard from them either;
I appreciate any hint,
Naarkhoo is offline   Reply With Quote
Old 02-03-2013, 04:08 PM   #19
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 838
Default

does spelling LENIENT correctly help?
gringer is offline   Reply With Quote
Old 02-03-2013, 04:42 PM   #20
Naarkhoo
Member
 
Location: Unja

Join Date: Jan 2013
Posts: 11
Default

Quote:
Originally Posted by gringer View Post
does spelling LENIENT correctly help?
My spelling in the command line is already correct,
Here, I just Quoted what "swbarnes2" has suggested ...
Naarkhoo is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:47 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO