Hi Everyone
I am running MarkDuplicates.jar on my paired end-bwa-mapped BAM file. However I get a weird error message, in fact there are 1000’s of these actually (0.5M so far).
An example is
Ignoring SAM validation error: ERROR: Record 68, Read name MachineName:1:2:15520:114537#0, CIGAR should have zero elements for unmapped read.
Has anyone experienced this before?
When I pull out the reads from the bam file above I see the read details as follows
(read1)
MachineName1:2:15520:114537#0 73 chr10 50281 0 69M31S = 50281 0 CTGTGCAATAACTGTGTACAAAAGCCCCAAAGCTTAAATTGTGCAGTTGAGCGCATGTTCTGTTGTTCAGCATTTATGTTGGTTTATAGTGGAAAAGATT
?5<3;2><<62@3A<<<7>@@=B7BCC=BB:,<+:9/)<+0;*'+-'271B@BB2BC@CC=B0B<>BA################################
XC:i:69 XT:A:R NM:i:3 SM:i:0 AM:i:0 X0:i:2 X1:i:0 XM:i:3 XO:i:0 XG:i:0 MD:Z:8A24A3T31
(Read2)
MachineName:1:2:15520:114537#0 133 chr10 50281 0 88M12S = 50281 0 TTCATTGTTTGGCATAACAGTACTTCAGATTTGAATCATCTAATAACATTGTCATCATAGCATATTCTCCTGGAAGTAACACACAATAACTACTTCAAAA
E/EBEDDFDFEFFF=?@CBA.=>.<.9.::EE=EE33?<7D>BCB-<5<>.:37:@<B/<.8986<:9;->A@A@BADBB@BB@AEE############# XC:i:88
Oddly enough they map to the same position……Although the sequences are completely different. I BLAT’ed the sequences and found for read one and two respectively
SCORE START END QSIZE IDENTITY CHRO STRAND START END SPAN
88 1 100 100 94.0% 10 + 50281 50380 100
86 1 100 100 93.0% 10 - 50520 50619 100
So sequence two really maps to a different position on chromosome 10 at a distance that’s roughly the expected insert size….
Could it be because the %Identity is low? that BWA mapped the pair incorrectly?
I am running MarkDuplicates.jar on my paired end-bwa-mapped BAM file. However I get a weird error message, in fact there are 1000’s of these actually (0.5M so far).
An example is
Ignoring SAM validation error: ERROR: Record 68, Read name MachineName:1:2:15520:114537#0, CIGAR should have zero elements for unmapped read.
Has anyone experienced this before?
When I pull out the reads from the bam file above I see the read details as follows
(read1)
MachineName1:2:15520:114537#0 73 chr10 50281 0 69M31S = 50281 0 CTGTGCAATAACTGTGTACAAAAGCCCCAAAGCTTAAATTGTGCAGTTGAGCGCATGTTCTGTTGTTCAGCATTTATGTTGGTTTATAGTGGAAAAGATT
?5<3;2><<62@3A<<<7>@@=B7BCC=BB:,<+:9/)<+0;*'+-'271B@BB2BC@CC=B0B<>BA################################
XC:i:69 XT:A:R NM:i:3 SM:i:0 AM:i:0 X0:i:2 X1:i:0 XM:i:3 XO:i:0 XG:i:0 MD:Z:8A24A3T31
(Read2)
MachineName:1:2:15520:114537#0 133 chr10 50281 0 88M12S = 50281 0 TTCATTGTTTGGCATAACAGTACTTCAGATTTGAATCATCTAATAACATTGTCATCATAGCATATTCTCCTGGAAGTAACACACAATAACTACTTCAAAA
E/EBEDDFDFEFFF=?@CBA.=>.<.9.::EE=EE33?<7D>BCB-<5<>.:37:@<B/<.8986<:9;->A@A@BADBB@BB@AEE############# XC:i:88
Oddly enough they map to the same position……Although the sequences are completely different. I BLAT’ed the sequences and found for read one and two respectively
SCORE START END QSIZE IDENTITY CHRO STRAND START END SPAN
88 1 100 100 94.0% 10 + 50281 50380 100
86 1 100 100 93.0% 10 - 50520 50619 100
So sequence two really maps to a different position on chromosome 10 at a distance that’s roughly the expected insert size….
Could it be because the %Identity is low? that BWA mapped the pair incorrectly?
Comment