Hello All,
Maybe I'm missing something trivial here...
I have a bam file produced by bwa which contains few reads (15 reads out of 21 millions I think) where the sam flag is 20 but the read is mapped with high MAPQ. Samflag 20 should mean "read unmapped and read reverse strand" (which doesn't make sense...)
I found this out while running picard/Downsample.
Here's an example
And this is the Picard log:
How do we explain that a read has flag 20 while it is regularly mapped?
Thanks!
Dario
Maybe I'm missing something trivial here...
I have a bam file produced by bwa which contains few reads (15 reads out of 21 millions I think) where the sam flag is 20 but the read is mapped with high MAPQ. Samflag 20 should mean "read unmapped and read reverse strand" (which doesn't make sense...)
I found this out while running picard/Downsample.
Here's an example
Code:
CRIRUN_738:7:60:15550:13583#GATCAGA 20 chr15 100338908 25 36M * 0 0 TAGGNTTCTAACCCTAACCCTAACCCTAACCCTAAC ########?B>DGGGGEGGGFGCGGGGG@GGGFGEG XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:4G2A28
Code:
[Thu Jun 21 10:54:03 BST 2012] net.sf.picard.sam.DownsampleSam INPUT=bam_clean/el001_6.clean.bam OUTPUT=bam_clean_downsample/el001_6.clean.8M.bam RANDOM_SEED=1234 PROBABILITY=0.3784436 VALIDATION_STRINGENCY=LENIENT VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false [Thu Jun 21 10:54:03 BST 2012] Executing as berald01@crinode2 on Linux 2.6.18-274.3.1.el5 amd64; Java HotSpot(TM) 64-Bit Server VM 1.6.0_21-b06; Picard version: 1.59(1062) Ignoring SAM validation error: ERROR: Record 4860916, Read name CRIRUN_738:7:60:15550:13583#GATCAGA, MAPQ should be 0 for unmapped read. Ignoring SAM validation error: ERROR: Record 6852526, Read name CRIRUN_738:7:66:12714:8917#GATCAGA, MAPQ should be 0 for unmapped read. Ignoring SAM validation error: ERROR: Record 6852527, Read name CRIRUN_738:6:65:7137:9838#ACTTGAA, MAPQ should be 0 for unmapped read. INFO 2012-06-21 10:54:58 DownsampleSam Read 10000000 reads, kept 3786210 INFO 2012-06-21 10:55:54 DownsampleSam Read 20000000 reads, kept 7572764 Ignoring SAM validation error: ERROR: Record 20637887, Read name CRIRUN_738:7:70:15841:5492#GATCAGA, MAPQ should be 0 for unmapped read. Ignoring SAM validation error: ERROR: Record 21376847, Read name CRIRUN_738:7:29:11846:7539#GGCTACA, MAPQ should be 0 for unmapped read. Ignoring SAM validation error: ERROR: Record 21376848, Read name CRIRUN_738:7:50:13587:20204#GATCAGA, MAPQ should be 0 for unmapped read. Ignoring SAM validation error: ERROR: Record 21385154, Read name CRIRUN_738:6:41:2357:8884#ATCACGA, MAPQ should be 0 for unmapped read. Ignoring SAM validation error: ERROR: Record 21385155, Read name CRIRUN_738:6:117:19385:1870#ATCACGA, MAPQ should be 0 for unmapped read. Ignoring SAM validation error: ERROR: Record 21385156, Read name CRIRUN_745:3:91:16809:17501#TTAGGCA, MAPQ should be 0 for unmapped read. Ignoring SAM validation error: ERROR: Record 21385157, Read name CRIRUN_738:6:2:15188:2232#ATCACGA, MAPQ should be 0 for unmapped read. Ignoring SAM validation error: ERROR: Record 21385158, Read name CRIRUN_745:2:57:17289:14668#GGCTACA, MAPQ should be 0 for unmapped read. Ignoring SAM validation error: ERROR: Record 21385159, Read name CRIRUN_745:3:9:10303:11004#ATCACGA, MAPQ should be 0 for unmapped read. Ignoring SAM validation error: ERROR: Record 21385160, Read name CRIRUN_745:3:50:12189:4898#ATCACGA, MAPQ should be 0 for unmapped read. Ignoring SAM validation error: ERROR: Record 21385161, Read name CRIRUN_745:3:73:14991:2335#TTAGGCA, MAPQ should be 0 for unmapped read. INFO 2012-06-21 10:56:02 DownsampleSam Finished! Kept 8117302 out of 21438988 reads. [Thu Jun 21 10:56:02 BST 2012] net.sf.picard.sam.DownsampleSam done. Elapsed time: 1.98 minutes.
Thanks!
Dario
Comment