Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Inconsistent SAM flag

    Hello All,

    Maybe I'm missing something trivial here...

    I have a bam file produced by bwa which contains few reads (15 reads out of 21 millions I think) where the sam flag is 20 but the read is mapped with high MAPQ. Samflag 20 should mean "read unmapped and read reverse strand" (which doesn't make sense...)

    I found this out while running picard/Downsample.
    Here's an example
    Code:
    CRIRUN_738:7:60:15550:13583#GATCAGA	20	chr15	100338908	25	36M	*	0	0	TAGGNTTCTAACCCTAACCCTAACCCTAACCCTAAC	########?B>DGGGGEGGGFGCGGGGG@GGGFGEG	XT:A:U	NM:i:2	X0:i:1	X1:i:0	XM:i:2	XO:i:0	XG:i:0	MD:Z:4G2A28
    And this is the Picard log:
    Code:
    [Thu Jun 21 10:54:03 BST 2012] net.sf.picard.sam.DownsampleSam INPUT=bam_clean/el001_6.clean.bam OUTPUT=bam_clean_downsample/el001_6.clean.8M.bam RANDOM_SEED=1234 PROBABILITY=0.3784436 VALIDATION_STRINGENCY=LENIENT    VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
    [Thu Jun 21 10:54:03 BST 2012] Executing as berald01@crinode2 on Linux 2.6.18-274.3.1.el5 amd64; Java HotSpot(TM) 64-Bit Server VM 1.6.0_21-b06; Picard version: 1.59(1062)
    Ignoring SAM validation error: ERROR: Record 4860916, Read name CRIRUN_738:7:60:15550:13583#GATCAGA, MAPQ should be 0 for unmapped read.
    Ignoring SAM validation error: ERROR: Record 6852526, Read name CRIRUN_738:7:66:12714:8917#GATCAGA, MAPQ should be 0 for unmapped read.
    Ignoring SAM validation error: ERROR: Record 6852527, Read name CRIRUN_738:6:65:7137:9838#ACTTGAA, MAPQ should be 0 for unmapped read.
    INFO    2012-06-21 10:54:58     DownsampleSam   Read 10000000 reads, kept 3786210
    INFO    2012-06-21 10:55:54     DownsampleSam   Read 20000000 reads, kept 7572764
    Ignoring SAM validation error: ERROR: Record 20637887, Read name CRIRUN_738:7:70:15841:5492#GATCAGA, MAPQ should be 0 for unmapped read.
    Ignoring SAM validation error: ERROR: Record 21376847, Read name CRIRUN_738:7:29:11846:7539#GGCTACA, MAPQ should be 0 for unmapped read.
    Ignoring SAM validation error: ERROR: Record 21376848, Read name CRIRUN_738:7:50:13587:20204#GATCAGA, MAPQ should be 0 for unmapped read.
    Ignoring SAM validation error: ERROR: Record 21385154, Read name CRIRUN_738:6:41:2357:8884#ATCACGA, MAPQ should be 0 for unmapped read.
    Ignoring SAM validation error: ERROR: Record 21385155, Read name CRIRUN_738:6:117:19385:1870#ATCACGA, MAPQ should be 0 for unmapped read.
    Ignoring SAM validation error: ERROR: Record 21385156, Read name CRIRUN_745:3:91:16809:17501#TTAGGCA, MAPQ should be 0 for unmapped read.
    Ignoring SAM validation error: ERROR: Record 21385157, Read name CRIRUN_738:6:2:15188:2232#ATCACGA, MAPQ should be 0 for unmapped read.
    Ignoring SAM validation error: ERROR: Record 21385158, Read name CRIRUN_745:2:57:17289:14668#GGCTACA, MAPQ should be 0 for unmapped read.
    Ignoring SAM validation error: ERROR: Record 21385159, Read name CRIRUN_745:3:9:10303:11004#ATCACGA, MAPQ should be 0 for unmapped read.
    Ignoring SAM validation error: ERROR: Record 21385160, Read name CRIRUN_745:3:50:12189:4898#ATCACGA, MAPQ should be 0 for unmapped read.
    Ignoring SAM validation error: ERROR: Record 21385161, Read name CRIRUN_745:3:73:14991:2335#TTAGGCA, MAPQ should be 0 for unmapped read.
    INFO    2012-06-21 10:56:02     DownsampleSam   Finished! Kept 8117302 out of 21438988 reads.
    [Thu Jun 21 10:56:02 BST 2012] net.sf.picard.sam.DownsampleSam done. Elapsed time: 1.98 minutes.
    How do we explain that a read has flag 20 while it is regularly mapped?

    Thanks!

    Dario

  • #2
    I have no explanation. However if you want to do some troubleshooting what happens if you take just these reads plus a couple others as a control and run them through BWA. Do they come out with the same problem?

    Comment


    • #3
      Is it possible that your read is hanging off of the end of one chromosome, onto the next?
      bwa will throw the 4 flag in those cases.
      Last edited by swbarnes2; 06-21-2012, 08:47 AM.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Advancing Precision Medicine for Rare Diseases in Children
        by seqadmin




        Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
        12-16-2024, 07:57 AM
      • seqadmin
        Recent Advances in Sequencing Technologies
        by seqadmin



        Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

        Long-Read Sequencing
        Long-read sequencing has seen remarkable advancements,...
        12-02-2024, 01:49 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 12-17-2024, 10:28 AM
      0 responses
      22 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 12-13-2024, 08:24 AM
      0 responses
      42 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 12-12-2024, 07:41 AM
      0 responses
      28 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 12-11-2024, 07:45 AM
      0 responses
      42 views
      0 likes
      Last Post seqadmin  
      Working...
      X