Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Confused about the strand FLAG of Bismark paired-end seq results

    Can anyone help me about the strand FLAG in the Bismark results for paired-end data?

    Previously, I was working with single-end seq analysis using Bismark. It works perfectly for me.
    Nevertheless, recently I am working on some paired-end data. I am confused about the strand FLAG in the Bismark results.

    My paired-end data are directional. Therefore,
    All "OT" have FLAG number "67"
    All "CTOT" have FLAG number "131"
    All "OB" have FLAG number "115"
    All "CTOB" have FLAG number "179"


    Correct me if I am wrong:
    when FLAG is "67" or "131", it means the reads on the "+" strand?
    The strand used as the genomic sequences for mouse.

    When FLAG is "115" or "179", it means the reads on the "-" strand?


    Am I right?

  • #2
    Originally posted by Jerry_Zhao View Post
    Can anyone help me about the strand FLAG in the Bismark results for paired-end data?

    Previously, I was working with single-end seq analysis using Bismark. It works perfectly for me.
    Nevertheless, recently I am working on some paired-end data. I am confused about the strand FLAG in the Bismark results.

    My paired-end data are directional. Therefore,
    All "OT" have FLAG number "67"
    All "CTOT" have FLAG number "131"
    All "OB" have FLAG number "115"
    All "CTOB" have FLAG number "179"


    Correct me if I am wrong:
    when FLAG is "67" or "131", it means the reads on the "+" strand?
    The strand used as the genomic sequences for mouse.

    When FLAG is "115" or "179", it means the reads on the "-" strand?


    Am I right?
    This is indeed a very confusing issues, I will link the commented code from the paired-end SAM section in the hope it answers your questions:

    ### As the FLAG value do not consider that there might be 4 different bisulfite strands of DNA, we are trying to make FLAG tags which take the strand identity into account

    # strands OT and CTOT will be treated as aligning to the top strand (both sequences are scored as aligning to the top strand)
    # strands OB and CTOB will be treated as aligning to the bottom strand (both sequences are scored as reverse complemented sequences)

    Code:
    ### This is a description of the bitwise FLAG field which needs to be set for the SAM file taken from: "The SAM Format Specification (v1.4-r985), September 7, 2011"
      ## FLAG: bitwise FLAG. Each bit is explained in the following table:
      ## Bit    Description                                                Comment                                Value
      ## 0x1    template having multiple segments in sequencing            0: single-end 1: paired end            value: 2^^0 (  1)
      ## 0x2    each segment properly aligned according to the aligner     true only for paired-end alignments    value: 2^^1 (  2)
      ## 0x4    segment unmapped                                           ---                                           ---
      ## 0x8    next segment in the template unmapped                      ---                                           ---
      ## 0x10   SEQ being reverse complemented                             - strand alignment                     value: 2^^4 ( 16)
      ## 0x20   SEQ of the next segment in the template being reversed     + strand alignment                     value: 2^^5 ( 32)
      ## 0x40   the first segment in the template                          read 1                                 value: 2^^6 ( 64)
      ## 0x80   the last segment in the template                           read 2                                 value: 2^^7 (128)
      ## 0x100  secondary alignment                                        ---                                           ---
      ## 0x200  not passing quality controls                               ---                                           ---
      ## 0x400  PCR or optical duplicate                                   ---                                           ---
    
    
      if ($index == 0){       # OT
        $flag_1 = 67;                                                      # Read 1 is on the + strand  (1+2+64) (Read 2 is technically reverse-complemented, but we do not score it)
        $flag_2 = 131;                                                     # Read 2 is on - strand but informative for the OT        (1+2+128)
      }
      elsif ($index == 1){    # CTOB
        $flag_1 = 115;                                                     # Read 1 is on the + strand, we score for OB  (1+2+16+32+64)
        $flag_2 = 179;                                                     # Read 2 is on the - strand  (1+2+16+32+128)
      }
      elsif ($index == 2){    # CTOT
        $flag_1 = 67;                                                      # Read 1 is on the - strand (CTOT) strand, but we score it for OT (1+2+64)
        $flag_2 = 131;                                                     # Read 2 is on the + strand, score it for OT (1+2+128)
      }
      elsif ($index == 3){    # OB
        $flag_1 = 115;                                                     # Read 1 is on the - strand, we score for OB  (1+2+16+32+64)
        $flag_2 = 179;                                                     # Read 2 is on the + strand  (1+2+16+32+128)
      }
    Last edited by fkrueger; 09-14-2012, 11:36 AM.

    Comment


    • #3
      67 = 64+2+1

      That means the read came from the first fastq file, the mate is not reverse direction, the read is not in the reverse direction, the mate mapped, the read mapped, the reads are properly paired, and the reads are paired.

      So yes, 67 means the read and its mate mapped in the forward direction, and the read came from the first fastq file. 131 means that the read and its mate mapped in the forward direction, and the read came from the second fastq file.

      115 = 64+32+16+2+1

      So that's first fastq file, mate is reversed, read is reversed, properly paired, paired.

      same with 179, except those came from the second fastq file.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      25 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      29 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      25 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      52 views
      0 likes
      Last Post seqadmin  
      Working...
      X