Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • sam flag is confusing

    i mapped reads to my transcrpitome assemblied from those reads.
    there is a record in the sam file
    HWI-ST397_0000:5:1101:19602:2158#ATCTCG 20 UN57620 131 25 95M *
    0 0 GTTGATCTGAGAGTAGCAGATGCCCTCAATCTTCACATCCTTGGGCACTTTGGCGCCCATGTCA
    GTGTCAGAGGGCTGGATGCTCGTGAAGGTGC BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
    BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB XT:A:U NM:i:5 X0:i:1 X1:i:0 X
    M:i:5 XO:i:0 XG:i:0 MD:Z:11C3C11G63C1T1


    i think the flag=20, = 16+4, 16 means reverse, 4 means unmapped, but i found it can mapped from 131 on UN57620

    the mapped sequences are here, not reverse
    >UN057620
    CCCGCTATAGCCATGATCCTAGCTTGAAAATCTCGTTGCG
    GCAAAATTTATGCTTTGTCGGCGGATAAAGGAGGGTAAGT
    GTATGTTTTGTTTACAAAGAGGAATCAAATCAGTGTAGGA
    CGACAGTCTAGTTGATCTGAGCGTACCAGATGCCCTCGAT #UN057620
    GTTGATCTGAGAGTAGCAGATGCCCTCAAT #read
    CTTCACATCCTTGGGCACTTTGGCGCCCATGTCAGTGTCA #UN057620
    GAGGGCTGGATGCTCGTGAA#read
    CTTCACATCCTTGGGCACTTTGGCGCCCATGTCAGTGTCA #UN057620
    GAGGGCTGGATGCTCGTGAAGGTGC#read

  • #2
    Agree the sam flag is very confusing and I am thoroughly confused by it. The sam file documentation doesn't help.

    Can someone explain it in plain English.

    What is the flag for reads that map uniquely? Can I sort these out with a awk/sed/grep?
    --------------
    Ethan

    Comment


    • #3
      The flags as such don't say if it mapped uniquely. bwa adds the XT tag. If it says XT:Z:U, that means it mapped uniquely. Ifit says XT:Z:R, then that end was repetative, but that doesn't mean tha the mate didn't map uniquely.

      The simplest explanation for that tag is that you are doing single end reads, with bwa, and this read hangs off of one reference sequence, and onto another. bwa concatenates sequences together, and when one read crosses two sequences like that, it will map them and give a 4 in the flag.

      Comment


      • #4
        See: http://picard.sourceforge.net/explain-flags.html

        Comment


        • #5
          Nilshomer, thanks a lot, this app is very usefull, but I still don´t understand the logic behind the flag system...

          I have some alignemnts with flag = 0 , can someone explain this??

          Comment


          • #6
            Originally posted by yina View Post
            Nilshomer, thanks a lot, this app is very usefull, but I still don´t understand the logic behind the flag system...

            I have some alignemnts with flag = 0 , can someone explain this??
            Single end reads yield only 3 possible flag values: 0,4 and 16.

            0 means the read aligned in the forward direction. 16 mean it aligned in the reverse direction. 4 means it didn't align.

            The rest of the flags are for paired end data only.

            Comment


            • #7
              Thanks swbarnes2, I found a good explanation of the way the bitwise flag is constructed, here the link:

              Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


              Hope it helps ETHANol and others

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              29 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              31 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              52 views
              0 likes
              Last Post seqadmin  
              Working...
              X