Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How can I dissect .sam files in text editor??

    Hello everyone!

    I'm new to bioinformatics.
    I have some questions about reading (eye-balling) a .sam file.

    For example:

    @SQ SN:HPV11REF LN:7931
    @SQ SN:HPV16REF LN:7846
    @SQ SN:HPV18REF LN:7857
    @SQ SN:HPV31REF LN:7906
    @SQ SN:HPV33REF LN:7909
    @SQ SN:HPV35REF LN:7879
    @SQ SN:HPV39REF LN:7833
    @SQ SN:HPV45REF LN:7858
    @SQ SN:HPV51REF LN:7808
    @SQ SN:HPV52REF LN:7942
    @SQ SN:HPV56REF LN:7845
    @SQ SN:HPV58REF LN:7824
    @SQ SN:HPV59REF LN:7896
    @SQ SN:HPV6REF LN:7996
    @SQ SN:HPV1REF LN:7816
    @SQ SN:HPV2REF LN:7860
    @SQ SN:HPV3REF LN:7820
    @SQ SN:HPV4REF LN:7353
    @SQ SN:HPV5REF LN:7746
    @SQ SN:HPV7REF LN:8027
    @SQ SN:HPV8REF LN:7654
    @SQ SN:HPV9REF LN:7434
    @SQ SN:HPV10REF LN:7919
    @SQ SN:HPV34REF LN:7723
    @SQ SN:HPV40REF LN:7909
    @SQ SN:HPV42REF LN:7917
    @SQ SN:HPV43REF LN:7975
    @SQ SN:HPV44REF LN:7833
    @SQ SN:HPV53REF LN:7859
    @SQ SN:HPV54REF LN:7759
    @SQ SN:HPV61REF LN:7989
    @SQ SN:HPV68REF LN:7822
    @SQ SN:HPV69REF LN:7700
    @SQ SN:HPV70REF LN:7905
    @SQ SN:HPV72REF LN:7989
    @SQ SN:HPV73REF LN:7700
    @SQ SN:HPV80REF LN:7427


    {BWA instruction}


    MSQ-M1307R:269:000000000-D24BN:1:1101:15163:1383 (QNAME)
    99 (FLAG)
    HPV56REF (RNAME)
    6262 (Position of the leftmost base)
    60 (Mapping quality, Phred)
    151M (CIGAR)
    = (Mate Reference sequence NaMe (`=' if same as RNAME) )
    6268 (1-based Mate POSition)
    157 ( inferred Template LENgth (insert size))

    ACATTGTACAATCCACCTGTAAATATCCTGACTATTTAAAAATGTCTGCAGATGCCTATGGTGATTCTATGTGGTTTTACTTACGCAGGGAACAATTATTTGCCAGACATTATTTTAATAGGGCTGGTAAAGTTGGGGAAACAATACCTGC

    BCCCCFFFFFFFGGGGGGGGGGHHHHHHHHHHHGHHHHHHHHHHHHHHHHHHHHHHHHHHHHGHHHHHHHHHHHHGHGHHHHHHGGGGGGGHGHHHHHHHHHHHHGHGHHHHHHHHHHHHHHHHHGGFHGHHHHHHGGGGHHHHHHHHHHH

    NM:i:0 (OPTional fields in the format “ TAG:VTYPE:VALUE”)

    MD:Z:151

    AS:i:151

    XS:i:0


    In this first read of the sam file, I pressed "Enter" when seeing a "Tabulation", for better understanding each part.

    Now, my question is about the following (copied) line (that you can find above):
    = (Mate Reference sequence NaMe (`=' if same as RNAME) )

    Does this mean: "if it were not '=' but 'gene X', then 'gene X' is contiguous to 'HPV56REF'(RNAME)." ???

    Thank you so much for your precious help!!

    Jacques T

  • #2
    Have you checked out SAM format specification?

    Comment


    • #3
      Thanks GenoMax!!!

      No I didn't look at that .pdf

      Still, tell me if I am wrong:
      In "Ref. name of the mate/next read": "next read", does it mean the one encompassing 2 genes if RNAME is not "="?

      Comment


      • #4
        Mate Reference Sequence will not be '=' if the mate maps to a different contig or chromosome (the sequences listed with @SQ at the start of the sam file).

        Occasionally you get read pairs where the 2 reads of the pair map to different chromosomes.

        Comment


        • #5
          OK. It's clearer now. Thanks Mastal

          Just in case I didn't understand, I have a dumb question: each read and its mate read are from the same sequence, except that one is forward and the other is reverse. Right?

          Comment


          • #6
            Yes, each read and its mate are from the same fragment, starting from different ends of the fragment.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM
            • seqadmin
              The Impact of AI in Genomic Medicine
              by seqadmin



              Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
              02-26-2024, 02:07 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 03-14-2024, 06:13 AM
            0 responses
            33 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-08-2024, 08:03 AM
            0 responses
            72 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-07-2024, 08:13 AM
            0 responses
            81 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-06-2024, 09:51 AM
            0 responses
            68 views
            0 likes
            Last Post seqadmin  
            Working...
            X