Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How can I dissect .sam files in text editor??

    Hello everyone!

    I'm new to bioinformatics.
    I have some questions about reading (eye-balling) a .sam file.

    For example:

    @SQ SN:HPV11REF LN:7931
    @SQ SN:HPV16REF LN:7846
    @SQ SN:HPV18REF LN:7857
    @SQ SN:HPV31REF LN:7906
    @SQ SN:HPV33REF LN:7909
    @SQ SN:HPV35REF LN:7879
    @SQ SN:HPV39REF LN:7833
    @SQ SN:HPV45REF LN:7858
    @SQ SN:HPV51REF LN:7808
    @SQ SN:HPV52REF LN:7942
    @SQ SN:HPV56REF LN:7845
    @SQ SN:HPV58REF LN:7824
    @SQ SN:HPV59REF LN:7896
    @SQ SN:HPV6REF LN:7996
    @SQ SN:HPV1REF LN:7816
    @SQ SN:HPV2REF LN:7860
    @SQ SN:HPV3REF LN:7820
    @SQ SN:HPV4REF LN:7353
    @SQ SN:HPV5REF LN:7746
    @SQ SN:HPV7REF LN:8027
    @SQ SN:HPV8REF LN:7654
    @SQ SN:HPV9REF LN:7434
    @SQ SN:HPV10REF LN:7919
    @SQ SN:HPV34REF LN:7723
    @SQ SN:HPV40REF LN:7909
    @SQ SN:HPV42REF LN:7917
    @SQ SN:HPV43REF LN:7975
    @SQ SN:HPV44REF LN:7833
    @SQ SN:HPV53REF LN:7859
    @SQ SN:HPV54REF LN:7759
    @SQ SN:HPV61REF LN:7989
    @SQ SN:HPV68REF LN:7822
    @SQ SN:HPV69REF LN:7700
    @SQ SN:HPV70REF LN:7905
    @SQ SN:HPV72REF LN:7989
    @SQ SN:HPV73REF LN:7700
    @SQ SN:HPV80REF LN:7427


    {BWA instruction}


    MSQ-M1307R:269:000000000-D24BN:1:1101:15163:1383 (QNAME)
    99 (FLAG)
    HPV56REF (RNAME)
    6262 (Position of the leftmost base)
    60 (Mapping quality, Phred)
    151M (CIGAR)
    = (Mate Reference sequence NaMe (`=' if same as RNAME) )
    6268 (1-based Mate POSition)
    157 ( inferred Template LENgth (insert size))

    ACATTGTACAATCCACCTGTAAATATCCTGACTATTTAAAAATGTCTGCAGATGCCTATGGTGATTCTATGTGGTTTTACTTACGCAGGGAACAATTATTTGCCAGACATTATTTTAATAGGGCTGGTAAAGTTGGGGAAACAATACCTGC

    BCCCCFFFFFFFGGGGGGGGGGHHHHHHHHHHHGHHHHHHHHHHHHHHHHHHHHHHHHHHHHGHHHHHHHHHHHHGHGHHHHHHGGGGGGGHGHHHHHHHHHHHHGHGHHHHHHHHHHHHHHHHHGGFHGHHHHHHGGGGHHHHHHHHHHH

    NM:i:0 (OPTional fields in the format “ TAG:VTYPE:VALUE”)

    MD:Z:151

    AS:i:151

    XS:i:0


    In this first read of the sam file, I pressed "Enter" when seeing a "Tabulation", for better understanding each part.

    Now, my question is about the following (copied) line (that you can find above):
    = (Mate Reference sequence NaMe (`=' if same as RNAME) )

    Does this mean: "if it were not '=' but 'gene X', then 'gene X' is contiguous to 'HPV56REF'(RNAME)." ???

    Thank you so much for your precious help!!

    Jacques T

  • #2
    Have you checked out SAM format specification?

    Comment


    • #3
      Thanks GenoMax!!!

      No I didn't look at that .pdf

      Still, tell me if I am wrong:
      In "Ref. name of the mate/next read": "next read", does it mean the one encompassing 2 genes if RNAME is not "="?

      Comment


      • #4
        Mate Reference Sequence will not be '=' if the mate maps to a different contig or chromosome (the sequences listed with @SQ at the start of the sam file).

        Occasionally you get read pairs where the 2 reads of the pair map to different chromosomes.

        Comment


        • #5
          OK. It's clearer now. Thanks Mastal

          Just in case I didn't understand, I have a dumb question: each read and its mate read are from the same sequence, except that one is forward and the other is reverse. Right?

          Comment


          • #6
            Yes, each read and its mate are from the same fragment, starting from different ends of the fragment.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Recent Advances in Sequencing Analysis Tools
              by seqadmin


              The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
              Yesterday, 07:48 AM
            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Today, 06:57 AM
            0 responses
            9 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 07:17 AM
            0 responses
            13 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-02-2024, 08:06 AM
            0 responses
            19 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-30-2024, 12:17 PM
            0 responses
            22 views
            0 likes
            Last Post seqadmin  
            Working...
            X