Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • samtools pileup format

    The document regarding the pileup says
    "Upper case letters are mismatches on the forward strand whereas lower case letters are mismatches on the reverse strand."

    I have a line of pileup file read like this without using the -c option

    chrM 9 A 5 GGGgg CCC*>



    This means I have 5x depth on this location 9 of chrM, the ref is A, but I read 3 big G and 2 small g. I am assuming the big G means I actually read a G instead of A. But I am not sure how to explain the small g, does it mean I read a C in the reverse strand or a G in the reverse strand?



    It is very important I get this right.



    Thanks
    Last edited by foxyg; 09-28-2010, 06:03 PM.

  • #2
    Your read mapped to the reverse strand, where you observe a C instead of the T from the reference genome. The reverse complement (g) is shown in the pileup because this always refers to the forward strand (reference: A).

    Comment


    • #3
      if you use -c option with pileup and use samtools.pl to generate a consensus fasta sequence file. There are many bases are in lower case in the fasta file. Do those lower cases mean repeats or something else?

      Comment


      • #4
        After looking at the code, it appears that the lower-case bases are positions where there are gaps in the mapping; e.g., no reads mapped to those positions.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        31 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        32 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        28 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        53 views
        0 likes
        Last Post seqadmin  
        Working...
        X