Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Some confused from samtools mpileup.

    When I run samtools with follow parameters:

    samtools mpileup -C50 -Q25 -q1 -l %s -f %s %s

    I got a output as follow:

    21 34124586 G 38 ,,,,AAA,AAAAa..,,,aaa.AaA,,A,,aaaaaaaA

    From samtools manual:

    "At this column, a dot stands for a match to the reference base on the forward strand, a comma for a match on the reverse strand, a ’>’ or ’<’ for a reference skip, ‘ACGTN’ for a mismatch on the forward strand and ‘acgtn’ for a mismatch on the reverse strand."

    How can I see 'A' at the forward strand, then see 'A' at the reverse strand?

    From the result, at position chr21:34124586, it seems it is heterozygous snp A/G. If the gene is expressed on the forward strand. So we should only see A or G on the forward strand. Why it can mapped to the reverse strand. Because only the forward strand DNA has been transcript to mRNA. The reverse strand does not.

    I am a bit of confused here. Could any one give me some hits?

    Thank you very much in advance.

  • #2
    Originally posted by fabrice View Post
    When I run samtools with follow parameters:

    samtools mpileup -C50 -Q25 -q1 -l %s -f %s %s

    I got a output as follow:

    21 34124586 G 38 ,,,,AAA,AAAAa..,,,aaa.AaA,,A,,aaaaaaaA

    From samtools manual:

    "At this column, a dot stands for a match to the reference base on the forward strand, a comma for a match on the reverse strand, a ’>’ or ’<’ for a reference skip, ‘ACGTN’ for a mismatch on the forward strand and ‘acgtn’ for a mismatch on the reverse strand."

    How can I see 'A' at the forward strand, then see 'A' at the reverse strand?

    From the result, at position chr21:34124586, it seems it is heterozygous snp A/G. If the gene is expressed on the forward strand. So we should only see A or G on the forward strand. Why it can mapped to the reverse strand. Because only the forward strand DNA has been transcript to mRNA. The reverse strand does not.

    I am a bit of confused here. Could any one give me some hits?

    Thank you very much in advance.
    It doesn't mean the revcomp of the reference strand. 'A' means that you have reads that align in the forward direction that have an A there. 'a' means that you have reads that align in the reverse direction that have an A there. If the only reads that had an A at that locus were all in the same direction, that's a sign that it's not a real SNP, but some kind of misalignment.

    Unless you specifically did a strand-specific library prep, and you probably didn't, you should expect to see reads in both directions, even though your mRNA is only in the one direction.

    So your SNP looks fine, and real, and normal.

    Comment


    • #3
      swbarnes2,

      Thank you for your clarify. Yes. I did not do a strand-specific library prepare. But I still have bit of problem to understand why I should expect to see reads in both directions.
      The mRNA is only in the one direction, when sequence we should also see one direction.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 08:47 AM
      0 responses
      14 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      54 views
      0 likes
      Last Post seqadmin  
      Working...
      X