Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • output samtools.pl

    Dear users,
    I used BFAST program to generate a file .sam (using microRNA reads from SOLiD ABI).
    I run SAMTOOLS to analyse SNPs and mutations, and I generated at the end two output files:
    a pileup file and a file .snps (by samtools.pl varFilter).

    This is an example of my file .snps:

    g hsa-let-7a-3 61 * */+A 73 73 40 81 * +A 79 2 0 0 0
    g hsa-let-7a-3 63 * */-C 133 133 45 79 * -C 76 1 2 3 1
    g hsa-let-7a-3 64 * */-T 165 165 44 78 * -T 77 1 0 0 0
    g hsa-let-7a-3 68 * */-T 385 385 40 53 * -T 52 1 0 0 0
    g hsa-let-7a-3 70 * */+A 78 78 35 43 * +A 42 1 0 0 0
    D hsa-let-7b 22 * */+A 895 895 22 1717 * +A 1652 36 29 7 37
    D hsa-let-7b 24 * */+TC 14761 14761 23 1728 * +TC 1573 51 104 7 4
    D hsa-let-7b 25 * +TA/+CTA 17955 37444 23 1730 +TA +CTA 32 17 1681 35 18
    D hsa-let-7b 26 * +A/+TA 21497 46739 23 1732 +A +TA 393 162 1177 40 29
    D hsa-let-7b 27 * +A/* 31077 31077 23 1733 +A * 6 1651 76 4 5
    D hsa-let-7b 28 T Y 228 228 23 1736 ,,.+1C,,$C.C...+4ATTC.+4ATTC*.+4ATTCC,.+1C.,...+4ATTCC..+1CCC.+1CC.+4ATTC
    .+1C.+4ATTC.+1C..+4ATTC.+4ATTC.+4ATTCC.+1C*.+4ATTCC.+1CC.+4ATTCC.+1C...+1CCC.+1C.+4ATTCC..+1CC.+1C.+1C.+1C.+1CC.+1C.+4ATTCC.+1C.+1C...+4ATTC.+1C.C..+1C.C
    ..+1C..+1C..+1C.....+4ATTCC.+4ATTCC..+4ATTC..C.+4ATTC.C.+1C.+4ATTCC.+1C.+4ATTCC...+4ATTC.C.+4ATTC.+1C......C.+4ATTC..+1C.+4ATTC.+1C.C.+4ATTC.+1CCCC.+1C.+
    1C.+4ATTC.+1C.+4ATTC.+1C.+4ATTC.+4ATTC.+4ATTC.+1C.+1CC.+4ATTC.+1CCC.+1CC.+1C.+1CC.C.+1CC.+1C.+1C.+4ATTC.+4ATTC..+1C.+1C.+1CCC.+4ATTCC.+1CC.+1C...C.+1C.+1
    CC.+4ATTC..+1C.+1C.+1C.+1C*.+4ATTC..+4ATTC.+1C.+1C.+1C.+1C.+1CC.CC.+4ATTC.+1C.+1CC..+1C..+1C.+1C..+1C.+4ATTC.+1C.+1C.+1C...+1C...+1C.A.+4ATTC.+4ATTC.-1C.
    +3CGC.+2TC.+1C.+1CC.+1C.+4ATTC..+4ATTC.+4ATTCC..+1C.+1C.C.CC.+4ATTCC.+4ATTC.+1C.+1C.+4ATTC.+1CC.C.+1C.+4ATTC.+4ATTC.+4ATTC.+1CC.+4ATTCC.+4ATTC.+4ATTC.+1C
    .+1CCC.+1C.+1C.+1C.+1C.+1CC.+4ATTC.+1CCC.+1CCCC.+1C.+1C.+4ATTC.+1CC.+1C..+4ATTC.CC.+1C.+4ATTC.+1C.CC.+1C.C.+1C.+1C.+1CC.+1CC.+1C.+1C.+1C.+4ATTC.+4ATTC.+1
    C.+4ATTCC.+1C.+1C.+1C.+1C.+1CC.+5CATTC.+4ATTC.+4ATTCCC.+1C.+1C.+1CC..C.+4ATTC.+1C..+1C..+1C.+1CC.+1C*.+1CCC.+4ATTC.+4ATTC..+1C.+1C..+1C..C.+4ATTC.+4ATTC.
    +4ATTC.C.+4ATTC.+1CC..+1C.+1CC.+4ATTC.+1C.+1C.CCC.+1C.+1C.+4ATTC.+4ATTC.+1C.+1CC..-1CC.+1C.+1C.+1C..+1C.+1C..+1C*...+1CC.+1C...+4ATTC.+1C.+1CC..+4ATTC..+
    1C...+1CC.+1C.+1C..C.+1C.-2CACC..C..+4ATTCA.+1C..+1C.+4ATTC..+1C.+1C..+1CC.+4ATTC.+4ATTC.+1C.+1C..+6GTATTCC.C.+4ATTC.+1C..+1C.+4ATTC.+4ATTC.+4ATTCC.


    But I don't know how I can read this file. What correspond to each column?
    How or where can I find the number of reads (the number of count of miRNA in the alignment)?

    Thanx a lot!
    Bye
    M.Elena

  • #2
    Originally posted by m_elena_bioinfo View Post
    Dear users,
    I used BFAST program to generate a file .sam (using microRNA reads from SOLiD ABI).
    I run SAMTOOLS to analyse SNPs and mutations, and I generated at the end two output files:
    a pileup file and a file .snps (by samtools.pl varFilter).

    This is an example of my file .snps:

    g hsa-let-7a-3 61 * */+A 73 73 40 81 * +A 79 2 0 0 0
    g hsa-let-7a-3 63 * */-C 133 133 45 79 * -C 76 1 2 3 1
    g hsa-let-7a-3 64 * */-T 165 165 44 78 * -T 77 1 0 0 0
    g hsa-let-7a-3 68 * */-T 385 385 40 53 * -T 52 1 0 0 0
    g hsa-let-7a-3 70 * */+A 78 78 35 43 * +A 42 1 0 0 0
    D hsa-let-7b 22 * */+A 895 895 22 1717 * +A 1652 36 29 7 37
    D hsa-let-7b 24 * */+TC 14761 14761 23 1728 * +TC 1573 51 104 7 4
    D hsa-let-7b 25 * +TA/+CTA 17955 37444 23 1730 +TA +CTA 32 17 1681 35 18
    D hsa-let-7b 26 * +A/+TA 21497 46739 23 1732 +A +TA 393 162 1177 40 29
    D hsa-let-7b 27 * +A/* 31077 31077 23 1733 +A * 6 1651 76 4 5
    D hsa-let-7b 28 T Y 228 228 23 1736 ,,.+1C,,$C.C...+4ATTC.+4ATTC*.+4ATTCC,.+1C.,...+4ATTCC..+1CCC.+1CC.+4ATTC
    .+1C.+4ATTC.+1C..+4ATTC.+4ATTC.+4ATTCC.+1C*.+4ATTCC.+1CC.+4ATTCC.+1C...+1CCC.+1C.+4ATTCC..+1CC.+1C.+1C.+1C.+1CC.+1C.+4ATTCC.+1C.+1C...+4ATTC.+1C.C..+1C.C
    ..+1C..+1C..+1C.....+4ATTCC.+4ATTCC..+4ATTC..C.+4ATTC.C.+1C.+4ATTCC.+1C.+4ATTCC...+4ATTC.C.+4ATTC.+1C......C.+4ATTC..+1C.+4ATTC.+1C.C.+4ATTC.+1CCCC.+1C.+
    1C.+4ATTC.+1C.+4ATTC.+1C.+4ATTC.+4ATTC.+4ATTC.+1C.+1CC.+4ATTC.+1CCC.+1CC.+1C.+1CC.C.+1CC.+1C.+1C.+4ATTC.+4ATTC..+1C.+1C.+1CCC.+4ATTCC.+1CC.+1C...C.+1C.+1
    CC.+4ATTC..+1C.+1C.+1C.+1C*.+4ATTC..+4ATTC.+1C.+1C.+1C.+1C.+1CC.CC.+4ATTC.+1C.+1CC..+1C..+1C.+1C..+1C.+4ATTC.+1C.+1C.+1C...+1C...+1C.A.+4ATTC.+4ATTC.-1C.
    +3CGC.+2TC.+1C.+1CC.+1C.+4ATTC..+4ATTC.+4ATTCC..+1C.+1C.C.CC.+4ATTCC.+4ATTC.+1C.+1C.+4ATTC.+1CC.C.+1C.+4ATTC.+4ATTC.+4ATTC.+1CC.+4ATTCC.+4ATTC.+4ATTC.+1C
    .+1CCC.+1C.+1C.+1C.+1C.+1CC.+4ATTC.+1CCC.+1CCCC.+1C.+1C.+4ATTC.+1CC.+1C..+4ATTC.CC.+1C.+4ATTC.+1C.CC.+1C.C.+1C.+1C.+1CC.+1CC.+1C.+1C.+1C.+4ATTC.+4ATTC.+1
    C.+4ATTCC.+1C.+1C.+1C.+1C.+1CC.+5CATTC.+4ATTC.+4ATTCCC.+1C.+1C.+1CC..C.+4ATTC.+1C..+1C..+1C.+1CC.+1C*.+1CCC.+4ATTC.+4ATTC..+1C.+1C..+1C..C.+4ATTC.+4ATTC.
    +4ATTC.C.+4ATTC.+1CC..+1C.+1CC.+4ATTC.+1C.+1C.CCC.+1C.+1C.+4ATTC.+4ATTC.+1C.+1CC..-1CC.+1C.+1C.+1C..+1C.+1C..+1C*...+1CC.+1C...+4ATTC.+1C.+1CC..+4ATTC..+
    1C...+1CC.+1C.+1C..C.+1C.-2CACC..C..+4ATTCA.+1C..+1C.+4ATTC..+1C.+1C..+1CC.+4ATTC.+4ATTC.+1C.+1C..+6GTATTCC.C.+4ATTC.+1C..+1C.+4ATTC.+4ATTC.+4ATTCC.


    But I don't know how I can read this file. What correspond to each column?
    How or where can I find the number of reads (the number of count of miRNA in the alignment)?

    Thanx a lot!
    Bye
    M.Elena
    I assume you are having problems understanding the samtools pileup output and BFAST ran fine. Take a look at these two pages on the samtools website that explain the output format:


    Comment


    • #3
      I guess you are using "samtools.pl -p" which explains why your variants get filtered out. The first letter gives the reason:

      # d low depth
      # D high depth
      # W too many SNPs in a window (SNP only)
      # G close to a high-quality indel (SNP only)
      # Q low RMS mapping quality (SNP only)
      # g close to another indel with higher quality (indel only)

      Note that if a SNP is filtered out due to depth, it will not be tested with "G".

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Today, 11:49 AM
      0 responses
      8 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 08:47 AM
      0 responses
      16 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      61 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Working...
      X