Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • loss of bases using mpileup

    I have noticed a problem with mpileup (samtools) that results in a large loss of reads for certain base positions, with some bases being entirely absent. Briefly, I created a pileup file from a merged bam file which comprised 13 separate bams (using the merge option in samtools) and then created an mpileup from the same 13 bams and compared the results. The pileup contained 133M unique bases, while the mpileup contained only 111M unique bases, a reduction of 16%. Looking in more detail at the output, I noticed that not only were many bases completely absent, several others had massively reduced coverage. Moreover, in many cases the only remaining reads were those containing the alternate (non-reference) allele, all other possibilities were absent.

    I was using samtools 0.1.9 in each case. All parameters were set to default for the pileup, and only -B -Q 0 were used for the mpileup. Prior to the creation of the different pileups the bam files had previously been filtered for base and mapping qualities below 20, and all ambiguously mapped reads were removed.

    Any ideas on what might be causing this discrepency?

  • #2
    You can try to add -A option I think (http://seqanswers.com/forums/showthread.php?t=21233)
    Last edited by Jane M; 07-01-2012, 11:42 PM.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin




      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
      Yesterday, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    58 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    53 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    45 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    55 views
    0 likes
    Last Post seqadmin  
    Working...
    X