Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Samtools mpileup. Which reads are excluded during SNP calling?

    Hi All,

    I have two questions about Samtools mpileup. Hope somebody can help me!

    First, I found 'mpileup' filters some reads/nts out before SNP calling (compared to command 'pileup'). How can I know which reads are actually filtered?

    Second, in 'pileup' output, we can directly see the nucleotides and corresponding base quality scores mapped to one position. Does mpileup provide similar output? Or mpileup just generates the VCF file?

    Many thanks!

  • #2
    As I understand it, we shouldn't be using pileup anymore. Yes, the output is nice, but all of the same information is (somewhere) in the VCF file.

    As for what's being filtered out, however, I don't know. One of the default options is

    -Q INT skip bases with baseQ/BAQ smaller than INT [13]

    This suggests to me that you might be losing some bases with low baseQs or those with low BAQ scores (explained here: http://samtools.sourceforge.net/mpileup.shtml)

    I wish I could be of more help.

    Comment


    • #3
      Originally posted by dagarfield View Post
      As I understand it, we shouldn't be using pileup anymore. Yes, the output is nice, but all of the same information is (somewhere) in the VCF file.

      As for what's being filtered out, however, I don't know. One of the default options is

      -Q INT skip bases with baseQ/BAQ smaller than INT [13]

      This suggests to me that you might be losing some bases with low baseQs or those with low BAQ scores (explained here: http://samtools.sourceforge.net/mpileup.shtml)

      I wish I could be of more help.
      Thank you! So we cannot get the specific nucleotides and corresponding base qualities from samtools anymore, right?

      Comment


      • #4
        It is not obvious to me where that information is in the VCF file, if it is in there at all. However, you might be able to get something in the file generated by mpileup.

        Rather than generating a BCF formatted file with mpileup (as is outlined on the man page for mpileup), have you tried running mpileup without the -u and -g options? The output looks a whole lot like the output from old pileup.

        --DG

        Comment


        • #5
          I've tried to run simply

          Code:
          samtools mpileup -f ref.fasta -b bam > out
          but I get "Segmentation Fault" almost immediately.
          @dagarfield
          Any ideas on what is going on? I though I'd drop you a question here because you said you have run it without the -u and -g options.

          Thanks,
          Gareth

          Comment


          • #6
            I just ran mine with the following syntax

            Code:
            samtools mpileup -f myGenome.fasta myBam.bam > myoutput.txt
            Where myGenome.fasta is a fasta file on which I have run the command

            Code:
            samtools faidx myGenome.fasta
            In the same directory in which myGenome.fasta lives.

            This looks pretty much like what you did except for the -b option. I think you can (and maybe should) leave that out when you are running just a single BAM file. For the -b option, you'd specify not a BAM file but rather a file that contains a list of the BAM files you want to analyze.

            How'd that work?

            Comment


            • #7
              You are right about the -b parameter being a list rather than a file. I think its time for some coffee.

              Thanks!

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Investigating the Gut Microbiome Through Diet and Spatial Biology
                by seqadmin




                The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
                02-24-2025, 06:31 AM
              • seqadmin
                Quality Control Essentials for Next-Generation Sequencing Workflows
                by seqadmin




                Like all molecular biology applications, next-generation sequencing (NGS) workflows require diligent quality control (QC) measures to ensure accurate and reproducible results. Proper QC begins at nucleic acid extraction and continues all the way through to data analysis. This article outlines the key QC steps in an NGS workflow, along with the commonly used tools and techniques.

                Nucleic Acid Quality Control
                Preparing for NGS starts with isolating the...
                02-10-2025, 01:58 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 03-03-2025, 01:15 PM
              0 responses
              32 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 02-28-2025, 12:58 PM
              0 responses
              138 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 02-24-2025, 02:48 PM
              0 responses
              498 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 02-21-2025, 02:46 PM
              0 responses
              245 views
              0 likes
              Last Post seqadmin  
              Working...
              X