Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Samtools mpileup. Which reads are excluded during SNP calling?

    Hi All,

    I have two questions about Samtools mpileup. Hope somebody can help me!

    First, I found 'mpileup' filters some reads/nts out before SNP calling (compared to command 'pileup'). How can I know which reads are actually filtered?

    Second, in 'pileup' output, we can directly see the nucleotides and corresponding base quality scores mapped to one position. Does mpileup provide similar output? Or mpileup just generates the VCF file?

    Many thanks!

  • #2
    As I understand it, we shouldn't be using pileup anymore. Yes, the output is nice, but all of the same information is (somewhere) in the VCF file.

    As for what's being filtered out, however, I don't know. One of the default options is

    -Q INT skip bases with baseQ/BAQ smaller than INT [13]

    This suggests to me that you might be losing some bases with low baseQs or those with low BAQ scores (explained here: http://samtools.sourceforge.net/mpileup.shtml)

    I wish I could be of more help.

    Comment


    • #3
      Originally posted by dagarfield View Post
      As I understand it, we shouldn't be using pileup anymore. Yes, the output is nice, but all of the same information is (somewhere) in the VCF file.

      As for what's being filtered out, however, I don't know. One of the default options is

      -Q INT skip bases with baseQ/BAQ smaller than INT [13]

      This suggests to me that you might be losing some bases with low baseQs or those with low BAQ scores (explained here: http://samtools.sourceforge.net/mpileup.shtml)

      I wish I could be of more help.
      Thank you! So we cannot get the specific nucleotides and corresponding base qualities from samtools anymore, right?

      Comment


      • #4
        It is not obvious to me where that information is in the VCF file, if it is in there at all. However, you might be able to get something in the file generated by mpileup.

        Rather than generating a BCF formatted file with mpileup (as is outlined on the man page for mpileup), have you tried running mpileup without the -u and -g options? The output looks a whole lot like the output from old pileup.

        --DG

        Comment


        • #5
          I've tried to run simply

          Code:
          samtools mpileup -f ref.fasta -b bam > out
          but I get "Segmentation Fault" almost immediately.
          @dagarfield
          Any ideas on what is going on? I though I'd drop you a question here because you said you have run it without the -u and -g options.

          Thanks,
          Gareth

          Comment


          • #6
            I just ran mine with the following syntax

            Code:
            samtools mpileup -f myGenome.fasta myBam.bam > myoutput.txt
            Where myGenome.fasta is a fasta file on which I have run the command

            Code:
            samtools faidx myGenome.fasta
            In the same directory in which myGenome.fasta lives.

            This looks pretty much like what you did except for the -b option. I think you can (and maybe should) leave that out when you are running just a single BAM file. For the -b option, you'd specify not a BAM file but rather a file that contains a list of the BAM files you want to analyze.

            How'd that work?

            Comment


            • #7
              You are right about the -b parameter being a list rather than a file. I think its time for some coffee.

              Thanks!

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              30 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              32 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Working...
              X