Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • kentk
    Member
    • Dec 2011
    • 17

    If pileup is deprecated, so how do I get consensus?

    I wanted to get the consensus sequence from a bam file and came across this
    Download SAM tools for free. SAM (Sequence Alignment/Map) is a flexible generic format for storing nucleotide sequence alignment. SAMtools provide efficient utilities on manipulating alignments in the SAM format.


    When I typed it, it cant run anymore because pileup has apparently been removed. It recommends using mpileup but theres no -c switch for consensus. Any way to do this using mpileup or should I just get an older version of samtools?
  • jfk
    Junior Member
    • Jul 2011
    • 5

    #2
    Originally posted by kentk View Post
    I wanted to get the consensus sequence from a bam file and came across this
    Download SAM tools for free. SAM (Sequence Alignment/Map) is a flexible generic format for storing nucleotide sequence alignment. SAMtools provide efficient utilities on manipulating alignments in the SAM format.


    When I typed it, it cant run anymore because pileup has apparently been removed. It recommends using mpileup but theres no -c switch for consensus. Any way to do this using mpileup or should I just get an older version of samtools?
    I ran into this problem too as the local cluster installation of samtools is v 0.1.18. The work I am repeating/checking requires pileup and the ability to generate a consensus sequence.

    Locally I have a desktop running BioLinux and installed v 0.1.17 of samtools on that. So I can do my mapping/alignment on the cluster and then generated a consensus sequence locally.

    Feel free to PM me.

    In the end I just installed a local

    Comment

    • chadn737
      Senior Member
      • Jan 2009
      • 392

      #3
      Did you read the Samtools page about calling SNPs and IN/DELs with mpileup?

      Code:
      Since r865, it is possible to generate the consensus sequence with
      
          samtools mpileup -uf ref.fa aln.bam | bcftools view -cg - | vcfutils.pl vcf2fq > cns.fq

      Comment

      • jfk
        Junior Member
        • Jul 2011
        • 5

        #4
        I would be interested in generating a consensus sequence - can mpileup do this as I'm not sure from the documentation that this is possible?

        mpileup is fine for variant calling and SNP detection - the option for the -c flag in pileup being preserved in the further releases of samtools would have been useful.

        Comment

        • lh3
          Senior Member
          • Feb 2008
          • 686

          #5
          Hasn't chadn737 already told you the answer?

          Comment

          • jflowers
            Member
            • Oct 2011
            • 42

            #6
            I think chadn737 may be asking if it is possible to use the mpileup | bcftools view | vcfutils.pl to get the consensus for MULTIPLE genomes. So far, in my reading, I only see examples of consensus sequences being generated when there is one input bam.

            Is it possible to get the consensus for multiple input bams?

            Comment

            • chadn737
              Senior Member
              • Jan 2009
              • 392

              #7
              Originally posted by jflowers View Post
              I think chadn737 may be asking if it is possible to use the mpileup | bcftools view | vcfutils.pl to get the consensus for MULTIPLE genomes. So far, in my reading, I only see examples of consensus sequences being generated when there is one input bam.

              Is it possible to get the consensus for multiple input bams?
              I'm not asking anything other than whether or not the original posters bothered to read the instructions. I am pointing out to those asking the question that it says in the samtools mpileup instructions how to go about getting a consensus sequence. Pileup only worked on one bam file, mpileup allows multiple. The pipeline that is described I would think would work for multiple bam inputs given that mpileup is designed to do exactly that.
              Last edited by chadn737; 02-28-2012, 08:26 AM.

              Comment

              • jflowers
                Member
                • Oct 2011
                • 42

                #8
                Just wanted to rephrase my previous question. If say multiple different samples (e.g., humans) were sequenced and we want a separate consensus for each human, how would we generate multiple consensus sequences (one for each genome) using mpileup | bcftools view | vcfutils.pl?

                Comment

                • chadn737
                  Senior Member
                  • Jan 2009
                  • 392

                  #9
                  Originally posted by jflowers View Post
                  Just wanted to rephrase my previous question. If say multiple different samples (e.g., humans) were sequenced and we want a separate consensus for each human, how would we generate multiple consensus sequences (one for each genome) using mpileup | bcftools view | vcfutils.pl?
                  mpileup works equally well on one input bam as it does on multiple input bams. Thats how I typically use it, on a single bam.

                  So just run the pipeline for each genome/bam individually and you should get exactly what you want.

                  Comment

                  • jflowers
                    Member
                    • Oct 2011
                    • 42

                    #10
                    Yes, mpileup works just fine on individual genomes. However, for my application, I would like to use the mpileup to call snps for many genomes simultaneously and then obtain a consensus sequence for EACH genome.

                    For example, I would like to run something like mpileup [options] sample1.bam sample2.bam ... sampleN.bam | bcftools view | vcfutils.pl and obtain N consensus sequences (where N is the number of input BAMs representing different sequenced individuals)

                    I have tried and the desired output doesnt seem to be implemented in this workflow.

                    I would appreciate any suggestions.

                    Comment

                    • swbarnes2
                      Senior Member
                      • May 2008
                      • 910

                      #11
                      Originally posted by jflowers View Post
                      Yes, mpileup works just fine on individual genomes. However, for my application, I would like to use the mpileup to call snps for many genomes simultaneously and then obtain a consensus sequence for EACH genome.

                      For example, I would like to run something like mpileup [options] sample1.bam sample2.bam ... sampleN.bam | bcftools view | vcfutils.pl and obtain N consensus sequences (where N is the number of input BAMs representing different sequenced individuals)

                      I have tried and the desired output doesnt seem to be implemented in this workflow.

                      I would appreciate any suggestions.
                      So you run mpileup with all the samples (though I'm not sure that doing that helps much), generate a vcf with a column for every sample. Why can't you just make individual vcfs from that to put through vcfutils?

                      Comment

                      • kentk
                        Member
                        • Dec 2011
                        • 17

                        #12
                        Sorry chad didnt see that mpileup documentation, I read the wiki page FAQs instead which seems dated. Anyway I tried it and it worked. Thanks!

                        Comment

                        • jflowers
                          Member
                          • Oct 2011
                          • 42

                          #13
                          swbarnes2,
                          Thanks for the suggestion, that just might work

                          Comment

                          • jflowers
                            Member
                            • Oct 2011
                            • 42

                            #14
                            Hi swbarnes2,

                            Just wanted to follow up on the issue of generating multiple consensus sequences from mpileup run with multiple BAMs in the event that someone happend upon this thread with the same question.

                            I looked more carefully into your suggestion of splitting the output VCF into many VCFs (one for each genotype) and running it through vcfutils.pl

                            Originally posted by swbarnes2 View Post
                            So you run mpileup with all the samples (though I'm not sure that doing that helps much), generate a vcf with a column for every sample. Why can't you just make individual vcfs from that to put through vcfutils?
                            A good idea, but it won't work. Looking at the vcfutils.pl vcf2fq routine, this script doesn't use any information from VCF column 10 (the column with genotypes, phred likelihoods, genotype qualities; e.g., 0/1:17,0,81:20).

                            All vcf2fq does is use information from the FQ and AF1 tags to call genotypes and these are set by bcftools for the set of genotypes.

                            Looks like creating multiple consensus sequences from multiple bams would require a currently unavailable utility.

                            Comment

                            Latest Articles

                            Collapse

                            • SEQadmin2
                              Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                              by SEQadmin2


                              I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                              Here are nine questions we think about, in roughly the order they matter, before...
                              Yesterday, 07:11 AM
                            • SEQadmin2
                              From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                              by SEQadmin2


                              Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                              The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                              ...
                              06-02-2026, 10:05 AM
                            • SEQadmin2
                              Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                              by SEQadmin2


                              With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                              Introduction

                              Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                              05-22-2026, 06:42 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by SEQadmin2, 06-17-2026, 06:09 AM
                            0 responses
                            16 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-09-2026, 11:58 AM
                            0 responses
                            37 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-05-2026, 10:09 AM
                            0 responses
                            43 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-04-2026, 08:59 AM
                            0 responses
                            49 views
                            0 reactions
                            Last Post SEQadmin2  
                            Working...