Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Samtools mpileup on multiple samples

    Hi,
    I have multiple samples from which I would like to call SNP. I've been trying either to call them separately or all at once

    samtools mpileup -uf hg19.fa sample1.bam | bcftools view -bvcg - > out.bcf
    or
    samtools mpileup -uf hg19.fa sample1.bam sample2.bam | bcftools view -bvcg - > out.bcf

    In the first case, I can get coverage (DP) at each position where a SNP is encountered, but only for the sample that has the SNP. That is, I would like to get the coverage at that position in all samples, but I could not.

    In the second case, for SNP I know are present in both samples, I found only 1 value for DP, which I assume is the combined coverage of both samples (see below).
    Is there any way I can get DP and DP4 for all samples even if the SNP is only in one?

    chr1 15211 . T G 51 . DP=16;AF1=1;CI95=0.8333,1;DP4=3,0,10,3;MQ=30;FQ=-37.5;PV4=1,1.3e-05,1,1 GT:PL:GQ 1/1:46,9,0:21 1/1:23,9,0:21

    Second question, I've been having a hard time trying to annotate the SNP calls in the vcf file (produced by bcftools view) with its rsID. I tried vcftools

    >bgzip UCSC_SNP_v132
    >zcat input.vcf.gz | vcftools_0.1.5/bin/fill-rsIDs -r UCSC_SNP_v132.gz | tabix-0.2.5/bgzip -c > out


    but kept getting error below which suggests to me that I'm not using the right format for the input dbSNP. I used the BED format downloaded from UCSC. What am I doing wrong here?

    tabix ./UCSC_SNP_v132.gz chr1 2>&1 |: No such file or directory at vcftools_0.1.5/bin/fill-rsIDs line 20
    main::error('tabix ./UCSC_SNP_v132.gz chr1 2>&1 |: No such file or directory') called at vcftools_0.1.5/bin/fill-rsIDs line 93
    main::fill_rsids('HASH(0x169602a0)', './UCSC_SNP_v132.gz') called at vcftools_0.1.5/bin/fill-rsIDs line 11

    Thanks!!

  • #2
    Regarding your first question: The -D option keeps per sample read depth with mpileup.

    Comment


    • #3
      Thanks, DerSeb!

      Regarding my 2nd question, I realized that error was because tabix (called inside fill-rsIDs) was not on my path.

      Comment


      • #4
        how to annotate VCF files created by bcftools view?

        I have 1 big vcf files which compares 12 samples by using samtools mpileup. And I would like annotate them? my reference genome is Hg19. How can annotate VCf files ?

        Could you write commands and set up and where to download vcftools?

        Thanks in advance.

        Are you using only vcftools? What about ANNOVAR?

        Comment


        • #5
          No, here's no way to run multiple sample .bams together and get the DP4 of each sample. I Which is a pity, since that would be useful. What you get instead are other stats, which can give you an idea of how good the SNP is in each sample.

          Comment


          • #6
            For annotating the VCF file you can use the Ensembl Variant Effect Predictor. It is very helpful and can also produce SIFT, PolyPhen and Condel consensus.


            I would recommend using the Perl script with a pre-built cache. http://ensembl.org/info/docs/variati...ep_script.html

            Here is the command that I use to annotate on a Mac:
            perl variant_effect_predictor.pl --species homo_sapiens -i SNP.vcf --format vcf -o SNP_effects.txt --sift b --polyphen b --condel b --gene --hgnc --canonical --regulatory --coding_only --no_intergenic --cache --compress gzcat --check_existing
            Variant Effect Predictor can also be used to check the SNPs against the 1000 Genomes and HapMap projects but I was not able to get it running. It was always throwing an error.

            --Dinesh
            Last edited by DineshCyanam; 12-28-2011, 01:26 PM.

            Comment


            • #7
              Thank you. Sorry for my late. I will try them.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 03-27-2024, 06:37 PM
              0 responses
              13 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-27-2024, 06:07 PM
              0 responses
              12 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              69 views
              0 likes
              Last Post seqadmin  
              Working...
              X