Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • angerusso
    Member
    • Oct 2011
    • 47

    How to set filter for frequency of reads AND HapMap exome sample results:

    Hey All,

    I only used so far three filters for my whole exome pipeline (aligning to hg19) for a HapMap sample. I tried it on the NA19240 Hapmap sample from paper below (Table 3) which shows ~196 variants (SNPs and INDELs).


    However, using my filters as below I get = ~15000 (just NON_SYNONYMOUS_CODING alterations) and ~500 (INDELs). If you add INDELS, it's going to be much higher number. What am I doing wrong?

    My list of filters are:

    1) vcfutils varFilter -D1000
    2) snpEff -minQ 20 -minCoverage 30

    Could they have different filters like frequency of variants etc.? If so, how do I set these up? Any help? What are the default parameters for # of reads (minimum) and frequency in bwa,samtools?

    Below is my pipeline:

    * bwa aln hg19.fa S375_R1.fastq > S375_1.sai
    * bwa aln hg19.fa S375_R2.fastq > S375_2.sai
    * bwa sampe hg19.fa S375_1.sai S375_2.sai S375_R1.fastq S375_R2.fastq > S375_NoIndex_L007.sam
    * samtools view -bS S375_NoIndex_L007.sam > S375_NoIndex_L007.bam
    * samtools sort S375_NoIndex_L007.bam S375_NoIndex_L007.sorted
    * Marked duplicates using picard
    * samtools index S375_NoIndex_L007.marked.bam
    * samtools mpileup -uf hg19.fa S375_NoIndex_L007.marked.bam | bcftools view -bvcg - > S375_NoIndex_L007.raw.bcf
    * bcftools view S375_NoIndex_L007.raw.bcf | vcfutils.pl varFilter -D1000 > S375_NoIndex_L007_var_d200.flt.vcf
    Last edited by angerusso; 02-28-2012, 06:09 PM.
  • aituka
    Member
    • Mar 2012
    • 13

    #2
    Just wondering what is the exome sample you try to process. I checked on the hapmap available on SRA .. but it seems that all samples are single-end.
    I could not find the NA19240 can you provide the link. thanks.
    link to the study i found in SRA: http://sra.dnanexus.com/dispatch_man..._name=download

    Comment

    • Dameon
      Member
      • Dec 2011
      • 14

      #3
      From what I can tell, you are filtering on a maximum depth of 1000. Try using a minimum depth of 5 ad allowing SNP calls from Anomalous reads. I also found turning BAQ off helps with sensitivity as BAQ can be a bit aggressive in filtering out SNP arounds indels.

      Try:
      samtools mpileup -ugBAf hg19.fa S375_NoIndex_L007.marked.bam | bcftools view -bvcg - > S375_NoIndex_L007.raw.bcf
      bcftools view S375_NoIndex_L007.raw.bcf | vcfutils.pl varFilter -d 5 > S375_NoIndex_L007_var_d200.flt.vcf

      Comment

      Latest Articles

      Collapse

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, Yesterday, 11:58 AM
      0 responses
      9 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-05-2026, 10:09 AM
      0 responses
      25 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-04-2026, 08:59 AM
      0 responses
      35 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-02-2026, 12:03 PM
      0 responses
      57 views
      0 reactions
      Last Post SEQadmin2  
      Working...