ankit4035 01-09-2019 09:55 PM

SNP calling using samtools
Hi all,

I have RNA-seq paired-end data from Illumina and I am using BWA-mem for mapping and samtools for SNP calling. I wanted to understand and get your opinion regarding following.

1. What is the criteria for calling the variant? Specifically, at a particular position, how many reads should contain variation for it to be called as a variant.
I want to call variant at a position if it supported by more than 20% of reads at that position.

What parameter can I use to achieve this.

Any help is appreciated!!

SNPsaurus 01-10-2019 06:32 AM

You can filter your vcf for variants that fit your minor allele threshold with vcftools

From the docs:
--non-ref-af <float>
--max-non-ref-af <float>
--non-ref-ac <integer>
--max-non-ref-ac <integer>

--non-ref-af-any <float>
--max-non-ref-af-any <float>
--non-ref-ac-any <integer>
--max-non-ref-ac-any <integer>

Include only sites with all Non-Reference (ALT) Allele Frequencies (af) or Counts (ac) within the range specified, and including the specified value. The default options require all alleles to meet the specified criteria, whereas the options appended with "any" require only one allele to meet the criteria. The Allele frequency is defined as the number of times an allele appears over all individuals at that site, divided by the total number of non-missing alleles at that site.

ankit4035 01-10-2019 07:07 PM

Thanks, for the help.

I was thinking that I could restrict this during SNP calling and not after vcf file was generated. How dumb of me..

