Does anyone have any suggestions on filtering the mpileup results? We have obtained SNPs/indels according to the command lines on the mpileup website:
samtools mpileup -uf ref.fa aln1.bam aln2.bam | bcftools view -bvcg - > var.raw.bcf (1)
bcftools view var.raw.bcf | vcfutils.pl varFilter -D100 > var.flt.vcf (2)
As in pileup where there is "awk '($3=="*"&&$6>=50)||($3!="*"&&$6>=20)' sample1.flt.txt > sample1.final.txt" suggested to filter the results, are there similar rules necessary to filter mpileup results?
For example, is the 6th column (named QUAL) in the vcf file from (2) the same as the 6th column in the pileup file, and is it appropriate to apply "$6>20" for the vcf file too?
Any suggestion is appreciated.
samtools mpileup -uf ref.fa aln1.bam aln2.bam | bcftools view -bvcg - > var.raw.bcf (1)
bcftools view var.raw.bcf | vcfutils.pl varFilter -D100 > var.flt.vcf (2)
As in pileup where there is "awk '($3=="*"&&$6>=50)||($3!="*"&&$6>=20)' sample1.flt.txt > sample1.final.txt" suggested to filter the results, are there similar rules necessary to filter mpileup results?
For example, is the 6th column (named QUAL) in the vcf file from (2) the same as the 6th column in the pileup file, and is it appropriate to apply "$6>20" for the vcf file too?
Any suggestion is appreciated.
Comment