Hi all,
I read through samtools manuals several times, but I'm still not clear on how exactly samtools & bcftools decide to call a SNP. I've tried to run through multiple combination of arguments with mpileup (-B, -C, -q, etc) & bcftools, but still ran into the problem below. I even ran bcftools view on the bcf file without the varFilter step, but the problem persists.
I have 2 samples, an original & an "evolved" cell line. Based on numerous runs, I found that there are many SNPs being called only in the "evolved" cell line but not on the original, making it look like they're "novel" SNP. However, when I view them on IGV, I can see the SNP in the original cell line and there don't seem to be significant differences between the mapping quality or base quality at the SNP position in these 2 samples.
It's not important to me if reads below a certain mapping quality don't get count, but the trouble is that it seems to be inconsistent. In one sample, SNPs on reads having mapping quality of 0 don't get count, but then they would get counted in the other sample, making it difficult to identify the true novel SNP. Is there anyway to force the SNP count to be more consistent?
-Ann
I read through samtools manuals several times, but I'm still not clear on how exactly samtools & bcftools decide to call a SNP. I've tried to run through multiple combination of arguments with mpileup (-B, -C, -q, etc) & bcftools, but still ran into the problem below. I even ran bcftools view on the bcf file without the varFilter step, but the problem persists.
I have 2 samples, an original & an "evolved" cell line. Based on numerous runs, I found that there are many SNPs being called only in the "evolved" cell line but not on the original, making it look like they're "novel" SNP. However, when I view them on IGV, I can see the SNP in the original cell line and there don't seem to be significant differences between the mapping quality or base quality at the SNP position in these 2 samples.
It's not important to me if reads below a certain mapping quality don't get count, but the trouble is that it seems to be inconsistent. In one sample, SNPs on reads having mapping quality of 0 don't get count, but then they would get counted in the other sample, making it difficult to identify the true novel SNP. Is there anyway to force the SNP count to be more consistent?
-Ann
Comment