Hi guys,
This is my first thread here in this forum. I hope I'm writing this question into the correct category!
Well, I have called variants with samtools mpileup, where:
MAPQ=20 # min map quality
BASEQ=20 # min base quality
and then filtered them by:
MINCOV=5 # min coverage
MAXCOV=30 # maximum coverage
Now, I have to look for the SNP density within the region sequenced. For that, I obtain the depth file (samtools depth).
awk '{nucleotides++} END {print nucleotides}' depthfile.txt
#In order to know the real length.
It should be as simple as divide the number of SNP by the length. However, doing that I would obtain a wrong number of SNP/kb, because I've filtered SNPs by quality and coverage.
So, am I correct if I should filter the depth file by coverage? Quality as well? Or not?
Thanks a lot, guys.
This is my first thread here in this forum. I hope I'm writing this question into the correct category!
Well, I have called variants with samtools mpileup, where:
MAPQ=20 # min map quality
BASEQ=20 # min base quality
and then filtered them by:
MINCOV=5 # min coverage
MAXCOV=30 # maximum coverage
Now, I have to look for the SNP density within the region sequenced. For that, I obtain the depth file (samtools depth).
awk '{nucleotides++} END {print nucleotides}' depthfile.txt
#In order to know the real length.
It should be as simple as divide the number of SNP by the length. However, doing that I would obtain a wrong number of SNP/kb, because I've filtered SNPs by quality and coverage.
So, am I correct if I should filter the depth file by coverage? Quality as well? Or not?
Thanks a lot, guys.
Comment