I have some pooled sequencing data, 200 individuals pooled into 10 pools, so each pool has 20 individuals.
When I use samtools depth to check the coverage, most of position has coverage as high as 7000, that means more than 30x per person, but we also have a proportion of place has less than 10x.
Does anyone has experience on dealing with the unbalanced coverage, should I only use those position with the 30x for variant call or include all of them but what is the best way to balance the coverage. I read some papers mention over amplification, how do I check whether our data have this issue?
The attached is my plot of coverage against the position on two different regions. The two blue vertical lines shows the sequencing region. Any thought is welcome!
Thanks a lot!
When I use samtools depth to check the coverage, most of position has coverage as high as 7000, that means more than 30x per person, but we also have a proportion of place has less than 10x.
Does anyone has experience on dealing with the unbalanced coverage, should I only use those position with the 30x for variant call or include all of them but what is the best way to balance the coverage. I read some papers mention over amplification, how do I check whether our data have this issue?
The attached is my plot of coverage against the position on two different regions. The two blue vertical lines shows the sequencing region. Any thought is welcome!
Thanks a lot!