Hi everyone,
My question is a basic one, but very important to understanding how sequence processing actually works.
What determines (how does a variant calling program know) whether a specific locus in a sequenced dna fragment should be diploid (e.g A/T or C/G) or haploid (e.g C/C, T/T).
Is it based on what the reads are for that locus, for example, if at least 25% of the reads for that locus are C, and the remainder T, then the position will be called C/T, whereas if less than 25% of the reads are C, and > 75% of the reads are T, then the position will be called T/T (the 1 or 2 Cs will be discarded as sequencing errors)
or is it based on whether the human reference is diploid or haploid for that locus or a combination. Please give specific examples to illustrate how reads look for a diploid vs haploid call.
Thanks
My question is a basic one, but very important to understanding how sequence processing actually works.
What determines (how does a variant calling program know) whether a specific locus in a sequenced dna fragment should be diploid (e.g A/T or C/G) or haploid (e.g C/C, T/T).
Is it based on what the reads are for that locus, for example, if at least 25% of the reads for that locus are C, and the remainder T, then the position will be called C/T, whereas if less than 25% of the reads are C, and > 75% of the reads are T, then the position will be called T/T (the 1 or 2 Cs will be discarded as sequencing errors)
or is it based on whether the human reference is diploid or haploid for that locus or a combination. Please give specific examples to illustrate how reads look for a diploid vs haploid call.
Thanks
Comment