I remember that former versions of the reference genome (at least hg18) used to lack some genes due to compression of the sequences to one prototype in the case of closely located repeated 'genes'.
I am wondering to which extend this is still true in the hg19/B37 builds and how many genes/regions might suffer from this artifactual simplification. Please correct me if this is wrong as I am not 100% certain.
This question aims at informing and maybe warning people performing NGS that their reference might not be the true genome in given analyzed cell types.
Thanks to Immunologists or specialists aware of the status of MHC regions and other similar hypervariable loci (Ig genes, TCR...) for their lights.
Cheers,
I am wondering to which extend this is still true in the hg19/B37 builds and how many genes/regions might suffer from this artifactual simplification. Please correct me if this is wrong as I am not 100% certain.
This question aims at informing and maybe warning people performing NGS that their reference might not be the true genome in given analyzed cell types.
Thanks to Immunologists or specialists aware of the status of MHC regions and other similar hypervariable loci (Ig genes, TCR...) for their lights.
Cheers,
Comment