Hi All,
I have a quick question. I'm doing some exome sequencing, and after all of the mapping, and annotating steps, I'm starting to go through the downstream analysis. One thing that I am looking at are the SNPs that are annotated (dbSNP, exac, TCGA, and other databases) versus what is not in those databases. One thing that I am finding is that of ~25000 variants pulled out of the exome data, only ~150 are not present (i.e. unannotated/novel). This seems like a very small amount. I'm not sure how many I would expect to get, but a few years ago when doing some exome seq I was getting about 3-5% of all SNPs to be unannotated. Out of curiosity, what are other people getting these days. Is it possible that with the vast amount of samples that have been sequenced, we are reaching the point where almost all variants will have been identified, with the exception of those few very rare or private ones?
This is more of a curiosity thing than anything else.....
I have a quick question. I'm doing some exome sequencing, and after all of the mapping, and annotating steps, I'm starting to go through the downstream analysis. One thing that I am looking at are the SNPs that are annotated (dbSNP, exac, TCGA, and other databases) versus what is not in those databases. One thing that I am finding is that of ~25000 variants pulled out of the exome data, only ~150 are not present (i.e. unannotated/novel). This seems like a very small amount. I'm not sure how many I would expect to get, but a few years ago when doing some exome seq I was getting about 3-5% of all SNPs to be unannotated. Out of curiosity, what are other people getting these days. Is it possible that with the vast amount of samples that have been sequenced, we are reaching the point where almost all variants will have been identified, with the exception of those few very rare or private ones?
This is more of a curiosity thing than anything else.....
Comment