Dear All,
We are a small research group who are working on NGS data analysis and Epigenomics. In Epigenomics, our research focus is CpG island detection. We are currently researching methods to automatically detect CpG islands. However, we have the following questions and we would appreciate any feedback in this matter:
1. What is the ground truth for CpG islands? We have looked at several datasets but they seem to provide locations as detected by their software (example, EMBOSS by EBI). Clearly, these cannot be used as ground truth when we are developing newer methods. Could any of you shed light on this matter and suggest a good data set with an accompanying ground truth?
2. In an automatic detection scenario, how harmful is the detection of false positives in CpG islands?
We want to thank each one of you in advance for any help you can provide in this matter.
We are a small research group who are working on NGS data analysis and Epigenomics. In Epigenomics, our research focus is CpG island detection. We are currently researching methods to automatically detect CpG islands. However, we have the following questions and we would appreciate any feedback in this matter:
1. What is the ground truth for CpG islands? We have looked at several datasets but they seem to provide locations as detected by their software (example, EMBOSS by EBI). Clearly, these cannot be used as ground truth when we are developing newer methods. Could any of you shed light on this matter and suggest a good data set with an accompanying ground truth?
2. In an automatic detection scenario, how harmful is the detection of false positives in CpG islands?
We want to thank each one of you in advance for any help you can provide in this matter.
Comment