Hi everyone,
We have whole genome data from a number of tumour-normal pairs, and are looking at our score distributions for somatic changes. We find that some of the highest scores look like errors due to read pile up in repetitive regions when viewed in IGV. Thus we are beginning to think about masking repetitive parts of the genome using repeat masker or something similar.
Does anyone have any tips on how to go about this? I've had a brief look, and to me it seems I could mask up to 50% of the genome using the least stringent criteria. Does anyone have any experience to get me started? Which programs/repeat libraries are the best to use.
Thank you for your help,
Amy.
We have whole genome data from a number of tumour-normal pairs, and are looking at our score distributions for somatic changes. We find that some of the highest scores look like errors due to read pile up in repetitive regions when viewed in IGV. Thus we are beginning to think about masking repetitive parts of the genome using repeat masker or something similar.
Does anyone have any tips on how to go about this? I've had a brief look, and to me it seems I could mask up to 50% of the genome using the least stringent criteria. Does anyone have any experience to get me started? Which programs/repeat libraries are the best to use.
Thank you for your help,
Amy.
Comment