Hi all,
I read several threads here and in other bio-forums about masking genome regions. There are different opinions, from don't mask at all to the always do it.
What do you think?
I'm realizing my own annotation for a genome and I have quite a number of repeats. I have reads of very different sizes and many samples, so I would like to avoid the most of the background noise (which is already high).
My questions are 3:
1. Would you recommend masking with RepeatMasker?
2. If yes, then should I leave repeats and mask only TEs and other elements?
3. If again yes, is BLAT still the Gold Standard tool for aligning reads on to a genome with a mask file as option?
I usually use tophat or STAR but I never worked with masked regions, and apparently they don't deal with them.
I read several threads here and in other bio-forums about masking genome regions. There are different opinions, from don't mask at all to the always do it.
What do you think?
I'm realizing my own annotation for a genome and I have quite a number of repeats. I have reads of very different sizes and many samples, so I would like to avoid the most of the background noise (which is already high).
My questions are 3:
1. Would you recommend masking with RepeatMasker?
2. If yes, then should I leave repeats and mask only TEs and other elements?
3. If again yes, is BLAT still the Gold Standard tool for aligning reads on to a genome with a mask file as option?
I usually use tophat or STAR but I never worked with masked regions, and apparently they don't deal with them.