Hi,
I have a question maybe someone can point me where to start looking for now it is still theoretical.
I have a yeast genome sequences by illumina paired end and I know a typical sequence of repeats is of length 6kb located in it. I wish to identify the different copies of repeats (some what different) which I am not sure whey they are located in genome (I have reference but the locations can be different for my genome).
I can easily find the borders pairs (one inside one outside of repeat) but what about the middle part of repeat? If I de-novo assemble only reads where at least one of the pair fit into repeat will I get different configs for different repeat copies ? How people approach this problem ?
Thanks!
I have a question maybe someone can point me where to start looking for now it is still theoretical.
I have a yeast genome sequences by illumina paired end and I know a typical sequence of repeats is of length 6kb located in it. I wish to identify the different copies of repeats (some what different) which I am not sure whey they are located in genome (I have reference but the locations can be different for my genome).
I can easily find the borders pairs (one inside one outside of repeat) but what about the middle part of repeat? If I de-novo assemble only reads where at least one of the pair fit into repeat will I get different configs for different repeat copies ? How people approach this problem ?
Thanks!