As part of a de novo assembly project, I'd like to try to identify repeat elements - everything from single gene duplications (difficult) to transposons (less difficult). The data are Illumina PE-101 reads, ~50X coverage. My (admittedly unsophisticated) approach is to assemble contigs (I'll try both de Bruijn and overlap assemblers), then flag those with >2X average read depth.
Two questions:
1) are there any tools designed for this application?
2) any suggestions for alternative strategies (e.g., candidate identification by sequence conservation, branch counting of de Bruijn graphs, etc.)?
Thanks,
Harold
Two questions:
1) are there any tools designed for this application?
2) any suggestions for alternative strategies (e.g., candidate identification by sequence conservation, branch counting of de Bruijn graphs, etc.)?
Thanks,
Harold
Comment