I want to generate my own library for RepeatMasker.
I plan to generate a library based on Plant Repeat Databases at Michigan State University (http://plantrepeats.plantbiology.msu.edu/downloads.html). How do I start from there??
The file content is in fasta, eg
>ARSiRGRR00500001 gi|29468390|gb|AF497419.1| Arabidopsis suecica clone JS6-3924 5S ribosomal RNA gene, partial sequence
GACAACTTGGCAGGGATACCTTTTCGGAAAGCCCAAAGAGAGCCCTCGGACGAAAGAAGC
AGGACAATGAACTTTTCCATTGACTTTTTGTCGACTCCAAATTTTGACCTTTAAGTACTT
TTTCGGGCATTTTCGTGATTTGTGCTATATTACGGACCCAAAATTACTCTTTCAAGCATT
GTTTTCGAATATTTTTCGTGCATCAAAGCTCGTTAAGACTAGATGGGGTATCCCTACATG
GCGGGTAGGACCCACGGCGAACGGTTCATTCAAGACTTAAAAAAAGAATATATACGATTG
CATTGCATATACTAACGGCTGCGATCATACCAGCACTAATGCACCGGATCCCATCAGAAA
TCCGCAGTTAAGCGTGCTTGGGCGAGAGTAGTAACTAGGATGGGTGACCTCCCGGGAAGT
CCTCGTGTTGCAGCCCTCTTTTTTTTTTTTT
>ARSiRGRR00500002 gi|29468389|gb|AF497418.1| Arabidopsis suecica clone JS6-3925 5S ribosomal RNA gene, partial sequence
GCCAAACTTGGCATGTGATACCTTTTCGGAAAGCCCAAAGACAGCCCTCCGACGAAATAA
GCAGGACAATGGAATTTTCCATTGACTTTTTGTCGACCCCAAATTTTGACCTTTAAGTAC
TTTTTCGGGCATTTTCGTGATTTGGGCTATATTACGGACCCAAAATTACTTGTTCAAGCA
TTGTTTTCGAATTTTTTCATGCATCAAAGCTCGTTAAGACTAGATGGGGTATCCCTACAT
AGCGGGTGGGACCCACGGCGAATGGTTCATCAAGTCTTCAAAAAAGAATATATACGATTG
CATTGCATATACTAACGGATGCGATCATACCAGCACTAATGCACCGGATCCCATCAGAAC
TCCGCAGTTAAGCGTGCTTGGGCGAGAGTAGTACTAGGATGGGTGACCTCCCGGGAAGTC
CTCGTGTTGCATCCCTCTTTTATGTTTAACCTTTT
I plan to generate a library based on Plant Repeat Databases at Michigan State University (http://plantrepeats.plantbiology.msu.edu/downloads.html). How do I start from there??
The file content is in fasta, eg
>ARSiRGRR00500001 gi|29468390|gb|AF497419.1| Arabidopsis suecica clone JS6-3924 5S ribosomal RNA gene, partial sequence
GACAACTTGGCAGGGATACCTTTTCGGAAAGCCCAAAGAGAGCCCTCGGACGAAAGAAGC
AGGACAATGAACTTTTCCATTGACTTTTTGTCGACTCCAAATTTTGACCTTTAAGTACTT
TTTCGGGCATTTTCGTGATTTGTGCTATATTACGGACCCAAAATTACTCTTTCAAGCATT
GTTTTCGAATATTTTTCGTGCATCAAAGCTCGTTAAGACTAGATGGGGTATCCCTACATG
GCGGGTAGGACCCACGGCGAACGGTTCATTCAAGACTTAAAAAAAGAATATATACGATTG
CATTGCATATACTAACGGCTGCGATCATACCAGCACTAATGCACCGGATCCCATCAGAAA
TCCGCAGTTAAGCGTGCTTGGGCGAGAGTAGTAACTAGGATGGGTGACCTCCCGGGAAGT
CCTCGTGTTGCAGCCCTCTTTTTTTTTTTTT
>ARSiRGRR00500002 gi|29468389|gb|AF497418.1| Arabidopsis suecica clone JS6-3925 5S ribosomal RNA gene, partial sequence
GCCAAACTTGGCATGTGATACCTTTTCGGAAAGCCCAAAGACAGCCCTCCGACGAAATAA
GCAGGACAATGGAATTTTCCATTGACTTTTTGTCGACCCCAAATTTTGACCTTTAAGTAC
TTTTTCGGGCATTTTCGTGATTTGGGCTATATTACGGACCCAAAATTACTTGTTCAAGCA
TTGTTTTCGAATTTTTTCATGCATCAAAGCTCGTTAAGACTAGATGGGGTATCCCTACAT
AGCGGGTGGGACCCACGGCGAATGGTTCATCAAGTCTTCAAAAAAGAATATATACGATTG
CATTGCATATACTAACGGATGCGATCATACCAGCACTAATGCACCGGATCCCATCAGAAC
TCCGCAGTTAAGCGTGCTTGGGCGAGAGTAGTACTAGGATGGGTGACCTCCCGGGAAGTC
CTCGTGTTGCATCCCTCTTTTATGTTTAACCTTTT
Comment