Hello all,
I have a mystery sequence that I was hoping y'all could help me with.
I've assembled a transcriptome set from both illumina and 454 datasets of the same plant sample. My approach to functional annotation has been to use blast searches against nr, refseq and hmmer searches vs interpro.
After performing expression analysis, I found that my 3rd highest expressed contig a long with a couple more very highly expressed ones were unannotated.
This 1200 bp sequence is found near identical in both my illumina and 454 assemblies. It has no hits vs nt, nr, refseq, cdd, interpro... it also seems to exist as the linking part of an assembly chimeric of 2600bp
(kinase-unannotated mystery seq-transferase).
Could anyone suggest ideas? I'm at a loss. Thanks very much!!
>c1
gttttatattagaattgacaaaaacataataataaaaaggttgtgtaactaaagatggcacttattgaaacaacattggctccgaatattactaaccataaccacaaccacggcgtttaagtggtgaacacagtataggtagaaacaaaatccataacatagcactaggaaaccctagaaaaacagggagacagagatgatgatcctctcctctccaaaagcattaccatccacactccagtcaccatcggtgctcaaaattattttactttctttcwsttaattgcctttgttcatcctcaactctctctttcctcccttttcttgcaattcaaattagtattccaatggccactgctgaggttgtatctgcagcgactgcattgcaagagaaagacacaactcatgaggaattgaacaagagcccagttgttgatgagaccaaggaagagaagccaacagaagaagtggtgacaccaccacccacatcagaagaggtcaaggaagacaaggctgatgcttcaatagaggaaccagtagccactacagatcaagcagaagccactgctgaagaagagaaagcagaggaggcacaagttgaggaggtaaaggaaacaaaggattcagttgaagaggagaaagcagtggttgaggagactaaagaagaggaatcaaaagaagataaggttagtactcctgaaccagtagcacctgaagagaagacccacgaaactacacctactactactaaagatgttagtgagagtactgttgaagcagaagagaaagttgttcaatcagagataccagttgaggaagccaaagcaacagaagcagaagagaaagttgttgccgcagagactactccagtagttgagaaggctgaggagtagatttgatctgtgttccccaggattctgattgtctggccagctagtagtgctggatgtttgtgtagtgggtgttttatatagaagcttgcatgttaagagttgatgagttttaatgtagtaaaagaatctatgtaggttatgaggcttggaagttagtatattaacttggtattttcttgggcccaccaagtaccattctgccatgtggcaaaggtccaatgccccaaaaatatgtttcaaattcccaaagcttttgtttggtggagggacactctctctattgcagtgtggagggttgtgtactggttggtctcgtgtactgtggatvdwcatsrsagttgcagcwacatyyraagccaaagagtaatcacagattttaaaagtaagtgttgggccctg
I have a mystery sequence that I was hoping y'all could help me with.
I've assembled a transcriptome set from both illumina and 454 datasets of the same plant sample. My approach to functional annotation has been to use blast searches against nr, refseq and hmmer searches vs interpro.
After performing expression analysis, I found that my 3rd highest expressed contig a long with a couple more very highly expressed ones were unannotated.
This 1200 bp sequence is found near identical in both my illumina and 454 assemblies. It has no hits vs nt, nr, refseq, cdd, interpro... it also seems to exist as the linking part of an assembly chimeric of 2600bp
(kinase-unannotated mystery seq-transferase).
Could anyone suggest ideas? I'm at a loss. Thanks very much!!
>c1
gttttatattagaattgacaaaaacataataataaaaaggttgtgtaactaaagatggcacttattgaaacaacattggctccgaatattactaaccataaccacaaccacggcgtttaagtggtgaacacagtataggtagaaacaaaatccataacatagcactaggaaaccctagaaaaacagggagacagagatgatgatcctctcctctccaaaagcattaccatccacactccagtcaccatcggtgctcaaaattattttactttctttcwsttaattgcctttgttcatcctcaactctctctttcctcccttttcttgcaattcaaattagtattccaatggccactgctgaggttgtatctgcagcgactgcattgcaagagaaagacacaactcatgaggaattgaacaagagcccagttgttgatgagaccaaggaagagaagccaacagaagaagtggtgacaccaccacccacatcagaagaggtcaaggaagacaaggctgatgcttcaatagaggaaccagtagccactacagatcaagcagaagccactgctgaagaagagaaagcagaggaggcacaagttgaggaggtaaaggaaacaaaggattcagttgaagaggagaaagcagtggttgaggagactaaagaagaggaatcaaaagaagataaggttagtactcctgaaccagtagcacctgaagagaagacccacgaaactacacctactactactaaagatgttagtgagagtactgttgaagcagaagagaaagttgttcaatcagagataccagttgaggaagccaaagcaacagaagcagaagagaaagttgttgccgcagagactactccagtagttgagaaggctgaggagtagatttgatctgtgttccccaggattctgattgtctggccagctagtagtgctggatgtttgtgtagtgggtgttttatatagaagcttgcatgttaagagttgatgagttttaatgtagtaaaagaatctatgtaggttatgaggcttggaagttagtatattaacttggtattttcttgggcccaccaagtaccattctgccatgtggcaaaggtccaatgccccaaaaatatgtttcaaattcccaaagcttttgtttggtggagggacactctctctattgcagtgtggagggttgtgtactggttggtctcgtgtactgtggatvdwcatsrsagttgcagcwacatyyraagccaaagagtaatcacagattttaaaagtaagtgttgggccctg
Comment