Hi All
I have downloaded Daphnia pulex amino acid sequence from wFleabase.
I found this amino acid sequence:
>hxNCBI_GNO_20744
MAKETNLSGKTTAVSAGSHRPDAPNEDSPALEERKTWAKKAEFLLAVIGFAVDLGNVWRFPYICYKNGRGAFLIPYVVMMVFGVLPLFYM*LALGQFHRSGCLTPWKRI*PALKGVPFARHAICIIDFYMGMYCNTYSTYQKFGNAREKLIFLYLGGGHLQMDVWTDAASQEFFSLGPGFGTLLALSSYNKFHNNCFYDALLTSSINLATSLLAGFVIFAVLAYMAEIRNVSIDQLGLEGPPGLVFVVYPEAIATMAGSTFWSMIFFFLLITLGLDSTFGGLEAMITGLCDEYPVLLGRRRELFVGILLVFIYLCALPTTTYSGMYFVDLLNVFGPGII*
As you can see there are multiple stop codons in this amino acid sequence. I am trying to find interpro domains in this sequence. Interproscan will reject this sequence because of special characters in the middle of the aa sequence. Is there anyway to use this sequence to look for protein domains in them. For example, can I just take from beginning of the sequence until M* and use that as my input for protein domain search. Please let me know what you all think of this strategy.
I have downloaded Daphnia pulex amino acid sequence from wFleabase.
I found this amino acid sequence:
>hxNCBI_GNO_20744
MAKETNLSGKTTAVSAGSHRPDAPNEDSPALEERKTWAKKAEFLLAVIGFAVDLGNVWRFPYICYKNGRGAFLIPYVVMMVFGVLPLFYM*LALGQFHRSGCLTPWKRI*PALKGVPFARHAICIIDFYMGMYCNTYSTYQKFGNAREKLIFLYLGGGHLQMDVWTDAASQEFFSLGPGFGTLLALSSYNKFHNNCFYDALLTSSINLATSLLAGFVIFAVLAYMAEIRNVSIDQLGLEGPPGLVFVVYPEAIATMAGSTFWSMIFFFLLITLGLDSTFGGLEAMITGLCDEYPVLLGRRRELFVGILLVFIYLCALPTTTYSGMYFVDLLNVFGPGII*
As you can see there are multiple stop codons in this amino acid sequence. I am trying to find interpro domains in this sequence. Interproscan will reject this sequence because of special characters in the middle of the aa sequence. Is there anyway to use this sequence to look for protein domains in them. For example, can I just take from beginning of the sequence until M* and use that as my input for protein domain search. Please let me know what you all think of this strategy.