I have recently finished my transcriptome assembly and currently in the process of annotating ~50k transcripts. For annotating, i thought of using two approaches....
Blasting those transcripts against closely related well annotated sp (in this Arabidopsis) and pull out the GO terms for the best hits from the blast and assign those GOs to my transcripts.
Blasting those transcripts against Plant Refseq database and then use b2go pipeline to annotate them using default parameters in mapping and GO annotation stepS.
Now i am wondering how do i compare which of the annotation is best. These are the following i can think of:
a) How many transcripts that were annotated in the finally annotation.
b) How many transcripts that have GOs that are associated with "BP" term
c) Evidence Code distribution for hits and sequences of the GO terms for those transcripts.
What do you think of these criteria and what else i should be thinking of before deciding which annotation i should be selecting in the end.
Blasting those transcripts against closely related well annotated sp (in this Arabidopsis) and pull out the GO terms for the best hits from the blast and assign those GOs to my transcripts.
Blasting those transcripts against Plant Refseq database and then use b2go pipeline to annotate them using default parameters in mapping and GO annotation stepS.
Now i am wondering how do i compare which of the annotation is best. These are the following i can think of:
a) How many transcripts that were annotated in the finally annotation.
b) How many transcripts that have GOs that are associated with "BP" term
c) Evidence Code distribution for hits and sequences of the GO terms for those transcripts.
What do you think of these criteria and what else i should be thinking of before deciding which annotation i should be selecting in the end.
Comment