Dear All,
We have developed a pipeline to annotate coding and long non-coding RNAs in transcriptome datasets. The pipeline is unix based and requires a multi-FASTA file of transcripts (nucleotides) as input. The final output is a tab-delimited table which can be filtered further based on the anntoations of each transcript. Currently Anncoript searches each query sequence against Uniprot, Swiss-Prot, Conserved Domain Database and Rfam. Further it associates Uniprot IDs with GO terms and Enzyme IDs. Finally it estimates longest ORF size and coding potential to give a binary classficiation on a sequence being a potential long non coding RNA. You can find it at
Below is the publication
We have developed a pipeline to annotate coding and long non-coding RNAs in transcriptome datasets. The pipeline is unix based and requires a multi-FASTA file of transcripts (nucleotides) as input. The final output is a tab-delimited table which can be filtered further based on the anntoations of each transcript. Currently Anncoript searches each query sequence against Uniprot, Swiss-Prot, Conserved Domain Database and Rfam. Further it associates Uniprot IDs with GO terms and Enzyme IDs. Finally it estimates longest ORF size and coding potential to give a binary classficiation on a sequence being a potential long non coding RNA. You can find it at
Below is the publication