Hi, all
Recently, I have downloaded a GTF file from Ensembl and found that there're coding and non-coding transcript annotations in the file. I just want to focus on the coding annotations, so can anyone tell me how to clean the GTF file?
The problem is that there're many types of transcripts in the GTF. Here's the list of each type:
I want to know which is coding and which is non-coding transcript.
Hope receive useful reply form you,
Thanks
wisense
Recently, I have downloaded a GTF file from Ensembl and found that there're coding and non-coding transcript annotations in the file. I just want to focus on the coding annotations, so can anyone tell me how to clean the GTF file?
The problem is that there're many types of transcripts in the GTF. Here's the list of each type:
Code:
antisense IG_C_pseudogene IG_J_pseudogene IG_pseudogene IG_V_pseudogene lincRNA miRNA misc_RNA Mt_rRNA Mt_tRNA nonsense_mediated_decay non_stop_decay polymorphic_pseudogene processed_pseudogene processed_transcript protein_coding pseudogene retained_intron rRNA sense_intronic sense_overlapping snoRNA snRNA transcribed_unprocessed_pseudogene unprocessed_pseudogene
Hope receive useful reply form you,
Thanks
wisense