Hi All,
I'm just wondering if anyone can shed light on how to obtain the latest annotations of a given organism from NCBI, and more specifically how to get all of the current transcript variants that are listed as refseq's..
I'm working on honeybees and I've grabbed the latest gff flatfile annotations from ftp://ftp.ncbi.nih.gov/genomes/Apis_mellifera/GFF/ but they don't contain all of the current refseq transcripts..
For example, the gene cort (http://www.ncbi.nlm.nih.gov/gene/726912) has three transcript variants listed as refseq entries; XM_006557348.1, XM_001122629.3 and XM_006557349.1, but in the gff annotations the only transcript ID is XM_001122629.2...
Is there any way to build a current set of annotations from the data NCBI uses to populate transcripts for gene records?
Thanks
I'm just wondering if anyone can shed light on how to obtain the latest annotations of a given organism from NCBI, and more specifically how to get all of the current transcript variants that are listed as refseq's..
I'm working on honeybees and I've grabbed the latest gff flatfile annotations from ftp://ftp.ncbi.nih.gov/genomes/Apis_mellifera/GFF/ but they don't contain all of the current refseq transcripts..
For example, the gene cort (http://www.ncbi.nlm.nih.gov/gene/726912) has three transcript variants listed as refseq entries; XM_006557348.1, XM_001122629.3 and XM_006557349.1, but in the gff annotations the only transcript ID is XM_001122629.2...
Is there any way to build a current set of annotations from the data NCBI uses to populate transcripts for gene records?
Thanks
Comment