![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Problem with NCBI gene location annotations | pparg | Bioinformatics | 2 | 04-26-2016 07:07 AM |
How to download gene annotation from NCBI? | jgarbe | Bioinformatics | 9 | 01-14-2015 11:26 AM |
How to download gene annotation for influenza from NCBI? | taro.ishibash | Bioinformatics | 0 | 11-08-2012 03:21 PM |
retrieving NCBI hg19 gene length | narges | Bioinformatics | 0 | 10-11-2012 05:55 AM |
gene location on UCSC vs NCBI | nguyendofx | Bioinformatics | 2 | 01-28-2012 03:39 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Japan Join Date: Mar 2015
Posts: 23
|
![]()
Hi everyone
I ask you about gene identifier. I know that NCBI provide several gene identifier e.g. Entrez gene symbol, Entrez gene ID, Unigene, Official gene and so, but I don't know proper use of them. Entrez gene symbol (e.g. POU5F1) Unigene (Hs.249184) First question: I don't know what does NCBI call "Entrez gene symbol" I think that "Entrez gene symbol" is not official name, because I could not find the document provided by NCBI containing "Entrez gene symbol". Second question: I want a file containing "Entrez gene symbol" and any identifier e.g. refseq ID (NM_****) If I have the files, Mostly I can convert certain identifier to Entrez gene symbol via any identifier I hope you answer two questions. Best regards, |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: bethesda Join Date: Feb 2009
Posts: 700
|
![]()
Get this file : ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz
This command will do it :wget ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz NCBI used to promote the term "Entrez" in terms like "Entrez Gene ID" ... but they are apparently no longer emphasizing this term. "Entrez" apparently refered to the software system used to access NCBI information. "Gene id" or "GeneID" is the accesison(?) number used by NCBI in column 2 in the file "gene_info" ( mentioned earlier). The official name is in the "Full_name_from_nomenclature_authority" field. Example for human TP53 gene ... grep -P "\tTP53\t" gene_info | grep "^9606" | cut -f1-13 9606 7157 TP53 - BCC7|LFS1|P53|TRP53 MIM:191170|HGNC:HGNC:11998|Ensembl:ENSG00000141510|HPRD:01859|Vega:OTTHUMG00000162125 17 17p13.1 tumor protein p53 protein-coding TP53 tumor protein p53 NCBI GeneID is 7157 and offical (HUGO) name is TP53 : https://www.ncbi.nlm.nih.gov/gene/?term=7157 "NM_" identfiers or "RNA_nucleotide_accession.version" are in the file "gene2accession" , available at from he same place: wget nc ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2accession.gz Last edited by Richard Finney; 11-01-2016 at 07:53 AM. |
![]() |
![]() |
![]() |
#3 | |
Member
Location: Japan Join Date: Mar 2015
Posts: 23
|
![]() Quote:
I understood about Entrez ID and checked gene_info If possible, I hope you answer following question. Is Entrez gene symbol official gene symbol provided by HGNC and MGI? I finished the quetions So Best regards, |
|
![]() |
![]() |
![]() |
#4 | |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,080
|
![]() Quote:
Even though there is a separate committee for mouse, it appears that the process for gene name assignment for many vertebrates is moving under a new committee VGNC. |
|
![]() |
![]() |
![]() |
#5 | |
Member
Location: Japan Join Date: Mar 2015
Posts: 23
|
![]() Quote:
I understood. I'll look up more information based on the above. |
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|