SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem with NCBI gene location annotations pparg Bioinformatics 2 04-26-2016 06:07 AM
How to download gene annotation from NCBI? jgarbe Bioinformatics 9 01-14-2015 10:26 AM
How to download gene annotation for influenza from NCBI? taro.ishibash Bioinformatics 0 11-08-2012 02:21 PM
retrieving NCBI hg19 gene length narges Bioinformatics 0 10-11-2012 04:55 AM
gene location on UCSC vs NCBI nguyendofx Bioinformatics 2 01-28-2012 02:39 PM

Reply
 
Thread Tools
Old 10-31-2016, 05:46 PM   #1
yu_chem
Member
 
Location: Japan

Join Date: Mar 2015
Posts: 22
Default Gene identifier provided by NCBI

Hi everyone

I ask you about gene identifier.
I know that NCBI provide several gene identifier e.g. Entrez gene symbol, Entrez gene ID, Unigene, Official gene and so, but I don't know proper use of them.

Entrez gene symbol (e.g. POU5F1)
Unigene (Hs.249184)

First question:
I don't know what does NCBI call "Entrez gene symbol"
I think that "Entrez gene symbol" is not official name, because I could not find the document provided by NCBI containing "Entrez gene symbol".

Second question:
I want a file containing "Entrez gene symbol" and any identifier e.g. refseq ID (NM_****)
If I have the files, Mostly I can convert certain identifier to Entrez gene symbol via any identifier

I hope you answer two questions.
Best regards,
yu_chem is offline   Reply With Quote
Old 11-01-2016, 06:34 AM   #2
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 700
Default

Get this file : ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz

This command will do it :wget ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz

NCBI used to promote the term "Entrez" in terms like "Entrez Gene ID" ... but they are apparently no longer emphasizing this term. "Entrez" apparently refered to the software system used to access NCBI information.

"Gene id" or "GeneID" is the accesison(?) number used by NCBI in column 2 in the file "gene_info" ( mentioned earlier).

The official name is in the "Full_name_from_nomenclature_authority" field.

Example for human TP53 gene ...

grep -P "\tTP53\t" gene_info | grep "^9606" | cut -f1-13
9606 7157 TP53 - BCC7|LFS1|P53|TRP53 MIM:191170|HGNC:HGNC:11998|Ensembl:ENSG00000141510|HPRD:01859|Vega:OTTHUMG00000162125 17 17p13.1 tumor protein p53 protein-coding TP53 tumor protein p53

NCBI GeneID is 7157 and offical (HUGO) name is TP53 : https://www.ncbi.nlm.nih.gov/gene/?term=7157

"NM_" identfiers or "RNA_nucleotide_accession.version" are in the file "gene2accession" , available at from he same place:
wget nc ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2accession.gz

Last edited by Richard Finney; 11-01-2016 at 06:53 AM.
Richard Finney is offline   Reply With Quote
Old 11-03-2016, 08:42 AM   #3
yu_chem
Member
 
Location: Japan

Join Date: Mar 2015
Posts: 22
Default

Quote:
Originally Posted by Richard Finney View Post
Get this file : ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz

This command will do it :wget ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz

NCBI used to promote the term "Entrez" in terms like "Entrez Gene ID" ... but they are apparently no longer emphasizing this term. "Entrez" apparently refered to the software system used to access NCBI information.

"Gene id" or "GeneID" is the accesison(?) number used by NCBI in column 2 in the file "gene_info" ( mentioned earlier).

The official name is in the "Full_name_from_nomenclature_authority" field.

Example for human TP53 gene ...

grep -P "\tTP53\t" gene_info | grep "^9606" | cut -f1-13
9606 7157 TP53 - BCC7|LFS1|P53|TRP53 MIM:191170|HGNC:HGNC:11998|Ensembl:ENSG00000141510|HPRD:01859|Vega:OTTHUMG00000162125 17 17p13.1 tumor protein p53 protein-coding TP53 tumor protein p53

NCBI GeneID is 7157 and offical (HUGO) name is TP53 : https://www.ncbi.nlm.nih.gov/gene/?term=7157

"NM_" identfiers or "RNA_nucleotide_accession.version" are in the file "gene2accession" , available at from he same place:
wget nc ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2accession.gz
Thank you for answer
I understood about Entrez ID and checked gene_info

If possible, I hope you answer following question.
Is Entrez gene symbol official gene symbol provided by HGNC and MGI?

I finished the quetions
So Best regards,
yu_chem is offline   Reply With Quote
Old 11-03-2016, 08:53 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

Quote:
Originally Posted by yu_chem View Post

If possible, I hope you answer following question.
Is Entrez gene symbol official gene symbol provided by HGNC and MGI?

So Best regards,
It should be for human genes because of this.

Even though there is a separate committee for mouse, it appears that the process for gene name assignment for many vertebrates is moving under a new committee VGNC.
GenoMax is offline   Reply With Quote
Old 11-05-2016, 07:33 AM   #5
yu_chem
Member
 
Location: Japan

Join Date: Mar 2015
Posts: 22
Default

Quote:
Originally Posted by GenoMax View Post
It should be for human genes because of this.

Even though there is a separate committee for mouse, it appears that the process for gene name assignment for many vertebrates is moving under a new committee VGNC.
Thank you for answer.
I understood. I'll look up more information based on the above.
yu_chem is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:32 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO