SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   Gene identifier provided by NCBI (http://seqanswers.com/forums/showthread.php?t=72310)

yu_chem 10-31-2016 06:46 PM

Gene identifier provided by NCBI
 
Hi everyone

I ask you about gene identifier.
I know that NCBI provide several gene identifier e.g. Entrez gene symbol, Entrez gene ID, Unigene, Official gene and so, but I don't know proper use of them.

Entrez gene symbol (e.g. POU5F1)
Unigene (Hs.249184)

First question:
I don't know what does NCBI call "Entrez gene symbol"
I think that "Entrez gene symbol" is not official name, because I could not find the document provided by NCBI containing "Entrez gene symbol".

Second question:
I want a file containing "Entrez gene symbol" and any identifier e.g. refseq ID (NM_****)
If I have the files, Mostly I can convert certain identifier to Entrez gene symbol via any identifier

I hope you answer two questions.
Best regards,

Richard Finney 11-01-2016 07:34 AM

Get this file : ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz

This command will do it :wget ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz

NCBI used to promote the term "Entrez" in terms like "Entrez Gene ID" ... but they are apparently no longer emphasizing this term. "Entrez" apparently refered to the software system used to access NCBI information.

"Gene id" or "GeneID" is the accesison(?) number used by NCBI in column 2 in the file "gene_info" ( mentioned earlier).

The official name is in the "Full_name_from_nomenclature_authority" field.

Example for human TP53 gene ...

grep -P "\tTP53\t" gene_info | grep "^9606" | cut -f1-13
9606 7157 TP53 - BCC7|LFS1|P53|TRP53 MIM:191170|HGNC:HGNC:11998|Ensembl:ENSG00000141510|HPRD:01859|Vega:OTTHUMG00000162125 17 17p13.1 tumor protein p53 protein-coding TP53 tumor protein p53

NCBI GeneID is 7157 and offical (HUGO) name is TP53 : https://www.ncbi.nlm.nih.gov/gene/?term=7157

"NM_" identfiers or "RNA_nucleotide_accession.version" are in the file "gene2accession" , available at from he same place:
wget nc ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2accession.gz

yu_chem 11-03-2016 09:42 AM

Quote:

Originally Posted by Richard Finney (Post 200577)
Get this file : ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz

This command will do it :wget ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz

NCBI used to promote the term "Entrez" in terms like "Entrez Gene ID" ... but they are apparently no longer emphasizing this term. "Entrez" apparently refered to the software system used to access NCBI information.

"Gene id" or "GeneID" is the accesison(?) number used by NCBI in column 2 in the file "gene_info" ( mentioned earlier).

The official name is in the "Full_name_from_nomenclature_authority" field.

Example for human TP53 gene ...

grep -P "\tTP53\t" gene_info | grep "^9606" | cut -f1-13
9606 7157 TP53 - BCC7|LFS1|P53|TRP53 MIM:191170|HGNC:HGNC:11998|Ensembl:ENSG00000141510|HPRD:01859|Vega:OTTHUMG00000162125 17 17p13.1 tumor protein p53 protein-coding TP53 tumor protein p53

NCBI GeneID is 7157 and offical (HUGO) name is TP53 : https://www.ncbi.nlm.nih.gov/gene/?term=7157

"NM_" identfiers or "RNA_nucleotide_accession.version" are in the file "gene2accession" , available at from he same place:
wget nc ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2accession.gz

Thank you for answer
I understood about Entrez ID and checked gene_info

If possible, I hope you answer following question.
Is Entrez gene symbol official gene symbol provided by HGNC and MGI?

I finished the quetions
So Best regards,

GenoMax 11-03-2016 09:53 AM

Quote:

Originally Posted by yu_chem (Post 200665)

If possible, I hope you answer following question.
Is Entrez gene symbol official gene symbol provided by HGNC and MGI?

So Best regards,

It should be for human genes because of this.

Even though there is a separate committee for mouse, it appears that the process for gene name assignment for many vertebrates is moving under a new committee VGNC.

yu_chem 11-05-2016 08:33 AM

Quote:

Originally Posted by GenoMax (Post 200666)
It should be for human genes because of this.

Even though there is a separate committee for mouse, it appears that the process for gene name assignment for many vertebrates is moving under a new committee VGNC.

Thank you for answer.
I understood. I'll look up more information based on the above.


All times are GMT -8. The time now is 08:40 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.