Seqanswers Leaderboard Ad

**GenoMax** · 12-17-2015, 04:07 AM

Generally (at least at NCBI) the ".n" refers to version number. Larger numbers indicate a newer version.

You can email UCSC genome support ([email protected]) to get confirmation since I couldn't find an explicit link that confirms above.

**sheen_yh** · 12-17-2015, 03:17 PM

Thank you GenoMax!
I just received a formal reply from USCS, after taking your advice of emailing them:
#######################################
For an ID like uc011mwn.1, the .1 represents the revision number of the transcript. When a new version of UCSC Genes is released, a transcript like uc011mwn.1 could possibly remain the same, it could become uc011mwn.2, it could receive a new transcript ID entirely or it could disappear altogether from the new version of UCSC Genes.

On hg38, the current version of UCSC Genes is version 9. The current version of UCSC Genes is always contained in the table knownGene. When version 8 was replaced by version 9, the old version 8 tables were renamed knownGeneOld8 and kgXrefOld8. The differences in transcripts between version 8 and version 9 are tracked with the table kg8ToKg9. This same schema is repeated every time UCSC Genes is updated, so on hg19, you will find several knownGeneOld# tables, several kgXrefOld# tables and several kg#ToKg# tables.

You could possibly ignore the revision number and still get a match, but that will only work if a transcript retained the same transcript ID with a new revision number (e.g., uc011mwn.1 to uc011mwn.2). For transcripts IDs that changed or disappeared entirely, this will not work. Note the following:

mysql> select oldId,newId from kg8ToKg9 limit 5;
+------------+------------+
| oldId | newId |
+------------+------------+
| uc001aaa.3 | |
| uc001aab.3 | |
| uc010nxq.1 | |
| uc001aae.4 | |
| uc009vit.3 | uc031tla.1 |
+------------+------------+
5 rows in set (0.00 sec)

Note that the first 4 IDs in the list disappeared entirely from version 8 to version 9 and the last changed from uc009vit.3 to uc031tla.1.

For IDs you are having problems mapping, you can try querying kgXrefOld# or you can track the ID changes from version to version through the kg#ToKg# tables. You can download the tables in their entirety from http://hgdownload.cse.ucsc.edu/golde...hg38/database/ or http://hgdownload.cse.ucsc.edu/golde...hg19/database/. You can also query the tables on our public MySql server: https://genome.ucsc.edu/goldenPath/help/mysql.html

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

UCSC kgID string format

Comment

Comment

Latest Articles

ad_right_rmr

News