Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • What kind of Gene symbol do annotation files (refFlat and GTF) contain ?

    Hi everyone

    I would like to ask you about the kind of gene symbol which refFlat and GTF file contain.
    I downloaded human genome reference (hg19) and associated files from iGenome and I'm going to use refFlat.txt and genes.gtf files. However I don't know the kind of contained gene symbol (e.g. POU5F1). I searched documents associated with these files, but I didn't find the document which explains about the symbol.

    Is its symbol entrez gene symbol ?
    If possible, I hope you answer about it with document.

    Please help me.

  • #2
    I think iGenome gives you the option of downloading the genomes and annotations from different sources. Their iGenomes download page shows that, for the human annotation, you can download from Ensembl, NCBI or UCSC. Since you specifically said that you downloaded hg19, I'm assuming that you downloaded the annotation from UCSC and not one of the other two sources but you should check that and let us know.

    That being said, the gene symbol 'POU5F1' is not the gene nomenclature used for UCSC. This looks like the approved symbol provided by the human genome nomenclature committee (HGNC) and you can find the page for that gene here: http://www.genenames.org/cgi-bin/gen...t?hgnc_id=9221. I've also heard this type of gene symbol referred to as the 'external gene name' (biomaRt) or just the 'common gene name'.

    You should be able to find all of these gene symbols at HGNC or GeneCards (for the human genes) but it is also easy to translate these into entrez gene symbols (usually a number code, for example POU5F1 = entrez# 5460) or ensembl symbols (ENSG00000204531) or UCSC symbols (uc003nsv.4).

    Comment


    • #3
      Thank you for answer. Is 5460 entrez gene ID and not symbol ?

      Originally posted by allerzbulintini View Post
      I think iGenome gives you the option of downloading the genomes and annotations from different sources. Their iGenomes download page shows that, for the human annotation, you can download from Ensembl, NCBI or UCSC. Since you specifically said that you downloaded hg19, I'm assuming that you downloaded the annotation from UCSC and not one of the other two sources but you should check that and let us know.

      That being said, the gene symbol 'POU5F1' is not the gene nomenclature used for UCSC. This looks like the approved symbol provided by the human genome nomenclature committee (HGNC) and you can find the page for that gene here: http://www.genenames.org/cgi-bin/gen...t?hgnc_id=9221. I've also heard this type of gene symbol referred to as the 'external gene name' (biomaRt) or just the 'common gene name'.

      You should be able to find all of these gene symbols at HGNC or GeneCards (for the human genes) but it is also easy to translate these into entrez gene symbols (usually a number code, for example POU5F1 = entrez# 5460) or ensembl symbols (ENSG00000204531) or UCSC symbols (uc003nsv.4).
      Thank you for answer. I understood that POU5F1 is provided by HGNC (checked this page https://www.ncbi.nlm.nih.gov/gene/5460). I'll check refFlat and GTF tomorrow.
      you told me that 5460 is entrez gene symbol. However, In below two page,
      Pou5f1 is used as Entrez gene symbol.

      If you know the reason why Pou5f1 is used as entrez gene symbol, Please tell me. I think that 5460 is entrez gene ID and other name exists as entrez gene symbol.

      I'm going to use wPGSA web site and so I need to convert gene name (contained in the result file of cufflinks) to entrez gene symbol.

      wPGSA is a method to estimate relative activities of transcriptional regulators from given transcriptome data.

      Comment


      • #4
        You're right, 5460 is the Entrez Gene ID, while Pou5f1 is the corresponding gene symbol used by Entrez Gene for this gene. From the WPGSA web page, it looks like they want the gene symbol.

        Comment


        • #5
          Thank you for answer. Aren't all Entrez genes ymbol the same as symbol of HGNC ?

          Originally posted by mastal View Post
          You're right, 5460 is the Entrez Gene ID, while Pou5f1 is the corresponding gene symbol used by Entrez Gene for this gene. From the WPGSA web page, it looks like they want the gene symbol.
          Thank you for answer. Aren't all Entrez genes symbol the same symbol of
          HGNC ?
          Although I can judge a kind of ID (e.g. NM_*** ENSG ***), I don't know how to judge a kind of symbol (e.g. Entrez gene symbol, HGNC)

          That time When I judge and convert symbols, usually I puzzle my brains.
          Please give me advice.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          25 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          29 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          25 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Working...
          X