Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • What kind of Gene symbol do annotation files (refFlat and GTF) contain ?

    Hi everyone

    I would like to ask you about the kind of gene symbol which refFlat and GTF file contain.
    I downloaded human genome reference (hg19) and associated files from iGenome and I'm going to use refFlat.txt and genes.gtf files. However I don't know the kind of contained gene symbol (e.g. POU5F1). I searched documents associated with these files, but I didn't find the document which explains about the symbol.

    Is its symbol entrez gene symbol ?
    If possible, I hope you answer about it with document.

    Please help me.

  • #2
    I think iGenome gives you the option of downloading the genomes and annotations from different sources. Their iGenomes download page shows that, for the human annotation, you can download from Ensembl, NCBI or UCSC. Since you specifically said that you downloaded hg19, I'm assuming that you downloaded the annotation from UCSC and not one of the other two sources but you should check that and let us know.

    That being said, the gene symbol 'POU5F1' is not the gene nomenclature used for UCSC. This looks like the approved symbol provided by the human genome nomenclature committee (HGNC) and you can find the page for that gene here: http://www.genenames.org/cgi-bin/gen...t?hgnc_id=9221. I've also heard this type of gene symbol referred to as the 'external gene name' (biomaRt) or just the 'common gene name'.

    You should be able to find all of these gene symbols at HGNC or GeneCards (for the human genes) but it is also easy to translate these into entrez gene symbols (usually a number code, for example POU5F1 = entrez# 5460) or ensembl symbols (ENSG00000204531) or UCSC symbols (uc003nsv.4).

    Comment


    • #3
      Thank you for answer. Is 5460 entrez gene ID and not symbol ?

      Originally posted by allerzbulintini View Post
      I think iGenome gives you the option of downloading the genomes and annotations from different sources. Their iGenomes download page shows that, for the human annotation, you can download from Ensembl, NCBI or UCSC. Since you specifically said that you downloaded hg19, I'm assuming that you downloaded the annotation from UCSC and not one of the other two sources but you should check that and let us know.

      That being said, the gene symbol 'POU5F1' is not the gene nomenclature used for UCSC. This looks like the approved symbol provided by the human genome nomenclature committee (HGNC) and you can find the page for that gene here: http://www.genenames.org/cgi-bin/gen...t?hgnc_id=9221. I've also heard this type of gene symbol referred to as the 'external gene name' (biomaRt) or just the 'common gene name'.

      You should be able to find all of these gene symbols at HGNC or GeneCards (for the human genes) but it is also easy to translate these into entrez gene symbols (usually a number code, for example POU5F1 = entrez# 5460) or ensembl symbols (ENSG00000204531) or UCSC symbols (uc003nsv.4).
      Thank you for answer. I understood that POU5F1 is provided by HGNC (checked this page https://www.ncbi.nlm.nih.gov/gene/5460). I'll check refFlat and GTF tomorrow.
      you told me that 5460 is entrez gene symbol. However, In below two page,
      Pou5f1 is used as Entrez gene symbol.

      If you know the reason why Pou5f1 is used as entrez gene symbol, Please tell me. I think that 5460 is entrez gene ID and other name exists as entrez gene symbol.

      I'm going to use wPGSA web site and so I need to convert gene name (contained in the result file of cufflinks) to entrez gene symbol.

      wPGSA is a method to estimate relative activities of transcriptional regulators from given transcriptome data.

      Comment


      • #4
        You're right, 5460 is the Entrez Gene ID, while Pou5f1 is the corresponding gene symbol used by Entrez Gene for this gene. From the WPGSA web page, it looks like they want the gene symbol.

        Comment


        • #5
          Thank you for answer. Aren't all Entrez genes ymbol the same as symbol of HGNC ?

          Originally posted by mastal View Post
          You're right, 5460 is the Entrez Gene ID, while Pou5f1 is the corresponding gene symbol used by Entrez Gene for this gene. From the WPGSA web page, it looks like they want the gene symbol.
          Thank you for answer. Aren't all Entrez genes symbol the same symbol of
          HGNC ?
          Although I can judge a kind of ID (e.g. NM_*** ENSG ***), I don't know how to judge a kind of symbol (e.g. Entrez gene symbol, HGNC)

          That time When I judge and convert symbols, usually I puzzle my brains.
          Please give me advice.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Advancing Precision Medicine for Rare Diseases in Children
            by seqadmin




            Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
            12-16-2024, 07:57 AM
          • seqadmin
            Recent Advances in Sequencing Technologies
            by seqadmin



            Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

            Long-Read Sequencing
            Long-read sequencing has seen remarkable advancements,...
            12-02-2024, 01:49 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 12-17-2024, 10:28 AM
          0 responses
          30 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-13-2024, 08:24 AM
          0 responses
          45 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-12-2024, 07:41 AM
          0 responses
          33 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-11-2024, 07:45 AM
          0 responses
          45 views
          0 likes
          Last Post seqadmin  
          Working...
          X