Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Genomic coordinates to gene names

    Hi All,

    Trying to get a list of gene names (preferably HUGO names) for 90,000 genomic co-ordinates (BED file). Very confused with Biomarts API. Ensembl's interface is taking hours. Spent hours on UCSC and cant see any option to retrieve this information. Any help on any other method to achieve this appreciated

    L

  • #2
    Not sure if they are HUGO names, but seems like the refFlat table in the Table Browser will get you there.

    Click the "define regions", paste in your BED file, and get output.

    edit:...looks like it's limited to 1k entries....

    Comment


    • #3
      You should be able to get the entire refFlat file from UCSC's table browser. That file will include the RefSeq IDs, start and end positions of the gene, and the gene name.

      I think their gene name is the HUGO name.
      Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
      Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
      Projects: U87MG whole genome sequence [Website] [Paper]

      Comment


      • #4
        BED Tools for comparing genomic intervals

        Hi,
        I recently completed a new suite of BED Tools for addressing such questions.

        They are available for 64-bit LINUX and Intel Macs at:


        Specifically, in the case of your question, you would download RefSeq (not sure if they are HUGO names) from the UCSC Table browser.

        Then run intersectBed -a <yourfile> -b refSeqFromUCSC.bed -wb

        The -wb option will write the entire RefSeq entry so that you can track the name associated with each overlap.

        If you have further question, just shout. Nicely.

        Comment


        • #5
          Thankx guys, the reflat file is useful, which I was not aware of.

          Thanx ECO, but yes its limited to 1000 co-ordinates. Not the best way for 90,000 coordinates

          Quinlana, I downloaded BED tools and ran from the bin folder, but I got an error message
          ./intersectBed -a mygenomiccoordinates.bed -b genome_ucsc.bed -wb
          ERROR:
          bash: ./intersectBed: Bad CPU type in executable

          L

          Comment


          • #6
            OS Type?

            Hi Layla,
            Apologies for that. What OS and processor are you using? The Linux version should work on 64-bit Red Hat and Ubuntu. Regardless, I'll post the source later today so you can compile the programs on your system. Sorry for the trouble, I just finished testing all of these tools yesterday and they work on all of our systems. However, I haven't been diligent about trying them out for every Linux flavor.

            Best,Aaron

            Comment


            • #7
              Hi Aaron,

              No worries, Thankyou for the help!

              My machine is a Mac OS X Version: 10.4.11
              Processor: 2.4GHz intel core 2 duo

              Cheers!
              L

              Comment


              • #8
                Gotcha. I believe the Core Duo processors are 32-bit. Email me at aaronquinlan [at] gmail and I'll send you a pre-compiled version for your machine.

                Comment


                • #9
                  Hi

                  I am still having problems with using the refFlat file and bed tools. I downloaded the refFlat.txt file for hg18. First, this file is not in the BED format. Is there a command line tool which just lets me add the gene symbol to my input file, which is in the format of "chr","start","end", so BED format. If this question is redundant, please excuse me, and point me to the right page so I can follow some instructions step wise and annotate my BED file, with gene symbols.

                  Thanks

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:37 PM
                  0 responses
                  10 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 06:07 PM
                  0 responses
                  9 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  49 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  67 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X