![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
extract gene names from enscafg id | madsaan | Bioinformatics | 1 | 08-07-2011 01:13 PM |
converting UCSC gene names to Hugo Symbol names | efoss | Bioinformatics | 2 | 07-16-2011 01:41 PM |
getting genomic coordinates from gene accesion information | mathew | Bioinformatics | 11 | 03-18-2011 12:37 PM |
From Affy probe sets/gene symbols to genomic coordinates? | ETHANol | Epigenetics | 7 | 10-25-2010 03:13 AM |
inconsistent gene names in genes.expr - Cufflinks | Boel | Bioinformatics | 2 | 04-14-2010 06:16 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: London Join Date: Sep 2008
Posts: 58
|
![]()
Hi All,
Trying to get a list of gene names (preferably HUGO names) for 90,000 genomic co-ordinates (BED file). Very confused with Biomarts API. Ensembl's interface is taking hours. Spent hours on UCSC and cant see any option to retrieve this information. Any help on any other method to achieve this appreciated L |
![]() |
![]() |
![]() |
#2 |
--Site Admin--
Location: SF Bay Area, CA, USA Join Date: Oct 2007
Posts: 1,358
|
![]()
Not sure if they are HUGO names, but seems like the refFlat table in the Table Browser will get you there.
Click the "define regions", paste in your BED file, and get output. edit:...looks like it's limited to 1k entries.... ![]() |
![]() |
![]() |
![]() |
#3 |
Senior Member
Location: Palo Alto Join Date: Apr 2009
Posts: 213
|
![]()
You should be able to get the entire refFlat file from UCSC's table browser. That file will include the RefSeq IDs, start and end positions of the gene, and the gene name.
I think their gene name is the HUGO name.
__________________
Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog] Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post] Projects: U87MG whole genome sequence [Website] [Paper] |
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: Charlottesville Join Date: Sep 2008
Posts: 119
|
![]()
Hi,
I recently completed a new suite of BED Tools for addressing such questions. They are available for 64-bit LINUX and Intel Macs at: http://people.virginia.edu/~arq5x/bedtools.html Specifically, in the case of your question, you would download RefSeq (not sure if they are HUGO names) from the UCSC Table browser. Then run intersectBed -a <yourfile> -b refSeqFromUCSC.bed -wb The -wb option will write the entire RefSeq entry so that you can track the name associated with each overlap. If you have further question, just shout. Nicely. |
![]() |
![]() |
![]() |
#5 |
Member
Location: London Join Date: Sep 2008
Posts: 58
|
![]()
Thankx guys, the reflat file is useful, which I was not aware of.
Thanx ECO, but yes its limited to 1000 co-ordinates. Not the best way for 90,000 coordinates ![]() Quinlana, I downloaded BED tools and ran from the bin folder, but I got an error message ./intersectBed -a mygenomiccoordinates.bed -b genome_ucsc.bed -wb ERROR: bash: ./intersectBed: Bad CPU type in executable L |
![]() |
![]() |
![]() |
#6 |
Senior Member
Location: Charlottesville Join Date: Sep 2008
Posts: 119
|
![]()
Hi Layla,
Apologies for that. What OS and processor are you using? The Linux version should work on 64-bit Red Hat and Ubuntu. Regardless, I'll post the source later today so you can compile the programs on your system. Sorry for the trouble, I just finished testing all of these tools yesterday and they work on all of our systems. However, I haven't been diligent about trying them out for every Linux flavor. Best,Aaron |
![]() |
![]() |
![]() |
#7 |
Member
Location: London Join Date: Sep 2008
Posts: 58
|
![]()
Hi Aaron,
No worries, Thankyou for the help! My machine is a Mac OS X Version: 10.4.11 Processor: 2.4GHz intel core 2 duo Cheers! L |
![]() |
![]() |
![]() |
#8 |
Senior Member
Location: Charlottesville Join Date: Sep 2008
Posts: 119
|
![]()
Gotcha. I believe the Core Duo processors are 32-bit. Email me at aaronquinlan [at] gmail and I'll send you a pre-compiled version for your machine.
|
![]() |
![]() |
![]() |
#9 |
Junior Member
Location: baltimore Join Date: Apr 2014
Posts: 1
|
![]()
Hi
I am still having problems with using the refFlat file and bed tools. I downloaded the refFlat.txt file for hg18. First, this file is not in the BED format. Is there a command line tool which just lets me add the gene symbol to my input file, which is in the format of "chr","start","end", so BED format. If this question is redundant, please excuse me, and point me to the right page so I can follow some instructions step wise and annotate my BED file, with gene symbols. Thanks |
![]() |
![]() |
![]() |
Thread Tools | |
|
|