SEQanswers

Go Back   SEQanswers > Applications Forums > Epigenetics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Genomic coordinates to gene names Layla Bioinformatics 8 04-17-2014 02:23 PM
What is a Probe Set? What is a Probe pair? Pixel08 General 1 12-31-2011 01:15 PM
Get Genomic position of each probe bhargavpbk88 Bioinformatics 1 08-11-2011 02:53 PM
getting genomic coordinates from gene accesion information mathew Bioinformatics 11 03-18-2011 11:37 AM
Programs comparing gene sets 454andSolid Bioinformatics 0 03-16-2010 02:22 AM

Reply
 
Thread Tools
Old 10-19-2010, 01:04 AM   #1
ETHANol
Senior Member
 
Location: Western Australia

Join Date: Feb 2010
Posts: 308
Default From Affy probe sets/gene symbols to genomic coordinates?

I have a list of Affy Probe sets and their corresponding gene symbol, does anyone know how I can convert them to the genomic coordinates of the gene so I can intersect them with my ChIP-seq peaks?
ETHANol is offline   Reply With Quote
Old 10-19-2010, 07:05 AM   #2
mudshark
Senior Member
 
Location: Munich

Join Date: Jan 2009
Posts: 138
Default

Quote:
Originally Posted by ETHANol View Post
I have a list of Affy Probe sets and their corresponding gene symbol, does anyone know how I can convert them to the genomic coordinates of the gene so I can intersect them with my ChIP-seq peaks?
Affymetrix usually provides detailed probe set information including the genomic position of the probe set. If you visit the affy website, search for your array and access the corresponding technical documentation you will get a download link for annotation files in rather convenient formats.

best regards
T.
mudshark is offline   Reply With Quote
Old 10-20-2010, 07:55 AM   #3
ETHANol
Senior Member
 
Location: Western Australia

Join Date: Feb 2010
Posts: 308
Default

Okay, I have the annotation file form Affymetrix (csv file) and it has the probe sets and the some genomic coordinates. My problems are two:

1) How can I match the probe sets on the data from the microarray experiment with the genomic coordinates/probe sets on the other file?

2) The genomic coordinates are not for the whole gene. How do I get the whole gene (I eventually want promoters). I think I should be able to do this with Galaxy.
ETHANol is offline   Reply With Quote
Old 10-20-2010, 08:49 AM   #4
mudshark
Senior Member
 
Location: Munich

Join Date: Jan 2009
Posts: 138
Default

sorry I did not understand that you were looking for coordinates of genes.

for sure there are many possibilities and galaxy, which i never used, might be an option.

in case you are a bit familiar with R/bioconductor, there is a package called 'biomaRt' which is a rather easy tool to retrieve genomic coordinates of whatever and you can easily use affy probe identifiers to query.

promoter coordinates is probably more tricky because of the lack of a robust annotation but you could start with a simple approach defining the region 1-2kb upstream of the TSS as promoter (depends very much on the organism) and map your chipseq reads to those.
mudshark is offline   Reply With Quote
Old 10-21-2010, 10:28 AM   #5
ETHANol
Senior Member
 
Location: Western Australia

Join Date: Feb 2010
Posts: 308
Default

R, I was afraid it would come to that. I'm trying to get going with R, but it's difficult and will take some time. BiomaRt looks like the tool I need.

Can someone help me with this. In the user guide they have instructions for getting genomic coordinates of the genes for Affy probes. I have a question about the following:

In this line specific probes are assigned to 'affyids':
affyids = c("202763_at", "209310_s_at", "207500_at")

and 'affyids' is the value used to query the data base.

Okay, that's great but I want to use a list of about 1000 Affy probes that I have as a text file ('mytextfile.txt') in the same directory that I a started R in.

Is there a command line (or two..) that I can use to assign my list of Affy probes to 'affyids'.

Sorry if my R-speak no so good.... but I'm a molecular biologist and new to the computer universe.


Thanks a billion to anyone that can help.
ETHANol is offline   Reply With Quote
Old 10-24-2010, 06:53 AM   #6
mudshark
Senior Member
 
Location: Munich

Join Date: Jan 2009
Posts: 138
Default

this is a guess (as i don't know exactly how your file is structured)

affyids <- read.delim("mytextfile.txt", header=F,stringsAsFactors=F)[,1]
mudshark is offline   Reply With Quote
Old 10-25-2010, 12:43 AM   #7
dariober
Senior Member
 
Location: Cambridge, UK

Join Date: May 2010
Posts: 311
Default

If you don't want to deal with R/Bioconductor, I guess you could use the web interface of Ensembl/Biomart (http://www.biomart.org/). It's quite intuitive to use.

Nevertheless, it would be probably useful to get to grips with R. See if this bit of code helps:

Code:
## Get the genome coordinates of the genes tagged by a set of 
## Affymetrix probes IDs

## Assuming your file of affy probes IDs is a single column of 
## probes identifiers, one probe per line, no header. 
## E.g. something like this:

# Ssc.25128.1.S1_at
# Ssc.6614.1.S1_at
# Ssc.24115.1.A1_at
# Ssc.15874.1.S1_at
# Ssc.30896.2.S1_at


library(biomaRt)

affyids<- read.table(file= 'mytextfile.txt', header=F)
affyids<- as.vector(affyids[,1])

mart<- useDataset(dataset= "sscrofa_gene_ensembl", ## Change dataset here
                  useMart("ensembl"))

probe2gene<- getBM(
             attributes= c('affy_porcine', 
                           'ensembl_gene_id', 
                           'chromosome_name', 
                           'strand', 
                           'start_position', 
                           'end_position'),
             filters= 'affy_porcine', ## Change as appropriate
             value= affyids,
             mart= mart)
Dario
dariober is offline   Reply With Quote
Old 10-25-2010, 02:13 AM   #8
ETHANol
Senior Member
 
Location: Western Australia

Join Date: Feb 2010
Posts: 308
Default

Thanks guys. I really appreciate it!!!!!

I was so close. I tried read.table but never found read.delim.
ETHANol is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:26 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO