Seqanswers Leaderboard Ad

**doc.ramses** · 05-04-2011, 01:39 PM

You can use accession numbers instead of gene names separated by a | if I remember correctly.
Getting exon positions out of a list of gene names is e.g. possible in ensembl - BIOMART.

**Heisman** · 05-04-2011, 03:17 PM

Originally posted by doc.ramses View Post

You can use accession numbers instead of gene names separated by a | if I remember correctly.
Getting exon positions out of a list of gene names is e.g. possible in ensembl - BIOMART.

Getting accession numbers wouldn't be too bad but would it select for just the exons as opposed to the entire gene? I have a hard time believing there is no fairly easy/straightforward way to do this. Thanks for the tip on ensembl, I will look at that.

**doc.ramses** · 05-05-2011, 12:32 AM

Originally posted by Heisman View Post

Getting accession numbers wouldn't be too bad but would it select for just the exons as opposed to the entire gene?

If you use the "exon finder" it will exactly do this. My advice is to ask an Agilent representative to do the design for you as earray is indeed not very handy.

**Heisman** · 05-05-2011, 05:58 AM

Originally posted by doc.ramses View Post

If you use the "exon finder" it will exactly do this. My advice is to ask an Agilent representative to do the design for you as earray is indeed not very handy.

Ok, I think I have it figured out, but I'll definitely email them and see if they are willing to design it (we will be placing a big order so hopefully they'll be more amenable) as that would obviously be the easiest. Thanks a lot!

**doc.ramses** · 05-05-2011, 06:24 AM

They will definately do. They will also have a more detailed look on GC-content etc.. And if you're placeing a big order - let them do the job for earning the money

**adamdeluca** · 05-05-2011, 07:00 AM

Here is a general procedure you can follow if you want to try it yourself.

1. http://genome.ucsc.edu/cgi-bin/hgTables
2. group - "Gene and Gene Prediction Tracks", track - "UCSC genes", table - knownGene
or use the refGene table if you like refseq genes
3. paste in your list of gene identifiers
4. output as a bed file
5. restrict to just coding exons
6. save the file

7. use bedtools to merge overlapping regions, pad as you feel appropriate etc
8. load the track back into the ucsc genome browser to spot check the regions
9. convert into a format eArray likes
IIRC - chr1:100-1000
conversion program:

Code:

awk '{print $1":"$2+1"-"$3}' myRegions.bed > myRegions.txt

10. upload to agilent

**Heisman** · 05-05-2011, 08:30 AM

adamdeluca, thank you for your post. I'm with you on steps 1-6. I've never used bedtools but I could probably figure it out if necessary. I'm curious as to why one would expect to have overlapping regions? Also, for loading it back into the USCS to spot check it, where exactly would I load it and what would I be checking for? Thanks a lot!

**adamdeluca** · 05-05-2011, 08:49 AM

Originally posted by Heisman View Post

adamdeluca, thank you for your post. I'm with you on steps 1-6. I've never used bedtools but I could probably figure it out if necessary. I'm curious as to why one would expect to have overlapping regions? Also, for loading it back into the USCS to spot check it, where exactly would I load it and what would I be checking for? Thanks a lot!

Exons will be duplicated for every different splice form of the gene. It has to do with the way UCSC stores data.

To run the bedtools merge:

Code:

mergeBed -i in.bed -d 60 > out.bed

This will combine any features that are <=60bp apart into a single feature.
You can also use slopBed to make the baits overlap a bit into the introns if that is desirable.

To preform the sanity check you want to add a custom track. From the main page, under the "genomes" tab, click the "add custom tracks" button. Just look at a few of the exons you are intending to target, and make sure the design region looks the way you are expecting. You will also want to make sure that all of the genes you really care about are included, they sometimes get missed due to difficulties parsing gene names.

**Heisman** · 05-05-2011, 10:46 AM

Ok, excellent. Thanks a bunch!

**steven** · 05-05-2011, 11:10 PM

You can also use Galaxy to do 7. There should be a "send results to galaxy" checkbox in the UCSC interface. Working with command lines tools is more powerful though.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 29 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Go from list of genes to all exon coordinates?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News