Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Heisman
    Senior Member
    • Dec 2010
    • 534

    Go from list of genes to all exon coordinates?

    Hey all,

    I want to use eArray to create a custom capture set of baits for a few hundred genes. I'm ignorant in non-wetlab stuff, and looking at the website it appears that I cannot just upload a list of genes; rather I have to upload a list of the exon coordinates within the genes that I would like to design baits for. What would be the easiest way for me to go from a list of genes to a list of these exon coordinates? Thanks a lot for any help.
  • doc.ramses
    Member
    • Jan 2011
    • 26

    #2
    You can use accession numbers instead of gene names separated by a | if I remember correctly.
    Getting exon positions out of a list of gene names is e.g. possible in ensembl - BIOMART.

    Comment

    • Heisman
      Senior Member
      • Dec 2010
      • 534

      #3
      Originally posted by doc.ramses View Post
      You can use accession numbers instead of gene names separated by a | if I remember correctly.
      Getting exon positions out of a list of gene names is e.g. possible in ensembl - BIOMART.
      Getting accession numbers wouldn't be too bad but would it select for just the exons as opposed to the entire gene? I have a hard time believing there is no fairly easy/straightforward way to do this. Thanks for the tip on ensembl, I will look at that.

      Comment

      • doc.ramses
        Member
        • Jan 2011
        • 26

        #4
        Originally posted by Heisman View Post
        Getting accession numbers wouldn't be too bad but would it select for just the exons as opposed to the entire gene?
        If you use the "exon finder" it will exactly do this. My advice is to ask an Agilent representative to do the design for you as earray is indeed not very handy.

        Comment

        • Heisman
          Senior Member
          • Dec 2010
          • 534

          #5
          Originally posted by doc.ramses View Post
          If you use the "exon finder" it will exactly do this. My advice is to ask an Agilent representative to do the design for you as earray is indeed not very handy.
          Ok, I think I have it figured out, but I'll definitely email them and see if they are willing to design it (we will be placing a big order so hopefully they'll be more amenable) as that would obviously be the easiest. Thanks a lot!

          Comment

          • doc.ramses
            Member
            • Jan 2011
            • 26

            #6
            They will definately do. They will also have a more detailed look on GC-content etc.. And if you're placeing a big order - let them do the job for earning the money

            Comment

            • adamdeluca
              Member
              • Jul 2010
              • 95

              #7
              Here is a general procedure you can follow if you want to try it yourself.

              1. http://genome.ucsc.edu/cgi-bin/hgTables
              2. group - "Gene and Gene Prediction Tracks", track - "UCSC genes", table - knownGene
              or use the refGene table if you like refseq genes
              3. paste in your list of gene identifiers
              4. output as a bed file
              5. restrict to just coding exons
              6. save the file

              7. use bedtools to merge overlapping regions, pad as you feel appropriate etc
              8. load the track back into the ucsc genome browser to spot check the regions
              9. convert into a format eArray likes
              IIRC - chr1:100-1000
              conversion program:
              Code:
              awk '{print $1":"$2+1"-"$3}' myRegions.bed > myRegions.txt
              10. upload to agilent

              Comment

              • Heisman
                Senior Member
                • Dec 2010
                • 534

                #8
                adamdeluca, thank you for your post. I'm with you on steps 1-6. I've never used bedtools but I could probably figure it out if necessary. I'm curious as to why one would expect to have overlapping regions? Also, for loading it back into the USCS to spot check it, where exactly would I load it and what would I be checking for? Thanks a lot!

                Comment

                • adamdeluca
                  Member
                  • Jul 2010
                  • 95

                  #9
                  Originally posted by Heisman View Post
                  adamdeluca, thank you for your post. I'm with you on steps 1-6. I've never used bedtools but I could probably figure it out if necessary. I'm curious as to why one would expect to have overlapping regions? Also, for loading it back into the USCS to spot check it, where exactly would I load it and what would I be checking for? Thanks a lot!
                  Exons will be duplicated for every different splice form of the gene. It has to do with the way UCSC stores data.

                  To run the bedtools merge:
                  Code:
                  mergeBed -i in.bed -d 60 > out.bed
                  This will combine any features that are <=60bp apart into a single feature.
                  You can also use slopBed to make the baits overlap a bit into the introns if that is desirable.

                  To preform the sanity check you want to add a custom track. From the main page, under the "genomes" tab, click the "add custom tracks" button. Just look at a few of the exons you are intending to target, and make sure the design region looks the way you are expecting. You will also want to make sure that all of the genes you really care about are included, they sometimes get missed due to difficulties parsing gene names.

                  Comment

                  • Heisman
                    Senior Member
                    • Dec 2010
                    • 534

                    #10
                    Ok, excellent. Thanks a bunch!

                    Comment

                    • steven
                      Senior Member
                      • Aug 2009
                      • 269

                      #11
                      You can also use Galaxy to do 7. There should be a "send results to galaxy" checkbox in the UCSC interface. Working with command lines tools is more powerful though.

                      Comment

                      Latest Articles

                      Collapse

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, 06-05-2026, 10:09 AM
                      0 responses
                      11 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-04-2026, 08:59 AM
                      0 responses
                      23 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      28 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 11:40 AM
                      0 responses
                      22 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...