Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to predict nucleotide mutation effects on a genome-wide scale?

    I'm familiar with (but have not used) ParseSNP (http://www.proweb.org/parsesnp/), a tool that calculates the effect (frame shift, truncation, etc.) of nucleotide mutations (indel, substitution) on genes, given the gene model. However, I'd like to do this on a genome-wide scale: load in a long list of mutations, and return protein effects if the mutation is within a gene. And I don't really think the people hosting ParseSNP would appreciate me bombing their servers by submitting entire chromosomes (of Arabidopsis thaliana)! Maybe it wouldn't be a problem ... I'll ask them.

    But, in the meantime, are there more convenient, fast tools, for which the source is available, that do this? Any suggestions or advice would be appreciated.

    Thanks,
    ~Joe

  • #2
    There are several packages in the Wiki which are in this space

    Comment


    • #3
      I just downloaded one called annovar. I'm working in mouse, but A.thaliain ought to be available too. I followed the manual, downloaded the mouse databases, converted my vcf file to it's format using the included script, and ran it. I got output that looks like this:

      intronic Col15a1 4 47234472 47234472 A G 99
      intronic Col15a1 4 47244368 47244368 A G 99
      intronic Col15a1 4 47299820 47299820 - AGAAGAAGAAGAAG 99
      intergenic Tgfbr1(dist=6268),Alg2(dist=48641) 4 47434064 47434064 T C 99
      intergenic Sec61b(dist=88256),Nr4a3(dist=479757) 4 47584361 47584363 GCT - 99

      The next step is to automatically check intronic mutations to see if they would cause cryptic splice sites, but I don't know of a site that will do that in batches.

      Comment


      • #4
        Thanks!

        Thanks krobison, swbarnes,

        I checked out your search results link, krobison ...
        GAMES and MU2A seem to be limited to human (similarly for SNPnexus, which I just stumbled across), and MU2A is a big install job involving TomCat ...
        VIP is for 454 reads (I've got Illumina).
        VariantClassifier from JCVI looks promising, but I think I'll try Annovar (thanks swbarnes!)

        Appreciate the tips,
        ~Joe

        Comment


        • #5
          GAMES doesn't appear to be human-specific, but is oriented towards genomes in the UCSC Browser, which unfortunately doesn't include Arabidopsis.

          Comment


          • #6
            The ensembl system comes with a variant effect predictor which you might find useful

            Ensembl Plants is a genome-centric portal for plant species of scientific interest

            Comment


            • #7
              Thanks Laura! - Ensembl's Variation Effect Predictor worked perfectly, and the online version is set up for Aribidopsis and about 8 other plants, so it didn't even require any hacking.

              We talked to Kai, who wrote (or is maintaining Annovar), and he added a FAQ question about using a genome that's not in UCSC's database (here), but we didn't fully explore how easy that would be, having found Ensembl's tool first.
              Thanks again, all ...
              ~Joe

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              59 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              57 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              56 views
              0 likes
              Last Post seqadmin  
              Working...
              X