Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Which organism has the best GO annotations?

    I was wondering if anyone had any thoughts on this topic. Basically I want to use GOrilla to find enriched GO terms between two samples. GOrilla requires that you use genes from 1 of 8 model organisms. I am using a non-model organism (plant) so I was going to BLAST my genes against Protein_refseq from Arabidopsis thaliana...

    But then I started to wonder if other organisms had better GO annotation. Clearly for BLASTing purposes Arabidopsis (the only plant on the list) would be best, but for getting GO terms is it?

    For example, if one of my genes BLASTS to both human, mouse, and arabidopsis RNA Pol II genes, would all these results give me the same GO information, or do particular organisms genes have better GO annotation?

    Cheers

  • #2
    GO annotation files for various genomes are located here: http://www.geneontology.org/GO.downl...otations.shtml

    Yeast (S. cerevisiae) may have the best GO annotations.

    Comment


    • #3
      If you are working with Plants, then you want to stick with Arabidopsis. There are a lot of Plant specific GO terms and if you use another organism, you will loose those.

      Comment


      • #4
        Originally posted by RNAddict View Post
        I was wondering if anyone had any thoughts on this topic. Basically I want to use GOrilla to find enriched GO terms between two samples. GOrilla requires that you use genes from 1 of 8 model organisms. I am using a non-model organism (plant) so I was going to BLAST my genes against Protein_refseq from Arabidopsis thaliana...

        But then I started to wonder if other organisms had better GO annotation. Clearly for BLASTing purposes Arabidopsis (the only plant on the list) would be best, but for getting GO terms is it?

        For example, if one of my genes BLASTS to both human, mouse, and arabidopsis RNA Pol II genes, would all these results give me the same GO information, or do particular organisms genes have better GO annotation?

        Cheers
        How distantly related is your plant species to Arabidopsis? The Arabidopsis annotations are very good, but I personally don't like the approach of just taking the GO term from another species unless you have good reasons. You may want to consider doing your searches (BLAST, HMMER, etc.) and then determine the GO mappings from your results.

        Comment


        • #5
          Thanks for your responses.

          @SES: I am working with a fern, so not very related to Arabidopsis. But obviously more so then to Humans, Mouse, ect.

          My protocol was going to be as follows:
          1. BLAST sequences from my two samples against organismal ref_seqs (Arabidopsis?).
          2. Assign some cutoff (probably via BLAST p-value) and associate gene symbol with my sequences.
          3. Use these gene symbol lists in GOrilla to look for enrichment of GO terms (sample 1vs2 and sample 2vs1).

          Is this different then what you are suggesting?

          @All

          As Chadn737 points out if I don't use a plant for GO terms I will miss a lot of plant specific things. Another complication is that while I want a global view (all GO terms), I need/want to also specifically know about genes involved in basal bodies and cilia. If Chlamydomonas had good gene ontology then that would be an obvious choice to use, but it doesn't seem to.

          Do you think that these two questions (changes in global GO term and changes in ciliary/basal body related genes) are best approached separately or as part of the same analysis?

          Comment


          • #6
            Originally posted by RNAddict View Post
            Thanks for your responses.

            @SES: I am working with a fern, so not very related to Arabidopsis. But obviously more so then to Humans, Mouse, ect.

            My protocol was going to be as follows:
            1. BLAST sequences from my two samples against organismal ref_seqs (Arabidopsis?).
            2. Assign some cutoff (probably via BLAST p-value) and associate gene symbol with my sequences.
            3. Use these gene symbol lists in GOrilla to look for enrichment of GO terms (sample 1vs2 and sample 2vs1).

            Is this different then what you are suggesting?

            @All

            As Chadn737 points out if I don't use a plant for GO terms I will miss a lot of plant specific things. Another complication is that while I want a global view (all GO terms), I need/want to also specifically know about genes involved in basal bodies and cilia. If Chlamydomonas had good gene ontology then that would be an obvious choice to use, but it doesn't seem to.

            Do you think that these two questions (changes in global GO term and changes in ciliary/basal body related genes) are best approached separately or as part of the same analysis?
            I have used exactly the approach you mentioned above when working in non-model species. I blastp against Arabidopsis, and used the top hits to look for GO term enrichment. This was crucial because a lot of the stuff we were interested in were plant hormonal and signaling genes....which would have been completely absent in a non-plant.

            In regards to the second part, looking for Genes involved in basal bodies and cilia, I would do this as separately from the GO-term enrichment. You may find that the two overlap in many ways, but doing just a GO-term enrichment, you may miss some interesting ones.

            Comment


            • #7
              Originally posted by RNAddict View Post
              Thanks for your responses.

              @SES: I am working with a fern, so not very related to Arabidopsis. But obviously more so then to Humans, Mouse, ect.

              My protocol was going to be as follows:
              1. BLAST sequences from my two samples against organismal ref_seqs (Arabidopsis?).
              2. Assign some cutoff (probably via BLAST p-value) and associate gene symbol with my sequences.
              3. Use these gene symbol lists in GOrilla to look for enrichment of GO terms (sample 1vs2 and sample 2vs1).

              Is this different then what you are suggesting?
              Yes, I was suggesting something completely different. The nucleotide substitution rate in Arabidopsis is 3-fold higher than other Angiosperms, so using this one species to make inferences about functional evolution in a fern is not optimal. I was suggesting you do a search, HMMER against all of Pfam for example, and determine the GO term mapping from the Pfam ID of your (best) hit. That method would not be biased by the evolutionary rate, annotations, biology, etc. associated with using one species for comparison. If you had to pick one species for comparison, I would probably use grape, not Arabidopsis, for the reason I mentioned above.

              Comment


              • #8
                There are a lot of Plant specific GO terms and if you use another organism, you will loose those.
                Your DVD choice --- one tree hill season 9 dvd ,one more shoot to scores.

                Comment


                • #9
                  @SES: HMMER doesn't support nucleotide searches (similar to BLASTx) though right? I just have transcripts at this point. Given that, do you still think it would be best to use HMMER? E.g. identify potential ORFs and use those as search queries?

                  Comment


                  • #10
                    Originally posted by RNAddict View Post
                    @SES: HMMER doesn't support nucleotide searches (similar to BLASTx) though right? I just have transcripts at this point. Given that, do you still think it would be best to use HMMER? E.g. identify potential ORFs and use those as search queries?
                    Yes, I think that is a good approach. For example, you can translate your sequences with sixpack (from EMBOSS) and then run InterProScan, which will run HMMER (in addition to 13 other programs) and the results will have your GO terms mapped to your matches. That takes care of the search and GO term mapping in one shot.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Essential Discoveries and Tools in Epitranscriptomics
                      by seqadmin




                      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                      04-22-2024, 07:01 AM
                    • seqadmin
                      Current Approaches to Protein Sequencing
                      by seqadmin


                      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                      04-04-2024, 04:25 PM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, Today, 08:47 AM
                    0 responses
                    12 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-11-2024, 12:08 PM
                    0 responses
                    60 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 10:19 PM
                    0 responses
                    59 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 09:21 AM
                    0 responses
                    54 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X