Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Hello Mr. Sarwar
    thanks for replying. I have installed the blast2go already, but in blast2go as I know one has to blast the list of sequences in fasta format then we can do the GO. I have a list of genes which are expressed in a rice variety I want to functionally categorize that gene without blast. Is it possible without blast?
    And Thanks for UniProt Suggestion.

    Comment


    • #17
      Balst is actually to assign the function of the gene. the idea is to transfer the function of annotated gene to the query gene on the basis of similarity.

      Comment


      • #18
        Thank you Mr Sarvar
        In uniprot retrieval process they are asking the gene in the uniprot ID but I have the genes in the BGI (Beijing Genomics Institute) ID form. Can you suggest me how to convert this bgi to uniprot id.

        Comment


        • #19
          How many genes you have. if number is less blast in uniprot (i dont think uniprot take in batch so one by one). Otherwise download uniref90 and blast with it on fairly stringent criteria. Take the id of best hit for each protein.

          Comment


          • #20
            I have 24175 gene list but i dont have the sequence of those genes. The only thing I have is its BGI ID, I dont have any other ID like uniprot or ncbi.
            Ex.
            BGIOSGA002569
            BGIOSGA002571
            BGIOSGA002572
            BGIOSGA002567
            BGIOSGA002570
            BGIOSGA002566
            BGIOSGA002582
            BGIOSGA002581
            BGIOSGA002587
            BGIOSGA002579
            BGIOSGA002573
            BGIOSGA002564
            BGIOSGA002574
            These are few examples. All these genes are from a rice species only (Oryza sativa indica). I want to categorizes these genes on the basis of there function?
            Please help..

            Comment


            • #21
              You should have sequence. If these are the genes from BGI indica release , You will get this from different repository. try gramene, ensemble. May be this will be helpfull to you "http://rice.genomics.org.cn/rice/index2.jsp"
              I donot work on rice so have limited information.

              Comment


              • #22
                Thank you Mr Sarvar

                Comment


                • #23
                  Originally posted by DZhang View Post
                  Hi,

                  I use both blast2go and nr blastx to annotate RNA seq data. One thing I would like to share is that blastx as standalone or via blast2go takes a huge amount of time if you use NCBI/EBI due to load restriction. So if you need quick results, you definitely seek other resources for blast/interproscan. Also install the GO database locally can dramatically improve the speed, too.

                  Best regards,
                  Douglas
                  www.contigexpress.com
                  3 years after this original thread...is anyone aware of a faster way to do the "blastx-to-nr" type of annotation? It seems like still quite a bottleneck. It certainly has been for us.

                  Comment


                  • #24
                    Hello everyone,
                    Thanks for your suggestions
                    I was running cufflinks with Oryza sativa indica with GTF and genome dowloded from ensmble. The problem is that, my cufflinks results generates approx 17000 novel genes with CUFF.* IDs while 10000 genes which are present in GTF files. The total no of genes which are present in GTF file is approx 40000.
                    I am worried because of those 17000 genes which is expressing in my sample. Can anyone suggest the solution for this problem.

                    Comment


                    • #25
                      fast blast2go

                      We use a program called MPI-BLAST and a huge university supercomputer. The computer has hundreds of nodes, and we downloaded the NCBI nr database. Blasting ~50,000 sequences usually takes less than 24hrs. Then we do blast2go locally. We downloaded the GO database. Using the blast2go GUI would freeze a lot when there are many sequences. We use the command line version now, and it can annotate very fast... usually finishes in 12 hours (we just run it overnight.) The command line version is called b2g4pipe. Few years ago, this program was free. Blast2GO must have realized how vital it is for large datasets, so now you have to buy a license for it. I have the old version if you are interested. I don't think it's a legal breach because the version we have came free when you downloaded the b2g software at that time.

                      Comment


                      • #26
                        Originally posted by kashif_nawaz View Post
                        Hi..
                        I have a list of genes from cufflinks from one organism (Oryza sativa indica). All gene are important to be functionally categorize. I want to do GO and other same stuff from blast2go. How can I do it? Or can I use any other fast technique other than blast2go? Please suggest me
                        If you are trying to annotate rice genes, you can speed up the process by blasting against rice proteins, not the entire NCBI nr database. NCBI taxonomy is a way to download all the protein sequences. Here is the link: http://www.ncbi.nlm.nih.gov/protein/?term=txid39946[Organism:noexp]
                        In the upper right hand corner there will be a small link that says "Send to:" This link will allow you to download all the protein sequences. Then do stand alone blast, then run blast2go on the output. Just remember, when using stand alone blast, you have to flag the command so it will output XML format instead of default blast output.

                        Comment


                        • #27
                          Originally posted by blindtiger454 View Post
                          We use the command line version now, and it can annotate very fast... usually finishes in 12 hours (we just run it overnight.)
                          12 hours actually seems kind of slow for annotation, given pre-computed blast results. Should it not just be a simple table lookup to get the GO references? Or is it doing something more complicated?

                          Comment


                          • #28
                            We are usually annotating over 50,000 transcripts. 12 hours is the maximum I've seen. Usually ~50,000 transcripts will run under 4 hours with command line BLAST2go. We usually just run it overnight, always finishes by the time I get back to work.

                            Comment


                            • #29
                              For those without access to huge clusters, Pauda from Uni. Tübingen looks like a nice solution. In my hands it hasn't worked on databases as big as uniprot though (Fasta currently something like 12GB in size).

                              I like the idea of screening out non-plant contaminant contigs using a fast system like this.

                              For an integrated solution Pedant from Biomax is pretty good, but needs a lot of blast results.

                              Some others coming through include Just annotate my protein sequence, and Trinotate (but haven't tried these yet).

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin




                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                04-22-2024, 07:01 AM
                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              59 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              57 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              51 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              56 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X