Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • plant annotation pipeline for assembled RNAseq?

    Dear all,
    since I'm not deeply into protein-coding genes I wanted to ask If I have already huge number of assembled RNA sequences (EST) from illumina is there a piplines for automatic annotation of sequences based on homology (for example to Arabidopsis proteins with blastX) or I should write my oun script?


    Thanks

    PS
    the new seqs from illumina are plant with non-sequenced genome
    ------------
    SMART - bioinfo.uni-plovdiv.bg

  • #2
    blast2go is one solution.

    Comment


    • #3
      Thanks it looks very interesting!
      ------------
      SMART - bioinfo.uni-plovdiv.bg

      Comment


      • #4
        Hey vebaev ! Now a days i am also in the same situation where i have analyse and annotate the assembled RNA sequences. So what strategies you used that with your case. Can you please give the detail or refer any paper or pipeline which is standard in scientific community to annotae characterise transcriptome assembly. OF course mine organism is non model plant organism.

        Thanks in advance

        Comment


        • #5
          Hi,
          I used blast2go it is quite cute program...
          ------------
          SMART - bioinfo.uni-plovdiv.bg

          Comment


          • #6
            Plant classification pipeline

            Hi,

            You might want to try our Mercator Plant classification pipeline. It will also create some kind of annotation.

            http://mapman.gabipd.org/web/guest/mercator.


            Cheers,
            bj

            Comment


            • #7
              I suggest you use the NCBI protein database first. We did plant transcriptome assembly and a small percent (~5%) of our assembled and annotated transcripts were contaminants, such as bacterial, viral, and fungal sequences. Probably metagenomic leftovers, but the only way to catch them is to use the NCBI NR database, and look at the taxonomy of each best blast hit. If you use a plant database only, some of the contaminant sequences will match a plant protein will some similarity, and will be annotated as such. Then when RNA-Seq analysis is performed, I've seen instances where these contaminant sequences will show significant differential expression if they are not removed from your dataset.

              Comment


              • #8
                But again i would like to advice about parameter of similarity , we generally take E-value <1e-10 and alignment length 100 and identity more than 80% . If you have chosen same or similar and then giving 5% contaminants , we have to be careful.

                Comment


                • #9
                  I am in similar situation. Working with non-model species with RNA-seq results. I am using blast2go for annotation. It is time-consuming considering about the data of RNA-seq. Is there any alternative? Thanks, SH

                  Comment


                  • #10
                    Dear shengandy

                    If you really were only interested in speed you could use simple blast (with a sprinkle of reciprocal best blast) versus a related (and well annotated) species . I would not really recommend it though, usually spending more time to get something decent always pays of.

                    And usually all the annotation pipelines don't take that long if you have some decent infrastructure. (If not maybe you get in touch with some local groups?)

                    Cheers,
                    Bj

                    Comment


                    • #11
                      Dear Usad,

                      Thanks for your response. I do care about the quality. That is why I did not choose to blast against certain genome. I have about 2 million sequences from de novo assembly. It will take about 20 days if everything goes well and it won't use up the memory. Is this the general situation for annotation? Thanks -SH

                      Comment


                      • #12
                        hello!!
                        i am trying to install maker software but it has large number of dependencies , can any one help me with that. or suggest any other tools which can identify transposons , SSRs and other gene features.

                        Comment


                        • #13
                          Hi,

                          I use both blast2go and nr blastx to annotate RNA seq data. One thing I would like to share is that blastx as standalone or via blast2go takes a huge amount of time if you use NCBI/EBI due to load restriction. So if you need quick results, you definitely seek other resources for blast/interproscan. Also install the GO database locally can dramatically improve the speed, too.

                          Best regards,
                          Douglas

                          Comment


                          • #14
                            Hi..
                            I have a list of genes from cufflinks from one organism (Oryza sativa indica). All gene are important to be functionally categorize. I want to do GO and other same stuff from blast2go. How can I do it? Or can I use any other fast technique other than blast2go? Please suggest me

                            Comment


                            • #15
                              your question is not clear to me. if you want to do with blast2go, you please install and give your genes as input and then find out go terms, using first blast tab and then map and annotaion . However, there is another quick way to map to go terms if your genes are mapped to uniprot gene id. if you have gene id from uniprot or simply map to uniprot genes and get id. go to retrieve tab of uniprot , upload the id and you will get all go terms as well as go slim of all the genes.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              18 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              22 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              16 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              47 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X