Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • De Novo assembly of a plant transcriptome

    Hello all,

    I work with Bioinformatics in a lab and our group received the task of mounting a plant transcriptome using sequences from 454.

    We still don't have the genome sequenced so we are now defining our strategies for this project.

    The first problem is related with the assembly of the sequences.

    We used Newbler and Mira for the assembly and now we are lacking of metrics to compare this 2 softwares in order to decide which one is better and why.

    After the assembly I think the next obvious thing to do is to align this transcriptome with the genome of arabdopsis and may be use this to decide which assembly was better and why.

    Another question would be about which aligner we should use for this.

    Does anyone have a suggestion or any experience to share about this project?

    Thanks for help.
    Last edited by raonyguimaraes; 05-10-2011, 10:37 AM.

  • #2
    Regarding to the comparison of assemblers, here is a paper that might help.
    Comparing de novo assemblers for 454 transcriptome data
    Transcriptome assemblies are smaller than genome assemblies and thus should be more computationally tractable, but are often harder because individual contigs can have highly variable read coverage. Comparing single assemblers, Newbler 2.5 performed best on our trial data set, but other assemblers w …

    I used Newbler 2.5 to assembly my data, which looks quite good.

    Comment


    • #3
      Thanks a lot !

      I also found this website http://www.plantagora.org/ and a lot of metrics I can use to evaluate my assemblies

      Check it out:



      The following metrics were gathered as part of this project: total number of contigs, contig N50, total contig length, average contig length, largest contig length, contigs > 1kb, contigs > 5kb, number of scaffolds, total scaffold length, average scaffold length, largest scaffold length, and scaffold N50.
      looks promising...
      Last edited by raonyguimaraes; 05-10-2011, 10:34 AM.

      Comment


      • #4
        Hi raon,
        I've been doing de novo plant transcriptome as well (although with illumina reads) and have had much luck with trinityrnaseq. http://trinityrnaseq.sourceforge.net/

        Also, plantgdb has a lot of resources (such as a download for all of the plant protein sequences known)

        Nyaman dan aman bermain di 3DBET sebagai situs judi slot online terbaik dan terpercaya di Indonesia, Login dengan Link resmi disini.

        Comment


        • #5
          MG1655, may I ask if your data was from a normalized or non-normalized cDNA bank?
          Thanks!

          Comment


          • #6
            Originally posted by Jenzo View Post
            MG1655, may I ask if your data was from a normalized or non-normalized cDNA bank?
            Thanks!
            Hi Jenzo, our cDNA was non-normalized because we also wanted to look at differential expression.

            Comment


            • #7
              Aligning your de novo assembled transcriptome to a genomic reference can help to determine which assembly is better. Presumably, whatever you are sequencing is a closely related plant to arabidopsis?

              Regardless, Blat (http://genome.ucsc.edu/cgi-bin/hgBlat?command=start) is a good tool for splice-aligning assembled transcripts to a genome.

              Originally posted by raonyguimaraes View Post
              Hello all,

              After the assembly I think the next obvious thing to do is to align this transcriptome with the genome of arabdopsis and may be use this to decide which assembly was better and why.

              Another question would be about which aligner we should use for this.

              Comment


              • #8
                de novo assembled transcriptome to a genomic reference

                I am also doing a de novo assembly of a plant transcriptome. I used ABySS and CLC Bio. One thing I did after assembling the transcriptome is to do a legacy blast search. Not all contigs will necessarily be plant - could be fungi, bacteria, and virus. If you only blast to Arabidopsis you might pick up a highly conserved gene but whose top hit was actually not a plant but a bacteria- blasting only to Arabidopsis doesn't allow you to select out contamination.

                My pipeline is the following:

                blastn to entire ncbi database (locally)
                |
                remove non-plant contigs - keep ones that had top 10 hits to plants and ones that had no hit
                |
                blastx
                |
                with the final set of contigs, annotate using closest hit in blast search.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                51 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                67 views
                0 likes
                Last Post seqadmin  
                Working...
                X