Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Nol
    Junior Member
    • Jul 2011
    • 3

    de novo assembly using Trinity versus Velvet-Oases

    Hi all,

    I have been working on a de novo assembly using velvet and oases and I am quite happy about the results. I have more do novo assembly to work on and I keep looking at the new softwares that are coming out.
    The Broad institute developped a novel method for de novo assembly and compared it with other methods in the following paper: http://www.nature.com/nbt/journal/va.../nbt.1883.html.
    I am disappointed that they did not compared their software with velvet-oases.

    Has somebody compared the two methods?

    Thanks in advance for your answers

    Nol
  • NeilMcCauley
    Junior Member
    • Aug 2011
    • 2

    #2
    I am also very interested in the comparison or any kind of experience. Accuracy is important to me and computational intensity or RAM requirement is not a problem for me since we have enough computational resources.

    Comment

    • Thorondor
      Member
      • Feb 2011
      • 69

      #3
      i did assemble an eukaryotic transcriptome with oases and also with trinity. And as far as i did check the results there are high similarities and it is not easy to say which assembly is strictly better. Trinity results in more shorter transcripts because it does not use scaffolding (e.g. inserting Ns in the sequences like Oases does) and at least the version i used did only support k-mer 25 (and atm they still state it on their website is the only possible one if you want to run inchworm, chrysalis and butterfly).
      Also trinity predicts less splice variants at least with the default edge-thr value. Some highly similar genes seems to be resolved by Oases with higher k-mers but not with Trinity but Trinity also assembles some transcripts better. :-/

      So, as always their is no clear better assembly. A de novo assembly with oases and trinty of something were a good reference is available would be maybe yield some clues. ;-)

      Comment

      • NeilMcCauley
        Junior Member
        • Aug 2011
        • 2

        #4
        Thanks for your comprehensive answer ! What kind of sequence data did you use (Illumina or 454) ?
        I have a 454 raw sequence data and I was wondering whether Oases can produce good assemblies with 454 data. According to this paper Comparing de novo assemblers for 454 transcriptome data, short-read assembler are not very suitable for 454 since they require high and even coverage depths. Indeed, I did some preliminary try with Oases and it gave me very short contigs.

        Now I'm trying Trinity.

        Comment

        • Thorondor
          Member
          • Feb 2011
          • 69

          #5
          I am working with illumina reads (PE 100bp).

          well the minimum kmer coverage is default 1 for trinity for oases it is 3, so maybe results will be better with trinity. but of course a lot depends on your expected coverage.

          Comment

          • hiddenrisk
            Junior Member
            • Sep 2011
            • 7

            #6
            Trinity questions...

            Since we are talking about Trinity here, does anyone know the answers to the following questions:

            1) What, explicitly is a "full-length transcript"? According to the second sentence of their paper, it is a "...complete and contiguous mRNA sequence form the transcription start site to the transcription end...". However, I was wondering how they were able to demonstrate this. It seems to me that it is entirely possible that with a single k-mer size of 25, it might collapse predicted transcripts if there are areas of repeat, and though it returns a contiguous piece from start to stop codon, it might not really be a complete transcript.

            2) Are these not predicted transcripts? They don't refer to them as such, but at least for the whitefly stuff I didn't see any biological verification of the predicted transcript sequences....

            3) What organism(s) do the "all reference protein-coding sequences that are reconstructable to full length given the read set" (p 647, left column, 3rd sentence under the header "Sensitivity limit for full-length reconstruction") come from? It sounded to me, from the paper, that they used Schizosaccharomyces pombes as their reference organism... if this is true, and this if the Oracle set with which they determined the working parameters of their program, didn't they basically optimize their program to run best with Illumina reads from fission yeast?

            Comment

            • enkia
              Junior Member
              • Feb 2012
              • 7

              #7
              I thought I would revive this thread to see if anyone has any more recent input on the comparison between Velvet and Trinity.

              I am working with a RNA data set that presumably has a mixture of viruses present in it and am looking to assemble them. For one of the viruses, I have a reference genome to assemble, the others are novel and will need to be assembled de novo.

              Another thought is whether either of these programs will better be able to handle contigs with very different copy numbers? Based on some preliminary dsRNA sequencing, one virus is about 25-fold higher levels than the other ones.

              Comment

              • ians
                Member
                • Aug 2011
                • 53

                #8
                Since the main discussion, Oasis-M was released. The authors did a direct comparison of Oasis, Trinity, trans-Abyss, and Cufflinks.

                Comment

                • schalivendra
                  Junior Member
                  • Mar 2013
                  • 2

                  #9
                  Hi,

                  I am using Oases to assemble a eukaryote transcriptome from Illumina reads using different k-mer values. However, I am getting the same stats (abyss-fac output: N50, min, max, median, total number of contigs) for all the k-mers tested. This is the script I am using:
                  velveth_transcripts_kxx xx -short -fastq Inputfile1.fastq Inputfile2.fastq Inputfile3.fastq Inputfile4.fastq

                  velvetg transcripts_kxx -read_trkg yes

                  oases transcripts_kxx

                  I appreciate to know if there is anything wrong with the script.

                  thank you very much,
                  Subbaiah Chalivendra

                  Comment

                  Latest Articles

                  Collapse

                  • SEQadmin2
                    From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                    by SEQadmin2


                    Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                    The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                    ...
                    06-02-2026, 10:05 AM
                  • SEQadmin2
                    Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                    by SEQadmin2


                    With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                    Introduction

                    Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                    05-22-2026, 06:42 AM
                  • SEQadmin2
                    Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                    by SEQadmin2

                    Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                    Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                    05-06-2026, 09:04 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by SEQadmin2, 06-02-2026, 12:03 PM
                  0 responses
                  20 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-02-2026, 11:40 AM
                  0 responses
                  14 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 05-28-2026, 11:40 AM
                  0 responses
                  29 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 05-26-2026, 10:12 AM
                  0 responses
                  31 views
                  0 reactions
                  Last Post SEQadmin2  
                  Working...