Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Velvet and Oases - choice of k value?

    Hi all,
    I am using Velvet and Oases to assemble a transcriptome (de novo - no genome available) from unpaired short reads. The k-mer value which yields the highest average length of the transcripts which constitute the final output from Oases is different from the k-mer value which yields the highest N50 value for the contig lengths. In that case, which k-mer value should I choose and why?

    Related, the kmer value yielding the highest average length of transcripts also yields more blast results than the kmer value yielding highest N50. For annotation purposes, it would seem more blast results = better choice. Is there a potential complication in using the kmer value yielding highest average length over the kmer value yielding highest N50?

    Lastly, if I want to combine assemblies for different k-mer values using Vmatch software, do I use the contigs output by Velvet or the transcripts output by Oases? Which would be more appropriate?

    Are there publicly available software to assemble transcripts output by Oases corresponding to different k-mer values?

    As always, thanks so much for all of your help!
    Cheers,
    Mikey

  • #2
    Originally posted by MikeyG View Post
    Hi all,
    I am using Velvet and Oases to assemble a transcriptome (de novo - no genome available) from unpaired short reads. The k-mer value which yields the highest average length of the transcripts which constitute the final output from Oases is different from the k-mer value which yields the highest N50 value for the contig lengths. In that case, which k-mer value should I choose and why?

    Related, the kmer value yielding the highest average length of transcripts also yields more blast results than the kmer value yielding highest N50. For annotation purposes, it would seem more blast results = better choice. Is there a potential complication in using the kmer value yielding highest average length over the kmer value yielding highest N50?

    Lastly, if I want to combine assemblies for different k-mer values using Vmatch software, do I use the contigs output by Velvet or the transcripts output by Oases? Which would be more appropriate?

    Are there publicly available software to assemble transcripts output by Oases corresponding to different k-mer values?

    As always, thanks so much for all of your help!
    Cheers,
    Mikey
    We are working on a tool, GAM http://services.appliedgenomics.org/software/gam/, that does that.
    At the moment it is Sanger based but we are close to a NGS release. In the first version it will merge different assemblies (different tools or same tool with different parameters, e.g. kmers) for the same set of reads.

    Best,
    Simone

    Comment


    • #3
      Is there a potential complication in using the kmer value yielding highest average length over the kmer value yielding highest N50?
      no, looks like you have found a metric that works for you. N50 is not the last word in assemblies.

      Lastly, if I want to combine assemblies for different k-mer values using Vmatch software, do I use the contigs output by Velvet or the transcripts output by Oases? Which would be more appropriate?
      This approach is fairly innocuous (it just clusters sequences with 100% one-sided overlap), you can run it on the transcripts.
      --
      Jeremy Leipzig
      Bioinformatics Programmer
      --
      My blog
      Twitter

      Comment


      • #4
        Originally posted by MikeyG View Post
        Hi all,
        I am using Velvet and Oases to assemble a transcriptome (de novo - no genome available) from unpaired short reads. The k-mer value which yields the highest average length of the transcripts which constitute the final output from Oases is different from the k-mer value which yields the highest N50 value for the contig lengths. In that case, which k-mer value should I choose and why?

        Related, the kmer value yielding the highest average length of transcripts also yields more blast results than the kmer value yielding highest N50. For annotation purposes, it would seem more blast results = better choice. Is there a potential complication in using the kmer value yielding highest average length over the kmer value yielding highest N50?

        Lastly, if I want to combine assemblies for different k-mer values using Vmatch software, do I use the contigs output by Velvet or the transcripts output by Oases? Which would be more appropriate?

        Are there publicly available software to assemble transcripts output by Oases corresponding to different k-mer values?

        As always, thanks so much for all of your help!
        Cheers,
        Mikey
        Currently i may not help you with your first few questions as i have just started velvet/oases. But your query regarding the publicly available software to assemble transcripts output with different k-mers you can use CAP3. It is very user friendly one-liner command tool.

        Comment


        • #5
          Oases comes with its own method for doing this, why not use this?

          Comment


          • #6
            vel vet work properly with my system but after oases instalation i am not be able to run that every time i got error that is:
            molbio@molbio-System-Product-Name[oases_0.2.8] oases --help [10:59AM]
            zsh: permission denied: oases
            molbio@molbio-System-Product-Name[oases_0.2.8] oases [11:18AM]
            zsh: permission denied: oases
            molbio@molbio-System-Product-Name[oases_0.2.8]

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            28 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            31 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            27 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            52 views
            0 likes
            Last Post seqadmin  
            Working...
            X