Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GS De Novo Assembler (Newbler) -large option for transcriptomes

    Hi all,

    Has anyone ever tried the -large option for de-novo assembly of 454 transcriptome data.
    The issue for the question is that the -large flag (flag for large of complex genomes) has no more documentation apart form that phrase I just wrote in the parentheses.
    I understand that this is an option for genome assemblies (mostly.. only...???) but what is the influnce of this flag if one use it for transcriptomes.

    The question has occured when I (for curiosity purposes) tried the -large flag for a transcriptome assembly (together with the -cdna flag of course) and then I observed a significant difference on the size and the constitution of the isotigs generated. No something significant in the number but significant difference in the lengths of the isotigs and how they have been put together.

    Has anybody gone to the bottom of how this flag works?

    Many thanks

  • #2
    Originally posted by cbouyio View Post
    The question has occured when I (for curiosity purposes) tried the -large flag for a transcriptome assembly (together with the -cdna flag of course) and then I observed a significant difference on the size and the constitution of the isotigs generated. No something significant in the number but significant difference in the lengths of the isotigs and how they have been put together.
    I don't have answer for you, but a question. Do you think your assembly was made better or worse by using the -large option?

    Comment


    • #3
      -large is supposed to be used for large genome assemblies, which won't finish 'ever' without the -large option set. On occasion, I needed it for transcriptome assemblies, otherwise they would take way too long.

      Generally, one wants to avoid -large, as it shortcuts some steps and thereby can lead to worse results (shorter contigs, more reads mared as repeat, for instance).

      Comment


      • #4
        Guys thanks for the replies.

        @kmcarr I can not give a straight answer to your question for I can not tell from the numbers only wich transcriptome assembly was "better". The number, the n50 and the distribution of the lengths of the *isotigs* was marginaly "better' without the -large option, however the -large option gave me a better resolution for an individual multi copy gene family that we are after. I need to wait for the PCR aplicons from the wet lab guys to coroborate that, but the indications so far was that for a particular family (which BTW contains sevelar repeats) the -large option might give us better resolution.

        @flxlex both with and without -large the assemblies run relative fine (about a couple of hours each in a 4core 32gb RAM machine) so finishing of the assembly is not an issue for us. However I take seriously into account your comment that -large "shortcuts some steps" and marks some reads as repeats and I ll have a manual look at the .ace files of the protein family we are after. The contigs number and lenght distributions as I mentioned are not significantly different. So with the lack of any other formal way, I ll go with the empirical assesment here and I manualy (and together with some wet lab confirmation) check which option give us better resolution for the family we are after.

        Thanks again for your replies.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM
        • seqadmin
          Techniques and Challenges in Conservation Genomics
          by seqadmin



          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

          Avian Conservation
          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
          03-08-2024, 10:41 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 06:37 PM
        0 responses
        8 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, Yesterday, 06:07 PM
        0 responses
        8 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-22-2024, 10:03 AM
        0 responses
        49 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-21-2024, 07:32 AM
        0 responses
        67 views
        0 likes
        Last Post seqadmin  
        Working...
        X