Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • mscholz
    Member
    • May 2010
    • 13

    Minimus2/nucmer assembly

    Hello,

    I was wondering if anyone had enough experience with Minimus2 to tell me what its default handling of Ns was. I am attempting to combine two fasta denovo assemblies, where one or both contain long stretches of Ns as a scaffold. The concern I have is whether minimus2 is replacing Ns with sequence if there is a match that stretches from outside the N region into the N gap.

    Thoughts?
  • konrad98
    Member
    • Jan 2009
    • 17

    #2
    In my experience I have found that minimus2 converts all Ns to As.

    Comment

    • mscholz
      Member
      • May 2010
      • 13

      #3
      Originally posted by konrad98 View Post
      In my experience I have found that minimus2 converts all Ns to As.
      That's...unfortunate for me.

      I had hoped that it would solve the gaps during the run. Does anyone know which program would be responsible for this conversion?

      Comment

      • Adjuvant
        Member
        • Sep 2010
        • 13

        #4
        Minimus2 as provided doesn't seen to handle N's very well. I found that if I changed the program in TextEdit at line 41 from:
        Code:
        41: $(BINDIR)/make-consensus -B -e $(CONSERR) -b $(BANK) -w $(WIGGLE)
        to:
        Code:
        41: $(BINDIR)/make-consensus_poly -B -e $(CONSERR) -b $(BANK) -w $(WIGGLE)
        N's that are overlapped by sequence in the query contigs will be replaced with sequence whereas non-overlapped N's and other ambiguity codes are retained. With make_consensus it seems like the N's were just getting replaced with random bases. That was unsettling, let me tell you...

        Comment

        • kbushley
          Member
          • Jan 2010
          • 22

          #5
          -w error

          Hi,

          Unnerving indeed! I'm trying to do this but getting error with the -w option (not for make-consensus poly)...did you just remove that and it seems to be working fine? There seems to be very little reference as you point out to what the make-consensus_poly algorithm...do you have any idea what it actually does?


          best,

          Kathryn

          Comment

          • Adjuvant
            Member
            • Sep 2010
            • 13

            #6
            You know, I've been running make-consensus_poly with the -w option and haven't been getting error messages, but going back and looking at the options listed under the -h option, I see that the -w option has disappeared from make-consensus_poly. I can't find any clear explanation of what "wiggle" actually is, and when I reverted back to make-consensus and tried modifying the wiggle value, I found no difference in the outputs. For my data it would appear that loss of the wiggle option doesn't have much impact on the results.

            It appears that make-consensus_poly is able to resolve ambiguity codes (like N's) whereas make-consensus can not. Here's some example output when I run the program to combine 87 contigs of a bacterial genome produced by alignment to a reference genome with 424 contigs produced by de novo assembly of the same reads.

            Stats for the combined fasta file input into minimus2:
            Code:
            Number of Contigs=511, Total bp=12703167, Shortest=52, Longest=568347,
            Average length=24859.4, Average GC%=66.6%, Non-ACGT bases=170454,
            Longest Run of non-ACGT Bases=290, Total non-ACGT bases on contig ends=0,
            Longest Run of Ns=290, Total Ns on contig ends=0
            Stats for the contig output file using minimus2 running the "make-consensus" program:
            Code:
            Number of Contigs=50, Total bp=6427520, Shortest=1519, Longest=447578,
            Average length=128550.4, Average GC%=66.8%, Non-ACGT bases=0
            Stats for the contig output file using minimus2 running the "make-consensus_poly" program:
            Code:
            Number of Contigs=50, Total bp=6564830, Shortest=1519, Longest=458268,
            Average length=131296.6, Average GC%=66.8%, Non-ACGT bases=137659,
            Longest Run of non-ACGT Bases=243, Total non-ACGT bases on contig ends=0,
            Longest Run of Ns=243, Total Ns on contig ends=0
            The singletons files were identical between both runs.

            So it appears that the number of contigs able to be combined was the same, but N's and other ambiguity codes were able to be preserved or replaced, in some cases (as the total number of non-ACTG bases between singletons and contigs is less than the total number in the input file) when make-consensus_poly was run instead of make-consensus.

            Looking at the stats for the output with make-consensus_poly, I was able to halve the number of contigs and double my average contig length. The total number of bases is still about 1.87x the expected genome size, so there are still going to be some overlaps minimus wasn't able to put together. Otherwise it would be too easy, right?

            Comment

            • mscholz
              Member
              • May 2010
              • 13

              #7
              Thanks all

              The alteration to make-consensus_poly was all that was needed.

              Now if I could just get rid of nucmer's pesky limitations on bases...

              Comment

              • kbushley
                Member
                • Jan 2010
                • 22

                #8
                Thanks also, that works! Another question. I'm a little troubled by results as I first tried nucmer on the two assemblies and I get what looks like a nice alignment. When running minimus2, I'm get a set of output 'contigs' that are roughly the expected size of my genome and then also a set of singletons that are also roughly the size of the genome. When I align these singletons back to the output contigs with nucmer, they also seem to align...I tried tweeking some of the nucmer parameters but that didn't work...Any thoughts on what could be causing this or what to do with all the singletons?

                Comment

                • 8052
                  Junior Member
                  • May 2010
                  • 2

                  #9
                  Seems the latest make-consensus bundled with AMOS 3.1.0 works well.
                  AMOS /amos/3.1.0 files. Browse /amos/3.1.0 files for AMOS, AMOS is a collection of tools for genome assembly

                  Comment

                  • mscholz
                    Member
                    • May 2010
                    • 13

                    #10
                    Originally posted by 8052 View Post
                    Seems the latest make-consensus bundled with AMOS 3.1.0 works well.
                    http://sourceforge.net/projects/amos/files/amos/3.1.0/
                    Does it work with Ns?

                    I'd love to stop using altered versions of other people's scripts....

                    Comment

                    • 8052
                      Junior Member
                      • May 2010
                      • 2

                      #11
                      Originally posted by mscholz View Post
                      Does it work with Ns?

                      I'd love to stop using altered versions of other people's scripts....
                      They say so in the version history. A new pipeline minimus2-blat, uses blat instead of nucmer is also available in this version.

                      Comment

                      Latest Articles

                      Collapse

                      • SEQadmin2
                        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                        by SEQadmin2


                        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                        ...
                        06-02-2026, 10:05 AM
                      • SEQadmin2
                        Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                        by SEQadmin2


                        With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                        Introduction

                        Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                        05-22-2026, 06:42 AM
                      • SEQadmin2
                        Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                        by SEQadmin2

                        Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                        Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                        05-06-2026, 09:04 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, Yesterday, 08:59 AM
                      0 responses
                      14 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      22 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 11:40 AM
                      0 responses
                      19 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 05-28-2026, 11:40 AM
                      0 responses
                      32 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...