Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Minimus2/nucmer assembly

    Hello,

    I was wondering if anyone had enough experience with Minimus2 to tell me what its default handling of Ns was. I am attempting to combine two fasta denovo assemblies, where one or both contain long stretches of Ns as a scaffold. The concern I have is whether minimus2 is replacing Ns with sequence if there is a match that stretches from outside the N region into the N gap.

    Thoughts?

  • #2
    In my experience I have found that minimus2 converts all Ns to As.

    Comment


    • #3
      Originally posted by konrad98 View Post
      In my experience I have found that minimus2 converts all Ns to As.
      That's...unfortunate for me.

      I had hoped that it would solve the gaps during the run. Does anyone know which program would be responsible for this conversion?

      Comment


      • #4
        Minimus2 as provided doesn't seen to handle N's very well. I found that if I changed the program in TextEdit at line 41 from:
        Code:
        41: $(BINDIR)/make-consensus -B -e $(CONSERR) -b $(BANK) -w $(WIGGLE)
        to:
        Code:
        41: $(BINDIR)/make-consensus_poly -B -e $(CONSERR) -b $(BANK) -w $(WIGGLE)
        N's that are overlapped by sequence in the query contigs will be replaced with sequence whereas non-overlapped N's and other ambiguity codes are retained. With make_consensus it seems like the N's were just getting replaced with random bases. That was unsettling, let me tell you...

        Comment


        • #5
          -w error

          Hi,

          Unnerving indeed! I'm trying to do this but getting error with the -w option (not for make-consensus poly)...did you just remove that and it seems to be working fine? There seems to be very little reference as you point out to what the make-consensus_poly algorithm...do you have any idea what it actually does?


          best,

          Kathryn

          Comment


          • #6
            You know, I've been running make-consensus_poly with the -w option and haven't been getting error messages, but going back and looking at the options listed under the -h option, I see that the -w option has disappeared from make-consensus_poly. I can't find any clear explanation of what "wiggle" actually is, and when I reverted back to make-consensus and tried modifying the wiggle value, I found no difference in the outputs. For my data it would appear that loss of the wiggle option doesn't have much impact on the results.

            It appears that make-consensus_poly is able to resolve ambiguity codes (like N's) whereas make-consensus can not. Here's some example output when I run the program to combine 87 contigs of a bacterial genome produced by alignment to a reference genome with 424 contigs produced by de novo assembly of the same reads.

            Stats for the combined fasta file input into minimus2:
            Code:
            Number of Contigs=511, Total bp=12703167, Shortest=52, Longest=568347,
            Average length=24859.4, Average GC%=66.6%, Non-ACGT bases=170454,
            Longest Run of non-ACGT Bases=290, Total non-ACGT bases on contig ends=0,
            Longest Run of Ns=290, Total Ns on contig ends=0
            Stats for the contig output file using minimus2 running the "make-consensus" program:
            Code:
            Number of Contigs=50, Total bp=6427520, Shortest=1519, Longest=447578,
            Average length=128550.4, Average GC%=66.8%, Non-ACGT bases=0
            Stats for the contig output file using minimus2 running the "make-consensus_poly" program:
            Code:
            Number of Contigs=50, Total bp=6564830, Shortest=1519, Longest=458268,
            Average length=131296.6, Average GC%=66.8%, Non-ACGT bases=137659,
            Longest Run of non-ACGT Bases=243, Total non-ACGT bases on contig ends=0,
            Longest Run of Ns=243, Total Ns on contig ends=0
            The singletons files were identical between both runs.

            So it appears that the number of contigs able to be combined was the same, but N's and other ambiguity codes were able to be preserved or replaced, in some cases (as the total number of non-ACTG bases between singletons and contigs is less than the total number in the input file) when make-consensus_poly was run instead of make-consensus.

            Looking at the stats for the output with make-consensus_poly, I was able to halve the number of contigs and double my average contig length. The total number of bases is still about 1.87x the expected genome size, so there are still going to be some overlaps minimus wasn't able to put together. Otherwise it would be too easy, right?

            Comment


            • #7
              Thanks all

              The alteration to make-consensus_poly was all that was needed.

              Now if I could just get rid of nucmer's pesky limitations on bases...

              Comment


              • #8
                Thanks also, that works! Another question. I'm a little troubled by results as I first tried nucmer on the two assemblies and I get what looks like a nice alignment. When running minimus2, I'm get a set of output 'contigs' that are roughly the expected size of my genome and then also a set of singletons that are also roughly the size of the genome. When I align these singletons back to the output contigs with nucmer, they also seem to align...I tried tweeking some of the nucmer parameters but that didn't work...Any thoughts on what could be causing this or what to do with all the singletons?

                Comment


                • #9
                  Seems the latest make-consensus bundled with AMOS 3.1.0 works well.

                  Comment


                  • #10
                    Originally posted by 8052 View Post
                    Seems the latest make-consensus bundled with AMOS 3.1.0 works well.
                    http://sourceforge.net/projects/amos/files/amos/3.1.0/
                    Does it work with Ns?

                    I'd love to stop using altered versions of other people's scripts....

                    Comment


                    • #11
                      Originally posted by mscholz View Post
                      Does it work with Ns?

                      I'd love to stop using altered versions of other people's scripts....
                      They say so in the version history. A new pipeline minimus2-blat, uses blat instead of nucmer is also available in this version.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Current Approaches to Protein Sequencing
                        by seqadmin


                        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                        04-04-2024, 04:25 PM
                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 04-11-2024, 12:08 PM
                      0 responses
                      22 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 10:19 PM
                      0 responses
                      24 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 09:21 AM
                      0 responses
                      19 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-04-2024, 09:00 AM
                      0 responses
                      50 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X