SEQanswers

Go Back   SEQanswers > Applications Forums > De novo discovery



Similar Threads
Thread Thread Starter Forum Replies Last Post
Merging Assemblies with Minimus2 SLB Bioinformatics 2 11-02-2014 12:50 PM
Minimus2 error at step 41 k2bhide Bioinformatics 5 04-18-2014 01:31 AM
Minimus2 - dumpreads RLB_84 Bioinformatics 2 10-11-2011 12:29 PM
minimus2 woes Adjuvant Bioinformatics 2 08-16-2011 01:21 PM
Minimus2 gardiea Bioinformatics 6 12-07-2010 06:39 AM

Reply
 
Thread Tools
Old 08-11-2010, 12:07 PM   #1
mscholz
Member
 
Location: Los alamos

Join Date: May 2010
Posts: 13
Default Minimus2/nucmer assembly

Hello,

I was wondering if anyone had enough experience with Minimus2 to tell me what its default handling of Ns was. I am attempting to combine two fasta denovo assemblies, where one or both contain long stretches of Ns as a scaffold. The concern I have is whether minimus2 is replacing Ns with sequence if there is a match that stretches from outside the N region into the N gap.

Thoughts?
mscholz is offline   Reply With Quote
Old 08-11-2010, 02:34 PM   #2
konrad98
Member
 
Location: Exeter, UK

Join Date: Jan 2009
Posts: 17
Default

In my experience I have found that minimus2 converts all Ns to As.
konrad98 is offline   Reply With Quote
Old 08-11-2010, 02:42 PM   #3
mscholz
Member
 
Location: Los alamos

Join Date: May 2010
Posts: 13
Default

Quote:
Originally Posted by konrad98 View Post
In my experience I have found that minimus2 converts all Ns to As.
That's...unfortunate for me.

I had hoped that it would solve the gaps during the run. Does anyone know which program would be responsible for this conversion?
mscholz is offline   Reply With Quote
Old 10-01-2010, 01:50 PM   #4
Adjuvant
Member
 
Location: Chicago, IL

Join Date: Sep 2010
Posts: 13
Default

Minimus2 as provided doesn't seen to handle N's very well. I found that if I changed the program in TextEdit at line 41 from:
Code:
41: $(BINDIR)/make-consensus -B -e $(CONSERR) -b $(BANK) -w $(WIGGLE)
to:
Code:
41: $(BINDIR)/make-consensus_poly -B -e $(CONSERR) -b $(BANK) -w $(WIGGLE)
N's that are overlapped by sequence in the query contigs will be replaced with sequence whereas non-overlapped N's and other ambiguity codes are retained. With make_consensus it seems like the N's were just getting replaced with random bases. That was unsettling, let me tell you...
Adjuvant is offline   Reply With Quote
Old 01-01-2011, 06:14 PM   #5
kbushley
Member
 
Location: Oregon

Join Date: Jan 2010
Posts: 22
Default -w error

Hi,

Unnerving indeed! I'm trying to do this but getting error with the -w option (not for make-consensus poly)...did you just remove that and it seems to be working fine? There seems to be very little reference as you point out to what the make-consensus_poly algorithm...do you have any idea what it actually does?


best,

Kathryn
kbushley is offline   Reply With Quote
Old 01-03-2011, 08:19 AM   #6
Adjuvant
Member
 
Location: Chicago, IL

Join Date: Sep 2010
Posts: 13
Default

You know, I've been running make-consensus_poly with the -w option and haven't been getting error messages, but going back and looking at the options listed under the -h option, I see that the -w option has disappeared from make-consensus_poly. I can't find any clear explanation of what "wiggle" actually is, and when I reverted back to make-consensus and tried modifying the wiggle value, I found no difference in the outputs. For my data it would appear that loss of the wiggle option doesn't have much impact on the results.

It appears that make-consensus_poly is able to resolve ambiguity codes (like N's) whereas make-consensus can not. Here's some example output when I run the program to combine 87 contigs of a bacterial genome produced by alignment to a reference genome with 424 contigs produced by de novo assembly of the same reads.

Stats for the combined fasta file input into minimus2:
Code:
Number of Contigs=511, Total bp=12703167, Shortest=52, Longest=568347,
Average length=24859.4, Average GC%=66.6%, Non-ACGT bases=170454,
Longest Run of non-ACGT Bases=290, Total non-ACGT bases on contig ends=0,
Longest Run of Ns=290, Total Ns on contig ends=0
Stats for the contig output file using minimus2 running the "make-consensus" program:
Code:
Number of Contigs=50, Total bp=6427520, Shortest=1519, Longest=447578,
Average length=128550.4, Average GC%=66.8%, Non-ACGT bases=0
Stats for the contig output file using minimus2 running the "make-consensus_poly" program:
Code:
Number of Contigs=50, Total bp=6564830, Shortest=1519, Longest=458268,
Average length=131296.6, Average GC%=66.8%, Non-ACGT bases=137659,
Longest Run of non-ACGT Bases=243, Total non-ACGT bases on contig ends=0,
Longest Run of Ns=243, Total Ns on contig ends=0
The singletons files were identical between both runs.

So it appears that the number of contigs able to be combined was the same, but N's and other ambiguity codes were able to be preserved or replaced, in some cases (as the total number of non-ACTG bases between singletons and contigs is less than the total number in the input file) when make-consensus_poly was run instead of make-consensus.

Looking at the stats for the output with make-consensus_poly, I was able to halve the number of contigs and double my average contig length. The total number of bases is still about 1.87x the expected genome size, so there are still going to be some overlaps minimus wasn't able to put together. Otherwise it would be too easy, right?
Adjuvant is offline   Reply With Quote
Old 01-05-2011, 03:35 PM   #7
mscholz
Member
 
Location: Los alamos

Join Date: May 2010
Posts: 13
Default Thanks all

The alteration to make-consensus_poly was all that was needed.

Now if I could just get rid of nucmer's pesky limitations on bases...
mscholz is offline   Reply With Quote
Old 01-05-2011, 03:59 PM   #8
kbushley
Member
 
Location: Oregon

Join Date: Jan 2010
Posts: 22
Default

Thanks also, that works! Another question. I'm a little troubled by results as I first tried nucmer on the two assemblies and I get what looks like a nice alignment. When running minimus2, I'm get a set of output 'contigs' that are roughly the expected size of my genome and then also a set of singletons that are also roughly the size of the genome. When I align these singletons back to the output contigs with nucmer, they also seem to align...I tried tweeking some of the nucmer parameters but that didn't work...Any thoughts on what could be causing this or what to do with all the singletons?
kbushley is offline   Reply With Quote
Old 08-16-2011, 06:31 PM   #9
8052
Junior Member
 
Location: Okinawa, Japan

Join Date: May 2010
Posts: 2
Default

Seems the latest make-consensus bundled with AMOS 3.1.0 works well.
http://sourceforge.net/projects/amos/files/amos/3.1.0/
8052 is offline   Reply With Quote
Old 08-17-2011, 02:43 PM   #10
mscholz
Member
 
Location: Los alamos

Join Date: May 2010
Posts: 13
Default

Quote:
Originally Posted by 8052 View Post
Seems the latest make-consensus bundled with AMOS 3.1.0 works well.
http://sourceforge.net/projects/amos/files/amos/3.1.0/
Does it work with Ns?

I'd love to stop using altered versions of other people's scripts....
mscholz is offline   Reply With Quote
Old 08-17-2011, 03:02 PM   #11
8052
Junior Member
 
Location: Okinawa, Japan

Join Date: May 2010
Posts: 2
Default

Quote:
Originally Posted by mscholz View Post
Does it work with Ns?

I'd love to stop using altered versions of other people's scripts....
They say so in the version history. A new pipeline minimus2-blat, uses blat instead of nucmer is also available in this version.
8052 is offline   Reply With Quote
Reply

Tags
amos, minimus2

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:24 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO