SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Merging Assemblies with Minimus2 SLB Bioinformatics 2 11-02-2014 01:50 PM
merging scaffolds from several SOAPdenovo assemblies into a single consensus assembly jstjohn Bioinformatics 10 08-13-2014 05:39 AM
Comparing and merging genome assemblies megh Bioinformatics 5 07-23-2014 04:58 AM
Merging genomic Assemblies rahularjun86 De novo discovery 0 02-03-2012 03:08 AM
Merging Velvet Assemblies millermr General 0 11-23-2010 09:59 PM

Reply
 
Thread Tools
Old 11-23-2010, 11:19 PM   #1
millermr
Guest
 

Posts: n/a
Question Merging Velvet Assemblies

Hi,

This is my first post, and I look forward to being part of the community.

I've been prepping, sequencing, and assembling pools of ~10 BAC clones using PE100 reads on an Illumina. The average clone size is ~150 kb, but they can range from 50-250 kb. During the preps, I pooled an equal weight of DNA for each BAC clone, so I expect different levels of coverage from each BAC. I know that Velvet produces optimal assemblies with a k-mer coverage of 20-30X, and k-mers of ~55 give me an average coverage of that level. However, large BACs will have <20X coverage and small BACs will have >30X with this k-mer. To deal with this, I've been running Velvet with a series of k-mers (31, 41, ..., 81), and my plan is to merge the contigs from the series of assemblies.

Initially, I just used Mummer to align the contigs produced from each assembly to the other assemblies, and I wrote a script to parse the Mummer output and discard contigs that are nested within larger contigs. This works OK, but I'm looking for something more sophisticated. Does anybody have suggestions of the best software for doing this merging. What I want to do is quite simple, but I'm just not sure of the best software to use.

Thanks,
Mike
  Reply With Quote
Old 11-24-2010, 01:23 AM   #2
siiner
Junior Member
 
Location: Shenzhen

Join Date: Mar 2010
Posts: 2
Default

I found some tools can do this job, such as CAP3, Phrap, CA, and MAIA.
But I didn't actually make any of them work well.
Hope you could try and show your results.
siiner is offline   Reply With Quote
Old 12-17-2010, 12:38 PM   #3
kbushley
Member
 
Location: Oregon

Join Date: Jan 2010
Posts: 22
Default

Hello Mike,

I'm also trying to do this. I think CAP3 might be the best tool but am still exploring this...there is a guy in our department who's written a program using CAP3 to merge velvet and abyss assemblies. It might be of some use. Let me know if you've found any other solutions. You also are in the great state of Oregon...where are you located?
kbushley is offline   Reply With Quote
Old 12-17-2010, 02:51 PM   #4
millermr
Guest
 

Posts: n/a
Default

Hi kbushley,

I've been using Minimus2 and am somewhat satisfied. I haven't tried CAP3 yet. I'm at the University of Oregon. Go Ducks!!!

Best,
Mike
  Reply With Quote
Old 12-17-2010, 03:44 PM   #5
kbushley
Member
 
Location: Oregon

Join Date: Jan 2010
Posts: 22
Default

Thanks, I was reading up on that one today. Would you be willing to share your script that parses MUMmer output...that sound rather useful. Go Beaves -.
kbushley is offline   Reply With Quote
Old 12-17-2010, 03:56 PM   #6
millermr
Guest
 

Posts: n/a
Default

Sure. Get me your email address and I'll send them.
  Reply With Quote
Old 01-02-2011, 11:49 PM   #7
natstreet
Member
 
Location: Sweden

Join Date: Nov 2009
Posts: 83
Default

I'd also be really interested to give the scripts a try, if possible as this is something I've been looking for a good solution to. Can I send my email address to get a copy?
natstreet is offline   Reply With Quote
Old 01-03-2011, 11:15 AM   #8
gfmgfm
Member
 
Location: il

Join Date: Jun 2010
Posts: 64
Default

I also ran trans-abyss and velevt (with different k-mer), created a fasta file of all the assemblies I got (from the various velvet runs and from trans-abyss) and ran on that cap3.
What is the script for cap3 is doing?
Am I missing an important step?
gfmgfm is offline   Reply With Quote
Old 03-30-2011, 02:14 PM   #9
rongmancai
Junior Member
 
Location: Virginia

Join Date: Mar 2011
Posts: 1
Default merge contigs

Hi, mike

I sent you an email and discussed about the merge of contigs using Mummer. I am not sure you get it. No reply after I sent message. Hope to hear from you. Thanks.

Rongman
rongmancai is offline   Reply With Quote
Old 06-06-2011, 10:13 PM   #10
Seth
Junior Member
 
Location: KM

Join Date: Jul 2010
Posts: 4
Default

Hi Mike,

Have you tried Phrap? I'm assembling overlapping BACs recently and I've tried CAP3, Minimus2 and Phrap to remove the redundance of merged contigs, and Phrap works best.
But there is still redundance in the final assembly. I'd like to try Mummer next. Can you send me a copy of your script?

Thanks!
Seth
Seth is offline   Reply With Quote
Old 06-06-2011, 10:21 PM   #11
natstreet
Member
 
Location: Sweden

Join Date: Nov 2009
Posts: 83
Default

Quote:
Originally Posted by Seth View Post
....
But there is still redundance in the final assembly.....
Seth
Hi Seth

How do you assess redundancy and how do you determine when two contigs are redundant and should be merged rather than being too different to each other? I'm not sure what species you work with, but for our assemblies of highly heterozygous plants this is a huge issue. So far I've failed to find an option for achieving this that isn't horribly slow on large(ish) assemblies (400 Mbp +).

One thing I haven't yet tried is using PCAP as a replacement for CAP3. Has anyone tried it?

I would also be interested in a copy of the script if possible.
natstreet is offline   Reply With Quote
Old 06-07-2011, 01:43 AM   #12
Seth
Junior Member
 
Location: KM

Join Date: Jul 2010
Posts: 4
Default

Quote:
Originally Posted by natstreet View Post
Hi Seth

How do you assess redundancy and how do you determine when two contigs are redundant and should be merged rather than being too different to each other? I'm not sure what species you work with, but for our assemblies of highly heterozygous plants this is a huge issue. So far I've failed to find an option for achieving this that isn't horribly slow on large(ish) assemblies (400 Mbp +).
Hi,
I used the total base count of final assembly to assess the redundancy. And the maximum length of target region can be estimated from the insert length of the BAC and BACs' count. I'm not familiar with the algorithms adopted in those softwares but I think the main idea is to identify overlapping contigs and join them together.

Have you tried Hapsembler? Designed for assembling highly heterozygous genomes, but also slow.
Seth is offline   Reply With Quote
Old 06-07-2011, 07:19 AM   #13
natstreet
Member
 
Location: Sweden

Join Date: Nov 2009
Posts: 83
Default

Hi Seth

Thanks for the pointer to hapsembler, I hadn't come across it before. I'll test it out asap.
natstreet is offline   Reply With Quote
Old 07-05-2011, 10:39 AM   #14
priya.s
Junior Member
 
Location: United States

Join Date: May 2011
Posts: 1
Default Hapsembler

Hi,

I am assembling Hepatitis C virus hypervariable regions E1 and E2, which have lots of SNPs. I am using hapsembler but it is very slow. How is your experience with hapsembler?
priya.s is offline   Reply With Quote
Old 09-26-2011, 06:38 PM   #15
ZhigangLi
Member
 
Location: Beijing, China

Join Date: Nov 2010
Posts: 11
Default

I have a similar problem. I want to combine contigs/scaffolds assembled with different dataset, e.g. sanger,454 and solexa. I wanted to combine them based on Mummer alignments. However, it's so hand for me. The organism is 40M and the largest scaffold is 2M. Can I use these software to finish my job?

Last edited by ZhigangLi; 09-26-2011 at 07:09 PM.
ZhigangLi is offline   Reply With Quote
Old 09-27-2011, 08:04 AM   #16
rskr
Senior Member
 
Location: Santa Fe, NM

Join Date: Oct 2010
Posts: 250
Default

Quote:
Originally Posted by ZhigangLi View Post
I have a similar problem. I want to combine contigs/scaffolds assembled with different dataset, e.g. sanger,454 and solexa. I wanted to combine them based on Mummer alignments. However, it's so hand for me. The organism is 40M and the largest scaffold is 2M. Can I use these software to finish my job?
Cap3 and Mummer have issues in scaling, besides cap3 is for ESTs anyway.
rskr is offline   Reply With Quote
Old 10-03-2011, 04:12 AM   #17
anli
Junior Member
 
Location: Sweden

Join Date: Apr 2011
Posts: 3
Default

Hi.

I also have come across this issue. I have illumina PE data from an archaea and I now have two datasets were one is done with 100bp read-length and the other one with 150bp.

Doing an assembly on a merged dataset doesn't seem as a good approach, since you can't set multiple k-mer lengths in velvet.
anli is offline   Reply With Quote
Old 10-03-2011, 02:53 PM   #18
boetsie
Senior Member
 
Location: NL, Leiden

Join Date: Feb 2010
Posts: 245
Default

What do you guys want to do exactly? Do you want to make a consensus of the assemblies, or do you want to extend one of the assemblies by other assemblies?

You should be aware that you can merge repeated regions if they are at the boundaries of the contigs, and thus concatenate distant regions because of the repeat.

Anyway, what you can do is break the assemblies into smaller pieces and do a new denovo. I have a perl script which breaks all assemblies in user-defined k-mers and tries to do a new de novo assembly based on the users 'coverage'. Say you have four assemblies with different k-mers, and you only want to extend a contig by a k-mer if it is supported by e.g. three assemblies.

If you would like to have it, please contact me at marten.boetzer@baseclear.com

Regards,
Boetsie
boetsie is offline   Reply With Quote
Old 01-07-2012, 12:10 AM   #19
edge
Senior Member
 
Location: China

Join Date: Sep 2009
Posts: 199
Default

Hi Seth,

I got few question regarding CAP3 might need your advice.
I'm currently facing the following problem when trying to form a single set of non-redundant unigenes by CAP3
I have total of 8 *.fasta right now (RNA-seq scaffold sequence that extracted from same tissue but treated the sample with different condition for sequencing).
I would like to use CAP3 to assemble all the unigenes from different samples (but same tissue just treated the sample with different condition for sequencing) to form a single set of non-redundant unigenes.

Can I know what is the proper command I should apply when running CAP3 in order to form a single set of non-redundant unigenes of my RNA-seq data?
All my 8 sample scaffold in fasta format which is assembled by third party assembler program, Illumina pair-end read, 2X50bp, insert size 200.

This is all the info I have right now.
Many thanks for any advice.
edge is offline   Reply With Quote
Old 02-26-2015, 12:15 PM   #20
milo0615
Member
 
Location: Walnut, California

Join Date: Dec 2012
Posts: 39
Default

Quote:
Originally Posted by kbushley View Post
Hello Mike,

I'm also trying to do this. I think CAP3 might be the best tool but am still exploring this...there is a guy in our department who's written a program using CAP3 to merge velvet and abyss assemblies. It might be of some use. Let me know if you've found any other solutions. You also are in the great state of Oregon...where are you located?
Hi kbushley,

I am also trying to merge six different k-mer assemblies from Abyss. Were you able to merge yours? Can you share the program the guy from your department wrote? Please let me know. I would really appreciate your help.

Thank you,

-Milo
milo0615 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:27 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO