Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging Velvet Assemblies

    Hi,

    This is my first post, and I look forward to being part of the community.

    I've been prepping, sequencing, and assembling pools of ~10 BAC clones using PE100 reads on an Illumina. The average clone size is ~150 kb, but they can range from 50-250 kb. During the preps, I pooled an equal weight of DNA for each BAC clone, so I expect different levels of coverage from each BAC. I know that Velvet produces optimal assemblies with a k-mer coverage of 20-30X, and k-mers of ~55 give me an average coverage of that level. However, large BACs will have <20X coverage and small BACs will have >30X with this k-mer. To deal with this, I've been running Velvet with a series of k-mers (31, 41, ..., 81), and my plan is to merge the contigs from the series of assemblies.

    Initially, I just used Mummer to align the contigs produced from each assembly to the other assemblies, and I wrote a script to parse the Mummer output and discard contigs that are nested within larger contigs. This works OK, but I'm looking for something more sophisticated. Does anybody have suggestions of the best software for doing this merging. What I want to do is quite simple, but I'm just not sure of the best software to use.

    Thanks,
    Mike

  • #2
    I found some tools can do this job, such as CAP3, Phrap, CA, and MAIA.
    But I didn't actually make any of them work well.
    Hope you could try and show your results.

    Comment


    • #3
      Hello Mike,

      I'm also trying to do this. I think CAP3 might be the best tool but am still exploring this...there is a guy in our department who's written a program using CAP3 to merge velvet and abyss assemblies. It might be of some use. Let me know if you've found any other solutions. You also are in the great state of Oregon...where are you located?

      Comment


      • #4
        Hi kbushley,

        I've been using Minimus2 and am somewhat satisfied. I haven't tried CAP3 yet. I'm at the University of Oregon. Go Ducks!!!

        Best,
        Mike

        Comment


        • #5
          Thanks, I was reading up on that one today. Would you be willing to share your script that parses MUMmer output...that sound rather useful. Go Beaves -.

          Comment


          • #6
            Sure. Get me your email address and I'll send them.

            Comment


            • #7
              I'd also be really interested to give the scripts a try, if possible as this is something I've been looking for a good solution to. Can I send my email address to get a copy?

              Comment


              • #8
                I also ran trans-abyss and velevt (with different k-mer), created a fasta file of all the assemblies I got (from the various velvet runs and from trans-abyss) and ran on that cap3.
                What is the script for cap3 is doing?
                Am I missing an important step?

                Comment


                • #9
                  merge contigs

                  Hi, mike

                  I sent you an email and discussed about the merge of contigs using Mummer. I am not sure you get it. No reply after I sent message. Hope to hear from you. Thanks.

                  Rongman

                  Comment


                  • #10
                    Hi Mike,

                    Have you tried Phrap? I'm assembling overlapping BACs recently and I've tried CAP3, Minimus2 and Phrap to remove the redundance of merged contigs, and Phrap works best.
                    But there is still redundance in the final assembly. I'd like to try Mummer next. Can you send me a copy of your script?

                    Thanks!
                    Seth

                    Comment


                    • #11
                      Originally posted by Seth View Post
                      ....
                      But there is still redundance in the final assembly.....
                      Seth
                      Hi Seth

                      How do you assess redundancy and how do you determine when two contigs are redundant and should be merged rather than being too different to each other? I'm not sure what species you work with, but for our assemblies of highly heterozygous plants this is a huge issue. So far I've failed to find an option for achieving this that isn't horribly slow on large(ish) assemblies (400 Mbp +).

                      One thing I haven't yet tried is using PCAP as a replacement for CAP3. Has anyone tried it?

                      I would also be interested in a copy of the script if possible.

                      Comment


                      • #12
                        Originally posted by natstreet View Post
                        Hi Seth

                        How do you assess redundancy and how do you determine when two contigs are redundant and should be merged rather than being too different to each other? I'm not sure what species you work with, but for our assemblies of highly heterozygous plants this is a huge issue. So far I've failed to find an option for achieving this that isn't horribly slow on large(ish) assemblies (400 Mbp +).
                        Hi,
                        I used the total base count of final assembly to assess the redundancy. And the maximum length of target region can be estimated from the insert length of the BAC and BACs' count. I'm not familiar with the algorithms adopted in those softwares but I think the main idea is to identify overlapping contigs and join them together.

                        Have you tried Hapsembler? Designed for assembling highly heterozygous genomes, but also slow.

                        Comment


                        • #13
                          Hi Seth

                          Thanks for the pointer to hapsembler, I hadn't come across it before. I'll test it out asap.

                          Comment


                          • #14
                            Hapsembler

                            Hi,

                            I am assembling Hepatitis C virus hypervariable regions E1 and E2, which have lots of SNPs. I am using hapsembler but it is very slow. How is your experience with hapsembler?

                            Comment


                            • #15
                              I have a similar problem. I want to combine contigs/scaffolds assembled with different dataset, e.g. sanger,454 and solexa. I wanted to combine them based on Mummer alignments. However, it's so hand for me. The organism is 40M and the largest scaffold is 2M. Can I use these software to finish my job?
                              Last edited by ZhigangLi; 09-26-2011, 06:09 PM.
                              github:
                              https://github.com/Bioinformatics-and-Genomics

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              8 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              8 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              49 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              66 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X