Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bambus2 ... setup for large(ish) genomes

    All,

    We've been happily using SOAPdeNOVO for the scaffolding of Ray contigs for a year or so, but it was recently suggested that bambus may be a more flexible scaffolder. Certainly it's turning out to be a more troublesome scaffolder.

    Parsing out our scaffolding mate-pair data into a link file after alignment with novoalign was relatively straight forward. We then used toAmos to merge our link file and contig file into an afg file. This also seemed to go OK - although I'm not sure how bambus knows about read position and orientation from just a link file - however this is how the example data is presented.

    The final step before running the bambus2 pipeline, according to the cbcb page, is to run minimus on the afg file to create a bnk directory. This is a stumbling block for us ... minimus is an assembler designed for small genomes and our set of contigs are not small. Consequently minimus runs for about a week trying to generate hash-overlaps without producing output or status messages and we eventually had to kill it.

    We tried using bank-transact to create the bnk file from the afg file directly but clearly the afg file didn't have all that was needed since this then failed at the clk stage with the errors like 'no contig account found'.

    I'm guessing that minimus is needed to create a set of files in the bnk directory describing contig overlaps. However (i) it is not a good tool for this and (ii) if these contigs were unambiguously overlapping, the earlier contiging stage would have merged them. So essentially we want to simply say is 'no contig overlaps at this stage'. But the bnk directory contents are not described that I can find (perhaps on sourceforge, but it's presenting errors).

    Could anyone with experience of getting the pipeline working when starting with MP and fasta data perhaps give a hint as to their pipeline? With the python error in goBambus2 (described in another post) and problems getting the compile to work with boost, this is turning out to be a bit more of a problem than we were expecting for a 'simple standalone scaffolder'.

    Many thanks

Latest Articles

Collapse

  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM
  • seqadmin
    Strategies for Sequencing Challenging Samples
    by seqadmin


    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
    03-22-2024, 06:39 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
18 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
22 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
47 views
0 likes
Last Post seqadmin  
Working...
X