Bambus2 ... setup for large(ish) genomes

plattsa

Member

Join Date: Mar 2009

Posts: 17
- Share
- Tweet
#1

Bambus2 ... setup for large(ish) genomes

05-09-2012, 07:32 AM

All,

We've been happily using SOAPdeNOVO for the scaffolding of Ray contigs for a year or so, but it was recently suggested that bambus may be a more flexible scaffolder. Certainly it's turning out to be a more troublesome scaffolder.

Parsing out our scaffolding mate-pair data into a link file after alignment with novoalign was relatively straight forward. We then used toAmos to merge our link file and contig file into an afg file. This also seemed to go OK - although I'm not sure how bambus knows about read position and orientation from just a link file - however this is how the example data is presented.

The final step before running the bambus2 pipeline, according to the cbcb page, is to run minimus on the afg file to create a bnk directory. This is a stumbling block for us ... minimus is an assembler designed for small genomes and our set of contigs are not small. Consequently minimus runs for about a week trying to generate hash-overlaps without producing output or status messages and we eventually had to kill it.

We tried using bank-transact to create the bnk file from the afg file directly but clearly the afg file didn't have all that was needed since this then failed at the clk stage with the errors like 'no contig account found'.

I'm guessing that minimus is needed to create a set of files in the bnk directory describing contig overlaps. However (i) it is not a good tool for this and (ii) if these contigs were unambiguously overlapping, the earlier contiging stage would have merged them. So essentially we want to simply say is 'no contig overlaps at this stage'. But the bnk directory contents are not described that I can find (perhaps on sourceforge, but it's presenting errors).

Could anyone with experience of getting the pipeline working when starting with MP and fasta data perhaps give a hint as to their pipeline? With the python error in goBambus2 (described in another post) and problems getting the compile to work with boost, this is turning out to be a bit more of a problem than we were expecting for a 'simple standalone scaffolder'.

Many thanks
Tags: None

Previous template Next

Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM
Strategies for Sequencing Challenging Samples

by seqadmin

Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
- Channel: Articles
03-22-2024, 06:39 AM

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 18 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 47 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Bambus2 ... setup for large(ish) genomes

Latest Articles

ad_right_rmr

News