Using Mosaik to assemble bacterial genome 454 sequencing

jpearl01

Member

Join Date: Dec 2010

Posts: 17
- Share
- Tweet
#1

Using Mosaik to assemble bacterial genome 454 sequencing

12-13-2010, 02:25 PM

Hi all,

I decided to check out the alignment package Mosaik to create an assembly of a bacterial genome that we are working on. Usually we just use Newbler to create de novo assemblies (and in fact we already have). We've sequenced 12 strains of the same species, using 454 titanium (not paired end). We then, after assembly, closed two of the genomes on the bench with PCR. I'd like to reduce the number of contigs in the other strains by using the closed genomes as reference sequences. Well, also I'd like to get the assemblies into SAM format, since Newbler doesn't support that as output yet.

Mosaik is the first one I've been looking at, but I'm having an issue. I create the reference using one of the closed genomes (fasta file consisting of a single contig, no quality information) with this command:
./MosaikBuild -fr B475.fasta -oa B475.dat

Then I create the input file for the sequence fragments from one of our runs (leading sequence i.e. MIDs etc stripped):
./MosaikBuild -fr B476.fasta -st 454 -out B476.dat -fq B476.qual

Both of the above commands appear to work fine, however using the command:
./MosaikAligner -in B476.dat -out B475_B476_aligned.dat -ia B475.dat

Nets this problem (end of output):
Alignment statistics (mates):
===================================
# failed hash: 1774 ( 35.9 %)
# filtered out: 3169 ( 64.1 %)
-----------------------------------
total: 4943
total aligned: 0 ( 0.0 %)

MosaikAligner CPU time: 39.200 s, wall time: 40.548 s

If I change some of the stats to be more forgiving, i.e. add the flags:
-hs 12 -mm 10

None of the sequences "failed hash", but they are still all filtered out. Am I doing something obviously wrong? The Alignment statistics (mates) title worries me, since this isn't mated pair reads, just single ends. Ideas?

~josh
Tags: None
magofiura

Junior Member

Join Date: Jan 2012

Posts: 2
- Share
- Tweet
#2

03-27-2013, 05:04 AM

I have the same problem here, even if using illumina paired-end reads. Someone knows how to solve this issue?
Comment
krobison

Senior Member

Join Date: Nov 2007

Posts: 743
- Share
- Tweet
#3

03-27-2013, 06:29 AM

You might also look at various assemblers designed specifically for this, such as MIRA.
Comment

Previous template Next

Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
- Channel: Articles
Today, 07:01 AM
Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 37 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 41 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 35 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Using Mosaik to assemble bacterial genome 454 sequencing

Comment

Comment

Latest Articles

ad_right_rmr

News