SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Genome Res De novo bacterial genome sequencing: millions of very short reads assembly b_seite Literature Watch 1 10-04-2017 11:26 PM
how to use mira to assemble the fastq generated by 454 sequencing dingkai0564 Bioinformatics 6 05-26-2013 01:12 PM
bacterial genome sequencing superior10 General 1 10-27-2010 04:12 PM
PubMed: Oral Bacterial Genome Sequencing Using the High-Throughput Roche Genome Seque Newsbot! Literature Watch 0 08-19-2010 06:50 AM
454 paired end bacterial whole genome assembly pmiguel Bioinformatics 15 03-11-2010 04:50 AM

Reply
 
Thread Tools
Old 12-13-2010, 01:25 PM   #1
jpearl01
Member
 
Location: Philadelphia PA

Join Date: Dec 2010
Posts: 17
Default Using Mosaik to assemble bacterial genome 454 sequencing

Hi all,

I decided to check out the alignment package Mosaik to create an assembly of a bacterial genome that we are working on. Usually we just use Newbler to create de novo assemblies (and in fact we already have). We've sequenced 12 strains of the same species, using 454 titanium (not paired end). We then, after assembly, closed two of the genomes on the bench with PCR. I'd like to reduce the number of contigs in the other strains by using the closed genomes as reference sequences. Well, also I'd like to get the assemblies into SAM format, since Newbler doesn't support that as output yet.

Mosaik is the first one I've been looking at, but I'm having an issue. I create the reference using one of the closed genomes (fasta file consisting of a single contig, no quality information) with this command:
./MosaikBuild -fr B475.fasta -oa B475.dat

Then I create the input file for the sequence fragments from one of our runs (leading sequence i.e. MIDs etc stripped):
./MosaikBuild -fr B476.fasta -st 454 -out B476.dat -fq B476.qual

Both of the above commands appear to work fine, however using the command:
./MosaikAligner -in B476.dat -out B475_B476_aligned.dat -ia B475.dat

Nets this problem (end of output):
Alignment statistics (mates):
===================================
# failed hash: 1774 ( 35.9 %)
# filtered out: 3169 ( 64.1 %)
-----------------------------------
total: 4943
total aligned: 0 ( 0.0 %)

MosaikAligner CPU time: 39.200 s, wall time: 40.548 s

If I change some of the stats to be more forgiving, i.e. add the flags:
-hs 12 -mm 10

None of the sequences "failed hash", but they are still all filtered out. Am I doing something obviously wrong? The Alignment statistics (mates) title worries me, since this isn't mated pair reads, just single ends. Ideas?

~josh
jpearl01 is offline   Reply With Quote
Old 03-27-2013, 05:04 AM   #2
magofiura
Junior Member
 
Location: Siena (Italy)

Join Date: Jan 2012
Posts: 2
Default

I have the same problem here, even if using illumina paired-end reads. Someone knows how to solve this issue?
magofiura is offline   Reply With Quote
Old 03-27-2013, 06:29 AM   #3
krobison
Senior Member
 
Location: Boston area

Join Date: Nov 2007
Posts: 747
Default

You might also look at various assemblers designed specifically for this, such as MIRA.
krobison is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:14 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO