Seqanswers Leaderboard Ad

**tweist** · 01-02-2009, 09:08 AM

Have you tried Mauve Genome Aligner? It's available at http://gel.ahabs.wisc.edu/mauve/.

**hlu** · 01-02-2009, 02:49 PM

blat software is pretty good to compare 2 sets of bacterial contigs.

blat is faster than blast, and by default it generates an excel compatible tab delimited table. This is very easy to view from Excel, or parse for follow up reviews.

blat is freeware for academic usage, and can be downloaded from web.

**bioinfosm** · 01-06-2009, 09:41 AM

Originally posted by azmicro View Post

I'm probably asking a basic question, but I've searched for hours and can't seem to find a straight answer.

We have recently sequenced the entire genome (~5 MB) of a Salmonella strain using a brand new 454 sequencer. Ours was one of the first sequences ran. Since this is new, no one here really knows what to do with the data.

I ran the sff reads through gsAssembler (i.e. Newbler) and now have contigs. There are several strains of Salmonella that have been sequenced and fully annotated. Thus, I believe it would be easiest to compare the contigs to a reference strain to figure out what gaps need to be filled. I used Gs Reference Mapper to do this, but the data that comes out of Mapper is significantly less than what comes out of Assembler. Thus, I think Mapper might be chopping up the contigs to make them fit better.

Is there a program where I can use the contigs produced from Assember (which are .ace files) and compare them to a reference sequence that I have in a .fasta format? I have access to Consed, but can't seem to add a .fasta file into Consed to use as a reference.

Thanks for the help!

I am working on something very similar. How large are your contigs? and is there some headway you made that you can share?

**biofqzhao** · 01-08-2009, 12:33 PM

Actually you can use the Fasta file of your contigs instead of .ace file. There are a bunch of softwares that can be used to map your target contigs to the reference genome. OSLay is a pretty one (http://www-ab.informatik.uni-tuebing...y/welcome.html). PGA4genomics can also be used to assemble your contigs following one or more reference genome (http://nar.oxfordjournals.org/cgi/content/full/gkn168v1).
You can also use MUMmer to layout the contigs.

**jnfass** · 01-09-2009, 12:05 PM

Originally posted by azmicro View Post

Thus, I believe it would be easiest to compare the contigs to a reference strain to figure out what gaps need to be filled. I used Gs Reference Mapper to do this, but the data that comes out of Mapper is significantly less than what comes out of Assembler. Thus, I think Mapper might be chopping up the contigs to make them fit better.

Just to clarify: are you sure you compared the contigs, and not the original reads, to the reference strains?

But more to the point - if your strain is divergent enough from your reference strains, then it doesn't seem surprising to me that you'd get less coverage by mapping from one strain to another, than by assembling your new strain de novo ... i.e. your mapping is failing wherever there's enough divergence, whereas if you have good reads, your assembly will cover divergent regions as well as homologous regions.

**bioinfosm** · 01-09-2009, 02:25 PM

OSLay is brialliant for the purpose I wanted .. thanks much

**azmicro** · 01-09-2009, 04:45 PM

I figured out what was wrong! I used Mauve to compare the 454ContigsAll.fna file that came out of Assembler to a reference .fasta genome I downloaded from GenBank. Mauve provides a really nice visualization of where the contigs match up. Through Mauve I found contigs that did not match the reference sequence. When I BLASTed these contigs, I discovered they matched up to a Salmonella plasmid. For the sequencing I just did a genomic prep and didn't even think to separate out the plasmid DNA. Thus, Assembler's output included contigs that matched up to a plasmid whereas Mapper only included contigs that matched the reference sequence. Hence, the discrepancy between the amount of data output. This definitely makes my life easier!

And in response to jnfass: Mapper compares the reads to a reference sequences and assembles contigs based on that reading. Mapper then gives you much longer and thus far fewer contigs than Assembler.

**jnfass** · 01-09-2009, 05:19 PM

Glad you found your solution, azmicro ..
but I'd have to quibble that the number and length of contigs you'll get, and whether you get better (de novo) assemblies or (mapped) assemblies, will definitely depend on how divergent your reference and sequenced species are ... yours must be pretty close (being different strains, but not different species? maybe?)

**ssully** · 09-22-2010, 01:51 PM

I have a multi-chromosome reference sequence, and I want to map my 454-generated contigs (not reads) from a closely-related species, against it. The contigs are in one large multi-record FASTA file, the chromosomes are in one large Genbank (.gbk) file, i.e., a single file with 15 sets of features plus sequence, ordered 1 through 15. I've tried Mauve Contig Mover but while it did what looks like a great mapping job, and nicely displays the contig and chromosome boundary information (and annotations of the reference sequece) in the final alignment graphic, none of the output files I see allow me to easily map contigs on a per-chromosome basis (e.g., "this set of contigs maps to chromosome 12 in this order and orientation...."). The .tab file in the output gives ordered contig coordinates on a single giant pseudochromosome, which is all but useless to me without an indication of how these relate to the chromosome boundaries of the reference sequence. The output also includes a contig directory... which is empty....?

Ultimately what I'm aiming for are synteny maps of each chromosome in my reference genome. I realize Mauve was developed mainly on prokaryotic (single chromosome) genomes, but am I missing something here? Is there an easy way to do what I want with Mauve, that I'm not seeing, short of running each chromosome separately as a reference sequence? If not, should I be trying a different contig mapper?

**flxlex** · 10-06-2010, 06:14 AM

Since this is such an old thread (that I happened to be subscribed to), may I suggest starting a new one with your question...

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

How to align contigs?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News