Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
reference based assembly from bam? lukas1848 Bioinformatics 6 05-11-2012 01:41 PM
SRMA Problem SAMRecord contig does not match the current reference sequence contig gavin.oliver Bioinformatics 5 07-05-2011 06:28 AM
reference based RNA-seq assembly explorebio RNA Sequencing 0 04-29-2011 02:08 AM
Contig assembly Ashu Bioinformatics 3 03-08-2011 05:16 AM
contig assembly yh_gu Bioinformatics 4 08-08-2010 09:08 AM

Thread Tools
Old 01-31-2013, 09:40 AM   #1
Location: Germany

Join Date: Jun 2011
Posts: 54
Question reference-based contig assembly?

Hi all,

I am facing a task which should be really easy to tackle, but I am kind of stuck. Maybe one of you has an idea of how to solve it.

I have a set of contigs (ranging from 2 to 120 kb) coming from a single bacterial species. I would like to assemble these contigs but I know that they do not overlap, i.e. that some parts of the genome are not present in my contigs. I therefore want to map my contigs against a reference genome and then just call a consensus alternative genome from my contigs, where all the gaps are filled with Ns.

I already tried samtools together with vcfutils' vcf2fq but this just outputs the reference genome again.
Here's my command for calling the consensus

samtools view -uS sorted.sam |samtools mpileup -uf genome.fasta - | bcftools view -cg - | vcf2fq > consensus.fq
So, has anyone an idea which alternative tool I could use to do that?

I also thought about writing my own script and somehow create fake paired end contigs from the information I get mapping the contigs against the reference.
lukas1848 is offline   Reply With Quote
Old 01-31-2013, 09:56 AM   #2
Location: US

Join Date: Sep 2012
Posts: 91

Have you tried Mauve Contig Mover?
winsettz is offline   Reply With Quote
Old 01-31-2013, 01:05 PM   #3
Senior Member
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,079

Find mauve here
GenoMax is offline   Reply With Quote
Old 02-01-2013, 11:49 AM   #4
Senior Member
Location: Boston area

Join Date: Nov 2007
Posts: 747

MIRA assembler has a map-to-reference based assembly mode; AMOS suite also contains a comparative assembly tool.

Also, Scaffolding low quality genomes using orthologous protein sequences. It looks like GRASS might also work for you.

Probably many more
krobison is offline   Reply With Quote
Old 02-01-2013, 12:54 PM   #5
Location: US

Join Date: Sep 2012
Posts: 91

And velvet's columbus mode, amongst others.

I'm curious if original poster assembled the reads naively (as in without reference) and is comparing against a reference? I did a short de novo test using ILM's DH10B MiSeq reads with and without the reference some time ago...
winsettz is offline   Reply With Quote
Old 02-03-2013, 07:55 AM   #6
Location: florida

Join Date: Jan 2013
Posts: 67

Also, there is a PAGIT pipeline published on nature protocol, This pipeline uses ABACAS to map contigs on reference, fill the gaps with Ns, then use the original reads to fill these gaps. Does this method work?
yzzhang is offline   Reply With Quote

reference-based, samtools consensus

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 05:12 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO