Unconfigured Ad

**westerman** · 02-20-2012, 06:58 AM

I've never done such a project. An interesting project.

I would map each sample's reads versus the reference. Eliminate those reads. Then use Velvet (or other denovo assembler) to assemble the remaining per-sample reads. Use Glimmer (or other) to detect the genes.

Your idea of a 'super assembly' is a good one however you might get better results via eliminating the reads that already map to the reference.

**rghan** · 02-20-2012, 08:01 AM

Have you read the below paper? The paper and supplementary describes an interesting pipeline that might be useful to you.

303 See Other

http://www.nature.com/nature/journal/v477/n7365/full/nature10414.html

This links to the software pipeline they employed. We're still trying to get it to work properly in house, but we've a much larger genome then you do.

http://mus.well.ox.ac.uk/19genomes/IMR-DENOM/

**Zam** · 02-20-2012, 12:13 PM

An alternative approach is to assemble a "graph" of all of your samples simultaneously, and then look either at the accumulation of new variants, or for which contigs are shared by which strains. Or, alternatively, you could build an assembly of yourfirst strain by standard means, and then compare this with your joint assembly of all strains, and pull out "novel" contigs that differ from your original assembly. All of these are supported by this software (disclosure - I am an author)
cortexassembler.sourceforge.net
You might take a look at this paper

303 See Other

http://dx.doi.org/10.1038/ng.1028

which does something by assembling 164 human genomes and looking for novel sequence different from the human reference

**Zam** · 02-20-2012, 12:14 PM

Oops, signed off too quickly - hope that made sense - feel free to email me if not (zam AT well.ox.ac.uk)

**green tree** · 02-20-2012, 03:19 PM

Hi everyone, Thanks for the responses ! Zam, great link and interesting paper ( I was actually just thinking about this in the human population)

Topics	Statistics	Last Post
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 25 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 30 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 39 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, 06-02-2026, 12:03 PM	0 responses 62 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 12:03 PM

Unconfigured Ad

Finding new regions of DNA in genome assemblies

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News

Unconfigured Ad

Finding *new* regions of DNA in genome assemblies

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News

Finding new regions of DNA in genome assemblies