Seqanswers Leaderboard Ad

**stefanoberri** · 07-01-2011, 05:13 AM

Hi.

I think the up to date simulator is wgsim

Reads simulator. Contribute to lh3/wgsim development by creating an account on GitHub.

History
=======

Wgsim was modified from MAQ's read simulator by dropping dependencies to other
source codes in the MAQ package and incorporating patches from Colin Hercus
which allow to simulate INDELs longer than 1bp. Wgsim was originally released
in the SAMtools software package. I forked it out in 2011 as a standalone
project. A few improvements were also added in this course.

**jstjohn** · 07-01-2011, 08:58 PM

Software list

Hey,
I am currently reviewing software for this purpose so I know of quite a few options. Most of these you can just google "[prog name] simulation genome" or something and you will find them in the top few hits. The illumina one you need to write to them to ask for and as far as I know it is not official.

* wgsim -> PE only, uniform error
* dwgsim -> Position specific error. PE only
* metasim -> PE only, specialized for simulating from a population
* in-house illumina C++ -> doesn't model mate-pair chimeras, uses sampling of illumina error strings as the error for the output. Doesn't model base specific error though, error is the same for each underlying base if it occurs.
* in-house illumina perl -> This adds in proper handling of mate-pair simulation, but it uses the same base level error strategy as the C++ version, this is the main reason we chose to write our own. Doesn't model pe-contamination in MP lib, but the developer notes it would be easy to separately generate PE reads and mix them into the output file. Although ours ended up being backwards, we still successfully modeled different error rates depending on the underlying base.
* PEMer -> no mate-pair chimeras
* reseqsim -> focuses on SV analysis, doesn't do MP modeling
* simnext -> flat error rate like wgsim
* mason -> doesn't model mate-pair chimeras
* flux-capacitor -> models RNA-seq reads

And of course there is the one I wrote which we used in the first Assemblathon:
SimSeq: https://github.com/jstjohn/SimSeq?locale=en

**gene coder** · 07-03-2011, 03:54 PM

Thanks everyone for your replies.

I want a sequence error simulator that should match Illumina in the 1000 Genomes Project. That is where I am getting my data from. (Illumina-specific is not a die-hard requirement, but it helps a bit. The type of error should not depend on the read size of reads.)

I need read-pairs. Read length should be specifiable by the user. The insert size should follow a random distribution - Normal or whatever - that can be specified. SimSeq seems to satisfy those criteria at the moment but I have not tried it yet.

I have my own tailored donor genome for a particular kind of mutation that needs sequencing errors.

**gene coder** · 07-07-2011, 03:37 AM

If I want to use dwgsim for simulating read-pairs, can anyone explain the flags for me (http://sourceforge.net/apps/mediawik...ome_Simulation)?

What do -e and -E mean technically? What are the error rates relative to?

I think that -r is the mutation rate per base pair. Can that be confirmed?

What does -R, the fraction of indels, mean? Fraction of what?

-X and -y are also confusing. What are those probabilities relative to?

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 31 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Simulate Illumina read-pairs

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News