SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
RNA-Seq: Comparative Analysis of RNA-Seq Alignment Algorithms and the RNA-Seq Unified Newsbot! Literature Watch 3 07-31-2011 07:08 PM
Simulation package in R Jasmine Bioinformatics 0 07-05-2011 08:27 AM
RNA-Seq: A multiplex RNA-seq strategy to profile poly(A(+)) RNA: Application to analy Newsbot! Literature Watch 0 04-26-2011 04:00 AM
RNA-Seq: ExpEdit: a web server to explore human RNA editing in RNA-Seq experiments. Newsbot! Literature Watch 0 03-24-2011 02:10 AM
ChIP-Seq Simulation golharam Bioinformatics 3 08-07-2010 11:56 AM

Reply
 
Thread Tools
Old 06-05-2010, 10:04 PM   #1
Zimbobo
Member
 
Location: US

Join Date: Mar 2010
Posts: 25
Default RNA-Seq simulation

Hello,

does anyone know of any software that produces simulated RNA-Seq data. I am interested in questions like how many reads are needed for a good assembly with velvet for example, what read errors produce which problems in the assembly. Thanks in advance for any pointers.
Zimbobo is offline   Reply With Quote
Old 06-06-2010, 05:11 AM   #2
krobison
Senior Member
 
Location: Boston area

Join Date: Nov 2007
Posts: 747
Default

You are probably better off downloading an RNA-Seq dataset from the Short Read Archive -- this is more likely to represent the real biases you will find in RNA-Seq data (such as strand, 3' vs 5', etc)
krobison is offline   Reply With Quote
Old 06-07-2010, 07:24 AM   #3
brentp
Member
 
Location: salt lake city, UT

Join Date: Apr 2010
Posts: 72
Default

you could check out FluxSimulator.
brentp is offline   Reply With Quote
Old 06-07-2010, 07:42 AM   #4
krobison
Senior Member
 
Location: Boston area

Join Date: Nov 2007
Posts: 747
Default

The FluxSimulator looks interesting, but someone needs to fix the pages -- there is a consistent typo which makes it impossible to read them aloud in polite company!
krobison is offline   Reply With Quote
Old 06-21-2010, 07:30 AM   #5
vinay052003
Member
 
Location: Atlanta, US

Join Date: Jan 2010
Posts: 59
Default

These are no doubt good suggestions. One simple way would be to take the mRNA sequences from the public database and chop them up randomly (in-silico). Repeat this process couple of times untill you don't get the desired coverage.
vinay052003 is offline   Reply With Quote
Old 06-22-2010, 02:12 AM   #6
lexa
Member
 
Location: MPI

Join Date: Jun 2010
Posts: 17
Default

you can use dwgsim from the dnaa package. you can give read number, read length (even for paired-end data) and a reference.

maq contains also a simulation tool which seems similar to dwgsim.
lexa is offline   Reply With Quote
Old 06-25-2010, 02:29 AM   #7
micha
Junior Member
 
Location: Barcelona

Join Date: Jan 2009
Posts: 1
Default

Quote:
The FluxSimulator looks interesting, but someone needs to fix the pages -- there is a consistent typo which makes it impossible to read them aloud in polite company!
Keith,
someone fixed the page, I think you can read it aloud now the timestamp of the former html file read 00:58, was probably not the best moment of that day. Thanks for bringing this typo to attention!
micha is offline   Reply With Quote
Old 01-26-2011, 04:43 PM   #8
seqmagician
Junior Member
 
Location: USA

Join Date: Aug 2010
Posts: 3
Default Link to FLuxSimulator paper.

Could any one please point to me the link to FluxSimulator paper? I do find out in their web pages. Thanks.
seqmagician is offline   Reply With Quote
Old 01-27-2011, 03:59 AM   #9
lexa
Member
 
Location: MPI

Join Date: Jun 2010
Posts: 17
Default

as far as I know, there is no paper about FluxSimulator.
lexa is offline   Reply With Quote
Old 04-12-2011, 12:38 PM   #10
catbus
Member
 
Location: San Francisco

Join Date: Feb 2011
Posts: 21
Default USeq: RNA-Seq Simulator (requires *real* data as input, however)

There's also "RNA Seq Simulator," which is part of USeq---note that this requires REAL RNA-seq data as an input, and then it simulates various types of factors that cause differential gene expression.

http://useq.sourceforge.net/cmdLnMen...NASeqSimulator
catbus is offline   Reply With Quote
Old 12-06-2011, 05:07 AM   #11
greggrant
Member
 
Location: philadelphia

Join Date: Dec 2008
Posts: 28
Default

Quote:
Originally Posted by Zimbobo View Post
Hello,

does anyone know of any software that produces simulated RNA-Seq data. I am interested in questions like how many reads are needed for a good assembly with velvet for example, what read errors produce which problems in the assembly. Thanks in advance for any pointers.
Please try our simulator BEERS:

http://cbil.upenn.edu/BEERS/
greggrant is offline   Reply With Quote
Old 10-16-2012, 07:01 AM   #12
jingjinghao
Junior Member
 
Location: Tsinghua University

Join Date: Oct 2012
Posts: 3
Default

Quote:
Originally Posted by greggrant View Post
Please try our simulator BEERS:

http://cbil.upenn.edu/BEERS/
Hi,sir, I have tried BEERS. Thank you for your good simulator. It did me a big favor.
Some questions:
(1)BEERS generates reads like: genes from combined gene models(RefSeq, AceView...)-->transcripts-->add polymorphisms-->reads-->add sequence error and position bias. Is it right?
(2)How does BEERS decide which and how many gene and transcript to be "expressed"?
(3)Reads are generated from transcripts according to which distribution?

Thank you very much!

Last edited by jingjinghao; 10-16-2012 at 07:02 AM. Reason: misspelling
jingjinghao is offline   Reply With Quote
Old 08-06-2014, 12:33 AM   #13
Jegar
Junior Member
 
Location: Cambridge

Join Date: Aug 2014
Posts: 6
Default

There seems to be a broad range of RNA-Seq simulators. Has anyone done a comparison, or know of a paper that examines them empirically? I'd just be interested to know how their different features compare. The list I have compiled is this so far (apologies for doubles), some of these may be only for DNA-Seq simulation.

http://www.biomedcentral.com/1471-2164/13/74 - GemSim

http://bioinformatics.oxfordjournals...rmatics.btr708 - ART

https://github.com/jstjohn/SimSeq - SimSeq

https://popmodels.cancercontrol.canc...lux-simulator/ -FLUX Simulator

https://github.com/lh3/wgsim - wgsim in SAMtools

http://omictools.com/simulators2/ - Massive range of DNA-seq simulators

http://useq.sourceforge.net/cmdLnMen...NASeqSimulator RNAseq simulator

http://cbil.upenn.edu/BEERS/ BEERS


Importantly, is it better to use one of these simulators than to just download something from a Short Read archive? In which contexts?
Jegar is offline   Reply With Quote
Old 08-06-2014, 12:39 AM   #14
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

I imagine such a paper would be difficult to get published (I can already see the reviewer comments of "not novel" and "too trivial", even though such a paper would end up being useful for the community).

Regarding when you might use a simulator vs. an actual experiment, the only benefit to a simulator is that you can know exactly where the reads should align and where their mismatches are. If you need to test the accuracy of an aligner, then that's something you need. Similarly, if you want to test methods for calling SNPs or finding RNA editing sites, then you need a dataset with known changes. Of course the error profiles of the resulting reads are never perfect, so you end up needing to use a real dataset too, just to compare raw alignment/call rates (you obviously can't know accuracy from that).
dpryan is offline   Reply With Quote
Old 08-06-2014, 01:22 AM   #15
Jegar
Junior Member
 
Location: Cambridge

Join Date: Aug 2014
Posts: 6
Default

Thanks for your reply and helpful clarification.

It sounds like a comparison of simulators might be good for a blog post - shame that a helpful piece of work like that wouldn't get published (I have to agree it would not be considered novel, despite there being no existing published comparison paper).

I am exploring error signatures produced through biological processes in the sequencing workflow, and am attempting to reproduce the workflow in silica. From what I gather, Flux Simulator might have some love for me but if not I'll get Python to do the heavy lifting.
Jegar is offline   Reply With Quote
Old 08-06-2014, 01:35 AM   #16
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

If you do happen to perform such a comparison, please do post a link to it here and/or on biostars, since I expect many people would find it interesting.
dpryan is offline   Reply With Quote
Old 08-06-2014, 04:12 AM   #17
mbblack
Senior Member
 
Location: Research Triangle Park, NC

Join Date: Aug 2009
Posts: 245
Default

Quote:
Originally Posted by Zimbobo View Post
Hello,

does anyone know of any software that produces simulated RNA-Seq data. I am interested in questions like how many reads are needed for a good assembly with velvet for example, what read errors produce which problems in the assembly. Thanks in advance for any pointers.
Why not download a real data set from NCBI and then randomly sample from that to derive pseudo-data sets of varying read depths? That way you would have an realistic baseline to compare to.
__________________
Michael Black, Ph.D.
ScitoVation LLC. RTP, N.C.
mbblack is offline   Reply With Quote
Old 08-06-2014, 05:30 AM   #18
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Quote:
Originally Posted by mbblack View Post
Why not download a real data set from NCBI and then randomly sample from that to derive pseudo-data sets of varying read depths? That way you would have an realistic baseline to compare to.
Note that the person you're replying to posted that ~4 years ago...
dpryan is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:15 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO