Go Back   SEQanswers > Introductions

Similar Threads
Thread Thread Starter Forum Replies Last Post
CASIM: RNA-Seq Factors in experimental design casim UK - Cambridge 0 04-18-2013 12:15 PM
seeking pool-seq experimental design advice bluesquid Genomic Resequencing 0 08-07-2012 04:15 AM
Help for experimental design ips RNA Sequencing 2 05-09-2011 03:47 PM

Thread Tools
Old 01-31-2014, 07:00 AM   #1
Junior Member
Location: Virginia, USA

Join Date: Jan 2014
Posts: 8
Default RNA-Seq Experimental Design


My name is David Brohawn and I am new to RNA-Seq.

My advisor and I are interested in doing an RNA-Seq experiment to compare the transcriptomes of iPSC neurons we generate from both ALS patients and controls. Ultimately we would like to identify molecular phenotypes based on transcriptome expression profiles for different instances of ALS (much like how cancer researchers now identify underlying molecular phenotypes for different instances of a given cancer).

We are primarily interested in generating transcriptome profiles (involving both coding and non-coding RNA and novel transcripts), with a heavy interest in differential gene expression and less interest in mapping full transcript isoforms.

As I understand it, a greater number of small reads is best to assess differential gene expression (Solid and Illumina look most amenable to this), while a smaller number of long reads is best to assess isoforms (Roche and PacBio look most amenable to this).

I see the ENCODE project recommends “Experiments whose purpose is discovery of novel transcribed elements and strong quantification of known transcript isoforms… a minimum depth of 100-200 M 2 x 76 bp or longer reads is currently recommended.”

We plan on using Illumina Truseq total RNA prep kits followed by sequencing on the Illumina HiSeq 2500. An Illumina rep quoted 187 million reads per lane as typical output for a 2X100 run. If this is true, I am thinking we multiplex our 20 total samples (10 cases and controls) and run 11 total lanes which would average out to just over 100 million reads per sample.

We would then analyze the data with the Tuxedo Suite bioinformatics package (we may substitute STAR for Tophat and Bowtie), and visualize our data using CummeRbund.

We are considering purchasing a LINUX based machine or a Mac with these specs for processing:

CPU – 2 quad core processors
HDD 8 TB – RAID assembly of 4 2-TB drives
RAM – 24 GB of RAM
GHz – 3.2 GHz

I have been told the number of reads per sample may be overkill given our goals, but I am really following ENCODEs recommendations. Do you all have any suggestions based on what I have reported?

Thanks for taking the time to read and respond!

Dave Brohawn
dbroh11 is offline   Reply With Quote
Old 01-31-2014, 07:23 AM   #2
Senior Member
Location: London

Join Date: Jun 2009
Posts: 298

You could run all 20 of your samples across 2 lanes and get somewhere approaching 20m reads per sample. This should be more than adequate for differential expression analysis.
TonyBrooks is offline   Reply With Quote
Old 01-31-2014, 07:30 AM   #3
Senior Member
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,060

Cross-posted. Please use the other thread since this one is in "Introductions":
GenoMax is offline   Reply With Quote
Old 01-31-2014, 07:32 AM   #4
Junior Member
Location: Virginia, USA

Join Date: Jan 2014
Posts: 8

Yup - didn't know how to change forums when I first signed up - Thank you for your help!
dbroh11 is offline   Reply With Quote

rna-seq advice, rna-seq design, rna-seq recommendations, rna-seq suggestions

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 05:54 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO