SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
EdgeR design-matrix design extended.wobble RNA Sequencing 3 07-11-2011 07:58 AM
Experimental design based on coverage needed bioseq Bioinformatics 1 06-01-2011 09:40 AM
Help for experimental design ips RNA Sequencing 2 05-09-2011 04:47 PM
Multiplexing experimental design question. chadn737 RNA Sequencing 1 04-12-2011 05:18 PM
ChIP-Seq: ChIP-chip versus ChIP-seq: Lessons for experimental design and data analysi Newsbot! Literature Watch 0 03-02-2011 03:50 AM

Reply
 
Thread Tools
Old 12-19-2011, 12:02 PM   #1
SamCurt
Member
 
Location: Iowa

Join Date: May 2010
Posts: 40
Default Experimental Design: Which Kind of Replicate to Use?

We are planning to do some F2 mapping project and I'm unsure on how should I perform replicates for the parentals.

The parental strains we use are inbred model strains, and, hence, supposedly genetically uniform. I have two choices on replication:
  1. Extract from 10 animals of each strain, pool 1ug of their RNA together, and sequence this pooled RNA twice as technical replicates;
  2. Extract RNA from 2 animals from a parental strain, and perform one sequencing each;
  3. Extract from 1 animal, and do a technical replicate.

Which one would make most statistical sense in downstream analysis?

(BTW: Does that also mean it would be appropriate for me to do technical replicates for each of the F2? They are most certainly considered a different treatment individually... This is going to cost us one extra flow cell...?)
SamCurt is offline   Reply With Quote
Old 12-19-2011, 12:57 PM   #2
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

It all depends on what downstream analysis you want to perform.

Without this, I can only give you the general advice that possibilities 1 and 3 are wrong for any purpose, and 2 may or may not be useful. Also, better explain to us why you think that technical replicates are important. In most cases, you do need biological replicates (see the multitude of earlier threads on the subject), while technical replicates offer little extra value, except for trouble shooting. Also, why would technical replicates require extra flow cells? Are you sure you are not confusing with sequencing depth requirements or forgetting about multiplexing?
Simon Anders is offline   Reply With Quote
Old 12-19-2011, 01:11 PM   #3
SamCurt
Member
 
Location: Iowa

Join Date: May 2010
Posts: 40
Default

Thank you for your quick reply, Simon.

We are planning to strictly perform expression analysis using NGS. The F2 data are mainly for eQTL mapping, but the parental data may be used for any other kind of expression analysis, especially given we're doing some consomic and congenic lines out of those.

As for the issue of technical replicates: Each F2 individual (n=40) would have a completely different genotype due to recombination, so finding a biological replicate for every animal is clearly impossible, unlike their inbred parentals. If I get from what you said correctly, doing technical replicates for the F2 would not improve the statistical power in any way, so this is completely unnecessary?

And as for the "extra flow cell": I should have meant an extra lane.

Last edited by SamCurt; 12-23-2011 at 10:55 PM.
SamCurt is offline   Reply With Quote
Old 12-23-2011, 11:07 PM   #4
SamCurt
Member
 
Location: Iowa

Join Date: May 2010
Posts: 40
Default

By the way, Simon, now there's a reason about read depth vs robustness.

You have mentioned on this forums that it's better to sacrifice depth for robustness in Expression Analyses through multiplexing. But how much depth should we sacrifice?

Our previous project was done on a GAII without multiplexing (one sample/lane), but would switch to a HiSeq. Of course, since HiSeq makes 5X as many reads as GAII, a <5-plex run on a single lane would still have GAII-levels of depth. But how about 9-plex (3 replicates each for each strain), which would be only 55% of GAII depth per sample? Would the decrease in depth more than offset by the increase in degrees of freedom?
SamCurt is offline   Reply With Quote
Old 12-24-2011, 12:24 AM   #5
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

Let's first discuss the simple case where it is clear how to replicate, each treatment of cell cultures with something, so that you can simply repeat the treatment on another sample. Then, if you just do twice the number of replicates to half the depth, you do not lose information because your statistical power depends on the total number of reads per experimental condition, not on the number of reads per sample. DESeq, for example, keeps the replicates apart only to estimate the dispersion but for the actual test for differential expression, it sums up the counts form replicates.

Hence, you should estimate how many reads in total you want per condition and then spread this read budget over as many replicate samples as practical. If you want to get the same power as in your previous experiment, keep the total number of reads per condition the same.
Simon Anders is offline   Reply With Quote
Old 12-24-2011, 12:40 AM   #6
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

Now for your specific experiment, first regarding the parents: Even the expression of isogenic litter mates reared together typical differs considerably. If you either pool several mice or sample several mice indivually, these variation will average out to a degree so that your result is closer to the population mean than if you had only once mouse. However, for most potential downstream analyses, it will be important to know how far away you are still from the population mean, i.e. how much variation is left, and this you cannot see if you pool all samples. It may, however, make sense to make two or three pools, each with a different subset of the available mice, if multiplex tagging each mouse is not practical. The preferred way is, of course, always to pool only in silico, i.e., to get separate counts and sum them only later, in the analysis.

Of course, in the case of vertebrate animals, my statement about "as many replicate samples as practical" needs to be modified. We do not want to sacrifice more animals than strictly needed. Especially here, it can help to find a good experimental design to use published data from a similar system and play with it to get a feel, by re-analysing it after throwing out some samples or a fraction of the reads, to see how power changes if one reduces the number of reads or samples.

For the F2 mice: No, you do not need technical replicates. You just need to make sure you have enough sequencing depth per mouse so that any expression effects are not drowned in Poisson noise, and this depends on whether you want to look only at strongly or also at medium expressed genes. You need lots of mice, of course. 40 does sound good but you may want to do some math before starting to make sure.
Simon Anders is offline   Reply With Quote
Old 12-24-2011, 11:35 AM   #7
SamCurt
Member
 
Location: Iowa

Join Date: May 2010
Posts: 40
Default

Thank you for your detailed reply, Simon.

For Parentals: My knowledge on microarray statistics was that if you have n individuals, it is better to divide it into smaller subsets and run microarray on each subset pool separately, rather than pool all n individuals and do technical replicates. Your basic idea of "keep total read count of each biological group constant" pretty much echoed this view.
Given out number of parentals (10 per strain) we're pretty much limited by the amount barcodes that can be run on a lane (12)--the best would be 5 pools (ie what you wrote as subsets) per strain, but probably we have to settle with 3 or 4 pools per strain, despite the number of animals per pool may not be the same.

F2: I was relieved that technical replicates would not be needed. Of course I would do some model testing for robustness, but we have far more than 40 animals' tissue archived, and it would not be far too much of an hassle to extract a bit more RNA...
SamCurt is offline   Reply With Quote
Reply

Tags
experimental design

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:40 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO