Hi there
I gather that most people aren't bothering with replication for quantitative RNA-Seq experiments, that is sequencing multiple biological replicate samples for each treatment under investigation. Of course it makes the expt ridiculously expensive! But I think it's really important. A very patient statistician is helping me with design of a digital gene expression profiling experiment (RNA-Seq - either SOLiD or Illumina, haven't decided yet). The design includes 2 treatments, a number of biological reps for each treatment, and the aim is to detect differentially expressed genes between the 2 treatments.
I'd like to do some power calculations to determine the minimum number of reps for each treatment I can get away with, with and without use of sample multiplexing (i.e. multiplexing replicate samples in the same lane). For these calculations, I need an estimate of the between-sample variability of the final data, which I could get from an existing data set which uses this design. I'm having trouble finding one...
Can anyone help, either by providing a data set which uses biological replication, or providing a between-lane standard deviation (from normalised data) from such an expt, or simply by shedding light on variability between reps which one might normally expect to see in Illumina or SOLiD RNA-Seq data? I know it depends on the biological variability between samples, but I figure any information is better than none.
Thanks
Anar
I gather that most people aren't bothering with replication for quantitative RNA-Seq experiments, that is sequencing multiple biological replicate samples for each treatment under investigation. Of course it makes the expt ridiculously expensive! But I think it's really important. A very patient statistician is helping me with design of a digital gene expression profiling experiment (RNA-Seq - either SOLiD or Illumina, haven't decided yet). The design includes 2 treatments, a number of biological reps for each treatment, and the aim is to detect differentially expressed genes between the 2 treatments.
I'd like to do some power calculations to determine the minimum number of reps for each treatment I can get away with, with and without use of sample multiplexing (i.e. multiplexing replicate samples in the same lane). For these calculations, I need an estimate of the between-sample variability of the final data, which I could get from an existing data set which uses this design. I'm having trouble finding one...
Can anyone help, either by providing a data set which uses biological replication, or providing a between-lane standard deviation (from normalised data) from such an expt, or simply by shedding light on variability between reps which one might normally expect to see in Illumina or SOLiD RNA-Seq data? I know it depends on the biological variability between samples, but I figure any information is better than none.
Thanks
Anar
Comment