Seqanswers Leaderboard Ad

**joro** · 05-13-2010, 01:10 AM

Hi Balat,

DESeq supports testing without replicates. See thread

edgeR with no replication (Common disp or poisson) - SEQanswers

http://seqanswers.com/forums/showthread.php?t=4055

Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

and the documentation on http://bioconductor.org/packages/rel...tml/DESeq.html

**Davis McC** · 05-26-2010, 07:35 PM

Hi Balat

The Bioconductor package edgeR also supports DE analysis without replication - see the discussion at the link posted by joro. Find out more about edgeR here - I recommend having a look at the User's Guide to get a feel for what the package does and how to use it. The section on Poisson analysis is most relevant if you want to analyse data without replication.

Best regards
Davis

**Simon Anders** · 05-26-2010, 10:55 PM

Hi,

DESeq, when given data without replicates, will switch to a conservative mode of overestimating variance, as I described in the post that joro cited. EdgeR can do the same but you have to tell it what dispersion estimate to use. Be sure to read Davis's post in the same thread. We both stress that switching the dispersion to 0 (Poisson test) will never give reliable results.

Why did you pool your data into bulk samples? I understand that sequencing each sample individually would have been to expensive, but you could have pooled the 10 stress samples into two pools of five each. Then, you would have sequences for three pools, two of which would have been biological replicates which is fully sufficient to get a good noise estimate. If you had used barcoded adapters, it might not even have cost more.

Simon

**Balat** · 05-26-2010, 11:27 PM

Hi Simon,
Thanks for the reply. Yes I could have used two bulks of five each but unfortunately I didn't. However I have analysed my data using the sample clustering feature of DESeq. I have sequence data from two populations. I have denoted the control samples from each population as S0-P and S0-K, similarly from stress2 as S1-P and S1-K and stress1 as S2-P and S2-K (there was an error in my labelling). The heat map clearly separates the treatments. Moreover expression from the two populations within a treatment look very similar as biological relplicates. In that case, can I treat the two populations as biological replicates? I have attached the heat map here.
Thank you very mcuh.

Attached Files

S0PK vs S2PK vs S1PK.jpg (8.3 KB, 60 views)

**Balat** · 05-27-2010, 12:02 AM

Hi Davis,
I just had a look into your reply to Sergio. I have used edgeR by treating S0-P and S0-K and S1-P and S1-K as biological replicates. I got a common dispersion estimate of 0.06. Based on this result and the heat map figure, is it ok to treat my two populations (P and K) as bilogical replicates?

Thank you.

**Simon Anders** · 05-27-2010, 12:51 AM

Originally posted by Balat View Post

In that case, can I treat the two populations as biological replicates?

If they are two independently grown populations, you don't just treat them as biological replicates, they are biological replicates. So go ahead and use them that way.

Simon

**Davis McC** · 05-27-2010, 03:57 PM

I concur with Simon - you either have biological replicates or you do not, based on the origin of your samples. It is not something that is determined in the analysis of the data.

I have used edgeR by treating S0-P and S0-K and S1-P and S1-K as biological replicates. I got a common dispersion estimate of 0.06.

By way of interpretation, the common dispersion estimate is the "squared coefficient of variation", which is a measure of the inter-library variability, distinct from the technical variability. Here the coefficient of variation is therefore approximately 0.24, which we would interpret as indicating that the true concentration of each gene (an unobservable quantity) varies up and down by 24% between libraries.

The assumption here is that the coefficient of variation is more or less constant across all genes. Now, Simon would rightly point out that this assumption does not hold for all RNA-seq datasets, but it does give you some idea of the variability you see between sample replicates in you data.

Cheers
Davis

**Balat** · 05-27-2010, 04:47 PM

Thanks Simon and Davis.
The populations in my study are separately grown populations of the same species under common garden conditions. I was expecting that the gene expression patterns would be very different between the populations. But clustering analysis by DEseq clearly shows that the expression patterns within each treatment are similar between the two populations.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 18 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 17 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 49 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Analysing RNA seq data

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News