Seqanswers Leaderboard Ad

**gringer** · 07-02-2016, 05:57 AM

DESeq2 doesn't need replicates, but they are strongly encouraged, and most bioinformaticians would advise the use of replicates if consulted prior to an experimental study. Without knowledge of the biological variation, it is difficult to establish whether or not an observed difference is statistically significant.

You may be able to treat different temperature brackets as biological replicates, but it would depend on the specifics of your investigation as well as what you are looking to get out of the experiment.

**SDPA_Pet** · 07-02-2016, 06:07 AM

Originally posted by gringer View Post

DESeq2 doesn't need replicates, but they are strongly encouraged, and most bioinformaticians would advise the use of replicates if consulted prior to an experimental study. Without knowledge of the biological variation, it is difficult to establish whether or not an observed difference is statistically significant.

You may be able to treat different temperature brackets as biological replicates, but it would depend on the specifics of your investigation as well as what you are looking to get out of the experiment.

Hi Gringer,

Are you sure about this (DESeq2) doesn't need replicates? I remember I emailed the author long time, he said it need replicates. At that time, it might be DEseq (DeSeq2 has not come out).

Personally, I don't remember any papers that I read didn't do replicates in their experiments. If you have read any papers without replicates in the analysis, I appreciate if you could share with me. I just want to make sure.

My 8 samples sites are from 8 different temperature. For example 80 C, 70 C .... 0 C. I would like to treat them independently. I really hate to group arbitrarily. For example, I could group them in two groups 4 sites > 40 degree C. 4 sites < 40 degree C. I would like to treat them as 8 individual groups.

**gringer** · 07-02-2016, 12:29 PM

Originally posted by SDPA_Pet View Post

Are you sure about this (DESeq2) doesn't need replicates?

From the DESeq2 documentation, frequently asked questions, section 5.8:

Can I use DESeq2 to analyze a dataset without replicates?

If a DESeqDataSet is provided with an experimental design without replicates, a warning is printed, that the samples are treated as replicates for estimation of dispersion. This kind of analysis is only useful for exploring the data, but will not provide the kind of proper statistical inference on differences between groups. Without biological replicates, it is not possible to estimate the biological variability of each gene. More details can be found in the manual page for ?DESeq.

**SDPA_Pet** · 07-05-2016, 07:10 AM

Well, it says without replicates, it's only useful for exploring the data. I need some of statistical results.

**GenoMax** · 07-05-2016, 07:25 AM

Originally posted by SDPA_Pet View Post

Well, it says without replicates, it's only useful for exploring the data. I need some of statistical results.

Even though this thread is about microarray data the pointers are universally applicable. https://www.biostars.org/p/14130/

**gringer** · 07-05-2016, 11:06 AM

Good discussion, thanks GenoMax.

From my point of view, you'll either be using programs written by people that say replicates are recommended, or you'll be using programs written by people who don't have a good understanding of the issues associated with replicate-free analysis.

But as mentioned in that biostars discussion, there's no reason why you can't use your results to generate a good hypothesis about your data, then proceed with follow-up studies to explore that hypothesis. Statistics can supplement other ideas and hypotheses, but should not be the sole determinant of whether or not a particular result is useful. The ASA statement on the p-value is a worthwhile read in that regard:

Just a moment...

http://amstat.tandfonline.com/doi/abs/10.1080/00031305.2016.1154108

Researchers should bring many contextual factors into play to derive scientific inferences, including the design of a study, the quality of the measurements, the external evidence for the phenomenon under study, and the validity of assumptions that underlie the data analysis. Pragmatic considerations often require binary, “yes-no” decisions, but this does not mean that p-values alone can ensure that a decision is correct or incorrect. The widespread use of “statistical significance” (generally interpreted as p < 0.05) as a license for making a claim of a scientific finding (or implied truth) leads to considerable distortion of the scientific process.

**vingomez** · 07-05-2016, 11:25 AM

Hi SDPA_Pet,

You could try the software STAMP (http://kiwi.cs.dal.ca/Software/STAMP), it supports tests for comparing pairs of samples or samples organized into two or more treatment groups.

Very useful for projects (see NCBI for published studies) with limited number of samples or no replicates (n=1).

From the author:

STAMP is a software package for analyzing taxonomic or metabolic profiles that promotes ‘best practices’ in choosing appropriate statistical techniques and reporting results. Statistical hypothesis tests for pairs of samples or groups of samples is support along with a wide range of exploratory plots. STAMP encourages the use of effect sizes and confidence intervals in assessing biological importance. A user friendly, graphical interface permits easy exploration of statistical results and generation of publication quality plots for inferring the biological relevance of features in a metagenomic profile. STAMP is open source, extensible via a plugin framework, and available for all major platforms.

**vingomez** · 07-05-2016, 11:51 AM

Originally posted by gringer View Post

From my point of view, you'll either be using programs written by people that say replicates are recommended, or you'll be using programs written by people who don't have a good understanding of the issues associated with replicate-free analysis.

Although I respect your opinion, sometimes programs were written for specific purpose/data; for example the lack of replicates. There are various quantitative techniques (see below 'Replication, lies and lesser-known truths regarding experimental design in environmental microbiology') that, when used properly, will allow scientists (e.g., environmental microbiologists) to make strong statistical conclusions from experimental and comparative data (Lennon 2011).

Originally posted by gringer View Post

But as mentioned in that biostars discussion, there's no reason why you can't use your results to generate a good hypothesis about your data, then proceed with follow-up studies to explore that hypothesis. Statistics can supplement other ideas and hypotheses, but should not be the sole determinant of whether or not a particular result is useful. The ASA statement on the p-value is a worthwhile read in that regard:

http://amstat.tandfonline.com/doi/ab...5.2016.1154108

Following this discussion, there are two amazing opinions published in Environmental Microbiology. The first, by James I. Prosser (Replicate or lie http://onlinelibrary.wiley.com/doi/1...0.02201.x/full) which discussed the lack of replication in current studies and a rebuttal by Jay T. Lennon (http://onlinelibrary.wiley.com/doi/1...445.x/abstract) which stated that 'although replication is an important component of experimental design, it is possible to do good science without replication".

**gringer** · 07-05-2016, 12:30 PM

Although I respect your opinion, sometimes programs were written for specific purpose/data; for example the lack of replicates.

Fair enough. I'd like to clarify that I don't think it's essential, just that it's a good idea if possible. The "replication" is also not necessarily an attempt at doing an identical experiment, but I think there should be some way to make a good guess at biological variation. Without that guess, I have difficulty in seeing how the importance of a particular result (if of marginal difference) can be established.

This replication, or estimation of biological variation, also doesn't need to involve further sampling. It could be something as simple as comparing with results from a public dataset for the same organism (but not necessarily the same experiment).

I notice that the second paper talks about a time series dataset with single observations for each time point, which might be similar to the temperature situation that SPDA_Pet has:

The researchers overcame this hurdle using a Bayesian technique called dynamic linear modelling (DLM), which explicitly deals with the non-independence of time-series data (Pole et al., 1994).

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 13 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Any software or R package can do this?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News