Seqanswers Leaderboard Ad

**Michael Love** · 10-29-2014, 08:55 AM

Generalized linear models use analysis of deviance instead of analysis of variance.

http://en.wikipedia.org/wiki/Deviance_(statistics)

the -2 * the log likelihood for each gene is given in mcols(dds)$deviance

You can't get the deviance for each predictor, but you can compare the difference in deviance between successively smaller models, where the smallest model is ~ 1. So if you remove one predictor at a time and re-run you will see this.

This is the same that anova(fit) does when fit is a glm object.

**Golsheed** · 10-29-2014, 11:38 AM

Thanks a bunch, that's a huge help.

Golsheed

Originally posted by Michael Love View Post

Generalized linear models use analysis of deviance instead of analysis of variance.

http://en.wikipedia.org/wiki/Deviance_(statistics)

the -2 * the log likelihood for each gene is given in mcols(dds)$deviance

You can't get the deviance for each predictor, but you can compare the difference in deviance between successively smaller models, where the smallest model is ~ 1. So if you remove one predictor at a time and re-run you will see this.

This is the same that anova(fit) does when fit is a glm object.

**Golsheed** · 10-30-2014, 08:10 AM

Is there any way in general to estimate the variance explained by each covariate? For example, can we, regardless of the method used (i.e., glm.nb in this case), compute the coefficient of determination between successfully smaller models?
R^2=\hat{Cor} (\hat {y}, y)^2
where \hat{Cor}, \hat{y}, and y are the estimated correlation, fittet values, and the observed values.

Thanks,
Golsheed

Originally posted by Michael Love View Post

Generalized linear models use analysis of deviance instead of analysis of variance.

http://en.wikipedia.org/wiki/Deviance_(statistics)

the -2 * the log likelihood for each gene is given in mcols(dds)$deviance

You can't get the deviance for each predictor, but you can compare the difference in deviance between successively smaller models, where the smallest model is ~ 1. So if you remove one predictor at a time and re-run you will see this.

This is the same that anova(fit) does when fit is a glm object.

**Golsheed** · 11-04-2014, 08:39 AM

Thanks a lot for your help, Michael. Do you mind clarifying a bit on the expression that was used to calculate the deviance for each gene? I'm trying to calculate a "generalized" coefficient of determination for this glm, and I find that (if I'm not mistaken) people use different definitions of deviance for that purpose, and come up with different measures. Just to double-check, is this the way deviance is calculated for each gene in DESeq2:
-2*(log P(y|M)-log P(y|S)),
where M is the model we're testing for and S is the saturated (full) model where every observation has its own coefficient, and y is the observation?

Thanks again,
Golsheed

Originally posted by Michael Love View Post

Generalized linear models use analysis of deviance instead of analysis of variance.

http://en.wikipedia.org/wiki/Deviance_(statistics)

the -2 * the log likelihood for each gene is given in mcols(dds)$deviance

You can't get the deviance for each predictor, but you can compare the difference in deviance between successively smaller models, where the smallest model is ~ 1. So if you remove one predictor at a time and re-run you will see this.

This is the same that anova(fit) does when fit is a glm object.

**Michael Love** · 11-04-2014, 08:51 AM

The deviance we return is just -2 logP(y|M). You can get the saturated log likelihood using something like (warning: untested code):

Code:

rowSums(dnbinom(counts(dds), mu=counts(dds), size=1/dispersions(dds), log=TRUE))

**Golsheed** · 11-04-2014, 11:43 AM

Thanks a lot for the clarification on the deviance.
Could you please explain the code? I guess you're trying to sum up the log likelihoods of the observations based on the negative binomial distribution, the parameters of which are estimated by DESeq2, is that right? I have a hard time understanding what each term of the code represents. Specifically, in the alternative parametrization of the negative binomial distribution, how are counts related to mu?

Thanks again!

Originally posted by Michael Love View Post

The deviance we return is just -2 logP(y|M). You can get the saturated log likelihood using something like (warning: untested code):

Code:

rowSums(dnbinom(counts(dds), mu=counts(dds), size=1/dispersions(dds), log=TRUE))

**Golsheed** · 11-04-2014, 01:37 PM

Sorry for bugging you this much. If I'm using nbinomlRT to compare design(dds) the full model with the reduced model, is the deviance returning -2 logP(y|M) for the full model or the model that's considered to be significant according to the LRT?

Thanks a bunch!

Originally posted by Michael Love View Post

The deviance we return is just -2 logP(y|M). You can get the saturated log likelihood using something like (warning: untested code):

Code:

rowSums(dnbinom(counts(dds), mu=counts(dds), size=1/dispersions(dds), log=TRUE))

**Michael Love** · 11-05-2014, 06:19 AM

There is not a simple R^2-like statistic for generalized linear models but a set of proposed ones, and the differences in interpreting these is not trivial.

Here are some links which describe these pseudo-R^2 for binomial GLM (logistic regression), but the point is relevant also for NB and Poisson GLM.

http://www.ats.ucla.edu/stat/mult_pkg/faq/general/Psuedo_RSquareds.htm

Pseudo R squared formula for GLMs

http://stats.stackexchange.com/questions/11676/pseudo-r-squared-formula-for-glms

I found a formula for pseudo $R^2$ in the book Extending the Linear Model with R, Julian J. Faraway (p. 59). $$1-\frac{\text{ResidualDeviance}}{\text{NullDeviance}}$$. Is this a common formula for

Which pseudo-$R^2$ measure is the one to report for logistic regression (Cox & Snell or Nagelkerke)?

http://stats.stackexchange.com/questions/3559/which-pseudo-r2-measure-is-the-one-to-report-for-logistic-regression-cox-s

I have SPSS output for a logistic regression model. The output reports two measures for the model fit, Cox & Snell and Nagelkerke. So as a rule of thumb, which of these $R^²$ measures would you

You can try these out yourself or ask someone locally to help implement the one you are interested in. Here are the corresponding terms:

Code:

y = counts(dds)
y.bar = rowMeans(counts(dds))
y.hat = assays(dds)[["mu"]]

**Golsheed** · 11-05-2014, 08:05 AM

Thanks, I appreciate your detailed response.
As for the deviance reported by DESeq2, you mentioned that it is just the log likelihood of observations given the estimated parameters. However, when I run DESeq2 on my dataset, I'm getting weird results as follows:

> head(mcols(dds)$deviance)
[1] 247.3268 1265.1006 1012.0499 792.3526 1722.3851 1112.4335

These values don't seem to be representing log P(y|M), so I'm guessing either I'm doing something wrong or the deviance definition differs a bit from what I have in mind (specifically, is it the ratio of log likelihoods of the tested model against the null model?)

Thanks a lot for your help and patience.

Originally posted by Michael Love View Post

There is not a simple R^2-like statistic for generalized linear models but a set of proposed ones, and the differences in interpreting these is not trivial.

Here are some links which describe these pseudo-R^2 for binomial GLM (logistic regression), but the point is relevant also for NB and Poisson GLM.

http://www.ats.ucla.edu/stat/mult_pkg/faq/general/Psuedo_RSquareds.htm

Pseudo R squared formula for GLMs

http://stats.stackexchange.com/questions/11676/pseudo-r-squared-formula-for-glms

I found a formula for pseudo $R^2$ in the book Extending the Linear Model with R, Julian J. Faraway (p. 59). $$1-\frac{\text{ResidualDeviance}}{\text{NullDeviance}}$$. Is this a common formula for

Which pseudo-$R^2$ measure is the one to report for logistic regression (Cox & Snell or Nagelkerke)?

http://stats.stackexchange.com/questions/3559/which-pseudo-r2-measure-is-the-one-to-report-for-logistic-regression-cox-s

I have SPSS output for a logistic regression model. The output reports two measures for the model fit, Cox & Snell and Nagelkerke. So as a rule of thumb, which of these $R^²$ measures would you

You can try these out yourself or ask someone locally to help implement the one you are interested in. Here are the corresponding terms:

Code:

y = counts(dds)
y.bar = rowMeans(counts(dds))
y.hat = assays(dds)[["mu"]]

**Michael Love** · 11-05-2014, 08:07 AM

read over my response again (#6 above)

**Golsheed** · 11-05-2014, 08:20 AM

Oh, right! sorry, I missed the minus sign!
If I'm using the nbinomLRT() function with a full and a reduced model, then does the deviance, for each gene, output -2 log P(y|M) where M is the full model or does it report the log likelihood for the significant model?
Thanks so much again.

**Michael Love** · 11-05-2014, 08:30 AM

The column, mcols(dds)$deviance, is -2 log likelihood full model.

The LRT statistic is given in results(dds)$stat, and this is -2 log (likelihood reduced model / likelihood full model)

See here for more details: http://en.wikipedia.org/wiki/Likelihood-ratio_test

**Golsheed** · 11-05-2014, 12:44 PM

Thanks a lot for the clarification.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 39 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 41 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 35 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Variance decomposition in DESeq2

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News