SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
edgeR ECHo Bioinformatics 13 04-18-2013 01:01 PM
edgeR wangli RNA Sequencing 4 04-19-2012 06:58 AM
edgeR Puva Bioinformatics 2 05-19-2011 09:04 AM
Phase contrast scope recommendation k-gun12 General 0 03-19-2011 07:46 AM

Reply
 
Thread Tools
Old 02-01-2014, 09:20 AM   #1
samhokin
Member
 
Location: Santa Fe, NM

Join Date: Nov 2013
Posts: 20
Default coef vs contrast in edgeR glmLRT

I thought I understood what the contrast= option in glmLRT means as opposed to coef= (which just looks for DE wrt the index=1 sample, presumed to be the control), but I'm getting crazy results from using contrast= to get what I think should be the equivalent to coef=.

I've got a 4 times X 5 conditions design matrix, so 20 total "treatments." I want to get the DE for condition vs control for each of the four times. The first non-control condition is called AS2 and the control is called Col (for AS2-enhanced transgenic samples and "Columbia", a standard Arabidopsis wild type, if you're curious). So, for example, I'd like to get DE{Col(0)|AS2(0)}, DE{Col(30)|AS2(30)}, etc., where 0 and 30 minutes are two of the four times. But I'll focus on t=0 here, which seems to be where the problem lies. The "control" as edgeR is concerned is condition=Col and time=0 with index 1.

The design matrix is, in order:
(Intercept), ConditionAS2, ConditionKan, ConditionRev, ConditionSTM,
ConditionCol:Timet030, ConditionAS2:Timet030, ConditionKan:Timet030, ConditionRev:Timet030, ConditionSTM:Timet030,
ConditionCol:Timet060, etc.

Here's the two calls that I think should be equivalent:

(a) glmLRT(f, coef=2)

This returns a very reasonable dataset for DE{Col(0)|AS2(0)} which compares favorably to other DE analyses (edgeR and DESeq2) using only the appropriate samples. I don't doubt that it's fine. However,

(b) glmLRT(f, contrast=c(-1,1,0,0,0, 0,0,0,0,0, 0,0,0,0,0, 0,0,0,0,0))

returns complete garbage with logFC values in the stratosphere. To make matters more confusing, it does return seemingly correct data for non-t=0 contrasts, for example,

(c) glmLRT(f, contrast=c(0,0,0,0,0, -1,1,0,0,0, 0,0,0,0,0, 0,0,0,0,0))

seems to return a very reasonable dataset for DE{Col(30)|AS2(30)}.

What am I missing about the meaning of the first index in the array passed to contrast= in glmLRT??? Is there some reason why contrast= can only be used for contrasts between non-control samples? I don't get it.
__________________
Sam Hokin
Computational Scientist, Carnegie and NCGR
samhokin is offline   Reply With Quote
Old 02-03-2014, 02:00 PM   #2
Dario1984
Senior Member
 
Location: Sydney, Australia

Join Date: Jun 2011
Posts: 166
Default

It's because your model matrix is using the treatment-contrasts parametrisation. That's why your second constrast doesn't make sense. What you're testing there is if the second coefficient, which is the difference of the first treatment to control is different to the first coefficient, which is the control. Subtracting a difference from an absolute measurement isn't correct.

Code:
glmLRT(f, contrast=c(0,1,0,0,0, 0,0,0,0,0, 0,0,0,0,0, 0,0,0,0,0))
will give you the same answer as your first test.
Dario1984 is offline   Reply With Quote
Old 02-03-2014, 02:21 PM   #3
samhokin
Member
 
Location: Santa Fe, NM

Join Date: Nov 2013
Posts: 20
Default

Thanks, Dario! That tells me that my other "contrasts" aren't what I want, either. I don't want the difference between [the difference between Col30 and Col0] and [the difference between AS230 and Col0]. That is actually not meaningful, either, but I guess that's what "contrast" means - variation of DE, so a sort of meta-DE.

I've just rerun the analysis using only the samples of interest in the "simple" formulation of edgeR, so four times for the four times. Simple enough. It's really like four separate experiments done at four separate times.
__________________
Sam Hokin
Computational Scientist, Carnegie and NCGR
samhokin is offline   Reply With Quote
Old 02-04-2014, 08:16 AM   #4
samhokin
Member
 
Location: Santa Fe, NM

Join Date: Nov 2013
Posts: 20
Default

Wait a sec. I'm looking at the edgeR User's Guide, section 3.3.1 "defining each treatment as a group", and I'm pretty much sure it says to do what I originally did. It says:

> lrt <- glmLRT(fit, contrast=c(-1,0,1,0,0,0))

would find the Drug.2vs0 contrast.

That seems exactly what I was doing, and getting nonsense answers for logFC. I don't see how what I was doing differs from the example in 3.3.1.

Does anyone else find the edgeR documentation to be confusing and inconsistent with application???
__________________
Sam Hokin
Computational Scientist, Carnegie and NCGR
samhokin is offline   Reply With Quote
Old 02-04-2014, 02:00 PM   #5
Dario1984
Senior Member
 
Location: Sydney, Australia

Join Date: Jun 2011
Posts: 166
Default

You never explained what design matrix you used for glmFit. It can't be the same one for both types of contrasts. The fact that you have an intercept immediately suggests it is wrong. The right formula would be ~ 0 + factor(paste(condition, time, sep = '.')) The 0 removes the intercept.
Dario1984 is offline   Reply With Quote
Old 02-04-2014, 04:13 PM   #6
samhokin
Member
 
Location: Santa Fe, NM

Join Date: Nov 2013
Posts: 20
Default

Ahhhh, thank you. That is really not very clear from the edgeR manual, or maybe I'm just too new to this approach to understand that. Seems subtle to me, but I am now enlightened!
__________________
Sam Hokin
Computational Scientist, Carnegie and NCGR
samhokin is offline   Reply With Quote
Old 04-04-2014, 09:59 AM   #7
moredd
Junior Member
 
Location: Brazil

Join Date: May 2013
Posts: 8
Default

Hei guys,

But the section 3.3.1 "defining each treatment as a group" doesn't refers to unpaired samples only? I'm not sure about that
In your case samhokin (so in mine) seems like you have paired samples, I mean, samples collected from the same patient at different time points. Am I wrong?!?
moredd is offline   Reply With Quote
Old 04-05-2014, 05:47 AM   #8
samhokin
Member
 
Location: Santa Fe, NM

Join Date: Nov 2013
Posts: 20
Default

Mine are actually independent bio reps of a lab-grown plant, with different duration of application of a hormone ('duration' is a better word than 'time' in my case). So it's quite different from human studies - my sample groups are the same genome, with reps for added stats.
__________________
Sam Hokin
Computational Scientist, Carnegie and NCGR
samhokin is offline   Reply With Quote
Old 01-14-2015, 10:23 AM   #9
noha osman
Junior Member
 
Location: log angeles

Join Date: Aug 2014
Posts: 9
Default

I have data frame called subtest as following:
RNA.LATER.MEN2B_S5 RNA.LATER.ROSA_S4 RNA.MEN2B.1_S2 RNA.MEN2B.2_S3 RNA.ROSA_S1
1 13707 13866 12193 12671 10178
2 0 0 0 0 1
3 7165 5002 1256 1341 2087
6 8537 16679 9042 9620 19168
10 19438 25234 15563 16419 16582
16 3 3 11 3 5
I would like to analysis the MEN samples to ROSA samples I did script like that .
> group=c("LMS5","LRS4","MS2","MS3","RS1")
> y=DGEList(counts=data.matrix(subtest),group=group ,genes=genes)
> indices=which(rowSums(cpm(y)>1)<3)
> y=y[-indices,]
> y=calcNormFactors(y)
> design=model.matrix(~0+group,data=subtest)
> cont.matrix= makeContrasts(groupLMS5-groupMS2-groupMS3-(groupLRS4-groupRS1),levels=design)
> fit <- glmFit(y, design,dispersion=y$trended.dispersion)
> lrt.all <- glmLRT(fit, contrast=cont.matrix)
> topTags(lrt.all)
summary(dexp<-decideTestsDGE(lrt.all,p=0.05,adjust="BH"))
-1 1
0 36
1 14248
I got one gene as down regulated and 14248 as upregulated so I think I did something wrong.please Iam a new user in that package and I want to get a appropriate number of up and down regulated genes to downstream analysis
noha osman is offline   Reply With Quote
Reply

Tags
differential analysis, differential expression, edger, rna-seq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:35 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO