SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Cuffdiff time series bsuac6 Bioinformatics 7 11-15-2013 01:48 PM
How to analyse time-series data with edgeR ca85 RNA Sequencing 0 10-06-2011 06:04 AM
BaySeq vs GLM in EdgeR and DESeq Hilary April Smith Bioinformatics 0 08-03-2011 11:14 AM
time series analysis of count data? greigite Bioinformatics 0 06-14-2011 06:11 PM
cuffdiff: time-series problem mrfox Bioinformatics 1 12-09-2010 11:08 PM

Reply
 
Thread Tools
Old 12-16-2011, 06:54 AM   #1
bsuac6
Junior Member
 
Location: UK

Join Date: Sep 2010
Posts: 9
Default EdgeR GLM linear regression of time series

Hi All,

I am using RNA-seq data. I have 3 time points in my data 6 months (young), 12 months (middle) and 28 months (old). I want to do a differential expression analysis across the 3 time points. I have used EdgeR GLM to do this though the design isnt allowing me to look at linear changes with increasing age whilst taking into account the expression at middle age. So after looking on lots of forums I have come up with a work round (see below). Though not being experienced in R or statistics I am not sure it is doing what I want it to. Could someone please look it over and see if I am actually getting linear changes with increasing age whilst taking in account the expression at middle age?
Many thanks

> library(edgeR)
> library(limma)
> targets<-read.delim (file="coding_targets.txt")
> targets$age<-C(targets$age,poly,1)
> attr(targets$age,"contrasts")
.L
a-young -7.071068e-01
b-middle -7.850462e-17
c-old 7.071068e-01
> targets$sample<-factor(targets$sample)
> targets
X files age sample
1 p1 p1_coding_counts.txt a-young 1
2 p2 p2_coding_counts.txt a-young 2
3 p3 p3_coding_counts.txt a-young 3
4 p7 p7_coding_counts.txt b-middle 7
5 p8 p8_coding_counts.txt b-middle 8
6 p9 p9_coding_counts.txt b-middle 9
7 p4 p4_coding_counts.txt c-old 4
8 p5 p5_coding_counts.txt c-old 5
9 p6 p6_coding_counts.txt c-old 6
> d<-readDGE(targets, skip=1, comment.char='#')
> colnames(d)<-row.names(targets)
> d<- calcNormFactors(d)
> d
An object of class "DGEList"
$samples
X files age sample lib.size norm.factors
1 p1 p1_coding_counts.txt a-young 1 3445622 0.9655724
2 p2 p2_coding_counts.txt a-young 2 2696547 0.9902573
3 p3 p3_coding_counts.txt a-young 3 3308099 0.9787044
4 p7 p7_coding_counts.txt b-middle 7 2503479 1.0584660
5 p8 p8_coding_counts.txt b-middle 8 2639127 1.0477893
6 p9 p9_coding_counts.txt b-middle 9 2696547 0.9902573
7 p4 p4_coding_counts.txt c-old 4 3037440 0.9778553
8 p5 p5_coding_counts.txt c-old 5 2647915 1.0144109
9 p6 p6_coding_counts.txt c-old 6 2475370 0.9809077

$counts
1 2 3 4 5 6 7 8 9
ENSRNOG0000000xxxx 1287 1285 1041 788 968 1285 1092 1009 960
ENSRNOG0000000xxxx 0 0 0 0 1 0 0 1 0
ENSRNOG0000000xxxx 0 0 0 3 0 0 1 0 0
ENSRNOG0000000xxxx 38 405 50 18 105 405 372 42 282
ENSRNOG0000000xxxx 0 0 0 0 0 0 0 0 0
22932 more rows ...
> d<-d[rowSums(d$counts)>9,]
> design<- model.matrix(~ age, data = targets)
> design
(Intercept) age.L
1 1 -7.071068e-01
2 1 -7.071068e-01
3 1 -7.071068e-01
4 1 -7.850462e-17
5 1 -7.850462e-17
6 1 -7.850462e-17
7 1 7.071068e-01
8 1 7.071068e-01
9 1 7.071068e-01
attr(,"assign")
[1] 0 1
attr(,"contrasts")
attr(,"contrasts")$age
.L
a-young -7.071068e-01
b-middle -7.850462e-17
c-old 7.071068e-01
> d<-estimateGLMCommonDisp(d, design)
> glmfit<- glmFit(d, design, dispersion=d$common.dispersion)
> results<- glmLRT(d, glmfit, coef=c(2))
> topTags(results)
Coefficient: age.L
logConc logFC LR P.Value FDR
ENSRNOG000000xxxxx -11.271529 -1.9336775 55.45465 9.564007e-14 1.453825e-09
ENSRNOG000000xxxxx -10.359443 1.2661223 49.11802 2.410161e-12 1.831843e-08
ENSRNOG000000xxxxx -11.317494 -1.5480925 47.35359 5.926938e-12 3.003180e-08
bsuac6 is offline   Reply With Quote
Old 05-07-2012, 01:48 AM   #2
Gordon Smyth
Member
 
Location: Melbourne, Australia

Join Date: Apr 2011
Posts: 91
Default

Dear bsuac6,

edgeR easily allows you to estimate a linear trend with age. Just put age as a column in the design matrix.

If you're still having trouble with it, could you post your question to the Bioconductor mailing list? See the section "how to get help" in the edgeR User's Guide.

Best wishes
Gordon
Gordon Smyth is offline   Reply With Quote
Old 03-28-2013, 01:38 PM   #3
David Robinson
Junior Member
 
Location: Princeton, NJ

Join Date: Mar 2013
Posts: 1
Default

Gordon Smyth: Given the link function for negative binomial regression, wouldn't this estimate an exponential trend rather than a linear one? (After all, wouldn't a linear trend be impossible since it could lead to a negative expected value?)
David Robinson is offline   Reply With Quote
Old 03-28-2013, 03:50 PM   #4
Gordon Smyth
Member
 
Location: Melbourne, Australia

Join Date: Apr 2011
Posts: 91
Default

Yes. It is a linear trend on the scale of the model formula and linear predictor, and this corresponds to an exponential trend on the scale of the predicted counts.
Gordon Smyth is offline   Reply With Quote
Reply

Tags
edger, glm, r-statistics, time series

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:53 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO