Hi All,
I am using RNA-seq data. I have 3 time points in my data 6 months (young), 12 months (middle) and 28 months (old). I want to do a differential expression analysis across the 3 time points. I have used EdgeR GLM to do this though the design isnt allowing me to look at linear changes with increasing age whilst taking into account the expression at middle age. So after looking on lots of forums I have come up with a work round (see below). Though not being experienced in R or statistics I am not sure it is doing what I want it to. Could someone please look it over and see if I am actually getting linear changes with increasing age whilst taking in account the expression at middle age?
Many thanks
> library(edgeR)
> library(limma)
> targets<-read.delim (file="coding_targets.txt")
> targets$age<-C(targets$age,poly,1)
> attr(targets$age,"contrasts")
.L
a-young -7.071068e-01
b-middle -7.850462e-17
c-old 7.071068e-01
> targets$sample<-factor(targets$sample)
> targets
X files age sample
1 p1 p1_coding_counts.txt a-young 1
2 p2 p2_coding_counts.txt a-young 2
3 p3 p3_coding_counts.txt a-young 3
4 p7 p7_coding_counts.txt b-middle 7
5 p8 p8_coding_counts.txt b-middle 8
6 p9 p9_coding_counts.txt b-middle 9
7 p4 p4_coding_counts.txt c-old 4
8 p5 p5_coding_counts.txt c-old 5
9 p6 p6_coding_counts.txt c-old 6
> d<-readDGE(targets, skip=1, comment.char='#')
> colnames(d)<-row.names(targets)
> d<- calcNormFactors(d)
> d
An object of class "DGEList"
$samples
X files age sample lib.size norm.factors
1 p1 p1_coding_counts.txt a-young 1 3445622 0.9655724
2 p2 p2_coding_counts.txt a-young 2 2696547 0.9902573
3 p3 p3_coding_counts.txt a-young 3 3308099 0.9787044
4 p7 p7_coding_counts.txt b-middle 7 2503479 1.0584660
5 p8 p8_coding_counts.txt b-middle 8 2639127 1.0477893
6 p9 p9_coding_counts.txt b-middle 9 2696547 0.9902573
7 p4 p4_coding_counts.txt c-old 4 3037440 0.9778553
8 p5 p5_coding_counts.txt c-old 5 2647915 1.0144109
9 p6 p6_coding_counts.txt c-old 6 2475370 0.9809077
$counts
1 2 3 4 5 6 7 8 9
ENSRNOG0000000xxxx 1287 1285 1041 788 968 1285 1092 1009 960
ENSRNOG0000000xxxx 0 0 0 0 1 0 0 1 0
ENSRNOG0000000xxxx 0 0 0 3 0 0 1 0 0
ENSRNOG0000000xxxx 38 405 50 18 105 405 372 42 282
ENSRNOG0000000xxxx 0 0 0 0 0 0 0 0 0
22932 more rows ...
> d<-d[rowSums(d$counts)>9,]
> design<- model.matrix(~ age, data = targets)
> design
(Intercept) age.L
1 1 -7.071068e-01
2 1 -7.071068e-01
3 1 -7.071068e-01
4 1 -7.850462e-17
5 1 -7.850462e-17
6 1 -7.850462e-17
7 1 7.071068e-01
8 1 7.071068e-01
9 1 7.071068e-01
attr(,"assign")
[1] 0 1
attr(,"contrasts")
attr(,"contrasts")$age
.L
a-young -7.071068e-01
b-middle -7.850462e-17
c-old 7.071068e-01
> d<-estimateGLMCommonDisp(d, design)
> glmfit<- glmFit(d, design, dispersion=d$common.dispersion)
> results<- glmLRT(d, glmfit, coef=c(2))
> topTags(results)
Coefficient: age.L
logConc logFC LR P.Value FDR
ENSRNOG000000xxxxx -11.271529 -1.9336775 55.45465 9.564007e-14 1.453825e-09
ENSRNOG000000xxxxx -10.359443 1.2661223 49.11802 2.410161e-12 1.831843e-08
ENSRNOG000000xxxxx -11.317494 -1.5480925 47.35359 5.926938e-12 3.003180e-08
I am using RNA-seq data. I have 3 time points in my data 6 months (young), 12 months (middle) and 28 months (old). I want to do a differential expression analysis across the 3 time points. I have used EdgeR GLM to do this though the design isnt allowing me to look at linear changes with increasing age whilst taking into account the expression at middle age. So after looking on lots of forums I have come up with a work round (see below). Though not being experienced in R or statistics I am not sure it is doing what I want it to. Could someone please look it over and see if I am actually getting linear changes with increasing age whilst taking in account the expression at middle age?
Many thanks
> library(edgeR)
> library(limma)
> targets<-read.delim (file="coding_targets.txt")
> targets$age<-C(targets$age,poly,1)
> attr(targets$age,"contrasts")
.L
a-young -7.071068e-01
b-middle -7.850462e-17
c-old 7.071068e-01
> targets$sample<-factor(targets$sample)
> targets
X files age sample
1 p1 p1_coding_counts.txt a-young 1
2 p2 p2_coding_counts.txt a-young 2
3 p3 p3_coding_counts.txt a-young 3
4 p7 p7_coding_counts.txt b-middle 7
5 p8 p8_coding_counts.txt b-middle 8
6 p9 p9_coding_counts.txt b-middle 9
7 p4 p4_coding_counts.txt c-old 4
8 p5 p5_coding_counts.txt c-old 5
9 p6 p6_coding_counts.txt c-old 6
> d<-readDGE(targets, skip=1, comment.char='#')
> colnames(d)<-row.names(targets)
> d<- calcNormFactors(d)
> d
An object of class "DGEList"
$samples
X files age sample lib.size norm.factors
1 p1 p1_coding_counts.txt a-young 1 3445622 0.9655724
2 p2 p2_coding_counts.txt a-young 2 2696547 0.9902573
3 p3 p3_coding_counts.txt a-young 3 3308099 0.9787044
4 p7 p7_coding_counts.txt b-middle 7 2503479 1.0584660
5 p8 p8_coding_counts.txt b-middle 8 2639127 1.0477893
6 p9 p9_coding_counts.txt b-middle 9 2696547 0.9902573
7 p4 p4_coding_counts.txt c-old 4 3037440 0.9778553
8 p5 p5_coding_counts.txt c-old 5 2647915 1.0144109
9 p6 p6_coding_counts.txt c-old 6 2475370 0.9809077
$counts
1 2 3 4 5 6 7 8 9
ENSRNOG0000000xxxx 1287 1285 1041 788 968 1285 1092 1009 960
ENSRNOG0000000xxxx 0 0 0 0 1 0 0 1 0
ENSRNOG0000000xxxx 0 0 0 3 0 0 1 0 0
ENSRNOG0000000xxxx 38 405 50 18 105 405 372 42 282
ENSRNOG0000000xxxx 0 0 0 0 0 0 0 0 0
22932 more rows ...
> d<-d[rowSums(d$counts)>9,]
> design<- model.matrix(~ age, data = targets)
> design
(Intercept) age.L
1 1 -7.071068e-01
2 1 -7.071068e-01
3 1 -7.071068e-01
4 1 -7.850462e-17
5 1 -7.850462e-17
6 1 -7.850462e-17
7 1 7.071068e-01
8 1 7.071068e-01
9 1 7.071068e-01
attr(,"assign")
[1] 0 1
attr(,"contrasts")
attr(,"contrasts")$age
.L
a-young -7.071068e-01
b-middle -7.850462e-17
c-old 7.071068e-01
> d<-estimateGLMCommonDisp(d, design)
> glmfit<- glmFit(d, design, dispersion=d$common.dispersion)
> results<- glmLRT(d, glmfit, coef=c(2))
> topTags(results)
Coefficient: age.L
logConc logFC LR P.Value FDR
ENSRNOG000000xxxxx -11.271529 -1.9336775 55.45465 9.564007e-14 1.453825e-09
ENSRNOG000000xxxxx -10.359443 1.2661223 49.11802 2.410161e-12 1.831843e-08
ENSRNOG000000xxxxx -11.317494 -1.5480925 47.35359 5.926938e-12 3.003180e-08
Comment