SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
DESeq Multi-factor designs: determining the significance of model terms DJParker Bioinformatics 8 07-21-2014 02:21 PM
Splice variants from a multi factor RNA-Seq cfreije Bioinformatics 3 08-26-2013 01:27 AM
DEXSeq for multi-factor design alittleboy Bioinformatics 5 06-27-2013 09:43 AM

Reply
 
Thread Tools
Old 10-18-2013, 09:57 AM   #1
sindrle
Senior Member
 
Location: Norway

Join Date: Aug 2013
Posts: 266
Default DESeq2: Multi-factor designs

With great help from dpryan I have come a long way, but now Im stuck again. I need to be able to calculated both within and between samples. I have two groups at two time points, the data is paired within samples, but not between:

sampleFiles <- list.files(path="/Volumes/timemachine/HTseq_DEseq2",pattern="*.txt");
status <- factor(c(rep("Healthy",26), rep("Diabetic",22)))
timepoints = as.factor(c(1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2));
sampleTable <- data.frame(sampleName = sampleFiles, fileName = sampleFiles, status=status, timepoints=timepoints);
directory <- c("/Volumes/timemachine/HTseq_DEseq2/");
des <- formula(~timepoints+status);
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design=des);
ddsHTSeq

These commands all work, but they are incorrect due to not taken the paired data in account.
So I looked here: page 49; http://www.bioconductor.org/packages...usersguide.pdf

So I tried instead;

Treat <- factor(paste(sampleTable$status,sampleTable$timepoints,sep=""));
design <- model.matrix(~0+Treat);
colnames(design) <- levels(Treat);

> design
Diabetic1 Diabetic2 Healthy1 Healthy2
1 0 0 1 0
2 0 0 1 0
3 0 0 1 0
4 0 0 1 0
5 0 0 1 0
6 0 0 1 0
7 0 0 1 0
8 0 0 1 0
9 0 0 1 0
10 0 0 1 0
11 0 0 1 0
12 0 0 1 0
13 0 0 1 0
14 0 0 0 1
15 0 0 0 1
16 0 0 0 1
17 0 0 0 1
18 0 0 0 1
19 0 0 0 1
20 0 0 0 1
21 0 0 0 1
22 0 0 0 1
23 0 0 0 1
24 0 0 0 1
25 0 0 0 1
26 0 0 0 1
27 1 0 0 0
28 1 0 0 0
29 1 0 0 0
30 1 0 0 0
31 1 0 0 0
32 1 0 0 0
33 1 0 0 0
34 1 0 0 0
35 1 0 0 0
36 1 0 0 0
37 1 0 0 0
38 0 1 0 0
39 0 1 0 0
40 0 1 0 0
41 0 1 0 0
42 0 1 0 0
43 0 1 0 0
44 0 1 0 0
45 0 1 0 0
46 0 1 0 0
47 0 1 0 0
48 0 1 0 0
attr(,"assign")
[1] 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$Treat
[1] "contr.treatment"

However my "old" design, the "des", looks like this:

> des
~timepoints + status

Running with "new" design:
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design=design)

Gives error:

Error in formula.default(design) : invalid formula

Not to suprising.. But Im stuck, what to do? Am I totally of?

Thanks!

Last edited by sindrle; 10-18-2013 at 10:02 AM.
sindrle is offline   Reply With Quote
Old 10-20-2013, 12:51 AM   #2
jingerlu
Junior Member
 
Location: los angeles

Join Date: Mar 2011
Posts: 4
Default

what colData(ddsHTSeq) gives ?
jingerlu is offline   Reply With Quote
Old 10-20-2013, 02:16 AM   #3
sindrle
Senior Member
 
Location: Norway

Join Date: Aug 2013
Posts: 266
Default

The answer is, thanks to dpryan:

status <- factor(c(rep("Healthy",26), rep("Diabetic",22)), levels=c("Healthy", "Diabetic"));
timepoints = as.factor(c(1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2));
patients <- factor(c(04,21,53,55,61,62,67,73,76,77,79,04,21,53,55,61,62,67,73,76,77,79, 01,08,13,70,71,72,75,81,82,83,86,87,88,01,08,13,70,71,72,75,81,82,83,86,87,88));
sampleTable <- data.frame(sampleName = sampleFiles, fileName = sampleFiles, status=status, timepoints=timepoints, patients=patients);

and

des <- formula(~patients + timepoints*status);

Last edited by sindrle; 10-20-2013 at 02:22 AM.
sindrle is offline   Reply With Quote
Old 10-20-2013, 05:33 PM   #4
jingerlu
Junior Member
 
Location: los angeles

Join Date: Mar 2011
Posts: 4
Default

I suppose this could get the pair combination you like :
Treat <- factor(paste(sampleTable$status,sampleTable$timepoints,sep=""));
design <- formula(~ 0 + Treat)

but I am not sure if the following DESeq() or nbinomWaldTest() or nbionomLRT() function can work with this design formula . I am so temped to switch to edgeR , which has nicer tutorial and examples to follow.
jingerlu is offline   Reply With Quote
Old 10-20-2013, 11:04 PM   #5
sindrle
Senior Member
 
Location: Norway

Join Date: Aug 2013
Posts: 266
Default

So you mean I dont have paried test as I have wrote it?
sindrle is offline   Reply With Quote
Old 10-21-2013, 12:28 AM   #6
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

With the "~ patient + ...", you get paired tests. You currently test for interaction between time and status, i.e., you will get gene for which the amount of change between the time points differs significantly between healthy and diabetic subjects.
Simon Anders is offline   Reply With Quote
Old 10-21-2013, 12:35 AM   #7
sindrle
Senior Member
 
Location: Norway

Join Date: Aug 2013
Posts: 266
Default

Thanks!
That what I thought, but is that the only thing tested?

I get alot of results, of which I have chosen three:

statusResults <- results(dds, "status_Healthy_vs_Diabetic");
timepointsResults <- results(dds, "timepoints_2_vs_1");
statusTreatmentResults <- results(dds, "timepoints2.statusHealthy");

So you described the "timepoints2.statusHealthy" result, right?

But am I correct to use "status_Healthy_vs_Diabetic" to look for differences between the groups (disregarding time), and vise versa for "timepoints_2_vs_1"?

Thanks!
sindrle is offline   Reply With Quote
Old 10-21-2013, 12:41 AM   #8
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Code:
Treat <- factor(paste(sampleTable$status,sampleTable$timepoints,sep=""));
design <- formula(~ 0 + Treat)
Will create 4 groups: "Diabetic1", "Diabetic2", "Normal1", and "Normal2". So no, that won't keep the pairing. This is a similar design to what cuffdiff would do if you input the files as 4 groups. Aside from the pairing issue, some people prefer this since it's easier to directly compare groups, which is what they want to do. It's mostly a matter of the question you really want to ask.
dpryan is offline   Reply With Quote
Old 10-21-2013, 12:47 AM   #9
sindrle
Senior Member
 
Location: Norway

Join Date: Aug 2013
Posts: 266
Default

So to compare Cuffdiff2 with DESeq2 I could run that design (since I have already done that on Cuffdiff 2), but as you said its not my actual question of interest, thus leading me to DEseq2 in the first place.
sindrle is offline   Reply With Quote
Old 10-21-2013, 07:22 AM   #10
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

Quote:
Originally Posted by sindrle View Post
But am I correct to use "status_Healthy_vs_Diabetic" to look for differences between the groups (disregarding time), and vise versa for "timepoints_2_vs_1"?
For those genes for which the interaction is not significant: Yes.

For the others: not quite because you cannot disregard time because the difference depends on the time point.

If you want to average over the two time points, you can add half of the interaction effect to the status main effect (or, vice versa, add half of the interaction to the time-point main effect to get the average over disease states).
Simon Anders is offline   Reply With Quote
Old 10-21-2013, 07:47 AM   #11
sindrle
Senior Member
 
Location: Norway

Join Date: Aug 2013
Posts: 266
Default

What if time had the same effect on both groups?
Or if groups are the same, but both changed in time.

Could you then do as I said?

Last edited by sindrle; 10-24-2013 at 03:28 PM.
sindrle is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:08 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO