SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics
Similar Threads
Thread Thread Starter Forum Replies Last Post
RNASeq experimental design crh General 0 09-23-2013 02:36 PM
DESeq, experimental design lmolokin Bioinformatics 14 06-12-2013 06:36 AM
Experimental design with edgeR and DESeq packages yvan.wenger Bioinformatics 0 11-16-2012 02:53 AM
Newbie - experimental design question enkia Sample Prep / Library Generation 10 03-20-2012 05:09 AM
Help for experimental design ips RNA Sequencing 2 05-09-2011 03:47 PM

Reply
 
Thread Tools
Old 11-28-2013, 01:34 AM   #1
frymor
Senior Member
 
Location: Germany

Join Date: May 2010
Posts: 150
Cool independet filtering and experimental design in DESeq

Hi everybody,

I know this problem to be discussed quite a lot. I read the posts here and here (and also the papers mentioned in them).

I have two questions concerning my experiment. One is about the experimental design, the second about how to set the filtering. I think they are both somehow connected, so I would like to place them in one post.

In my experiment we have three conditions (ctrl, KO1 and KO2) and three separate cell types ( I, P, and NP).
I would like to understand better how to analyse the data in one go.

The aim of the experiment is not only to compare the ctrl vs. KO1 and/or KO2, but also to analyse the efficiency of cellular processes by comparing NP vs. P in ctrl and/or KO1 and KO2.

I ran the analysis once with all genes (without any filtering at all, first!). I compared the ctrl vs. KO1 and KO2. It was interesting to see, that in all the comparisons of ctrl vs. KO1 I get a long list of significantly deregulated genes (FDR=0.1%), but in the comparison ctrl vs. KO2 I get only 2-5 genes.
So I thought a good explanation for that will be filtering the low-count genes. In search of a good cutoff I tried the genefilter package and got the following rank plot:
rank_scatterplot.png

Q1: I was wondering if cutting the data set at 0.57 is a good decision.

Than I looked for a FDR value and did the rejection plot, to see how many genes I am left with, with each of the different FDR values.
rejection_plot.png
It was interessting to see, that from 0%-50% they are all overlap each other.

Q2: Does that mean, that there is no difference between ϑ=0.5 and ϑ=0.1?

pair-wise vs. multifactor design:

I read the DESeq manual and ran the analysis as described here:
Code:
pd <- read.delim2("../phenoData.txt", sep="\t",quote="", row.names=1)

featureCountTable = read.table( "countTable.txt", header=TRUE, row.names=1, quote="")

conditions = factor(pd$comparison) # I have nine conditions are ctrl_I, ctrl_NP, ctrl_P, KO1_I, KO1,NP, KO1_P KO2_I, KO2_NP and KO2_P

cds = newCountDataSet( featureCountTable, conditions )

cds = estimateSizeFactors( cds )
normResults <- counts( cds, normalized=TRUE ) 

#Variance estimation
cds = estimateDispersions(cds)

# I than ran for each comparison a binomial test
res_I_ctrl_KO1 = nbinomTest( cds, "ctrl_I", "KO1_I" )
res_P_ctrl_KO1 = nbinomTest( cds, "ctrl_P", "KO1_P" )
...
I was wondering if DESeq can work this way or if I need to run a multi-factor design such as

Code:
fit1 = fitNbinomGLMs( cdsFullDataSet, count ~ libType + condition )
fit0 = fitNbinomGLMs( cdsFullDataSet, count ~ libType )
whereas libType will be the ctrl, KO1 and KO2 and condition will be I, NP and P.

It will be great if I can get some help.

thanks a lot

Assa
frymor is offline   Reply With Quote
Reply

Tags
deseq, filtering, multi-factor design

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT -8. The time now is 03:20 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO