SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
RNAseq analysis SOLiD Raa Bioinformatics 3 07-19-2012 06:12 AM
error in DESeq analysis stephenhart General 5 11-08-2011 02:55 AM
DESeq-statistical analysis without replicate lynn012 RNA Sequencing 0 10-27-2011 02:47 AM
Re-extracting refgene names after DESeq Analysis silverlining Bioinformatics 6 08-04-2011 02:55 PM
DESeq analysis without replicates for 16 tissues johannes.helmuth Bioinformatics 0 05-25-2011 01:53 AM

Reply
 
Thread Tools
Old 06-05-2011, 04:24 AM   #1
katussa10
Member
 
Location: Columbus, OH, USA

Join Date: Jun 2010
Posts: 11
Default RNAseq analysis using DESeq

We are using DESeq to find differentially expressed genes for RNAseq experiment with two biological replicates. When we did the analyses considering them as biological replicates, we found that among differential expressed genes, there was a very high number of counts in one biorep compared to the others. Then we took biorep1 from experimental group and compared with biorep1 of the control group and there were only 3 genes differentially expressed at padj 0.05. However, there are lot of genes with thousands reads vs 0 in the two groups. Then we did the same with the second biorep and we found about 100 differentially expressed genes. Does anyone know why this is happening?
katussa10 is offline   Reply With Quote
Old 06-13-2011, 01:44 PM   #2
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

I'm not sure I understand your question. Could you give an example, please?
Simon Anders is offline   Reply With Quote
Old 06-14-2011, 07:35 AM   #3
katussa10
Member
 
Location: Columbus, OH, USA

Join Date: Jun 2010
Posts: 11
Default

Hi Simon,
Here is an example (attached txt file) for the genes that showed differential expression, but between the same experimental group variation was very high. Please let me if it is still not clear. I will try to explain again.
Attached Files
File Type: txt example.txt (2.1 KB, 110 views)

Last edited by katussa10; 06-14-2011 at 11:42 AM. Reason: Attachment was not correct
katussa10 is offline   Reply With Quote
Old 07-06-2011, 11:39 PM   #4
Gangcai
Member
 
Location: Shanghai, China

Join Date: Nov 2009
Posts: 30
Default

Quote:
Originally Posted by Simon Anders View Post
I'm not sure I understand your question. Could you give an example, please?
Dear Simon,
I have quite similar problem for the significant genes detected by DESeq and edgeR. Both of them output some significant candidates which have quite large variation within groups. Such as:
wt1: 2
wt2: 345
treat1: 3
treat2:1

or
wt1: 0
wt2: 345
treat1: 0
treat2: 0

Is it normal to get low p value for such kind of expression pattern? Thanks
Gangcai is offline   Reply With Quote
Old 07-07-2011, 01:00 AM   #5
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

Short answer: Please try again with the 'deve' version of DESeq (version 1.5.19), and this oddity should vanish.

Long answer: In the current release version of DESeq (version 1.4.1), we estimate a variance for each gene, fit a line through the mean-variance plot, and then use the fitted value of the variance, i.e., the value typical for a gene of the same expression strength. The 'nbinomTest' function gives you, besides the p values, two columns with the "variance residuals", i.e., the ratio of the gene's variance estimate over the fitted value. Cases such as your should show up as having a large value there and the vignette advises to disregard such hits in the downstream analysis.

Nobody ever read this sentence in the vignette, and also, the solution was rather unsatisfactory anyway, and so we have now changed this. Now, we do not use anymore always the fitted value, but instead the maximum of the per-gene estimate and the fitted value. This avoids artifacts like the ones you see. Have a look at the help page for 'estimateDispersion' in the new version, and also at the vignette, which we have extensively overhauled.
Simon Anders is offline   Reply With Quote
Old 07-07-2011, 01:24 AM   #6
Gangcai
Member
 
Location: Shanghai, China

Join Date: Nov 2009
Posts: 30
Default

Quote:
Originally Posted by Simon Anders View Post
Short answer: Please try again with the 'deve' version of DESeq (version 1.5.19), and this oddity should vanish.

Long answer: In the current release version of DESeq (version 1.4.1), we estimate a variance for each gene, fit a line through the mean-variance plot, and then use the fitted value of the variance, i.e., the value typical for a gene of the same expression strength. The 'nbinomTest' function gives you, besides the p values, two columns with the "variance residuals", i.e., the ratio of the gene's variance estimate over the fitted value. Cases such as your should show up as having a large value there and the vignette advises to disregard such hits in the downstream analysis.

Nobody ever read this sentence in the vignette, and also, the solution was rather unsatisfactory anyway, and so we have now changed this. Now, we do not use anymore always the fitted value, but instead the maximum of the per-gene estimate and the fitted value. This avoids artifacts like the ones you see. Have a look at the help page for 'estimateDispersion' in the new version, and also at the vignette, which we have extensively overhauled.
Hi Simon,
Thanks for your quick reply. One more question about installation of the devel DESeq.
I have downloaded the newest version of Biobase from bioconductor, but DESeq require even advanced version.(http://www.bioconductor.org/packages...l/Biobase.html )
Do you know where to download >=2.13.6 Biobase? Thanks.
"
Error : package 'Biobase' 2.12.2 was found, but >= 2.13.6 is required by 'DESeq'
"
Gangcai is offline   Reply With Quote
Old 07-07-2011, 01:26 AM   #7
labunit
Member
 
Location: Giessen, Germany

Join Date: Sep 2010
Posts: 10
Default

Quote:
Originally Posted by Gangcai View Post
Hi Simon,
Thanks for your quick reply. One more question about installation of the devel DESeq.
I have downloaded the newest version of Biobase from bioconductor, but DESeq require even advanced version.(http://www.bioconductor.org/packages...l/Biobase.html )
Do you know where to download >=2.13.6 Biobase? Thanks.
"
Error : package 'Biobase' 2.12.2 was found, but >= 2.13.6 is required by 'DESeq'
"
You need to download the development version of R (2.14) to be able to install the development branch of Bioconductor packages including DESeq 1.5.19
labunit is offline   Reply With Quote
Old 07-07-2011, 01:39 AM   #8
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

At http://www.bioconductor.org/packages...l/Biobase.html

However, installing a 'devel' version of Biobase over a 're'ease' installtion of Bioconductor might cause chaos. Better install the devel version of R and then, 'bioclite' will pull 'devel' versions of all Bioc packages.
Simon Anders is offline   Reply With Quote
Old 07-07-2011, 02:42 AM   #9
Gangcai
Member
 
Location: Shanghai, China

Join Date: Nov 2009
Posts: 30
Default

Hi Simon,
I have tried the deve version of DESeq. The number of significant genes drop quite a lot. It does look better comparing with previous result. But still have some genes have high variation within biological repilcates.
wt1 wt2 treat1 treat2 pvalue_adjusted
928 0 0 0 <<0.01
0 135 0 0 <<0.01
Gangcai is offline   Reply With Quote
Old 08-29-2011, 06:32 AM   #10
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

Could you please try again with

Code:
cds <- estimateDispersions( cds, method="pooled" )
It seems that our improved method removes these oddities reliably only if one uses a pooled dispersion estimation. I guess, we should hence change the default in 'estimateDispersions' to this, but at the moment, it is still method="per-condition" (which is the same as method="normal" in the old version).
Simon Anders is offline   Reply With Quote
Reply

Tags
data analysis, deseq, rnaseq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:42 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO