SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
differential gene expression without replicates: edgeR, DESeq? mrfox Bioinformatics 12 05-23-2013 07:20 AM
Differential gene expression analysis without reference cerebralrust Bioinformatics 7 05-04-2012 02:57 AM
miRNA differential expression (DESeq) ericamica Bioinformatics 3 05-16-2011 07:18 AM
Differential gene expression of gene clusters anjana.vr RNA Sequencing 1 10-28-2010 10:33 AM

Reply
 
Thread Tools
Old 07-04-2012, 05:08 AM   #1
iceage
Junior Member
 
Location: france

Join Date: Apr 2012
Posts: 1
Question Differential gene expression analysis with bioreplicates using EdgeR/DESeq

Hi everyone,

I have read and search a lot about this topic but can not find any solution to my problem. May you will be able to help me.

I am doing an intern-ship in bioinformatics for my master and I have to deal with RNA-seq data. I have 2 sets of experiments (A and B), both having 2 illumina runs of two stages (1 and 2) of a plant. A and B has not been done at the same time and the technology is a bit different, coming up with:
runs about 30M reads for A,
runs about 80M reads for B.

For a given stage the log(RPKM) of the replicates are very well correlated.

When I use EdgeR to obtain a common dispersion from the counts of each runs searching for differential expressed genes between each stage I obtain 0.86. Which seems far too big regarding the correlation of the RPKM. Moreover the number of differentially expressed genes is not consistent with our affymetrix knowledge (about 250 genes when we expected about 1000 genes).

I first think about filtering the list of genes from the one having a count per million below 1 in all conditions. I then obtain a dispersion of 0.76 : still to high...

I also think about getting variance stabilized data (with DESeq) to use with limma but it does not make sense if the samples are not paired, does it?

I am wondering if I am doing something wrong here and if there are any filtration/computation that I should have done to obtain a more consistent common dispersion.

Any idea would be really appreciate,

François
iceage is offline   Reply With Quote
Old 07-07-2012, 04:58 PM   #2
Gordon Smyth
Member
 
Location: Melbourne, Australia

Join Date: Apr 2011
Posts: 91
Default

A few points:

edgeR is a Bioconductor package, so more detailed help is available on the Bioconductor mailing list than on SEQanswers.

If you want to get your RNA-seq data into limma, the way to do this is use the voom() function of the limma package. See the limma User's Guide.

There are any number of things that might be causing problems with your analysis, but there's no to way know from the information that you give. Your dispersion values are very high indeed. Have you used an MDS plot to look at your data?
Gordon Smyth is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:05 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO