![]() |
|
|
#1 |
|
Junior Member
Join Date: Dec 2009
Location: Granada, Spain
Posts: 8
|
Hi,
I need to analyze differentially expressed genes between samples from two tissues. I was thinking about using DEGseq or edgeR packages, any of you have tried these packages? thanks! marina |
|
|
|
|
|
#2 |
|
Member
Join Date: Sep 2009
Location: nl
Posts: 21
|
I had RNA-Seq data that I:
1) mapped with tophat 2) determined RPKM/transcript with cufflinks 3) analyzed differentially expressed transcripts with DEGseq Using these software packages the analysis worked pretty straight-forward. Using tophat and cufflinks I only had to use the function "DEGexp" from the DEGseq package, but you should be able to skip Cufflinks and feed DEGseq uniquely aligned sequences directly as well. svl |
|
|
|
|
|
#3 | |
|
Senior Member
Join Date: Oct 2009
Location: Tsinghua, Beijing, China
Posts: 236
|
Quote:
Since I didn't use cufflinks much, I have a detailed question: You said you determined RPKM/transcript by Cufflinks, so I am wondering how you make sure the transcripts determined by Cufflinks from different samples match. Do you use such kind of gene annotation to guide Cufflinks? Thanks in advance.
__________________
Xi Wang |
|
|
|
|
|
|
#4 |
|
Member
Join Date: Sep 2009
Location: nl
Posts: 21
|
Yep, exactly, using a GFF file. From the cufflinks manual:
"-G/--GTF -> Tells Cufflinks to use the supplied reference annotation to estimate isoform expression. It will not assemble novel transcripts, and the program will ignore alignments not structurally compatible with any reference transcript." |
|
|
|
|
|
#5 | |
|
Senior Member
Join Date: Oct 2009
Location: Tsinghua, Beijing, China
Posts: 236
|
Quote:
__________________
Xi Wang |
|
|
|
|
|
|
#6 |
|
Junior Member
Join Date: Dec 2009
Location: Granada, Spain
Posts: 8
|
Hi!
it's good to know that DEGseq works fine. I would like to know if it's necessary to have several sample replicates to use this package, thanks! marina |
|
|
|
|
|
#7 | |
|
Senior Member
Join Date: Oct 2009
Location: Tsinghua, Beijing, China
Posts: 236
|
Quote:
Following the examples in the manual, you can get to know how to use DEGseq.
__________________
Xi Wang |
|
|
|
|
|
|
#8 |
|
Senior Member
Join Date: Feb 2010
Location: Heidelberg, Germany
Posts: 133
|
Hi,
given the title of the thread, I have to use the opportunity to advertise our new package "DESeq", which is now a third option in Bioconductor for determining whether a fold change in RNA-Seq data is significant. Like edgeR, DESeq uses the negative binomial distribution. However, we use a novel way of estimating the variance between biological replicates that is, in out view, more precise than edgeR's. The package is in Bioc devel (see also here). The paper describing the method is submitted; contact me if you would like to get a preprint. In this paper, we also argue that the Poisson approximation is not suitable for RNA-Seq analysis and a dispersion estimate in indispensible. Note that this explicitly contradicts the opinion that the DEGSeq authors state in their package vignette. Hence, your choice is as follows: If you go with Xi Wang et al.'s opinion that Poisson is justified, use DEGSeq, while if you agree with the negative binomial people, i.e., the authors of edgeR and DESeq, you go with these. I've tried to make DESeq easy to use and fast, and I hope you all will like it. Feedback is very wellcome. In comparing all three methods, you will typically find that edgeR and DESeq find about the same number of hits, but with different distribution in across the range of abundances. In our paper, we argue while we think that our newer tool gets closer to the truth. DEGSeq (the Poisson-base method) will give you many more hits than edgeR and DESeq. I don't want to go into details (this is what we have written the paper for ;-) ) but as Xi Wang has already posted in this thread, it would be impolite to not at least briefly mention why we advise against relying on the Poisson assumption: As Marioni et al. [Genome Res., 2008] have shown, the noise between technical replicates is indeed at the theoretical minimum, i.e., the level predicted by the Poisson distribution. However, the noise between biological replicates is, unsurprisingly, much higher (see the comparison between technical and biological replicates by Nagalakshmi et al. [Science, 2008]) and vastly exceeds the noise predicted by the Poisson assumption. Hence, if you test against a Poisson null hypothesis and reject it (i.e., call it differentially expressed), this informs you that the difference of the transcript abundance between your two samples is larger than what you would expect between technical replicates. The question you are typically interested is, however, whether it is larger than what one would expect between biological replicates, as only then, it can be attributed to a difference in the treatment or characteristics of the biological sample. Hence, it is important to measure the noise between biological replicates, and the fact that the noise between technical replicates can be calculated from the Poisson distribution does not help. Best regards Simon |
|
|
|
|
|
#9 |
|
Junior Member
Join Date: Jan 2010
Location: Barcelona
Posts: 3
|
I recently used edgeR and works really good, I also used TopHat but then calculated the RPKMs with my own scripts and edgeR did the DE analysis, it is easy to use and creates really nice scatterplots of the data.
Cheers, Sergio |
|
|
|
|
|
#10 |
|
Senior Member
Join Date: Nov 2008
Location: Berkeley, CA
Posts: 154
|
With the recently released Cufflinks 0.8.0, you can use the included program "cuffdiff" to test for differential gene and isoform expression, as well as differential splicing and differential promoter use within genes.
|
|
|
|
|
|
#11 |
|
Junior Member
Join Date: Dec 2009
Location: Granada, Spain
Posts: 8
|
Thanks a lot for the answers
|
|
|
|
![]() |
| Thread Tools | |
|
|