![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
how to do test on RNA-seq data like this | shuixiangyuer | RNA Sequencing | 1 | 01-17-2013 10:27 AM |
RNA-Seq: The bench scientist's guide to statistical analysis of RNA-Seq data. | Newsbot! | Literature Watch | 0 | 09-18-2012 03:00 AM |
RNA-Seq: A Statistical Framework for eQTL Mapping Using RNA-seq Data. | Newsbot! | Literature Watch | 0 | 08-16-2011 03:00 AM |
RNA-Seq: A survey of statistical software for analysing RNA-seq data. | Newsbot! | Literature Watch | 12 | 12-20-2010 10:10 PM |
PubMed: Statistical Design and Analysis of RNA-Seq Data. | Newsbot! | Literature Watch | 0 | 05-09-2010 08:00 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: Odense Join Date: Aug 2012
Posts: 5
|
![]()
Hi everybody
I'm struggling a bit trying to do GSEA of RNA-Seq data. I've ended up settling with a package known as GAGE (Generally Applicable Gene-set Enrichment). The main reason is that this algorithm is the only one I've been able to find that does not require 10+ biological replicates. The algorithms employed by GAGE is targetted toward microarray data and as such there are some adjustments that are necessary prior to analysis. Basically, I need to do a transformation to make the data homoscedastic and I need to do length bias correction.I'm then asking GAGE to do a paired comparison between treatment and control and for each pair this will give me some enrichment score. What I'm struggling with is what sort of method I should ask GAGE to use for statistical testing, as I have only two replicates. The options are: I think the correct test to use is the rank-based two sample t-test, but it would be nice if someone with more statistical knowledge could comment on my workflow. Last edited by JesperGrud; 01-24-2013 at 08:16 AM. Reason: Layout of post |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: MDC, Berlin, Germany Join Date: Oct 2009
Posts: 317
|
![]()
Hi, If you haven't solved this out, I'd like to suggest you try our newly developed R package SeqGSEA, which is available at http://bioconductor.org/packages/rel...l/SeqGSEA.html.
It can integrate differential expression and differential splicing together for GSEA. By using negative binomial to model read counts, SeqGSEA can correctly capture technical and biological variance in RNA-Seq data. It doesn't require a large number of biological replicates, but I'd like to know how many you have. Cheers Xi
__________________
Xi Wang |
![]() |
![]() |
![]() |
#3 |
Junior Member
Location: Odense Join Date: Aug 2012
Posts: 5
|
![]()
Hi.
In the end i used a rank-based Wilcoxon-test. This might yield some false negative in the sense that p-value scoring is too conservative, if the data are in fact normally distributed. However, since im only interested in the say top 5 pathways from a biological point of view, it does not matter that much. Our current setup is to have just 2 replicates from cell culture experiments. Our tests show that 2 replicates gives us sufficient data to detect most differential expression using DESeq. I havnt read the details concerning your package, but the questions that spring to mind is if it requires paired end reads? and if the differential splicing analysis can be omitted? |
![]() |
![]() |
![]() |
#4 | |
Senior Member
Location: MDC, Berlin, Germany Join Date: Oct 2009
Posts: 317
|
![]() Quote:
I am afraid 2 replicates are too few for SeqGSEA, as its based on sample label permutation for statistical significance. Regarding your questions, SeqGSEA doesn't require PE reads. It only takes read-count data. SeqGSEA can work with DE-only GSEA. Hope this info would help with your future data analysis and other researchers. Cheers.
__________________
Xi Wang |
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|