Syndicated from PubMed RSS Feeds
A Powerful and Flexible Approach to the Analysis of RNA Sequence Count Data.
Bioinformatics. 2011 Aug 2;
Authors: Zhou YH, Xia K, Wright FA
MOTIVATION: A number of penalization and shrinkage approaches have been proposed for the analysis of microarray gene expression data. Similar techniques are now routinely applied to RNA-sequence transcriptional count data, although the value of such shrinkage has not been conclusively established. If penalization is desired, the explicit modeling of mean-variance relationships provides a flexible testing regimen that "borrows" information across genes, while easily incorporating design effects and additional covariates. RESULTS: We describe BBSeq, which incorporates two approaches: (i) a simple beta-binomial generalized linear model, which has not been extensively tested for RNA-Seq data, and (ii) an extension of an expression mean-variance modeling approach to RNA-Seq data, involving modeling of the overdispersion as a function of the mean. Our approaches are flexible, allowing for general handling of discrete experimental factors and continuous covariates. We report comparisons with other alternate methods to handle RNA-Seq data. Although penalized methods have advantages for very small sample sizes, the beta-binomial generalized linear model, combined with simple outlier detection and testing approaches, appears to have favorable characteristics in power and flexibility. AVAILABILITY: An R package containing examples and sample datasets is available at http://www.bios.unc.edu/research/genomic_software/BBSeq CONTACT: [email protected]; [email protected].
PMID: 21810900 [PubMed - as supplied by publisher]
More...
A Powerful and Flexible Approach to the Analysis of RNA Sequence Count Data.
Bioinformatics. 2011 Aug 2;
Authors: Zhou YH, Xia K, Wright FA
MOTIVATION: A number of penalization and shrinkage approaches have been proposed for the analysis of microarray gene expression data. Similar techniques are now routinely applied to RNA-sequence transcriptional count data, although the value of such shrinkage has not been conclusively established. If penalization is desired, the explicit modeling of mean-variance relationships provides a flexible testing regimen that "borrows" information across genes, while easily incorporating design effects and additional covariates. RESULTS: We describe BBSeq, which incorporates two approaches: (i) a simple beta-binomial generalized linear model, which has not been extensively tested for RNA-Seq data, and (ii) an extension of an expression mean-variance modeling approach to RNA-Seq data, involving modeling of the overdispersion as a function of the mean. Our approaches are flexible, allowing for general handling of discrete experimental factors and continuous covariates. We report comparisons with other alternate methods to handle RNA-Seq data. Although penalized methods have advantages for very small sample sizes, the beta-binomial generalized linear model, combined with simple outlier detection and testing approaches, appears to have favorable characteristics in power and flexibility. AVAILABILITY: An R package containing examples and sample datasets is available at http://www.bios.unc.edu/research/genomic_software/BBSeq CONTACT: [email protected]; [email protected].
PMID: 21810900 [PubMed - as supplied by publisher]
More...