SEQanswers

Go Back   SEQanswers > Applications Forums > Epigenetics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Calculating sample size for RNA-Seq data Genohub Bioinformatics 0 12-24-2013 01:35 PM
Power Analysis - Sample Size Calculation jroussarie Bioinformatics 2 11-07-2012 12:15 PM
Statistical geneticist, human whole genome sequence analysis knome Industry Jobs! 0 05-05-2011 10:39 AM
PubMed: Relative power and sample size analysis on gene expression profiling data. Newsbot! Literature Watch 0 09-18-2009 03:00 AM

Reply
 
Thread Tools
Old 03-23-2014, 06:29 PM   #1
sparky
Junior Member
 
Location: Australia

Join Date: Mar 2014
Posts: 5
Default statistical analysis of 454 bisulfite sequence data - small sample size

Hello everyone,

I am trying to figure out how to analyse some bisulfite sequencing data that I have and I am hoping that someone will have some suggestions as to how I should go about doing it. I have looked online and in statistics textbooks, but am totally stumped

I have performed 454 BS sequencing of a number of PCR amplicons. I have two different treatment groups, with n=3 biological replicates in each (six sets of read data in total). I want to use two types of statistical analysis to assess differences in methylation between the treatment groups. I would like to test for differences in methylation (1) at individual CpG sites within an amplicon and (2) across each amplicon as a whole. I think that it will be necessary for me to analyse my results as count data rather than %methylation values, as I have a small sample size and the %methylation values probably do not conform to normality or homogeneity of variance assumptions. Similar studies that I have seen in the literature have used a Fisher's exact test for (1) and a negative binomial generalised linear model for (2). However, these studies have analysed unreplicated data (where biological replicates were pooled prior to PCR) far and as I know these stat tests are unable to accommodate my replicated data. In another post, somebody suggested that the program DESeq could be used for (1). After trying to use DESeq to analyse my data I realised that this is not possible as the relatively small number of CpG sites that I have to analyse result in inaccurate mean/dispersion estimates.

If anyone has any idea as to which statistical tests would be appropriate for my data I would be very grateful.

Thank you in advance
sparky is offline   Reply With Quote
Old 03-26-2014, 03:43 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

I'd be a bit hesitant to try to shoe-horn this into DESeq or one of the other RNAseq tools, the negative binomial distribution doesn't really fit bisulfite sequencing well. This sort of data is generally handled in one of a few ways:

(1) Logistic regress (e.g., in methylKit), which you can do easily enough in R.
(2) Smoothing followed by either a t-test or wilcoxon test, similar to how BSseq/Bsmooth works.
(3) Beta-binomial regression (e.g., in BiSeq).

I would say that the Beta-binomial methods will win out long term since they're actually able to model the underlying biology. You can just use the betareg package from CRAN in R to do this. The next thing to think about is if you're interested in single CpGs or whole regions. Most of the packages actually try to find regions, but if you're looking at a small number of amplicons then you're actually likely to be more interested in single CpGs, so you might just ignore the packages and use betareg. I should note that none of these methods are as of yet that ideal. There are new variants every month it seems and I actually have a tweaked version of beta-binomial regression in mind to implement if no one else has already (the downside to new packages appearing every couple weeks...), so you'll likely find something to work nicely in the not too distant future.
dpryan is offline   Reply With Quote
Old 03-26-2014, 09:32 AM   #3
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

In fact, it turns out that MOABS, which just came out, already implements what I had in mind. You'll have to figure out how to get your data into it, but it's likely to give nice results.
dpryan is offline   Reply With Quote
Old 04-02-2014, 09:12 PM   #4
sparky
Junior Member
 
Location: Australia

Join Date: Mar 2014
Posts: 5
Default

Thank you for your helpful advice dpryan
sparky is offline   Reply With Quote
Reply

Tags
454, bisulfite sequencing, methylation, replicates, small sample size

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:19 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO