SEQanswers (
-   Bioinformatics (
-   -   RNASeq Biological Replicates (

Fernas 05-20-2014 02:48 AM

RNASeq Biological Replicates
Hi all,

I have RNASeq data for 20 different samples (human cell types and tissues). 10 of these samples are represented by single biological replicate, while, the other 10 samples are represented by >=2 biological replicates. In total I have 40 RNASeq fastq files mapped independently.

Now, the purpose is to compare the samples (cell types) against each other (not between the biological replicates of each sample). So, my question is: how can I deal with the biological replicates that I mapped? I have some options:

1) Take one (random) replicate per sample
2) Generate read count of mapped reads for all replicates per sample independently and then average the read counts
3) Merge the mapped reads (using bedtools merge) of the replicates

Any other suggestion?

dpryan 05-20-2014 04:09 AM

Don't use any of the options you presented. You absolutely want to use the replicates as replicates to better gauge biological variability. You'll be using one of the normal RNAseq packages (DESeq2, edgeR, limma, etc.), which can handle replicates natively. The problem is actually the comparison of unreplicated samples, for which the results will be questionable.

Fernas 05-20-2014 04:13 AM

Thanks @dpryan.
But actually my initial purpose of this study is not to study (differentially expressed genes) but to compare the gene profile (list of gene expression values in all the 20 cell types) against the histone modification profile (and some other epigenomic markers) which do not have replicates for these cell types. So, replicates here look useless. Am I right?

dpryan 05-20-2014 04:18 AM

No, more replicates will still give you a better idea of actual correlation, since you'll be more accurately assessing expression level (and how much histone modification or whatever is actually present). My guess (since you really don't provide enough information for people to know what you're actually attempting) is that you just want to use something like GSEA and look at gene-sets comprised of those genes whose promoters/bodies/whatever show enrichment for your histone modification(s) of interest across groups.

Fernas 05-20-2014 04:30 AM

Many thanks @dpryan for the prompt reply.
I am sorry if I missed some information. I just wanted to make my question simple. What I want exactly to do is:
to find the expression profile (vector of 20 values for 20 cell types) at each genomic region (e.g. promoter) and compare it with the epigenomic profile (vector of 20 values of histone modification marker or methylation..etc for 20 cell types) in order to detect those regions that have strong correlation.

So, your point is to keep the replicates. In this case, I will need to use the same epigenomic value (histone modification value or methylation..etc) with the n replicates of the same cell type.

I am not sure if GSEA Can do this task for me?

dpryan 05-20-2014 06:04 AM

I believe it can, yes.

All times are GMT -8. The time now is 07:11 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.