Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to normalize the RNA seq data for the purpose of correlation analysis bioinfor RNA Sequencing 0 04-15-2013 01:03 PM
RNA-Seq: Canonical correlation analysis for RNA-seq co-expression networks. Newsbot! Literature Watch 0 03-06-2013 03:00 AM
the correlation of RNA-seq data. kentnf Bioinformatics 6 07-17-2012 11:08 AM

Thread Tools
Old 02-04-2014, 09:21 AM   #1
Junior Member
Location: UK

Join Date: Aug 2013
Posts: 5
Default Within Sample Correlation of 2 genes - Rna-seq

Hi Guys,

I am doing some differential expression analysis of rna seq data using deseq2 . I have 12 different samples and i am using the raw count data and then inputting the matrix in deseq2.

my question is that if i wanted to compare a correlation of Gene A and Gene B within samples (not between samples - as they are co-expressed): do I do this on the raw counts or normalized counts.

so I have 12 values for Gene A across 12 samples
and 12 values for Gene B across 12 samples

doing a raw count correlation gives me around rho 0.8 something
however normalizing using the method in DESeq2 will scale each sample differently by size factors and the rho goes down to 0.5

anyway i am not sure how i should be doing the correlation (raw or normalized), if normalized then which method is preferable for within sample comparisons for 2 different genes.

Thank you for taking the time to read this and hope someone can give me some advice.
saint_667 is offline   Reply With Quote
Old 02-05-2014, 03:28 AM   #2
Junior Member
Location: UK

Join Date: Aug 2013
Posts: 5
Default bump!!

just bumping the post up - as i posted it very late in the evening.
saint_667 is offline   Reply With Quote
Old 02-06-2014, 02:56 PM   #3
Michael Love
Senior Member
Location: Boston

Join Date: Jul 2013
Posts: 333


Note that DESeq2 doesn't really help you out with this question, as it focuses on gene-by-gene differential expression, and the transformations are most useful for visualizing and clustering samples.

You don't want the sequencing depth as a factor in the correlation. Consider a situation where gene A and B are not correlated, but you sequence the samples so that each sample has double the number of reads as the previous sample. Then you will get a really high correlation which has no biological significance.

So you could* do:

nc <- counts(dds,normalized=TRUE)

where idx gives the index of genes you want to find correlations for.

*However, I would also consider batch effects if you are calculating gene-gene correlations and the samples were processed in batches. This would be another way to get spurious large-in-absolute-value correlations. You can check for batch effects using either of the transformations and the plotPCA workflow in the DESeq2 vignette.

If the samples cluster by batch, then the cqn package vignette explains how to get "normalized expression values", where the normalization takes care of sequencing depth, GC-content bias and gene length bias:
Michael Love is offline   Reply With Quote
Old 02-07-2014, 01:53 AM   #4
Junior Member
Location: UK

Join Date: Aug 2013
Posts: 5
Default Thank you for clearing this up

hi michael,

thank you for clearing this up and giving a comprehensive response to this problem. i was thinking along the same lines. someone suggested to use the vsd transformed data from deseq2 and then plot these correlations

- that however gives really high correlations, almost in line with non-normalized data. i understand that transformations from deseq2 are useful if we want to perform clustering

- the method that you suggest i.e. using the normalized data makes sense to me. and then look for batch effects as well.

thanks again
saint_667 is offline   Reply With Quote

correlation, deseq2, rna-seq normalization

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 05:24 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO