Seqanswers Leaderboard Ad

**saint_667** · 02-05-2014, 03:28 AM

bump!!

just bumping the post up - as i posted it very late in the evening.

**Michael Love** · 02-06-2014, 02:56 PM

hi,

Note that DESeq2 doesn't really help you out with this question, as it focuses on gene-by-gene differential expression, and the transformations are most useful for visualizing and clustering samples.

You don't want the sequencing depth as a factor in the correlation. Consider a situation where gene A and B are not correlated, but you sequence the samples so that each sample has double the number of reads as the previous sample. Then you will get a really high correlation which has no biological significance.

So you could* do:

nc <- counts(dds,normalized=TRUE)
cor(nc[idx,])

where idx gives the index of genes you want to find correlations for.

*However, I would also consider batch effects if you are calculating gene-gene correlations and the samples were processed in batches. This would be another way to get spurious large-in-absolute-value correlations. You can check for batch effects using either of the transformations and the plotPCA workflow in the DESeq2 vignette.

If the samples cluster by batch, then the cqn package vignette explains how to get "normalized expression values", where the normalization takes care of sequencing depth, GC-content bias and gene length bias:

cqn

http://www.bioconductor.org/packages/release/bioc/html/cqn.html

A normalization tool for RNA-Seq data, implementing the conditional quantile normalization method.

**saint_667** · 02-07-2014, 01:53 AM

Thank you for clearing this up

hi michael,

thank you for clearing this up and giving a comprehensive response to this problem. i was thinking along the same lines. someone suggested to use the vsd transformed data from deseq2 and then plot these correlations

- that however gives really high correlations, almost in line with non-normalized data. i understand that transformations from deseq2 are useful if we want to perform clustering

- the method that you suggest i.e. using the normalized data makes sense to me. and then look for batch effects as well.

thanks again

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 26 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 29 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 25 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Within Sample Correlation of 2 genes - Rna-seq

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News