Seqanswers Leaderboard Ad

**dpryan** · 02-16-2014, 06:57 AM

There are NAs in the data, which "cor()" doesn't handle how you likely want by default. See the "use=" option.

**sindrle** · 02-16-2014, 07:22 AM

Hi again!
I have updated the script, handling a lot of rows and columns with variance = 0.

Should not be NAs anymore.. But don't know.

Still, the heat map looks very black, and the row clusters seams hard to interpret?

Screen Shot 2014-02-16 at 16.20.05.pdf

library(clusterGenomics)
data <- read.table(file = "~/RNAseq/INFSTK-5010/Oblig1/NEJM_Web_Fig1data.txt", header = FALSE, skip = 1, sep = "\t")

dim(data)

test <- count.fields("~/RNAseq/INFSTK-5010/Oblig1/NEJM_Web_Fig1data.txt", sep="\t")
which(test != 295)

header = scan("~/RNAseq/INFSTK-5010/Oblig1/NEJM_Web_Fig1data.txt", "", n = 295)
colnames(data) = header
fixdata <- data[-7401,-275]

x <- data.matrix(fixdata[,-1:-2], rownames.force = NA)

rownames(x) <- fixdata[,1]
x[is.na(x)] = 0

ind <- apply(x, 2, var) == 0
x <- x[,!ind]
ind <- apply(x, 1, var) == 0
x <- x[!ind,]

colclust <- hclust(as.dist(1-cor(x, method="pearson")), method="average")

rowclust <- hclust(as.dist(1-cor(t(x), method="pearson")), method="average")

z <- x[rev(rowclust$labels[rowclust$order]), colclust$labels[colclust$order]]

plotHeatmap(z, fast = TRUE)

res = part(t(x), B=10, Kmax=10, minSize=40, dist.method="cor")

plotTreeCol(clust=colclust, groups=res$lab.hatK[colclust$order])

res2 = part(x, B=10, Kmax=10, minSize=40, dist.method="cor")

plotTreeRow(clust=rowclust, groups=res2$lab.hatK[rowclust$order])

groups = cutree(colclust, k=3)

groups2 = cutree(colclust, h=2)

comparison <- cbind(res$lab.hatK, groups)

colnames(comparison) <- c("PART", "cutree")

test <- ifelse(comparison[,1]==comparison[,2], 1,NA)

table(is.na(test))["TRUE"]

**dpryan** · 02-16-2014, 10:52 AM

This is because ~95% of the values are no more than 10% away from 0 after being normalized. Do a "hist(x)" to see this.

I'd never heard of the "clusterGenomics" package before. I guess the treecutting part is interesting.

**sindrle** · 02-16-2014, 01:09 PM

Thanks for input, it was what I thought, but it don't look like the one in the paper...

I also liked the tree cutting!

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 39 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 41 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 35 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Troubleshot Heatmap

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News