SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
extract dendrogram information from heatmap generated by heatmap.2 crazyhottommy Bioinformatics 6 11-24-2014 09:45 AM
Correlation Plots willemate RNA Sequencing 0 05-06-2014 12:09 AM
what's wrong with heatmap.2, heatmap turns to blue after finishing plotting crazyhottommy Bioinformatics 3 08-20-2013 11:02 AM
CummeRbund plots godzilla07 Bioinformatics 1 08-10-2012 04:00 AM
heatmap and heatmap.2 capricy Bioinformatics 2 06-20-2012 05:47 AM

Reply
 
Thread Tools
Old 07-08-2014, 12:41 PM   #1
id0
Senior Member
 
Location: USA

Join Date: Sep 2012
Posts: 130
Default Improving heatmap plots

I sometimes make heatmaps for gene expression data. I proceed with the basic heatmap.2 method (based on DESeq recommendation):
Code:
heatmap.2( x, scale="row", trace="none",
dendrogram="both", Rowv=TRUE, Colv=TRUE, col = col )
Based on all the parameters, it should come out okay. However, very often I find that the resulting heatmap does not cluster very well. For a simple two-group experiment, if I give it some differentially expressed genes, I would expect to see the heatmap divided into four sections (up and down for each condition). In my experience, that result has been very difficult to achieve.

Based on heatmap.2 documentation, it seems to be very flexible, but there are a lot of options. Has anyone been able to significantly improve their clustering by adjusting various parameters? Is there a particular combination that works especially well for gene expression data?
id0 is offline   Reply With Quote
Old 07-08-2014, 11:43 PM   #2
WhatsOEver
Senior Member
 
Location: Germany

Join Date: Apr 2012
Posts: 215
Default

You could start trying a different method for the hierarchical clustering (I think the standard is average, but I'm not really sure on this -> http://stat.ethz.ch/R-manual/R-patch...ml/hclust.html).
In my case, ward's method performed much better for clustering of gene expressions.
WhatsOEver is offline   Reply With Quote
Old 07-09-2014, 08:28 AM   #3
id0
Senior Member
 
Location: USA

Join Date: Sep 2012
Posts: 130
Default

Quote:
Originally Posted by WhatsOEver View Post
You could start trying a different method for the hierarchical clustering (I think the standard is average, but I'm not really sure on this -> http://stat.ethz.ch/R-manual/R-patch...ml/hclust.html).
In my case, ward's method performed much better for clustering of gene expressions.
Thanks for that suggestion. Switching the hclust method to ward had very noticeable results.

I guess my problem is really with the range of values. Most of the values end up in a small subset of the color range. My initial hope was the scale parameter would solve that, but it only shifts the distribution. The colors at the ends of the range are essentially not represented. Regardless of how good the clustering is, it's difficult to actually see the results. Here is an example of what I mean (the color key and histogram is the important part):
id0 is offline   Reply With Quote
Old 07-09-2014, 08:45 AM   #4
jwfoley
Senior Member
 
Location: Stanford

Join Date: Jun 2009
Posts: 181
Default

Look at your histogram. This is a feature of your data, not of the clustering tool. If you only want contrast within that middle range, chop off the tails of your distribution before you put it into heatmap.2, or set your own breaks for the color bins to get the same result.

Also, consider using a two-hue gradient with something neutral (white, gray, black, whatever) in the middle, since your scale has a zero point and the difference between positive vs. negative vs. neither is probably meaningful.
jwfoley is offline   Reply With Quote
Old 07-09-2014, 08:48 AM   #5
crazyhottommy
Senior Member
 
Location: Gainesville

Join Date: Apr 2012
Posts: 140
Default

if you use pearson correlation distance to cluster, you will get the desired the figure.
crazyhottommy is offline   Reply With Quote
Old 07-10-2014, 01:27 AM   #6
WhatsOEver
Senior Member
 
Location: Germany

Join Date: Apr 2012
Posts: 215
Default

1) You used a symmetric color key.
2) You have a datapoint (RMS.T11 / 343867) which has a z-score of ~8
Both result in a color map ranging from -8 to 8 to which the rest of your data is assigned to.

You can set the symkey parameter to false to make an unsymmetric key. Also playing with a different color gradient (as suggest by jwfoley) makes sense in my opinion. I mean, you see the separation of genes of your BL.C group and the RMS group in comparison to EWS and each other, so it's just a matter of fine tuning the contrast <- if that is what you want to show
WhatsOEver is offline   Reply With Quote
Old 07-11-2014, 01:48 PM   #7
rskr
Senior Member
 
Location: Santa Fe, NM

Join Date: Oct 2010
Posts: 250
Default

I use Cramer's V for heatmap it is a measure of association that that doesn't suffer from being over generalized from continuous variables to discrete variables, and actually makes sense when clustering genes when there are no reads.
rskr is offline   Reply With Quote
Old 08-25-2014, 10:57 AM   #8
id0
Senior Member
 
Location: USA

Join Date: Sep 2012
Posts: 130
Default

Quote:
Originally Posted by rskr View Post
I use Cramer's V for heatmap it is a measure of association that that doesn't suffer from being over generalized from continuous variables to discrete variables, and actually makes sense when clustering genes when there are no reads.
How would you use Cramer's V for heatmap? Do you have any example?
id0 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:38 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO