Hi all,
I'm analyzing some ChIP-seq data, and I'd like to see how different my replicate samples are (pairwise comparison only is fine). To be clear, I want to compare the read density across the whole genome, not just the peaks. I know the process of comparing peaks has been covered in other threads, but I couldn't find a description of how to generate a simple scatterplot of read density at all genomic positions (after normalization), with the corresponding pearson correlation coefficient. It seems to be a common form of analysis, and it looks like people generally use ~200 bp windows. It also seems important to eliminate regions with no reads in either sample, to avoid artificially increasing the correlation. Any suggestions on how to tackle this would be much appreciated!
I'm dealing with FASTQ files, already mapped to hg19.
I'm analyzing some ChIP-seq data, and I'd like to see how different my replicate samples are (pairwise comparison only is fine). To be clear, I want to compare the read density across the whole genome, not just the peaks. I know the process of comparing peaks has been covered in other threads, but I couldn't find a description of how to generate a simple scatterplot of read density at all genomic positions (after normalization), with the corresponding pearson correlation coefficient. It seems to be a common form of analysis, and it looks like people generally use ~200 bp windows. It also seems important to eliminate regions with no reads in either sample, to avoid artificially increasing the correlation. Any suggestions on how to tackle this would be much appreciated!
I'm dealing with FASTQ files, already mapped to hg19.
Comment