Does anybody have a good way of generating random ChIP-seq peaks?
The goal is to generate random sets to assess statistical significance of overlap of real ChIP-seq peaks with different genomic features.
Simply generating random coordinates does not suffice because ChIP-seq peaks are biased for non-repetitive regions, and of course, only assembled regions of the genome, and possibly have some GC bias.
I wonder if simply excluding the repetitive and unassembled regions of the genomes would do a good job. Another possibility would be using huge amounts of input data, using mapped reads as "anchors" for randomly generated peaks?
Thanks!
The goal is to generate random sets to assess statistical significance of overlap of real ChIP-seq peaks with different genomic features.
Simply generating random coordinates does not suffice because ChIP-seq peaks are biased for non-repetitive regions, and of course, only assembled regions of the genome, and possibly have some GC bias.
I wonder if simply excluding the repetitive and unassembled regions of the genomes would do a good job. Another possibility would be using huge amounts of input data, using mapped reads as "anchors" for randomly generated peaks?
Thanks!