09032013, 08:35 AM  #1 
Member
Location: USA Join Date: Oct 2012
Posts: 10

Probability of Occurence
How do I determine the if the overlap of genetic mutations, ex: copy number variants (CNVs) with a particular set of genomic regions is more than expected by random chance?
I have a subset of a larger set of CNVs that were found to overlap a type of genomic region, but would like to do a simple statistic to demonstrate that the number of overlapping CNVs is greater that you would expect by random chance. Thanks 
09032013, 09:02 AM  #2 
Member
Location: Boston, MA Join Date: Aug 2013
Posts: 13

If you can convert the locations of these features into bed format you can try looking into the shuffleBed function in bedtools. That is, use the function to randomly shuffle the positions of the CNVs/features and tabulate how many times the overlaps occur (over say 1000 shuffles). You can specify a file to the function containing regions to exclude where a feature cannot exist, for example centromeric regions, and etc.

