Hi,
I have a large (~2 million) set of SNPs and I would like to know the GC content +/- 500 bp from the variants. I have some Python and Unix skills, but am not an expert. I have download the hg19 gc5Base from the UCSC table browser, which has the GC content in 5 bp bins across the genome:
$ more hg19.gc5Base.txt
variableStep chrom=chr1 span=5
10001 40
10006 40
10011 40
10016 60
10021 60
10026 60
But I am not sure of the best way to get specific regions from this file based on the SNPs I am interested in. Any help would be appreciated!
I have a large (~2 million) set of SNPs and I would like to know the GC content +/- 500 bp from the variants. I have some Python and Unix skills, but am not an expert. I have download the hg19 gc5Base from the UCSC table browser, which has the GC content in 5 bp bins across the genome:
$ more hg19.gc5Base.txt
variableStep chrom=chr1 span=5
10001 40
10006 40
10011 40
10016 60
10021 60
10026 60
But I am not sure of the best way to get specific regions from this file based on the SNPs I am interested in. Any help would be appreciated!
Comment