In exome sequencing, specific interval list (-L) is used by each company to capture particular region. But all exome in the genome will be in particular regions (the default region of exomes mentioned here.
So what is the difference between using custom interval list (-L option used in GATK best Practice and default interval list. How the accuracy of the output is affected?
Code:
$ curl -s "http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/refGene.txt.gz" |\ gunzip -c | cut -f 3,5,6 | sort -t $'\t' -k1,1 -k2,2n | bedtools merge -i - > exome.bed
Comment