Hi,
A bit confusing...
I am using a human reference genome (human_g1k_v37.fasta) from GATK, a Ensemble annotation file and the TruSeq Exome Targeted Regions Manifest v1.2 (http://support.illumina.com/downloads.html).
1) To fit with the 2 other files I formatted the 'chr' (removing it), but is the TruSeq .bed file 0- or 1-based? Where does the file come from?
If it is 0-based, I think I have to make it 1-based to fit the annotation file, right? How could I do that?
2) Picard complains about the TruSeq interval list not having a header. Is it ok if I append the .dict file (generated from the reference genome with GATK) to it?
What is the format of the interval list file?
Thanks !
A bit confusing...
I am using a human reference genome (human_g1k_v37.fasta) from GATK, a Ensemble annotation file and the TruSeq Exome Targeted Regions Manifest v1.2 (http://support.illumina.com/downloads.html).
1) To fit with the 2 other files I formatted the 'chr' (removing it), but is the TruSeq .bed file 0- or 1-based? Where does the file come from?
If it is 0-based, I think I have to make it 1-based to fit the annotation file, right? How could I do that?
2) Picard complains about the TruSeq interval list not having a header. Is it ok if I append the .dict file (generated from the reference genome with GATK) to it?
What is the format of the interval list file?
Thanks !
Comment