![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
GATK HaplotypeCaller READS? | hr3y | Bioinformatics | 0 | 06-29-2017 02:31 PM |
Can't run GATK HaplotypeCaller set ploidy = 1 | Genomics101 | Bioinformatics | 1 | 09-29-2015 06:37 AM |
Too many Variants called by HaplotypeCaller GATK | drmaly | Bioinformatics | 2 | 12-05-2013 09:56 PM |
GATK HaplotypeCaller memory problem | Robby | Bioinformatics | 1 | 04-03-2013 11:18 AM |
GATK- haplotypecaller or unifiedgenotyper? | lre1234 | Bioinformatics | 2 | 03-28-2013 09:48 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: Los Angeles Join Date: Mar 2010
Posts: 5
|
![]()
I'm using HaplotypeCaller with the -L option, where I explicitly state the interval. I am trying to break my task into pieces and assign it to different jobs for it to complete faster. In one run, I am running it, from say,
Chr1:1-1000000 and another run, I am running it twice, from Chr1:1-500000 on one jobs and Chr1:500001-1000000 on another essentially splitting the interval into 2 different jobs. What I am seeing in this second run is that there are SNPs identified +/- 100 bases from 500000 that are not found in the first run. My guess is that asking HC to focus only on a region (ie -L 1-500000) does not allow the local aligner to reassemble properly the reads, and hence results in spurious reads and SNPs. I was hoping that by specifying -L, it does the local aligner in a larger region and just report the SNPs in the -L region. Has anybody heard of this or have a way around? |
![]() |
![]() |
![]() |
Tags |
gatk, haplotypecaller, interval, snp calling |
Thread Tools | |
|
|