Dear all,
I am new to GATK and try to test this to look at indels from ILLUMINA nextGen data generated from custom pull down experiments. I have 2 set of files, one "disease" file and one "no-disease"( from the same individual). I know from other experiments that there is a 4pb duplication in the disease sample, I found it as well using SAMTOOLS pileup .
I wanted to try GATK on the same set of samples and get indels using SomaticIndelDetector.
I have tried without local realignment first
java -Xmx2g -jar GenomeAnalysisTK.jar \
-R ref.fasta \
-T SomaticIndelDetector \
-o indels.vcf \
-verbose indels.txt
-I:normal normal.bam \
-I:tumor tumor.bam
from the output vcf I cannot detect my duplication, I have tried as well in the single file mode (without putting a normal reference file).
I tried to realign using both RealignerTargetCreator and IndelRealigner. It appears that this region is called in the RealignerTargetCreator as it is a bit complex.
I used this for my disease file and did the same thing for the normal file
java -Xmx2g -jar GenomeAnalysisTK.jar \
-I input.bam \
-R ref.fasta \
-T RealignerTargetCreator \
-o forIndelRealigner.intervals \
then I ran IndelRealigner for both files, using the just created forIndelRealigner.intervals .
java -Xmx4g -jar GenomeAnalysisTK.jar \
-I input.bam \
-R ref.fasta \
-T IndelRealigner \
-targetIntervals intervalListFromRTC.intervals \
-o realignedBam.bam \
With my new realigned Bams I did re-run SomaticIndelDetector , and did not detect any duplication, this region is not called. I have tried as well in the single file mode (without putting a normal reference file), no calls.
I have checked that this region is not discarded ( as containing too few or too much reads ) ..
Could you please help/advise for better settings?
Many thanks
I am new to GATK and try to test this to look at indels from ILLUMINA nextGen data generated from custom pull down experiments. I have 2 set of files, one "disease" file and one "no-disease"( from the same individual). I know from other experiments that there is a 4pb duplication in the disease sample, I found it as well using SAMTOOLS pileup .
I wanted to try GATK on the same set of samples and get indels using SomaticIndelDetector.
I have tried without local realignment first
java -Xmx2g -jar GenomeAnalysisTK.jar \
-R ref.fasta \
-T SomaticIndelDetector \
-o indels.vcf \
-verbose indels.txt
-I:normal normal.bam \
-I:tumor tumor.bam
from the output vcf I cannot detect my duplication, I have tried as well in the single file mode (without putting a normal reference file).
I tried to realign using both RealignerTargetCreator and IndelRealigner. It appears that this region is called in the RealignerTargetCreator as it is a bit complex.
I used this for my disease file and did the same thing for the normal file
java -Xmx2g -jar GenomeAnalysisTK.jar \
-I input.bam \
-R ref.fasta \
-T RealignerTargetCreator \
-o forIndelRealigner.intervals \
then I ran IndelRealigner for both files, using the just created forIndelRealigner.intervals .
java -Xmx4g -jar GenomeAnalysisTK.jar \
-I input.bam \
-R ref.fasta \
-T IndelRealigner \
-targetIntervals intervalListFromRTC.intervals \
-o realignedBam.bam \
With my new realigned Bams I did re-run SomaticIndelDetector , and did not detect any duplication, this region is not called. I have tried as well in the single file mode (without putting a normal reference file), no calls.
I have checked that this region is not discarded ( as containing too few or too much reads ) ..
Could you please help/advise for better settings?
Many thanks
Comment