Seqanswers Leaderboard Ad

**gavin.oliver** · 12-05-2011, 03:48 AM

I am now getting all Indels up to 29bp in length. I achieved this by increasing the maximum number of permitted gap extensions with bwa aln -e 50.

I will continue to experiment in order to get the larger indels.

**adaptivegenome** · 12-05-2011, 06:30 AM

Do you perform a base recalibration step with GATK before calling indels?

**gavin.oliver** · 12-05-2011, 06:40 AM

Originally posted by genericforms View Post

Do you perform a base recalibration step with GATK before calling indels?

Indeed I do.

**oiiio** · 12-05-2011, 08:10 AM

I have been trying to call indels with GATK UnifiedGenotyper from BWA-mapped BAMs for some time now, but with no success.

Did you have to use anything outside of the default parameters with UnifiedGenotyper or COuntCovariates/TableRecalibration? Others with this problem have found that it could be sequencing error rates in the sample were too high.

If you dont mind, could you post a couple command lines from your pipeline? I'm particularly interested in your UnifiedGenotyper and base recalibration commands. It would be an immense help.

**gavin.oliver** · 12-05-2011, 08:23 AM

Originally posted by oiiio View Post

I have been trying to call indels with GATK UnifiedGenotyper from BWA-mapped BAMs for some time now, but with no success.

Did you have to use anything outside of the default parameters with UnifiedGenotyper or COuntCovariates/TableRecalibration? Others with this problem have found that it could be sequencing error rates in the sample were too high.

If you dont mind, could you post a couple command lines from your pipeline? I'm particularly interested in your UnifiedGenotyper and base recalibration commands. It would be an immense help.

I am pretty sure my commands are very standard. Nonetheless, you are welcome to have a look!

for file in *fastq; do bwa aln -e 50 -f ${file%%.fastq}.sai chr17hg19 ${file}; done

for file in *sai; do bwa samse chr17hg19 ${file} ${file%%.sai}.fastq > ${file%%.sai}.sam; done

for file in *bam; do java -Xmx3g -jar /home/goliver/ngs_software/picard-tools-1.53/SortSam.jar I=${file} O=${file%%.bam}_sorted.bam SO=coordinate; done

for file in *_sorted.bam; do java -Xmx3g -jar /home/goliver/ngs_software/picard-tools-1.53/MarkDuplicates.jar I=${file} O=${file%%.bam}_ndup.bam M=metric TMP_DIR=./tmp REMOVE_DUPLICATES=TRUE VALIDATION_STRINGENCY=LENIENT; done

for file in *ndup.bam; do java -jar /home/goliver/ngs_software/picard-tools-1.53/AddOrReplaceReadGroups.jar I=${file} O=${file%%.bam}_rg.bam SO=coordinate ID=1 LB=Z PL=illumina PU=Z SM=Z; done

for file in *rg.bam; do java -Xmx3g -jar /home/goliver/ngs_software/picard-tools-1.53/BuildBamIndex.jar I=${file} O=${file}.bai; done

for file in *rg.bam; do java -Xmx3g -jar /home/goliver/ngs_software/GenomeAnalysisTK-1.2-24-g6478681/GenomeAnalysisTK.jar -T RealignerTargetCreator -R ../ref_chr17.hg19.fa -o ${file%%.bam}.intervals -I ${file}; done

for file in *rg.bam; do java -Xmx3g -jar /home/goliver/ngs_software/GenomeAnalysisTK-1.2-24-g6478681/GenomeAnalysisTK.jar -I ${file} -R ../ref_chr17.hg19.fa -T IndelRealigner -o ${file%%.bam}_2.bam -targetIntervals ${file%%.bam}.intervals --known ../GATK/dbsnp_132.b37.vcf; done

for file in *_2.bam; do java -Xmx20g -jar /home/goliver/ngs_software/GenomeAnalysisTK-1.2-24-g6478681/GenomeAnalysisTK.jar -R ../ref_chr17.hg19.fa -knownSites ../GATK/dbsnp_132.b37.vcf -I ${file} -T CountCovariates -cov QualityScoreCovariate -cov DinucCovariate -cov ReadGroupCovariate -cov CycleCovariate -recalFile ${file%%.bam}.recal.csv --default_read_group 1 --default_platform illumina -nt 4; done

for file in *_2.bam; do java -Xmx3g -jar /home/goliver/ngs_software/GenomeAnalysisTK-1.2-24-g6478681/GenomeAnalysisTK.jar -l INFO -R ../ref_chr17.hg19.fa -T TableRecalibration -I ${file} -o ${file%%.bam}.final.bam -recalFile ${file%%.bam}.recal.csv --default_read_group 1 --default_platform illumina; done

for file in *final.bam; do java -Xmx3g -jar /home/goliver/ngs_software/GenomeAnalysisTK-1.2-24-g6478681/GenomeAnalysisTK.jar -T UnifiedGenotyper -glm BOTH -I ${file} -R ../ref_chr17.hg19.fa -o ${file%%.bam}.vcf; done

**Jon_Keats** · 12-05-2011, 09:00 PM

Do you have any paired-end data as opposed to single-ended as you methods suggest? The indel alignment should be better with paired-ends than single ends

**gavin.oliver** · 12-06-2011, 12:52 AM

Originally posted by Jon_Keats View Post

Do you have any paired-end data as opposed to single-ended as you methods suggest? The indel alignment should be better with paired-ends than single ends

This particular dataset is all single end. I am pretty certain the larger indels can still be detected though...

Topics	Statistics	Last Post
Evaluating Genome Sequencing for ECMO Patients in the NICU by seqadmin Started by seqadmin, 12-17-2024, 10:28 AM	0 responses 22 views 0 likes	Last Post by seqadmin 12-17-2024, 10:28 AM
New Genetic Toolkit Refines Studies on Gene Function and Disease by seqadmin Started by seqadmin, 12-13-2024, 08:24 AM	0 responses 42 views 0 likes	Last Post by seqadmin 12-13-2024, 08:24 AM
Study Links Brain Mechanism to Emotional Responses in Animals and Humans by seqadmin Started by seqadmin, 12-12-2024, 07:41 AM	0 responses 28 views 0 likes	Last Post by seqadmin 12-12-2024, 07:41 AM
Study Identifies Ribosomal RNA Fingerprints as Early Cancer Biomarkers by seqadmin Started by seqadmin, 12-11-2024, 07:45 AM	0 responses 42 views 0 likes	Last Post by seqadmin 12-11-2024, 07:45 AM

Seqanswers Leaderboard Ad

Announcement

6-99bp indels with BWA/GATK

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News