SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
? --doNotWriteOriginalQuals equivalent command in GATK 2.x ? swNGS Bioinformatics 0 05-08-2013 10:17 AM
GATK RealignerTargetCreator -B and -D options efoss Bioinformatics 1 10-03-2011 08:13 AM
Setting Bowtie options from the Tophat command line GiladZil RNA Sequencing 2 08-02-2011 01:42 PM
GATK Queue and options kasthuri Bioinformatics 2 06-10-2011 07:57 AM
Tophat command line options ice RNA Sequencing 6 09-02-2010 03:25 PM

Reply
 
Thread Tools
Old 11-04-2015, 12:20 PM   #1
cmccabe
Senior Member
 
Location: chicago

Join Date: Jul 2012
Posts: 354
Default GATK command options

I just want to be sure that I am interpreting the below correct:

In this GATK command the -L 20 gives the chromosome 20 only and if all chromosomes were used then the -L can be removed.

The -known is only used if your organism has a file (maybe hg19 does, but I have to search). Is it a text file?

Code:
java -jar GenomeAnalysisTK.jar \ 
    -T RealignerTargetCreator \ 
    -R reference.fa \ 
    -I dedup_reads.bam \ 
    -L 20 \        (chromomosome 20 only)
    -known gold_indels.vcf \    (optional) 
    -o realignment_targets.list

Expected Result

This creates a file called realignment_targets.list containing the list of intervals that the program identified as needing realignment within our target, chromosome 20.

The list of known indel sites (gold_indels.vcf) are used as targets for realignment. Only use it if there is such a list for your organism.
In the below GATK command:

Code:
java -jar GenomeAnalysisTK.jar \ 
    -T BaseRecalibrator \ 
    -R reference.fa \ 
    -I realigned_reads.bam \ 
    -L 20 \ 
    -knownSites dbsnp.vcf \ 
    -knownSites gold_indels.vcf \ 
    -o recal_data.table
t he -L20 and -known do the same as in the previous command. Thank you .
cmccabe is offline   Reply With Quote
Old 11-04-2015, 01:34 PM   #2
HESmith
Senior Member
 
Location: Bethesda MD

Join Date: Oct 2009
Posts: 505
Default

"gold_indels.vcf" is a VCF; see here for the latest specification.
HESmith is offline   Reply With Quote
Old 11-05-2015, 03:38 AM   #3
WhatsOEver
Senior Member
 
Location: Germany

Join Date: Apr 2012
Posts: 215
Default

Concerning your question: No, "L" restricts the variant calling to a certain region (in your example chr20); "known" specifies a file containing known variants - these files contain variant info over the whole genome.

The Broad institute offers a resource bundle for hg19 which also include the known variants (https://www.broadinstitute.org/gatk/download/)
WhatsOEver is offline   Reply With Quote
Old 11-05-2015, 02:11 PM   #4
cmccabe
Senior Member
 
Location: chicago

Join Date: Jul 2012
Posts: 354
Default

Thank you .
cmccabe is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:00 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO