SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Which GATK step should be multisample? dr_sson Bioinformatics 0 05-10-2013 03:12 AM
GATK re aligner step doubles file size? manducasexta Bioinformatics 7 07-05-2012 08:50 AM
Doubts about GATK "raw data processing" step for SOliD exome data jorgebm Bioinformatics 2 06-18-2012 05:17 AM
GATK recalibration score and local realignment: what is the first step? m_elena_bioinfo Bioinformatics 0 01-24-2011 02:59 AM
step by step for rarefaction calculation psong Metagenomics 1 01-06-2010 05:08 AM

Reply
 
Thread Tools
Old 11-25-2013, 11:50 AM   #1
frankyue50
Member
 
Location: CA

Join Date: Nov 2008
Posts: 34
Default GATK recalibration confused, one step or two step?

Just based my reading
Previously, it was:
Quote:
java -Xmx4g -jar GenomeAnalysisTK.jar -l INFO -R hg19.fa --DBSNP dbsnp132.txt -I input.marked.realigned.fixed.bam -T CountCovariates -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov DinucCovariate -recalFile input.recal_data.csv

java -Xmx4g -jar GenomeAnalysisTK.jar -l INFO -R hg19.fa -I input.marked.realigned.fixed.bam -T TableRecalibration --out input.marked.realigned.fixed.recal.bam -recalFile input.recal_data.csv

Then some suggested (http://seqanswers.com/forums/showthr...14038&page=5):
Quote:
java -Xmx4g -jar GenomeAnalysisTK.jar -l INFO -R hg19.fa -knowSites dbsnp132.txt -I input.marked.realigned.fixed.bam -T BaseRecalibrator -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov DinucCovariate --out input.recal_data.csv

From GATK doc:
java -Xmx4g -jar GenomeAnalysisTK.jar \
-T BaseRecalibrator \
-I my_reads.bam \
-R resources/Homo_sapiens_assembly18.fasta \
-knownSites bundle/hg18/dbsnp_132.hg18.vcf \
-knownSites another/optional/setOfSitesToMask.vcf \
-o recal_data.grp
with this additional step only if using several bam files (if I understand well the documentation: "PrintReads can dynamically merge the contents of multiple input BAM files, resulting in merged output sorted in coordinate order" )
Quote:
java -Xmx2g -jar GenomeAnalysisTK.jar \
-R ref.fasta \
-T PrintReads \
-o output.bam \
-I input1.bam \
-I input2.bam \
--read_filter MappingQualityZero

What should I do?
frankyue50 is offline   Reply With Quote
Old 11-25-2013, 12:54 PM   #2
frankyue50
Member
 
Location: CA

Join Date: Nov 2008
Posts: 34
Default

Anyway, here is what I did:

java -Xmx4g -jar GenomeAnalysisTK.jar -l INFO -R hg19.fa -knownSites 00-All.vcf -I input.nodup.realigned.fixed.bam -T BaseRecalibrator -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov ContextCovariate --out input.csv

Will run the print read now
frankyue50 is offline   Reply With Quote
Old 11-25-2013, 02:25 PM   #3
vdauwera
Member
 
Location: Boston, MA

Join Date: Apr 2012
Posts: 42
Default

Hi there,

These tools have evolved slightly over the past few months, so make sure to read the latest documentation here: http://www.broadinstitute.org/gatk/guide/best-practices

Also, look at the presentations from the latest GATK workshop here:
http://www.broadinstitute.org/gatk/guide/events
vdauwera is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:56 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO