SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
GATK realignment HGENETIC Bioinformatics 17 08-28-2012 08:05 AM
FixMateInformation after GATK realignment MolecularToast Bioinformatics 6 07-24-2012 06:26 AM
Interpreting Local Realignment results from GATK alma Bioinformatics 0 07-07-2011 02:46 AM
GATK recalibration score and local realignment: what is the first step? m_elena_bioinfo Bioinformatics 0 01-24-2011 02:59 AM
Local realignment using GATK and smra seq_GA Bioinformatics 28 01-17-2011 08:10 AM

Reply
 
Thread Tools
Old 06-14-2011, 03:36 PM   #1
seq_GA
Senior Member
 
Location: Asiana

Join Date: Feb 2009
Posts: 124
Default Gatk multiSample realignment and recalibration

Hi,
I have used gatk for single sample realignment followed by recalibration. I want to try multi sample realignment and recalibration as mentioned in the best practice page http://www.broadinstitute.org/gsa/wi...th_the_GATK_v2
The example doesn't show how to use the commandline.
Has anyone used this feature? If so can you please share the command line for gatk to call this? Thanks.
seq_GA is offline   Reply With Quote
Old 06-14-2011, 03:47 PM   #2
fpepin
Member
 
Location: Berkeley

Join Date: Feb 2011
Posts: 30
Default

Is this what you have in mind? http://www.broadinstitute.org/gsa/wi..._around_indels

Score recalibration can be found at http://www.broadinstitute.org/gsa/wi..._recalibration

I could offer more details, but they're simply saying it better (and more completely) than I could.
fpepin is offline   Reply With Quote
Old 06-14-2011, 07:09 PM   #3
seq_GA
Senior Member
 
Location: Asiana

Join Date: Feb 2009
Posts: 124
Default

Thanks for your response. Right now I am using the following commandline to do realignment and recalibration sample by sample.

Code:
java -Xmx4g -jar /tools/GenomeAnalysisTK-1.0.4013/GenomeAnalysisTK.jar -T RealignerTargetCreator -I test_PCR_removed_sorted.bam -R /Genome/hg19/hg19.fa -o output.intervals 

echo "GATK run realigner"
java -Xmx4g -jar /tools/GenomeAnalysisTK-1.0.4013/GenomeAnalysisTK.jar -T IndelRealigner -U -S Silent -I test_PCR_removed_sorted.bam -R /Genome/hg19/hg19.fa -targetIntervals output.intervals --output test_reAligned.bam

echo "GATK Count covariates and generate csv"
java -Xmx4g -jar /tools/GenomeAnalysisTK-1.0.4013/GenomeAnalysisTK.jar -R /Genome/hg19/hg19.fa --DBSNP /GATK/latest_GATK/resources/dbsnp_129_b37.rod -I test_reAligned_sorted.bam --max_reads_at_locus 20000 --default_platform illumina -T CountCovariates -cov ReadGroupCovariate -cov CycleCovariate -cov DinucCovariate -recalFile test_recal.csv

echo " GATK re-calibration"
java -Xmx4g -jar /tools/GenomeAnalysisTK-1.0.4013/GenomeAnalysisTK.jar -R /Genome/hg19/hg19.fa -I test_reAligned_sorted.bam -T TableRecalibration --default_platform illumina -outputBam test_reAligned_reCal.bam -recalFile test_recal.csv
But I am interested in Best: multi-sample realignment with known sites and recalibration as mentioned here http://www.broadinstitute.org/gsa/wi..._recalibration

How to implement the following.
Code:
for each sample
    lanes.bam <- merged lane.bam's for sample
    dedup.bam <- MarkDuplicates(lanes.bam)

samples.bam <- merged dedup.bam's for all samples
realigned.bam <- realign(samples.bam)
recal.bam <- recal(realigned.bam)
Please let me know whether my intrepretation as below is correct.
1. Merge all lane for each sample bam file.
2. Mark duplicates of sample bam
3. Merge all the sample dedup bams
4. Call realignment
5. Call recalibration.
Thanks.
seq_GA is offline   Reply With Quote
Old 06-14-2011, 09:49 PM   #4
fpepin
Member
 
Location: Berkeley

Join Date: Feb 2011
Posts: 30
Default

I'm not an expert, but that does sound reasonable (and very close to what I have). I also call the indel discovery tool before the realignment to get additional regions to focus on.

I use Picard for the duplicate removal:
Code:
java -Xmx8g -jar MarkDuplicates.jar I=bamFile O=bamDupRemFile \ 
M=duplicateMetricsFile VALIDATION_STRINGENCY=SILENT ASSUME_SORTED=true \
REMOVE_DUPLICATES=true
Merging can be done with samtools:
Code:
samtools merge mergedBam bam1 bam2 bam3
Unfortunately, you'll need to redo the header as it will only use the header from the first bam file and it will miss the @RG from the others. Hopefully I'm behind in the news and someone can mention an easy tool to do it, I just have a bit of perl code to do it. I can post it here if you want.
fpepin is offline   Reply With Quote
Old 06-14-2011, 11:56 PM   #5
mard
Member
 
Location: Melbourne

Join Date: Jan 2010
Posts: 21
Default

Quote:
Originally Posted by fpepin View Post

Unfortunately, you'll need to redo the header as it will only use the header from the first bam file and it will miss the @RG from the others. Hopefully I'm behind in the news and someone can mention an easy tool to do it, I just have a bit of perl code to do it. I can post it here if you want.
Picard's MergeSamFiles will merge multiple BAM files and add all @RGs to the header of the merged BAM.
mard is offline   Reply With Quote
Old 06-15-2011, 01:02 AM   #6
seq_GA
Senior Member
 
Location: Asiana

Join Date: Feb 2009
Posts: 124
Default

Thanks Mard for the info.
seq_GA is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:41 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO