Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
Discussion about MuTect and its parameters Jane M Bioinformatics 9 09-02-2014 05:38 AM
criteria for filtering SNPs BhariD Bioinformatics 1 06-18-2013 09:49 AM
MuTect -> vcf pipeline?option? dkrtndhkd Bioinformatics 2 05-22-2013 07:41 AM
Criteria for novel miRNA hamid-ali-khan RNA Sequencing 1 08-27-2012 04:44 AM
MuTect contig error.. shyam_la Bioinformatics 4 07-10-2012 08:55 AM

Thread Tools
Old 06-18-2013, 09:43 AM   #1
Location: Mumbai

Join Date: Sep 2011
Posts: 38
Default Mutect Analysis Criteria: Judgement Calls

Hello everybody, I am dealing with a dataset of cancer tumours sequenced on hiseq. I do not have a matched normal, and I have used Mutect to call somatic variants. I have the following doubts:

1) How good is mutect at calling variants, when there is no matched normal supplied?

2) Is the judgement criteria i.e. "KEEP" or "REJECT" an absolute criteria? On what basis is this criteria decided? Will i loose out on a lot of quality variants, if i discount all the variants marked as "REJECT" by mutect, and proceed ahead with only variants marked as "KEEP" in my downstream analysis? There is a lot of ambiguity surrounding this, and I would love to hear the communities thoughts on this subject.

Thanks a lot for your 2 cents!
ron128 is offline   Reply With Quote
Old 06-19-2013, 02:29 AM   #2
Junior Member
Location: Milano

Join Date: Oct 2009
Posts: 2

same question of ron128.

Moreover, I'm running Mutect on an exome (Agilent sureselect v5). But in this other case I have both normal and tumor sample. Default parameters for all apart --minimum_mutation_allele_fraction 0.10 and --min_qscore 20 and --clipping_bias_pvalue_threshold 0.05.

Resulting somatic and "keeped" variants were only 40!

Is it "normal" to have so few somatic mutations by your experience? I know that it depends by the kind of cancer sample..but just to have a comparison metric.

I'm also a little bit confused how to deal with the possible sample contamination (by tumor cells) of the normal germline one. I saw the parameter --minimum_normal_allele_fraction. But how to interpret it? My actual 40 somatic variants in the control have always 0-coverage for the "somatic allele" in the normal sample. This is like the tumor_allele_in_control_sample/tumor_allele_in_tumor_sample ratio has to be zero or really close to it. That is a very low control sample contamination is admitted. So, maybe by default --minimum_normal_allele_fraction is set to an high value?

Thank in advance!

Last edited by UltimaSeq; 06-19-2013 at 02:32 AM.
UltimaSeq is offline   Reply With Quote
Old 07-25-2013, 06:43 AM   #3
Location: New York

Join Date: Mar 2012
Posts: 35

Same question.
I am running MuTect on my mouse RNA seq data and I get ~600000 calls from Mutect but all of them are Reject. Has anyone experienced this before?
Is it a problem with my analysis or MuTect?.
Thanks for help!!!
himanshu04 is offline   Reply With Quote
Old 08-22-2013, 11:31 PM   #4
Senior Member

Join Date: Jul 2013
Posts: 142

Is there anyone who can tell me normal input for MuTect.
I have only cancer data without normal or control data of Prostate Cancer Cell Lines Exome seq.
What --input:control or normal can I use in MuTect ?
Thanks in advance.
jp. is offline   Reply With Quote
Old 06-07-2015, 09:26 PM   #5
Junior Member
Location: Oregon

Join Date: Jan 2014
Posts: 1
Default Mutect defaults

I'm not sure how pertinent this will be considering the age of the thread but I thought I would reply, since I wasn't able to find any sources of information when I was struggling with this.
I found the Mutect default filters by running Mutect once with the following parameters:
--enable_extended_output \
This is not the default. The resulting vcf header will have the defaults thresholds for Mutect. Here's an example of a clip of my vcf header. THIS JUST AN EXAMPLE. I tweaked some of these parameters so you are not viewing the defaults. So make sure you run your own version of Mutect and look at the vcf header to find the defaults. It seems silly to bury it in the vcf header and not posting it ANYWHERE else.

##FILTER=<ID=PASS,Description="Accept as a confident somatic mutation">
##MuTect="analysis_type=MuTect ...
downsample_to_coverage=1000 enable_experimental_downsampling=false baq=OFF baqGapOpenPenalty=40.0 performanceLog=null useOriginalQualities=false BQSR=null quantize_quals=0 disable_indel_quals=false emit_original_quals=false preserve_qscores_less_than=6 defaultBaseQualities=-1 validation_strictness=SILENT remove_program_records=false keep_program_records=false unsafe=null num_threads=1 num_cpu_threads_per_data_thread=1 num_io_threads=0 monitorThreadEfficiency=false num_bam_file_handles=null read_group_black_list=null pedigree=[] pedigreeString=[] pedigreeValidationType=STRICT allow_intervals_with_unindexed_bam=false generateShadowBCF=false logging_level=INFO log_to_file=null help=false noop=false enable_extended_output=true artifact_detection_mode=false tumor_sample_name=1002_tumor_52 bam_tumor_sample_name=null normal_sample_name=1002_Normal_53 force_output=false force_alleles=false only_passing_calls=false initial_tumor_lod=4.0 tumor_lod=6.3 fraction_contamination=0.02 minimum_mutation_cell_fraction=0.0 normal_lod=2.2 normal_artifact_lod=1.0 strand_artifact_lod=2.0 strand_artifact_power_threshold=0.9 dbsnp_normal_lod=5.5 somatic_classification_normal_power_threshold=0.95 minimum_normal_allele_fraction=0.0 tumor_f_pretest=0.0050 min_qscore=5 gap_events_threshold=3 heavily_clipped_read_fraction=0.3 clipping_bias_pvalue_threshold=0.05 fraction_mapq0_threshold=0.5 pir_median_threshold=10.0 pir_mad_threshold=3.0 required_maximum_alt_allele_mapping_quality_score=20 max_alt_alleles_in_normal_count=2 max_alt_alleles_in_normal_qscore_sum=20 max_alt_allele_in_normal_fraction=0.03 power_constant_qscore=30 absolute_copy_number_data=null power_constant_af=0.30000001192092896

So to figure out why a single mutation may have failed Mutect filters is a complicated process. Here's how I do it.
1) Check your Mutect callstats.txt file and find your mutation. Check out the "failure reason" column and it should give you a reason, ie. normal_lod, f_star_tumor_lod, alt_allele_in_normal, etc.
2) Go to this website:
This site connects the failure reason to the column name in the callstats.txt output that it's associated with.
For example, if the failure reason is "alt_allele_in_normal" then go to the "n_alt_count" or "normal_f" column in your extended output callstats.txt file to find the value.
3) Look at the vcf header to find the default threshold:
(from above)
4) Rerun mutect with these thresholds lowered/adjusted accordingly to include your mutation. So adding parameters to your command like:
--max_alt_alleles_in_normal_count=5 \

You could also filter callstats.txt manually if you weren't concerned about getting corrected VCF and other files.

It's a wonder that Mutect finds anything interesting at all with the stringency of their filters, it really depends on the purity of the your paired normal. 90% of the time Mutect's defaults filter out my primary somatic mutation in most of the cancer types I study. This is most likely because of the heterogeneity of the tumor and the often mixed tumor contamainated nature of the paired normal. However, even with a lowered threshold of alt allelic fraction to 20% and the a count of about 6, it still will sometimes miss some mutations that are near indels. There's is a gap_threshold parameter in Mutect but if you've followed the GATK best practices protocol and realigned around indels those gaps occur pretty frequently because you're realigned around them, and I've noticed a marked decrease in sensitivity in mutation detection around these areas. Lowering the gap_threshold defaults, however, will explode the number of false positives you get, so be careful which parameters you tweak and make sure you have a reason to tweak them.

Broad has yet to fix it's deprecated indel detector, but I predict future best practices pipeline will start with an indel caller first which will be subsequently used by the SNP detector (Mutect) to discover SNPs. Knowing the location of the indels, might fix this decreased sensitivity around indels.

Good luck!

Last edited by patterja; 06-07-2015 at 09:27 PM. Reason: More detailed description
patterja is offline   Reply With Quote
Old 06-08-2015, 01:23 PM   #6
Len Trigg
Registered Vendor
Location: New Zealand

Join Date: Jun 2011
Posts: 29

Just noticing your comment about problem with indels when using MuTect. You might want to try the RTG somatic caller (part of RTG Core), which uses the same haplotype calling engine as the regular RTG variant callers with the addition of Bayesian somatic mutation modelling, and so automatically handles SNPs, indels, and other complex calls. You do need matched tumor normal samples however.

The somatic caller in the current release of RTG Core (3.4.5) also outputs variants in putative LOH regions and gain-of-reference calls, so you may want to filter these out for normal somatic small variant detection. Both AVR and the somatic score field are good VCF attributes with which to tailor your precision/recall tradeoff. (We are currently doing additional work on our somatic caller, so expect further improvements in the next releases)

Len Trigg, Ph.D.
Real Time Genomics
Len Trigg is offline   Reply With Quote

exome analysis, gatk, mutect, ngs analysis, snp analysis

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 06:16 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO