Hi all,
I'm have a question. The situation is that I was given exome-seq data for 120 tumor/normal matched pairs (240 samples all together) to very high depth (~400x). But I was only given the SNP calls and not raw data or mapped bam files so I can't go back to the sequence data itself to look at things a little more closely. The SNPs were called using mutect, for which I have little experience with and have been trying to read up on and get information about, but not really finding answers to my questions.
So, in the outputted VCF files, in the filter field I have either PASS or mf1 (which is stated that it was filter out by mutect v1). But in looking at the data, I have a particular location, in an interesting gene that has been on our radar for awhile for various reasons. At this position, in ~80 of the samples, there is an identified SNP (always somatic, not in the matched normal). The coverage at the position is for the most part >=100x in both the tumor and normal and the alt allele in the tumor hovers around 10%. But, in all cases, the filter flag has an 'mf1', suggesting that mutect filtered it out. By all indication, this looks real to me, the mapping quality and other metrics look good. My question would be, could this really be a false positive site? If, this is a sequence artifact, how could so many samples have the same SNP? To me (and I am be wrong), but I would think that a sequence artifact would be random and not present at the same location in so many samples? Could this actually be a real site? Any thoughts?
I'm have a question. The situation is that I was given exome-seq data for 120 tumor/normal matched pairs (240 samples all together) to very high depth (~400x). But I was only given the SNP calls and not raw data or mapped bam files so I can't go back to the sequence data itself to look at things a little more closely. The SNPs were called using mutect, for which I have little experience with and have been trying to read up on and get information about, but not really finding answers to my questions.
So, in the outputted VCF files, in the filter field I have either PASS or mf1 (which is stated that it was filter out by mutect v1). But in looking at the data, I have a particular location, in an interesting gene that has been on our radar for awhile for various reasons. At this position, in ~80 of the samples, there is an identified SNP (always somatic, not in the matched normal). The coverage at the position is for the most part >=100x in both the tumor and normal and the alt allele in the tumor hovers around 10%. But, in all cases, the filter flag has an 'mf1', suggesting that mutect filtered it out. By all indication, this looks real to me, the mapping quality and other metrics look good. My question would be, could this really be a false positive site? If, this is a sequence artifact, how could so many samples have the same SNP? To me (and I am be wrong), but I would think that a sequence artifact would be random and not present at the same location in so many samples? Could this actually be a real site? Any thoughts?
Comment