Hello,
I have used Varscan somatic today for the first time. I have provided the "normal" and the "tumoral" bam files with a reference, hg19.fasta. I'm looking for certain information, that I haven't found yet.
1) First, I would like to know if there exists a documentation for this tool, as a publication or a pdf which describe each option?
2) Then, I am very interested in understanding clearly all the p-values.
In the inputs, we can specify 2 p-values:
In the outputs, there are also 2 p-values:
Somatic-p-value is most likely the computed p-value of the test: H0 "the variant is not somatic", for wich we gave a threhold in the inputs.
I don't understand very much the variant-p-value. When checking the output file, I understood that this is the result of the statistical test: H0 "the variant is not germline", but I'm not sure...
I wonder about the way to compute those p-values. In the website, I read that an exact Fisher test is computed. But for the tumor read count vs. normal read count, we have only one count per sample, so do they compare 1 value in each condition? How is it possible to use this test?
3) Third, I would like more precision about the inputs "purity" and strand filter". I have kept the default values because I dont know what they are exaclty.
4) Finally, I have the impression that the problem of false discovery rate is not taken into account... Is there a way to compute the adjuted p-values to control it?
Thank you for the help you can give me for any of these points
Jane
I have used Varscan somatic today for the first time. I have provided the "normal" and the "tumoral" bam files with a reference, hg19.fasta. I'm looking for certain information, that I haven't found yet.
1) First, I would like to know if there exists a documentation for this tool, as a publication or a pdf which describe each option?
2) Then, I am very interested in understanding clearly all the p-values.
In the inputs, we can specify 2 p-values:
- p-value: P-value threshold to call a heterozygote
So, I guess the null hypothesis is: H0 "the individual is homozygous at this site". We test H0 at the risk 0.1 (default value)%.
If the p-value is less than 0.1% then, H0 is rejected: the individual is heterozygous.
Am I right? I found a default value of 0.9% when reading the explanation given when using the software. Are they equivalent?
- somatic-p-value: P-value threshold to call a somatic site
I guess the null hypothesis is: H0 "the variant is not somatic". We test H0 at the risk 0.0001 (default value)%.
If the p-value is less than 0.0001% then, H0 is rejected: the variant is somatic.
In the outputs, there are also 2 p-values:
- somatic-p-value: Significance of tumor read count vs. normal read count
- variant-p-value: Significance of variant read count vs. baseline error rate
Somatic-p-value is most likely the computed p-value of the test: H0 "the variant is not somatic", for wich we gave a threhold in the inputs.
I don't understand very much the variant-p-value. When checking the output file, I understood that this is the result of the statistical test: H0 "the variant is not germline", but I'm not sure...
I wonder about the way to compute those p-values. In the website, I read that an exact Fisher test is computed. But for the tumor read count vs. normal read count, we have only one count per sample, so do they compare 1 value in each condition? How is it possible to use this test?
3) Third, I would like more precision about the inputs "purity" and strand filter". I have kept the default values because I dont know what they are exaclty.
4) Finally, I have the impression that the problem of false discovery rate is not taken into account... Is there a way to compute the adjuted p-values to control it?
Thank you for the help you can give me for any of these points
Jane
Comment