Seqanswers Leaderboard Ad

**Jane M** · 05-22-2012, 12:13 AM

Dan, thank you a lot for your detailed answer.
1) I would like to come back to the Min-Reads2 and Min-Strands2 parameters.

Originally posted by dkoboldt View Post

Min-Reads2 and Min-Strands2
The min-reads2 parameter specifies the minimum number of variant-supporting reads required for VarScan to call a variant. This is a useful parameter for improving the specificity of mutation calling; internally, we require at least 4 supporting reads (in most cases) to call a variant.

I understand why they are used for, but I'm wondering what is happening in some cases, for example:

50 (reference in normal sample) 50 (variant in normal sample) 100 (reference in tumor sample) 0 (variant in tumoral sample)
0 (reference in normal sample) 100 (variant in normal sample) 100 (reference in tumor sample) 0 (variant in tumoral sample)

Since there is no variant in the tumor samples, if we use Min-Reads2>0, the position won't be considered as "LOH", whereas it is, am I right?

2) Then, I would like to know if VarScan could be use to compare 2 tumor samples. For several patients, I have a normal sample, an untreated sample and a treated sample.
I want to study the effect of the treatment.
I am comparing normal/untreated (1), normal/treated (2) and finally (1) vs (2). But it would be much faster to directly compare untreated/treated. I am wondering if there is something which prevents to do so in VarScan. What do you think ?

3) Finally, for the issue concerning the number of reads in .indel file, here are 2 postions:

Varscan output:

Code:

chrom	position	ref	var	normal_reads1	normal_reads2	normal_var_freq	normal_gt	tumor_reads1	tumor_reads2	tumor_var_freq	tumor_gt	somatic_status	variant_p_value	somatic_p_value	tumor_reads1_plus	tumor_reads1_minus	tumor_reads2_plus	tumor_reads2_minus																																		
chr1	20098	C	-AG	10	0	0%	C	6	2	25%	*/-AG	Somatic	1.0	0.1830065359477151	7	1	1	1																				
chr1	3418647	G	-C	16	0	0%	G	7	5	41,67%	*/-C	Somatic	1.0	0.008058608058608	4	6	4	1

Mpileup output:

Code:

$ grep -w 20098 fibros.mpileup
chr1	20098	c	10	.$..,,,.,.,	EJJIH?EEDD

$ grep -w 20098 tumor.mpileup
chr1	20098	c	8	....-2AG.,-2ag..	JJIEJEHF

Thank you for your help,
Jane

**cdry7ue** · 06-11-2012, 03:18 PM

Dan,
I was wondering how I could get each position specific hypothesis outputted from
pileup2snp, irrespective of quality,%variant etc. So absolutely no filtering of any kind.

I tried
--variant 1
--validation 1
--variant 0

etc but I still get filtered output only.

Also is there a way for the p-value to not come out as 0.98 by default. That problem is
solved by a ghetto trick of setting --p-value 0.9779 but is there an official way.

Thanks

-Ashish

**shyam_la** · 06-18-2012, 09:52 AM

Originally posted by Jane M View Post

Dan, thank you a lot for your detailed answer.
1) I would like to come back to the Min-Reads2 and Min-Strands2 parameters.

I understand why they are used for, but I'm wondering what is happening in some cases, for example:

50 (reference in normal sample) 50 (variant in normal sample) 100 (reference in tumor sample) 0 (variant in tumoral sample)
0 (reference in normal sample) 100 (variant in normal sample) 100 (reference in tumor sample) 0 (variant in tumoral sample)

Since there is no variant in the tumor samples, if we use Min-Reads2>0, the position won't be considered as "LOH", whereas it is, am I right?

2) Then, I would like to know if VarScan could be use to compare 2 tumor samples. For several patients, I have a normal sample, an untreated sample and a treated sample.
I want to study the effect of the treatment.
I am comparing normal/untreated (1), normal/treated (2) and finally (1) vs (2). But it would be much faster to directly compare untreated/treated. I am wondering if there is something which prevents to do so in VarScan. What do you think ?

3) Finally, for the issue concerning the number of reads in .indel file, here are 2 postions:

Varscan output:

Code:

chrom	position	ref	var	normal_reads1	normal_reads2	normal_var_freq	normal_gt	tumor_reads1	tumor_reads2	tumor_var_freq	tumor_gt	somatic_status	variant_p_value	somatic_p_value	tumor_reads1_plus	tumor_reads1_minus	tumor_reads2_plus	tumor_reads2_minus																																		
chr1	20098	C	-AG	10	0	0%	C	6	2	25%	*/-AG	Somatic	1.0	0.1830065359477151	7	1	1	1																				
chr1	3418647	G	-C	16	0	0%	G	7	5	41,67%	*/-C	Somatic	1.0	0.008058608058608	4	6	4	1

Mpileup output:

Code:

$ grep -w 20098 fibros.mpileup
chr1	20098	c	10	.$..,,,.,.,	EJJIH?EEDD

$ grep -w 20098 tumor.mpileup
chr1	20098	c	8	....-2AG.,-2ag..	JJIEJEHF

Thank you for your help,
Jane

Hi Jane,

I am new to bioinformatics. Infact, Im a biologist.
Since you appear to have dissected varscan over months, what is your opinion of it now? Do you think it fares well? I have used MuTect (beta) for my mutation calls on tumor-normal paired exomes and it seems to work well..
Have you used it?
Which one do you think is more reliable, from your experience?

I am confused about the varscan output too - particularly the somatic p-value column, which goes all the way upto "1" from 10 orders down, for sites that have been called somatic.
Even after removing all the sites with somatic p > .05 manually from the output - varscan2 still outputs 25% more sites than MuTect. I don't know which one to choose, if it possible to say one is better than the other at all..

Thanks.

**Jane M** · 06-18-2012, 10:39 AM

Hello,

I haven't used MuTect, so I cannot compare. I would like to hear your experience with it. Could you give me the range of the number of mutations that you have detected wit MuTect? Did you check it with classical sequencing?

I am using VarScan for a few months, but only VarScan2. I have detected somatic mutations and LOH, but we haven't verified them by sequencing yet.
I have added some filters to reduce my list of variants, as the biologists wanted. To increase reliability, we can tune the different thresholds (and it takes time) but it would be great to make a cautious choice based on a solid argument. I am rather satisfied with the results now. Overall, the tool is good, besides some potential issues that I reported on this website.

Instead of removing all the sites with somatic p > .05, I could advise you to take a look at the adjusted p-values. You will remove much more variants and it's more logical to use that. Here again, the threshold will be "an arbitrary" choice...

When a site is "somatic", the somatic p-value shouldn't be 1, it's weird... By the way, did you read the post of dkoboldt regarding the p-values?

**shyam_la** · 06-18-2012, 01:37 PM

Hi,

Well, I started on this project only 3 weeks back and learnt everything from scratch. Right now I have only 1 sample; more raw reads will be coming in soon. The mutation rate was ~100 / mbasepair on this sample which even though high, is consistent with this patients medical history.

I trust MuTect with its calls now, because the SNPs it gave me were over 95% C > T and G > A which is expected for this type of tumor and also there were stop codons in genes that were expected to have them in this type of tumor. But no not verified by classical sequencing yet.. Will do that when there are many more samples..

The only reason I was unhappy was I had to downgrade to older ref databases (dbsnp132 and the fasta file available on the mutect ftp), to get it to work in the first place, and so was searching for alternatives. I tested SomaticSniper yesterday and VarScan2 today.. I think both give too many false positives and require too much tweaking (for a biologist)..
But I managed to get MuTect to work with GRCh37.67 and dbSNP135 today (I just like to use the most updated stuff :P ). So my search ends here.. I am sticking with MuTect.
In my opinion, you should give it a shot - its much easier to use than VarScan, runs faster (-Xmx8g for both) and definitely makes great calls. But it doesn't provide info about LoH, if that is important to you..

Thanks.

PS: MuTect doesn't run with java 7.. Just discovered it today. You will have to use JRE 6..

**Jane M** · 06-19-2012, 01:55 AM

Thank you for the information! I will try MuTect, at least to compare the results with the ones of VarScan2.

**david.tamborero** · 06-22-2012, 06:34 AM

Hi!

I've returned to the 'somatic mutation detectors' world after a long time, and I am going to check the Varscan2.

I am interested in not taking into account any SNP found near an indel. This can be removed by the somaticFilter command of the Varscan2 (--indel-file), am I right?

On the other hand, I would love to use the false positive filter mentioned in the Varscan2 manuscript published in Genome Research (Table I). Can anyone tell me how to do it?

Thank you very much!

cheers,
david

**Jane M** · 06-22-2012, 06:52 AM

Hello,
SomaticFilter with the parameter --indel-file removes the SNPs near indels. So it's easy not to remove them!
I would like also to try the false positive filter. Dan Koboldt gave some info in his last post on this topic...

Once you will have tried VarScan2, can you please tell us if you notice results in the output files not verifying the depth criteria? Or with the parameter min-strand (=0)?

**david.tamborero** · 06-22-2012, 11:47 AM

Sorry for my last post, I've just found the scripts for running the false positive filter in the VArscan sourceforge page.

So it's time to go home, I will feedback the performace I will get as soon as possible!

Off topic, has any of you any experience with other tool for retrieving somatic mutations? I mean a specific tool for normal/tumor alignements, not to get mutations separately and then intersect them. I would love to use Mu Tect, but I have not received answer for testing it.

have a good weekend!
david

**Jane M** · 06-25-2012, 01:32 AM

Hello,

I am currently trying MuTect. I sent an email last week to the author and I got an answer quite fast. Did you use this address: [email protected]?
Jane

**vyellapa** · 06-25-2012, 02:26 PM

A question regarding multiple testing was asked and I was trying to look for how Varscan 2 adresses this. I could not find an answer and am curious if some correction to the p-value is made in the -p--somatic mode.

**Jane M** · 06-25-2012, 11:41 PM

This issue is not addressed in VarScan2, as far as I know. Personally, I do it by myself...

**david.tamborero** · 07-16-2012, 05:50 AM

Hi!

I am trying to use the false positive filter attached as a pl script within the Varscan source code. I've noticed that first I need to run the bam-readcount script. I've just tried it and I have two questions:

1) if i give a (whole) bam file to the bam-readcount, a segmentation fault occurs. I've noticed in the https://github.com/genome/bam-readcount page that this should be a bug, and they recommend to run the script by using a different bam file for each chromosome and then to join the resulting files. Is that true? has any of you a better idea? This is a lot of time for all the samples I want to process!

2) btw, i did not dive into the code, but the fpfilter.pl requires the varScan file (i assume that the output when running the somatic script) and the bam-readcount output. But, the bam readcount of what? the normal.bam , the tumor.bam, or both (somehow)??

I am a bit confused, any feedback will be appreciated!
cheers !
david

**dkoboldt** · 08-22-2012, 07:35 AM

David,

Thanks for the question. Currently, we run the filter with the bam-readcounts from the TUMOR BAM only for somatic mutations, as reads supporting the variant allele are required for the filter to work.

However, I haven't yet updated the FP filtering code to be compatible with VCF input. Also, bam-readcount doesn't expect VCF output either. This might help explain your segmentation fault issues.

As soon as I can, I will post an updated version of the script that accepts VCF. I'm not the author of the bam-readcount utility but will talk to them about adding VCF compatibility.

**jjinking** · 11-07-2012, 10:05 PM

VarScan Somatic P-values > threshold

Originally posted by david.tamborero View Post

Yes, I would say that about the Ho. On the other hand, the cut-off value for definining the somatic status (column 13 in the output) should be editable by the user by the --somatic-p-value argument (i did not check it). The default p value should be the one from which the somatic status is declared as somatic in the output. However, in my results varscan has defined as somatic several positions with a p larger than 5% (up to a p=0.48!), and it has no sense to me...

I looked inside VarScan source code and saw that in order to classify a variant as somatic, it checks to see if the computed somatic p-value is less than the somatic p-value threshold parameter, as well as whether the normal genotype for the variant position is homozygous reference. If the alternate allele frequency among the reads is 0.0 for the normal sample and the tumor sample genotype is heterozygous, then VarScan will ignore the somatic p-value threshold and classify the variant as "Somatic".

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 57 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 53 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 45 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News