SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
plant annotation pipeline for assembled RNAseq? vebaev RNA Sequencing 27 08-19-2014 01:43 AM
Questions about VarScan alexbmp Bioinformatics 7 02-20-2014 11:05 AM
ncRNA annotation pipeline ideas Francisc Bioinformatics 0 03-22-2012 06:00 AM
error from VarScan shuang Bioinformatics 1 10-13-2011 08:35 AM
ChIP-Seq: Fish the ChIPs: a pipeline for automated genomic annotation of ChIP-Seq dat Newsbot! Literature Watch 0 10-08-2011 03:40 AM

Reply
 
Thread Tools
Old 04-27-2012, 12:48 AM   #1
dkrtndhkd
Member
 
Location: Seoul

Join Date: Jan 2012
Posts: 42
Smile varscan-annotation pipeline?

How to connect VarScan output and annotation tools?

is there any useful tool to directly annotate the varscan's output file?

or I have to change the form of the output file to vcf format?
dkrtndhkd is offline   Reply With Quote
Old 05-01-2012, 08:17 AM   #2
dkoboldt
Member
 
Location: St. Louis

Join Date: Mar 2009
Posts: 62
Default

Hello,

The latest version of VarScan (v2.2.11, just posted) includes a VCF output option for somatic mutations.

This option was already available for multi-sample germline variant calling (mpileup2snp, mpileup2cns, mpileup2indel commands).

Just set --output-vcf to 1.

Yours,

Dan Koboldt
dkoboldt is offline   Reply With Quote
Old 05-02-2012, 07:03 AM   #3
mark.dunning
Junior Member
 
Location: Cambridge, Uk

Join Date: Feb 2012
Posts: 2
Default

Can I ask if the vcf provided by varscan is valid though? I have used the latest version and tried to annotate with annovar (via their conversion perl script) but I get an error.

NOTICE: for SNPs, column 6 and beyond MAY BE heterozygosity status, quality score, read depth, RMS mapping quality, quality by depth, if these information can be recognized automatically
NOTICE: for indels, column 6 and beyond MAY BE heterozygosity status, quality score, read depth, read count supporting indel call, RMS mapping quality, if these information can be recognized automatically

Similarly, using vcf-stats from vcftools also gives an error;

Different number of columns at chr1:12198 (expected 10, got 9)
Error not recoverable, exiting.


Here is the head of my varscan vcf file

##fileformat=VCFv4.0
##source=VarScan2
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##FILTER=<ID=str10,Description="Less than 10% or more than 90% of variant supporting reads on one strand">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1
chr1 12198 G C . PASS DP=107 GT:GQP 1/1:35:107
chr1 12266 G A . PASS DP=53 GT:GQP 0/1:4:53


Regards,

Mark
mark.dunning is offline   Reply With Quote
Old 05-04-2012, 09:39 PM   #4
dkrtndhkd
Member
 
Location: Seoul

Join Date: Jan 2012
Posts: 42
Default

what about somatic option??

I couldn't find the vcf file output option command...
dkrtndhkd is offline   Reply With Quote
Old 05-22-2012, 10:19 PM   #5
fjrossello
Member
 
Location: Melbourne (Victoria) Australia

Join Date: Sep 2011
Posts: 30
Default

Quote:
Originally Posted by dkrtndhkd View Post
what about somatic option??

I couldn't find the vcf file output option command...
Hi dkrtndhkd,

You can also set --output-vcf to 1 for somatic.

Cheers,

Fernando
fjrossello is offline   Reply With Quote
Old 06-08-2012, 12:41 AM   #6
oliviajm
Member
 
Location: france

Join Date: Apr 2012
Posts: 13
Default

Quote:
Originally Posted by mark.dunning View Post
Can I ask if the vcf provided by varscan is valid though? I have used the latest version and tried to annotate with annovar (via their conversion perl script) but I get an error.

NOTICE: for SNPs, column 6 and beyond MAY BE heterozygosity status, quality score, read depth, RMS mapping quality, quality by depth, if these information can be recognized automatically
NOTICE: for indels, column 6 and beyond MAY BE heterozygosity status, quality score, read depth, read count supporting indel call, RMS mapping quality, if these information can be recognized automatically

Similarly, using vcf-stats from vcftools also gives an error;

Different number of columns at chr1:12198 (expected 10, got 9)
Error not recoverable, exiting.


Here is the head of my varscan vcf file

##fileformat=VCFv4.0
##source=VarScan2
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##FILTER=<ID=str10,Description="Less than 10% or more than 90% of variant supporting reads on one strand">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1
chr1 12198 G C . PASS DP=107 GT:GQP 1/1:35:107
chr1 12266 G A . PASS DP=53 GT:GQP 0/1:4:53


Regards,

Mark
Hi Mark,

I got a similar problem with another software when I tried to provide it with a vcf file coming from VarScan mpileup2indel. It seems that in the vcf files obtained with VarScan the QUAL column is empty. So when the file is open by another tool, the number of column is wrong and the data in the columns don't match with the name of the column. ("PASS" should be in the "FILTER" column, and here it seems to be in the "QUAL" column.)
So you need to add a column filled with a dot under the "QUAL" name.
In my case, I used the command :
awk '{ if ($1 ~ "^#") { print $0} else { sub("",".\t",$6); print $1"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6"\t"$7"\t"$8"\t"$9"\t"$10"\t"$11"\t"$12} }' VarScanfile.vcf > outputFile.vcf
and it solved the problem.

Hope it will help you.

Olivia

EDIT : just found this : http://seqanswers.com/forums/showthread.php?t=20000

Last edited by oliviajm; 06-08-2012 at 01:00 AM.
oliviajm is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:43 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO