SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
VarScan output to VCF OlgaMikh Bioinformatics 4 06-08-2012 12:46 AM
varscan output jgSoton Bioinformatics 1 12-20-2011 09:39 AM
QC in GATK VCF output jorge Bioinformatics 2 07-26-2011 11:34 PM
Cufflinks and CuffDiff bugs? lewewoo Bioinformatics 7 07-15-2011 12:33 AM

Reply
 
Thread Tools
Old 02-04-2014, 03:26 PM   #21
bw.
Member
 
Location: San Francisco, CA

Join Date: Mar 2012
Posts: 21
Default

I'm also seeing the slashes with VarScan v2.3.6
I wrote this script to convert the slashes to commas:

Code:
import sys

if len(sys.argv) < 2:  sys.exit("Usage: " + sys.argv[0] + "  vcf_filename")

in_fname = sys.argv[1]
out_fname = (in_fname[:-4] if in_fname.endswith(".vcf") else in_frame) + ".fixed.vcf"
print("Writing to: " + out_fname)
out = open(out_fname, "w")
for line in open(in_fname):
        if not line or line[0] is "#":
                out.write(line)
        else:
                fields = line.split("\t")
                fields[3] = fields[3].replace("/", ",").replace("\\", ",")   # remove any slashes from REF field
                fields[4] = fields[4].replace("/", ",").replace("\\", ",")   # remove any slashes from ALT field
                out.write("\t".join(fields))
To use, just copy-paste into a file (lets say script.py) and run:

python script.py file.vcf


Also, this version of the script just removes the vcf records with slashes:

Code:
import sys

if len(sys.argv) < 2:  sys.exit("Usage: " + sys.argv[0] + "  vcf_filename")

in_fname = sys.argv[1]
out_fname = (in_fname[:-4] if in_fname.endswith(".vcf") else in_frame) + ".fixed.vcf"
print("Writing to: " + out_fname)
out = open(out_fname, "w")
for line in open(in_fname):
        if not line or line[0] is "#":
                out.write(line)
        else:
                fields = line.split("\t")
                if "\\" not in (fields[3]+fields[4]) and "/" not in (fields[3]+fields[4]):
                        out.write("\t".join(fields))

Last edited by bw.; 02-05-2014 at 02:16 PM. Reason: Turns out slashes also sometimes appear in the REF field, so added checks for that.
bw. is offline   Reply With Quote
Old 02-11-2014, 10:35 AM   #22
IsmailM
Junior Member
 
Location: Los Angeles

Join Date: Apr 2013
Posts: 7
Default

If you are using VarScan mpileup2snp or mpileup2indel, why does the QUAL column not have a number in it?
IsmailM is offline   Reply With Quote
Old 02-27-2014, 10:24 AM   #23
coco90417
Junior Member
 
Location: los angeles

Join Date: Feb 2014
Posts: 1
Default Varscan vcf output for indel

Hi,

I am also encountering issues with vcf output of indels. I have got indels that look like this:

1 984171 . CAG AG .
1 1588744 . AGCG GCG .

I checked genome browser for the context of both mutations(http://genome.ucsc.edu/cgi-bin/hgTra...A984170-984180 and http://genome.ucsc.edu/cgi-bin/hgTra...588740-1588750), it seems that the first one is supposed to be simple deletion of the first base and the should look like this:

1 984170 . GC G .

And the second one can be either represented by a block substitution that looks like this:

1 1588743 . AAG AG .

or a deletion (if you align the deletion to the left) that looks like this:

1 1588742 . GA G .

So I do not know whether I did something wrong or it was because Varscan has a different vcf output format for indels?

Please help me. Many many thanks.
coco90417 is offline   Reply With Quote
Reply

Tags
annovar, varscan, vcf

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:54 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO