SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Why MAQ consensus seq better than SAMtools consensus ?? av_d Genomic Resequencing 5 10-18-2015 04:44 AM
Consensus from a VCF file szilva Bioinformatics 1 09-20-2012 10:01 PM
Questions about generating consensus sequences from the VCF file sulicon Bioinformatics 1 07-24-2012 07:47 AM
How to get list of column in vcf file using Vcf.pm? jessada Bioinformatics 0 01-20-2012 08:22 AM
VCFtools Vcf.pm problem - broken VCF header on 1000genomes data naumenko.sa Bioinformatics 1 07-08-2011 05:17 AM

Reply
 
Thread Tools
Old 03-02-2011, 03:00 PM   #1
dbrami
Member
 
Location: San Diego

Join Date: Sep 2008
Posts: 14
Question New consensus from VCF

Hello,

I have a reference file in FASTA format and list of indels in VCF format (courtesy of GATK).

How can I generate a NEW consensus that incorporates the modifications from the the VCF file?
I am positive a tool like this must exist, although I've written my own naive version in Perl.

Would it be a part of GATK, VCFTools or other?

Regards,

Daniel
dbrami is offline   Reply With Quote
Old 03-03-2011, 12:13 AM   #2
iansealy
Member
 
Location: Hitchin, UK

Join Date: Oct 2010
Posts: 15
Default

Dear Daniel,

I haven't tried it yet, but the announcement for the latest version of SAMtools says:

Quote:
Implemented "vcfutils.pl vcf2fq" to generate a consensus sequence similar to "samtools.pl pileup2fq".
See http://sourceforge.net/mailarchive/f...tools-announce and http://samtools.svn.sourceforge.net/...ls.pl?view=log

Cheers,
Ian
iansealy is offline   Reply With Quote
Old 03-03-2011, 09:59 AM   #3
dbrami
Member
 
Location: San Diego

Join Date: Sep 2008
Posts: 14
Default

Thank you Ian,

I will give it a look.
dbrami is offline   Reply With Quote
Old 03-03-2011, 01:14 PM   #4
iansealy
Member
 
Location: Hitchin, UK

Join Date: Oct 2010
Posts: 15
Default

Sorry, looks like I led you astray:

http://sourceforge.net/mailarchive/f...=samtools-help
iansealy is offline   Reply With Quote
Old 03-03-2011, 04:38 PM   #5
dbrami
Member
 
Location: San Diego

Join Date: Sep 2008
Posts: 14
Default

It didnt work for me.
I wrote to the SAM toosl developer and got this response:
Quote:
Firstly, vcf2fq only filters SNPs around indels, but does not build indels into the consensus. Secondly, it requires nearly *every* base in the reference genome to be present in the input, no matter whether there is a variant. For now, vcf2fq only works with samtools all-site BCF/VCF as other callers do not generate information at all sites.

vcf2fq is mainly useful to people who not only want to get the SNPs, but also intend to know the regions where a call can be made. This is essential for most popgen studies.

Heng
I think I might have to find a perl VCF parser (part of sam tools page) and write my own consensus script.
dbrami is offline   Reply With Quote
Old 03-04-2011, 08:46 AM   #6
wormseq
Junior Member
 
Location: Florida

Join Date: May 2010
Posts: 2
Default

Hi dbrami,

If you find a solution (i.e a perl script) could you please post it.

Thank you.
wormseq is offline   Reply With Quote
Old 03-08-2011, 08:36 AM   #7
dagarfield
Member
 
Location: Heidelberg, Germany

Join Date: Aug 2010
Posts: 39
Default

Does this link help you out?
It describes moving from mpileup (pileup is now outdated) into a consensus sequence by shooting the results of mpileup through two other utilities.

http://samtools.sourceforge.net/mpileup.shtml
dagarfield is offline   Reply With Quote
Old 07-24-2012, 07:49 AM   #8
nupurgupta
Member
 
Location: New Jersey

Join Date: Aug 2010
Posts: 29
Default

Yeah, except that vcfutils has an error if your VCF entry doesn't have an FQ value. Which is true in current format of VCF. And I don't feel comfortable modifying the vcfutils.pl file. So if anyone knows of another solution, would be great to know.
nupurgupta is offline   Reply With Quote
Old 09-20-2012, 10:00 PM   #9
James Hane
Member
 
Location: Perth, Australia

Join Date: Apr 2010
Posts: 11
Default

I haven't used vcf2fq before, however samtools pileup in the past had problems with incorporating indels into the consensus. I regularly use GATK's FastaAlternateReferenceMaker for this which handles both SNPs and indels very well... GATK also allows you to vastly improve the accuracy of your variant calls by running steps like RealignerTargetCreator/IndelRealigner prior to final consensus generation.
James Hane is offline   Reply With Quote
Old 09-20-2012, 11:35 PM   #10
dagarfield
Member
 
Location: Heidelberg, Germany

Join Date: Aug 2010
Posts: 39
Default

I think this problem of indels has been at least partially addressed using mpileup. That said, I've heard good things about the route James is recommending.
dagarfield is offline   Reply With Quote
Reply

Tags
bwa, gatk, indel, vcf, vcftools

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:21 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO