SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
samtools mpileup segfault with positions list lovestow Bioinformatics 8 05-08-2015 01:30 PM
Problem replacing contigs in a dbsnp vcf using Picard SortVcf idedios Bioinformatics 0 04-16-2015 05:46 PM
Tools to generate VCF from two FASTA, or mutant FASTA from Ref FASTA and VCF? jeffseq Bioinformatics 3 05-28-2013 11:59 AM
How to get list of column in vcf file using Vcf.pm? jessada Bioinformatics 0 01-20-2012 08:22 AM
Replacing FASTA headers for TopHat & Cufflinks brachysclereid Bioinformatics 2 02-16-2011 05:44 AM

Reply
 
Thread Tools
Old 06-15-2017, 09:54 AM   #1
kpboh
Junior Member
 
Location: CO

Join Date: Oct 2014
Posts: 2
Default replacing specific positions in fasta from vcf/list

I have a reference assembly (in fasta format) and vcf file containing a list of specific sites. I'd like to edit the fasta file to change these positions to 'Ns'.

Does anyone have any suggestions for a tool to accomplish this? I also have a trimmed down version of the vcf that just contains chrom# and position...

Thanks in advance for any suggestions!
kpboh is offline   Reply With Quote
Old 06-15-2017, 07:36 PM   #2
neavemj
Member
 
Location: MA, USA

Join Date: Feb 2014
Posts: 58
Default

Sounds like a job for python or perl.

You could read through the vcf file and gather the positions in a dictionary. Then read through the fasta file and make the change to N at positions that match in the dictionary..
neavemj is offline   Reply With Quote
Old 06-16-2017, 12:00 PM   #3
kpboh
Junior Member
 
Location: CO

Join Date: Oct 2014
Posts: 2
Default

thanks for the reply. yeah--seems to be the way to go, but i'm unfortunately not fluent enough in either language.

i did find this example but couldn't get it to run properly (it output an entire new fasta for each individual position as it looped through the vcf instead of accumulating all the changes in the vcf before printing a single, mutated fasta). i suspect it's a trivial change to get it to work properly.

at any rate, i managed to hack a solution by changing the 'alt' allele in my vcf to 'N', modifying (using sed) all the GT values to "1/1", then feeding this file into GATK's FastaAlternateReferenceMaker tool. clearly far from elegant, but i checked the positions in question in the output and it seemed to have worked.
kpboh is offline   Reply With Quote
Old 06-18-2017, 04:05 PM   #4
neavemj
Member
 
Location: MA, USA

Join Date: Feb 2014
Posts: 58
Default

Clever solution!
neavemj is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:34 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO