SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Extract sequences from a FASTQ file based on another file caputcastellae Bioinformatics 3 08-14-2014 01:39 PM
Extract gene sequences from gff3 file and reference fasta JonB Bioinformatics 1 07-15-2014 12:13 AM
How to extract unmapped reference sequence from BAM/SAM file.? Pinal Bioinformatics 2 03-06-2014 12:40 PM
Extract aligned reads from a BAM file above a certain threshold The Snow Bioinformatics 4 07-29-2013 02:02 AM
Extract aligned sequence coordinates from SAM or BAM file pirates.genome Bioinformatics 5 08-20-2012 08:06 AM

Reply
 
Thread Tools
Old 01-17-2015, 02:22 PM   #1
floem7
Member
 
Location: Poland, Warsaw

Join Date: Jan 2013
Posts: 10
Default Extract reference and aligned sequences from BAM file basing on VCF file

I couldn't find relevant information. I hope it's not duplicate.

I have resequencing data for two maize lines (BAM and VCF files).
I want to extract sequences (fasta) for several genes (I have also GFF3 file with annotation data) from reference genome and corresponding sequences from resequencing data. I probably could use sequence identifiers, as they are in my files

Which tool allows to extract such data, and more generally to extract sequences for
a given variant type (SNP, indel, etc) and location (exon, intron, etc)?

Last edited by floem7; 01-17-2015 at 02:24 PM. Reason: Adding info.
floem7 is offline   Reply With Quote
Old 01-18-2015, 11:20 PM   #2
sarvidsson
Senior Member
 
Location: Berlin, Germany

Join Date: Jan 2015
Posts: 137
Default

You can try

- "vcf-consensus" from VCFtools: http://vcftools.sourceforge.net/perl...#vcf-consensus. Click on "Read more" to get an example how to get the consensus for a given region within the reference sequence (you need to extract this information from your GFF).

or

- FastaAlternateReferenceMaker within GATK (https://www.broadinstitute.org/gatk/...renceMaker.php)

Read the documentation thoroughly - there are several caveats!
sarvidsson is offline   Reply With Quote
Old 01-18-2015, 11:25 PM   #3
sarvidsson
Senior Member
 
Location: Berlin, Germany

Join Date: Jan 2015
Posts: 137
Default

Forgot to mention, if you just want to extract FASTA sequences for GFF features (i.e. without any called variants applied), you can use BEDTools getfasta (http://bedtools.readthedocs.org/en/l.../getfasta.html).
sarvidsson is offline   Reply With Quote
Old 01-19-2015, 03:08 AM   #4
floem7
Member
 
Location: Poland, Warsaw

Join Date: Jan 2013
Posts: 10
Default

Thanks, I've followed example found at vcftools page and it works :-)

Great thanks!

Edit: however, I realized that aligned fasta format would be better. The aim is to quickly generate
friendly msa view for a given region. For example for primer design.

Certainly, ordinary MSA programs don't create sufficiently similar alignment as this in bam file.
So it require manual inspection.

Last edited by floem7; 01-19-2015 at 01:45 PM. Reason: adding info.
floem7 is offline   Reply With Quote
Reply

Tags
bam, fasta extraction, vcf

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:26 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO