![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Extract sequences from a FASTQ file based on another file | caputcastellae | Bioinformatics | 3 | 08-14-2014 02:39 PM |
Extract gene sequences from gff3 file and reference fasta | JonB | Bioinformatics | 1 | 07-15-2014 01:13 AM |
How to extract unmapped reference sequence from BAM/SAM file.? | Pinal | Bioinformatics | 2 | 03-06-2014 01:40 PM |
Extract aligned reads from a BAM file above a certain threshold | The Snow | Bioinformatics | 4 | 07-29-2013 03:02 AM |
Extract aligned sequence coordinates from SAM or BAM file | pirates.genome | Bioinformatics | 5 | 08-20-2012 09:06 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Poland, Warsaw Join Date: Jan 2013
Posts: 10
|
![]()
I couldn't find relevant information. I hope it's not duplicate.
I have resequencing data for two maize lines (BAM and VCF files). I want to extract sequences (fasta) for several genes (I have also GFF3 file with annotation data) from reference genome and corresponding sequences from resequencing data. I probably could use sequence identifiers, as they are in my files Which tool allows to extract such data, and more generally to extract sequences for a given variant type (SNP, indel, etc) and location (exon, intron, etc)? Last edited by floem7; 01-17-2015 at 03:24 PM. Reason: Adding info. |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: Berlin, Germany Join Date: Jan 2015
Posts: 137
|
![]()
You can try
- "vcf-consensus" from VCFtools: http://vcftools.sourceforge.net/perl...#vcf-consensus. Click on "Read more" to get an example how to get the consensus for a given region within the reference sequence (you need to extract this information from your GFF). or - FastaAlternateReferenceMaker within GATK (https://www.broadinstitute.org/gatk/...renceMaker.php) Read the documentation thoroughly - there are several caveats! |
![]() |
![]() |
![]() |
#3 |
Senior Member
Location: Berlin, Germany Join Date: Jan 2015
Posts: 137
|
![]()
Forgot to mention, if you just want to extract FASTA sequences for GFF features (i.e. without any called variants applied), you can use BEDTools getfasta (http://bedtools.readthedocs.org/en/l.../getfasta.html).
|
![]() |
![]() |
![]() |
#4 |
Member
Location: Poland, Warsaw Join Date: Jan 2013
Posts: 10
|
![]()
Thanks, I've followed example found at vcftools page and it works :-)
Great thanks! Edit: however, I realized that aligned fasta format would be better. The aim is to quickly generate friendly msa view for a given region. For example for primer design. Certainly, ordinary MSA programs don't create sufficiently similar alignment as this in bam file. So it require manual inspection. Last edited by floem7; 01-19-2015 at 02:45 PM. Reason: adding info. |
![]() |
![]() |
![]() |
Tags |
bam, fasta extraction, vcf |
Thread Tools | |
|
|