![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Extract gene sequences from gff3 file and reference fasta | JonB | Bioinformatics | 1 | 07-15-2014 01:13 AM |
Grouping fasta entries from different files based on reference/name | gevielr | Bioinformatics | 2 | 05-06-2014 01:45 AM |
Does Samtools mpileup command require a reference fasta? | rcapper | Bioinformatics | 9 | 06-04-2013 01:02 PM |
Convert WIG file into Fasta file | kumardeep | Bioinformatics | 3 | 08-23-2012 05:56 AM |
Lower case characters in FASTa reference sequence | foxyg | Bioinformatics | 5 | 09-08-2010 02:08 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Cupertino, CA Join Date: Nov 2011
Posts: 59
|
![]()
Hi all,
I downloaded FASTA files from NCBI and tried to use them in my sequencing pipeline. The issue is that within the FASTA file the sequences were represented using their genbank accession and due to this the genbank accession also appreared in the VCF file which is undesirable. Is there a way to prepare a FASTA file so that chromosome number is used or rather is there a way to "prepare" a FASTA file for sequencing use? Thank in advance. |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,080
|
![]()
What "genome" is this?
You can get pre-formatted sequence, annotation and index files for a number of common organisms at the iGenomes site: http://support.illumina.com/sequenci...e/igenome.html |
![]() |
![]() |
![]() |
#3 |
Member
Location: Cupertino, CA Join Date: Nov 2011
Posts: 59
|
![]()
That site has only up to hg18 for human but if I wanted to use much older genome assemblies? And my work is for human genome analysis.
|
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,080
|
![]()
I would suggest that you get older data from from UCSC: http://hgdownload.soe.ucsc.edu/downloads.html#human Look for "chromFa.zip" files in the "Full Dataset" links.
|
![]() |
![]() |
![]() |
#5 |
Member
Location: Cupertino, CA Join Date: Nov 2011
Posts: 59
|
![]()
We did take those file and there is no issue with those files. There are some other old genomes which we require and that is the issue.
I am wondering if there is some code that can standardize sequence names across fasta file? |
![]() |
![]() |
![]() |
#6 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,080
|
![]()
You could manually change the fasta headers to suite your purposes and remake the indexes.
|
![]() |
![]() |
![]() |
Tags |
fasta, reference file, vcf |
Thread Tools | |
|
|