SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
gene location on UCSC vs NCBI nguyendofx Bioinformatics 2 01-28-2012 02:39 PM
how to download genome annotation file pfzhu Bioinformatics 1 09-25-2011 11:24 PM
Tools for converting JGI annotation data to NCBI .asn AppleInformatics Bioinformatics 0 04-06-2011 10:00 AM
Download Affy annotation using Command prompt Ankit Maroo General 1 03-19-2011 06:48 AM
download all gene sequences sinakv Bioinformatics 5 01-28-2010 01:19 AM

Reply
 
Thread Tools
Old 09-07-2011, 11:21 AM   #1
jgarbe
Member
 
Location: Saint Paul, MN

Join Date: Mar 2010
Posts: 13
Default How to download gene annotation from NCBI?

The NCBI Map Viewer has the latest pig genome build and shows the
locations of all the genes. I would like to download this gene
annotation so I can load it into my own GBrowse genome browser.
So I need the NCBI gene annotation for the latest pig genome build in
gff3 format, and the way to do it seems to be to download an asn.1
file from NCBI, convert it to genbank format, and then use the bioperl
script bp_genbank2gff3.pl to convert from genbank to gff3.

I downloaded the gene annotation for the pig genome from the NCBI ftp site at
ftp://ftp.ncbi.nlm.nih.gov/gene/DATA..._scrofa.ags.gz

I downloaded the asn2gb conversion program from
ftp://ftp.ncbi.nlm.nih.gov/asn1-conv...latform/linux/

I run ./linux.asn2gb -i Sus_scrofa.ags -b T
and get the error "Asn io_failure for input file 'Sus_scrofa.ags'"
I've tried all the options for the -a and -t flags without luck.

I'm able to convert the Sus_scrofa.ags file to xml format using the
gene2xml program, but I don't know of any tool that can convert from
XML to gff3.
I downloaded a genbank format file of pig genes from
ftp://ftp.ncbi.nlm.nih.gov/genomes/S...RNA/rna.gbk.gz but the
file doesn't give chromosome coordinates for the genes, so I can't
make a gff3 file out of it.

Any pointers on how to use the asn tools properly, or how to get NCBI
annotation in gff format in general, would be much appreciated.

Thanks

-John
jgarbe is offline   Reply With Quote
Old 09-08-2011, 07:18 AM   #2
darked89
Member
 
Location: Barcelona, Spain

Join Date: Jun 2009
Posts: 36
Default

I managed to run:

./gene2xml.linux -i Sus_scrofa.ags -b T -c T

This prints XML output. Strangely Sus_scrofa.ags had to be gzipped and named Sus_scrofa.ags.gz.

XML to gff convertion should be fairly easy, but I do not know a tool yet. You may check:
http://brendelgroup.org/mespar1/gthxml/gthxmlToGFF.py
darked89 is offline   Reply With Quote
Old 02-16-2012, 06:44 AM   #3
jjw14
Member
 
Location: Missouri

Join Date: Apr 2010
Posts: 39
Question

John,

I'm running into the same problem that you had. The NCBI Sus scrofa genome FTP site provides .asn, .fa, .gbk, .gbs, and .mfa files for each chromosome (last updated 10-12-2011).

Were you able to convert the .asn data to .gff3 or .gtf format for annotation? I'd be interested to hear the best method you found for generating the annotation file that corresponds to the most recent S. scrofa genome.

Thanks in advance,
jjw
jjw14 is offline   Reply With Quote
Old 02-16-2012, 06:55 AM   #4
jgarbe
Member
 
Location: Saint Paul, MN

Join Date: Mar 2010
Posts: 13
Default

I was not able to figure out how to convert any of the NCBI annotation data into a usable form. I sent an email to NCBI but didn't get a useful reply from them. Thankfully another group has generated a good gene build for Sscr10.2. As described here: http://animalgenome.org/pig/newsletter/No.110.html, you can download annotation at this site: http://gbi.agrsci.dk/pig/sscrofa10_2_annotation/
Alternatively, Ensembl is running 10.2 through their pipeline and should have a gene build available in two or three months. If you can wait that long that would be another good alternative to NCBI's annotation.
jgarbe is offline   Reply With Quote
Old 02-16-2012, 07:06 AM   #5
jjw14
Member
 
Location: Missouri

Join Date: Apr 2010
Posts: 39
Thumbs up

Thanks for the quick reply, John.

No doubt, you've saved me a lot of frustration. I appreciate it.

jjw
jjw14 is offline   Reply With Quote
Old 03-21-2012, 03:17 PM   #6
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Quote:
Originally Posted by jgarbe View Post
Any pointers on how to use the asn tools properly, or how to get NCBI annotation in gff format in general, would be much appreciated.
The NCBI are currently revising all their GFF3 output (it hadn't been compliant with the standards), so this should be much easier now/soon.

Try ftp://ftp.ncbi.nlm.nih.gov/genomes/Sus_scrofa/GFF/ for the NCBI RefSeq annotation of pig Sscrofa10.2
maubp is offline   Reply With Quote
Old 03-22-2012, 10:51 AM   #7
jjw14
Member
 
Location: Missouri

Join Date: Apr 2010
Posts: 39
Thumbs up

Thanks Peter,

I'll take a look.

jjw14
jjw14 is offline   Reply With Quote
Old 08-11-2014, 12:20 PM   #8
noha osman
Junior Member
 
Location: log angeles

Join Date: Aug 2014
Posts: 9
Default

Hi ,all
I need buffalo gff or.ggf3 file from NCBI but I donot know how can get it .
Could anyone help me to know the answer
Thanks
noha osman is offline   Reply With Quote
Old 08-12-2014, 01:57 AM   #9
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Bubalus bubalis? ftp://ftp.ncbi.nlm.nih.gov/genomes/Bubalus_bubalis/GFF/

Last edited by maubp; 08-12-2014 at 02:00 AM.
maubp is offline   Reply With Quote
Old 01-14-2015, 10:26 AM   #10
noha osman
Junior Member
 
Location: log angeles

Join Date: Aug 2014
Posts: 9
Default

Thanks Peter
noha osman is offline   Reply With Quote
Reply

Tags
annotation, gff3, ncbi

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:21 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO