SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Downloading Genomic Data from NCBI from E-utils Direct roliwilhelm General 9 09-10-2015 01:12 PM
Trouble downloading data from Illumina iGenomes anwesharry Bioinformatics 5 06-03-2015 10:25 AM
Downloading metagenomes from ncbi from terminal vsindorf Metagenomics 1 04-27-2014 06:01 PM
Downloading all CSHL long RNA seq data from UCSC FTP server superfly RNA Sequencing 2 03-09-2014 06:23 PM
Resource for Downloading Raw 454 Data foolishbrat Bioinformatics 1 08-04-2009 08:14 PM

Reply
 
Thread Tools
Old 01-13-2022, 04:25 PM   #1
Cannon
Junior Member
 
Location: London

Join Date: Jan 2022
Posts: 1
Default Downloading data from ncbi

Hello
I've been trying to get the hang of NCBI's esearch suite as I want to download their gene summary paragraphs. Would anyone be able to clarify the correct code format for this to output a file that is of the form

Gene Summary
PCSK9 'This gene encodes...'

If this was for all genes or for a list provided both would work thanks

Last edited by Cannon; 01-13-2022 at 04:30 PM.
Cannon is offline   Reply With Quote
Old 01-14-2022, 06:41 PM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,138
Default

Using Entrezdirect:

Code:
$ esearch -db gene -query "PCSK9 [GENE] AND human [ORGN]" | efetch -format acc

1. PCSK9
Official Symbol: PCSK9 and Name: proprotein convertase subtilisin/kexin type 9 [Homo sapiens (human)]
Other Aliases: FH3, FHCL3, HCHOLA3, LDLCQ1, NARC-1, NARC1, PC9
Other Designations: proprotein convertase subtilisin/kexin type 9; convertase subtilisin/kexin type 9 preproprotein; neural apoptosis regulated convertase 1; subtilisin/kexin-like protease PC9
Chromosome: 1; Location: 1p32.3
Annotation: Chromosome 1 NC_000001.11 (55039548..55064852)
MIM: 607786
ID: 255738

2. PCSK9
Official Symbol: PCSK9 and Name: proprotein convertase subtilisin/kexin type 9 [Homo sapiens (human)]
Other Aliases: FH3, HCHOLA3, NARC-1, NARC1
Other Designations: Hypercholesterolemia, familial, 3; hypercholesterolemia, autosomal dominant 3
Chromosome: 1; Location: 1p34.1-p32
This record was replaced with GeneID: 255738
ID: 353175
GenoMax is offline   Reply With Quote
Old 01-14-2022, 06:43 PM   #3
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,138
Default

Another variation:

Code:
$ esearch -db gene -query "PCSK9 [GENE] AND human [ORGN]" | esummary | xtract -pattern DocumentSummary -element Name,Summary
PCSK9	This gene encodes a member of the subtilisin-like proprotein convertase family, which includes proteases that process protein and peptide precursors trafficking through regulated or constitutive branches of the secretory pathway. The encoded protein undergoes an autocatalytic processing event with its prosegment in the ER and is constitutively secreted as an inactive protease into the extracellular matrix and trans-Golgi network. It is expressed in liver, intestine and kidney tissues and escorts specific receptors for lysosomal degradation. It plays a role in cholesterol and fatty acid metabolism. Mutations in this gene have been associated with autosomal dominant familial hypercholesterolemia. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Feb 2014]
For more than one gene put them in a file:

Code:
$ more id
BRCA2
TP53
PCSK9

$ for i in `cat id`; do esearch -db gene -query "${i} [GENE] AND human [ORGN]" | esummary | xtract -pattern DocumentSummary -element Name,Summary; done
BRCA2	Inherited mutations in BRCA1 and this gene, BRCA2, confer increased lifetime risk of developing breast or ovarian cancer. Both BRCA1 and BRCA2 are involved in maintenance of genome stability, specifically the homologous recombination pathway for double-strand DNA repair. The largest exon in both genes is exon 11, which harbors the most important and frequent mutations in breast cancer patients. The BRCA2 gene was found on chromosome 13q12.3 in human. The BRCA2 protein contains several copies of a 70 aa motif called the BRC motif, and these motifs mediate binding to the RAD51 recombinase which functions in DNA repair. BRCA2 is considered a tumor suppressor gene, as tumors with BRCA2 mutations generally exhibit loss of heterozygosity (LOH) of the wild-type allele. [provided by RefSeq, May 2020]
TP53	This gene encodes a tumor suppressor protein containing transcriptional activation, DNA binding, and oligomerization domains. The encoded protein responds to diverse cellular stresses to regulate expression of target genes, thereby inducing cell cycle arrest, apoptosis, senescence, DNA repair, or changes in metabolism. Mutations in this gene are associated with a variety of human cancers, including hereditary cancers such as Li-Fraumeni syndrome. Alternative splicing of this gene and the use of alternate promoters result in multiple transcript variants and isoforms. Additional isoforms have also been shown to result from the use of alternate translation initiation codons from identical transcript variants (PMIDs: 12032546, 20937277). [provided by RefSeq, Dec 2016]
PCSK9	This gene encodes a member of the subtilisin-like proprotein convertase family, which includes proteases that process protein and peptide precursors trafficking through regulated or constitutive branches of the secretory pathway. The encoded protein undergoes an autocatalytic processing event with its prosegment in the ER and is constitutively secreted as an inactive protease into the extracellular matrix and trans-Golgi network. It is expressed in liver, intestine and kidney tissues and escorts specific receptors for lysosomal degradation. It plays a role in cholesterol and fatty acid metabolism. Mutations in this gene have been associated with autosomal dominant familial hypercholesterolemia. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Feb 2014]

Last edited by GenoMax; 01-14-2022 at 06:47 PM.
GenoMax is offline   Reply With Quote
Reply

Tags
gene annotation

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:06 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO