SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Get Protein Sequences fasta file by using Entrez Gene Ids kurban910 Bioinformatics 2 07-03-2015 09:30 AM
Download aminoacid sequences from a set of uniprot IDs. anushavarma Bioinformatics 2 10-27-2014 06:30 AM
where can I download an example sequences sihua Bioinformatics 2 12-01-2011 04:26 PM
Download human gene sequences ritzriya Bioinformatics 6 03-24-2011 05:05 AM
download all gene sequences sinakv Bioinformatics 5 01-28-2010 01:19 AM

Reply
 
Thread Tools
Old 06-18-2018, 11:56 PM   #1
Toliman
Junior Member
 
Location: Cambridge

Join Date: Apr 2012
Posts: 2
Default Entrez esearch.fcgi large set of sequences download: fluctuating number of sequences

Hello everybody!

I have a little problem: I'm trying to download a large set of fasta sequences from Entrez, but the number of sequences retrieved differs from the announced number:
When I am using the link in a browser:
Code:
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=nucleotide&term=Phaeophyceae[Organism]&usehistory=y
It will give me a certain number of sequences.
But when I use the perl script (as explain here: https://www.ncbi.nlm.nih.gov/books/N..._esayers-5-4-3), the number of final fasta sequences is always lower...
I am using just a small difference compared with the tutorial:
Code:
        $efetch_url = $base ."efetch.fcgi?db=nucleotide&WebEnv=$web";
        $efetch_url .= "&query_key=$key&retstart=$retstart";
        $efetch_url .= "&retmax=$retmax&rettype=fasta&retmode=text";
Does anybody had the same problem before? Does anybody know where does it come from and how to fix it?

Thanks in advance,
Denis
Toliman is offline   Reply With Quote
Old 06-21-2018, 06:01 AM   #2
Toliman
Junior Member
 
Location: Cambridge

Join Date: Apr 2012
Posts: 2
Default

Ok, I found the problem: DO NOT USE eUtils !

Use Entrez Direct instead, It's really working (be sure to install the latest NCBI version and not the aptitude package)
https://www.ncbi.nlm.nih.gov/books/NBK179288/
Toliman is offline   Reply With Quote
Old 06-21-2018, 06:21 AM   #3
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,759
Default

NCBI requires use of API tokens with eUtils as May 1st, 2018. It you are running a large amount of queries some of those may fail, which may explain the variable numbers. Just a thought.

Last edited by GenoMax; 06-21-2018 at 06:26 AM.
GenoMax is offline   Reply With Quote
Reply

Tags
entrez id

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:51 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO