Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to get FASTA sequences from GI number

    Hey guys I need help

    I need to download a large amount of FASTA sequences from a set of GI number.

    Is there any script to do this??

    I know I could do it with http://www.ncbi.nlm.nih.gov/sites/batchentrez , but I have too many sequences (and It says to split if they are too many) and I really don't want to do it via browser


    Thank you

  • #2
    Did you mean to cross post this in Biostars? Maybe remove the question from this list or from that list.

    Comment


    • #3
      Originally posted by bt27uk View Post
      Did you mean to cross post this in Biostars? Maybe remove the question from this list or from that list.
      Since I'm blocked with this problem for a couple of days, I tryied to ask in different forums with different people in order to get an answer as soon as possible.

      I can't see the problem.

      If it is contrary to any rules of seqanswer, I'll delete it

      Comment


      • #4
        cross posted on biostars: https://www.biostars.org/p/112410/

        Comment


        • #5
          One way to do this would be using blastdbcmd command that is part of the "blast" suite. You will need to have access to (or download the nt blast database indexes).

          You can put a list of your GI numbers in a file like so (one per line):

          Code:
          $ more gi_list.txt 
          4
          7
          78
          324
          
          $ blastdbcmd -entry_batch gi_list.txt -db /path_to/nt -outfmt "%f" -out seq_fasta_filename.fa

          Comment


          • #6
            Originally posted by GenoMax View Post
            One way to do this would be using blastdbcmd command that is part of the "blast" suite. You will need to have access to (or download the nt blast database indexes).

            You can put a list of your GI numbers in a file like so (one per line):

            Code:
            $ more gi_list.txt 
            4
            7
            78
            324
            
            $ blastdbcmd -entry_batch gi_list.txt -db /path_to/nt -outfmt "%f" -out seq_fasta_filename.fa
            Thank you very much.

            It is exactly what I was looking for!!!

            Comment


            • #7
              Originally posted by fefe89 View Post
              Since I'm blocked with this problem for a couple of days, I tryied to ask in different forums with different people in order to get an answer as soon as possible.

              I can't see the problem.

              If it is contrary to any rules of seqanswer, I'll delete it
              As has been said before it creates more work for folks who are answering the questions.

              It is ok to cross-post but please close your post out on all forums (cross-referencing the solution, once you find one that you like).

              Comment


              • #8
                Originally posted by GenoMax View Post
                As has been said before it creates more work for folks who are answering the questions.

                It is ok to cross-post but please close your post out on all forums (cross-referencing the solution, once you find one that you like).
                OK. The other post has been already closed.

                Comment


                • #9
                  Originally posted by fefe89 View Post
                  I need to download a large amount of FASTA sequences from a set of GI number.

                  Is there any script to do this??
                  The recommended way to do this is with Eutils. Eutils is a Web-service offert by the NCBI.

                  There already exist several threads about using Eutils as well in this forum as in Biostars.

                  Comment


                  • #10
                    Originally posted by GenoMax View Post
                    One way to do this would be using blastdbcmd command that is part of the "blast" suite. You will need to have access to (or download the nt blast database indexes).

                    You can put a list of your GI numbers in a file like so (one per line):

                    Code:
                    $ more gi_list.txt 
                    4
                    7
                    78
                    324
                    
                    $ blastdbcmd -entry_batch gi_list.txt -db /path_to/nt -outfmt "%f" -out seq_fasta_filename.fa
                    Hi !

                    I'm beginner in bioinformatics (and new on the forum) and I have the same problem as fefe89. Your answer (here above) seems totally appropriate for my problem but I have a very naive question (seems simple but I don't find an adequate answer on google) : how can I use the blastdbcmd command line if I don't want to download the (heavy) nt databases on my own computer ? Or am I forced to download the nt locally before running the command line ?

                    Thank you in advance for your understanding (certainly a newbies question...)

                    Comment


                    • #11
                      Originally posted by ericaf View Post
                      Hi !
                      how can I use the blastdbcmd command line if I don't want to download the (heavy) nt databases on my own computer ? Or am I forced to download the nt locally before running the command line ?

                      Thank you in advance for your understanding (certainly a newbies question...)
                      If you don't want to download the blast database locally take look at the NCBI e-utils option (referred to in one of the posts above). You will need to do some additional work to create the right query URL's.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Investigating the Gut Microbiome Through Diet and Spatial Biology
                        by seqadmin




                        The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
                        02-24-2025, 06:31 AM
                      • seqadmin
                        Quality Control Essentials for Next-Generation Sequencing Workflows
                        by seqadmin




                        Like all molecular biology applications, next-generation sequencing (NGS) workflows require diligent quality control (QC) measures to ensure accurate and reproducible results. Proper QC begins at nucleic acid extraction and continues all the way through to data analysis. This article outlines the key QC steps in an NGS workflow, along with the commonly used tools and techniques.

                        Nucleic Acid Quality Control
                        Preparing for NGS starts with isolating the...
                        02-10-2025, 01:58 PM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 03-03-2025, 01:15 PM
                      0 responses
                      151 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 02-28-2025, 12:58 PM
                      0 responses
                      230 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 02-24-2025, 02:48 PM
                      0 responses
                      599 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 02-21-2025, 02:46 PM
                      0 responses
                      262 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X