Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • "Invalid byte in GI list"

    Hello everyone,

    I am currently working with a local BLAST nucleotide database. After getting it set up I am able to BLAST FASTA files without any bother.

    What I wanted to do was only search the nt database for viruses.

    To this end I have downloaded the virus accession list from NCBI.

    I try to use the following command:

    $ blastn -db nt -query sequence.fasta -num_alignments 10 -num_descriptions 10 -evalue 1e-6 -gilist viruses.nbr -num_threads 4 -out sequence.tab

    When I input this command I get a result saying "Invalid byte in GI list" and the command does not run. Can anyone help me out with this error message? Has there been a problem downloading the accession list file?

    Thanks for the help.

  • #2
    Is there a header present in your gilist file? If it is there is try removing that.
    Last edited by GenoMax; 10-12-2015, 07:23 AM.

    Comment


    • #3
      Hi Genomax,

      Thanks for the advice. Yeah, there were header lines indicating accession number, organism name etc.

      Instead of using 'gilist' I ended up using 'seqidlist' which accepted my downloaded file. I am not sure if the results will differ using 'gilist' successfully but I will indeed try and remove the headers and re-run using 'gilist' to see if there are any differences.

      Cheers!

      Comment


      • #4
        If you had "gi's" then it may be best to stick with gilist option. Not sure if the gi is equivalent to seqid.

        If you expect to do this often then consider sub-setting the viruses set permanently.

        Comment


        • #5
          I had limited success with the GI List option. I have realised that the virus taxa file I downloaded was for whole genomes and not partially sequenced genomes.

          I went back and downloaded the GenBank viral database in a FASTA file.

          From this I want to make a custom viral database to put my sequences through in order to speed up processing time and get the data I want without any bacterial sequences etc. however a new problem occurred.

          when typing $ makeblastdb -help in order to even just get the possible options I get a 'segmentation fault' error. Is this due to RAM limitations or problems with the BLAST+ application?

          Cheers.

          Comment


          • #6
            What OS are you using?

            Comment


            • #7
              GenoMax,

              I am using GNOME CentOS 2.16.0.

              Outdated I am sure but my institution are picky about software unfortunately.

              Comment


              • #8
                Looking at your blast command line you appear to be using the latest blast+ package. Can you confirm that? If blastn from that package worked then I am not sure why you are getting a seg fault with makeblastdb. Perhaps that is using a library that is missing from your system. Going to be hard to fix.

                BTW: Are you really using a 10+ year old OS (if I googled it right)?

                Comment


                • #9
                  GenoMax,

                  Yep, I am using the latest BLAST+ package - 2.2.31 along with the most recent nt database.

                  Could it be possible an OS update could fix the problem?

                  And yes ha, we are using a 10 year old OS. As I mentioned my institute can be ridiculously picky when installing new software due to security measures. Even so a 10 year old OS is a bit ridiculous really.

                  Comment


                  • #10
                    You need a complete reinstall of a newer vintage OS

                    On a serious note, if you are not able to update the OS you could try compiling blast from source code (I am not even sure if that will work). Blast may expect latest libraries and such that are likely not going to be available in a 10 yr old OS. Even the compiler you have available will likely not work.

                    Comment


                    • #11
                      Thanks for the help GenoMax. I may just have to find a different computer to do this all on unfortunately.

                      Just as a revision, in case I have installed something incorrectly.

                      I downloaded the latest NCBI BLAST+ package which I then extracted.

                      I also downloaded the most recent nucleotide database which I extracted in to the .bin folder of the extracted BLAST+ package. Does this all sound correct?

                      I do not have much experience when running command line so perhaps I have installed something incorrectly.

                      Comment


                      • #12
                        For the purpose of what you were trying to run that all sounds right. I am surprised that even blastn worked considering makeblastdb generates a seg fault.

                        Comment


                        • #13
                          Thanks GenoMax.

                          Seems I will have to find an alternative route.

                          Sometimes if I am BLASTing a particularly large FASTA file I will get a segmentation fault a short time after the command has been input and it stops the process then and there resulting in incomplete output.

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Strategies for Sequencing Challenging Samples
                            by seqadmin


                            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                            03-22-2024, 06:39 AM
                          • seqadmin
                            Techniques and Challenges in Conservation Genomics
                            by seqadmin



                            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                            Avian Conservation
                            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                            03-08-2024, 10:41 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, Yesterday, 06:37 PM
                          0 responses
                          10 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, Yesterday, 06:07 PM
                          0 responses
                          9 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-22-2024, 10:03 AM
                          0 responses
                          51 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-21-2024, 07:32 AM
                          0 responses
                          67 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X