Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • obtaining sequence names after local blast

    Dear all,

    i am trying to create an annotation for a custom array of a non-model species. I am running the latest version of BLAST+ (2.2.29+) and using the blastx. My output does not contain the gene names and i would like to have those.

    I know this topic has been covered in a previous thread (http://seqanswers.com/forums/showthread.php?t=14031) and that it is not possible to obtain gene informations using blast+.

    I am wondering if there are alternative ways by which other people achieve this tasks (i am not keen in using blast2GO as it runs very slowly). I would appreciate some hints.

  • #2
    The manual. You could try e.g. the stitle (subject title) flag. Alternatively, it shouldn't be very hard to link subject gi's or accessions to other information with entrez direct..
    savetherhino.org

    Comment


    • #3
      Thanks rhinoceros,

      i was looking at the options via the blastx -help on the terminal and it does not have that stitle flag amongst the options for some reasons. I am certainly trying that.

      Also thanks for the entrez direct, i wasnt aware of it. I am new to blast and in general new to bioinformatics so i apologize if the question was very basic. I guess we all ahve to start somewhere. :-)

      Comment


      • #4
        Originally posted by cdes79 View Post
        Thanks rhinoceros,
        i was looking at the options via the blastx -help on the terminal and it does not have that stitle flag amongst the options for some reasons. I am certainly trying that.
        It's an option within -outfmt, e.g. -outfmt '6 std stitle' gives you the standard tabular output + stitle as the last column.
        savetherhino.org

        Comment


        • #5
          Originally posted by rhinoceros View Post
          It's an option within -outfmt, e.g. -outfmt '6 std stitle' gives you the standard tabular output + stitle as the last column.
          i know, i read the terminal manual. But again it is not there. Pasted below the relevant section. Anyway, i am running it as we speak and it is running fine. I'll see the output when it comes out.

          *** Formatting options
          -outfmt <String>
          alignment view options:
          0 = pairwise,
          1 = query-anchored showing identities,
          2 = query-anchored no identities,
          3 = flat query-anchored, show identities,
          4 = flat query-anchored, no identities,
          5 = XML Blast output,
          6 = tabular,
          7 = tabular with comment lines,
          8 = Text ASN.1,
          9 = Binary ASN.1,
          10 = Comma-separated values,
          11 = BLAST archive format (ASN.1)

          Options 6, 7, and 10 can be additionally configured to produce
          a custom format specified by space delimited format specifiers.
          The supported format specifiers are:
          qseqid means Query Seq-id
          qgi means Query GI
          qacc means Query accesion
          qaccver means Query accesion.version
          qlen means Query sequence length
          sseqid means Subject Seq-id
          sallseqid means All subject Seq-id(s), separated by a ';'
          sgi means Subject GI
          sallgi means All subject GIs
          sacc means Subject accession
          saccver means Subject accession.version
          sallacc means All subject accessions
          slen means Subject sequence length
          qstart means Start of alignment in query
          qend means End of alignment in query
          sstart means Start of alignment in subject
          send means End of alignment in subject
          qseq means Aligned part of query sequence
          sseq means Aligned part of subject sequence
          evalue means Expect value
          bitscore means Bit score
          score means Raw score
          length means Alignment length
          pident means Percentage of identical matches
          nident means Number of identical matches
          mismatch means Number of mismatches
          positive means Number of positive-scoring matches
          gapopen means Number of gap openings
          gaps means Total number of gaps
          ppos means Percentage of positive-scoring matches
          frames means Query and subject frames separated by a '/'
          qframe means Query frame
          sframe means Subject frame
          btop means Blast traceback operations (BTOP)
          When not provided, the default value is:
          'qseqid sseqid pident length mismatch gapopen qstart qend sstart send
          evalue bitscore', which is equivalent to the keyword 'std'
          Default = `0'

          Comment


          • #6
            Just use the XML output (outfmt -5) and parse it to obtain gene names.

            Comment


            • #7
              @cdes76: You need a more recent version of blast. Mine is blastx: 2.2.29+
              Package: blast 2.2.29, build Dec 10 2013 14:41:40 and has 'stitle' in it.

              @Birdman: IMHO parsing XML is not that easy. Oh, you and I can do it but the casual user will have more problems. Did you see the recent note from NCBI saying that they want input on how to make their XML more standard/parsable?

              Comment


              • #8
                Originally posted by westerman View Post
                @cdes76: You need a more recent version of blast. Mine is blastx: 2.2.29+
                Package: blast 2.2.29, build Dec 10 2013 14:41:40 and has 'stitle' in it.

                @Birdman: IMHO parsing XML is not that easy. Oh, you and I can do it but the casual user will have more problems. Did you see the recent note from NCBI saying that they want input on how to make their XML more standard/parsable?
                thanks westerman for the support, i think it is easy to forget how daunting this field can be, particularly for people that are not dedicated bioinformaticians, but biologists trying to use new tools. Anyway, back to us i think i figured what the problem might be. I said before i could not find the "stitle" and actually when i used it did not add the sequence info to the output.

                Then i noticed that although i installed the latest version 2.2.29+ when i go blastx -h it tells me in the description that i have 2.2.25+. I had a previous blast version installed and probably that is why i am experiencing the problem.

                I am now running the command giving the path to the right blastx and see what happens (it is running now). Do you know how i can fix this problem and make sure the blastx runs from the right folder? I assume i should change the PATH? How so?

                Thanks, Christian

                Comment


                • #9
                  Originally posted by cdes79 View Post
                  I am now running the command giving the path to the right blastx and see what happens (it is running now). Do you know how i can fix this problem and make sure the blastx runs from the right folder? I assume i should change the PATH? How so?
                  Changing the PATH is a good idea. I am sure there are many tutorials on how to do so out there. In general, from Bash,

                  export PATH=/new/path:$PATH

                  Comment


                  • #10
                    Originally posted by cdes79 View Post
                    I am now running the command giving the path to the right blastx and see what happens (it is running now). Do you know how i can fix this problem and make sure the blastx runs from the right folder? I assume i should change the PATH? How so?
                    At the command line:

                    Code:
                    which blastx
                    Go ahead and delete the whole dir (except if it's something like /usr/bin or /usr/local/bin in which case just delete the blast binaries). Then change paths in your .bashrc (or equivalent depending on your OS)..
                    Last edited by rhinoceros; 03-19-2014, 05:59 AM.
                    savetherhino.org

                    Comment


                    • #11
                      Thanks all of you for the fantastic help! Everything worked fine!!!

                      Comment


                      • #12
                        You need BLAST+ 2.2.28 or later for the stitle field and related new columns, see:
                        This is an open letter to the NCBI BLAST+ team to request two simple enhancements which I think would be extremely useful - first and foremo...

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Strategies for Sequencing Challenging Samples
                          by seqadmin


                          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                          03-22-2024, 06:39 AM
                        • seqadmin
                          Techniques and Challenges in Conservation Genomics
                          by seqadmin



                          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                          Avian Conservation
                          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                          03-08-2024, 10:41 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, Yesterday, 06:37 PM
                        0 responses
                        12 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, Yesterday, 06:07 PM
                        0 responses
                        10 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 03-22-2024, 10:03 AM
                        0 responses
                        52 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 03-21-2024, 07:32 AM
                        0 responses
                        68 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X