Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Searching databases

    Dear all,

    I am new to whole genome sequencing, I never sequenced complete genomes. I used to sequence small pieces of DNA.
    However I have been sequencing lots of DNA lately and I realised I never knew how to look for multiple DNA sequences at the same time in for example the NCBI database.

    Normally I only have 1 or maybe 2, max 3 sequences and then I simple enter then manually and search for matches, but now I got 98 contigs/sequences, how can I enter those all at once so that I dont need to enter 98 sequences one by one manuelly?

    There must be programs out there, but I am guessing they cost a lot of money?
    And since I am not good with writing my own programs, programming such thing isnt much of an option I am afraid.

    Or is there an option to enter more sequences at once at the ncbi website?


    thanks in advance.

  • #2
    Originally posted by phillie View Post
    Or is there an option to enter more sequences at once at the ncbi website?
    If you hve sequence IDs for which you want to retrieve information, see if Batch Entrez suites you http://www.ncbi.nlm.nih.gov/sites/batchentrez

    If you have a file of sequences (e.g. fasta) to blast then the web interface of blast (http://blast.ncbi.nlm.nih.gov/Blast.cgi) lets you upload that file and search different databases.

    If can be more specific about what you want to achieve maybe you can get better answers...

    Best
    Dario

    Comment


    • #3
      Hallo dariober,

      WHat I have, are simple faste files with nucleotide sequences.
      And what I do is pretty simple: I use the second link you gave me to search for matches in the databases.

      However: I want to know how I can make it easier for myself when I have for example 50 different files/contigs.

      At this moment: I manually copy and paste each file in the ncbi database, search for matches, check the matches, and repeat this for the second sequence (untill know I just sequence perhaps 1 or 2 contigs each week).
      But what when I get 50 contigs at once.. I dont want to repeat the search 50 times for each file..
      So I wonder if I can simple "load" the 50 files all at once...

      PS at the website of ncbi, I can upload a file and it says I can upload a list of sequences, however I seem not to be able to do this.. when I copy the information from file 2 in file 1, it does not seem to work.
      + I still would need to open each file and copy the sequence and paste it in another file (with all the sequences).
      so I wonder whether I can simple upload all the files at once or something like that.

      ("Use the browse button to upload a file from your local disk. The file may contain a single sequence or a list of sequences. The data may be either a list of database accession numbers, NCBI gi numbers, or sequences in FASTA format")
      Last edited by phillie; 07-15-2012, 01:48 AM.

      Comment


      • #4
        Originally posted by phillie View Post
        So I wonder if I can simple "load" the 50 files all at once...
        Hi- (I hope I'm not misunderstanding your question...) I don't think loading more than one file is possible, however the workaround is quite simple: Concatenate all the files to a single one and upload this big file to blast.
        If you are on Mac or Linux or Windows/Cygwin it's very simple to do on the command line:
        Code:
        cd /path/to/my/fastas ## Move to dir with your FASTA files
        ## Concatenate files (assuming you have 3 files):
        cat myfile1.fasta myfile2.fasta myfile3.fasta > catfile.fasta
        ## Or, if all the files whose name ending in "fasta" have to be concatenated:
        cat *.fasta > catfile.fasta
        Now, just upload catfile.fasta to blast using "Browse" "Upload file"
        (If you end up with different sequences having the same name, I'm not sure how BLAST is going to handle it though...)

        Good luck!
        Dario

        Comment


        • #5
          Originally posted by dariober View Post
          Hi- (I hope I'm not misunderstanding your question...) I don't think loading more than one file is possible, however the workaround is quite simple: Concatenate all the files to a single one and upload this big file to blast.
          If you are on Mac or Linux or Windows/Cygwin it's very simple to do on the command line:
          Code:
          cd /path/to/my/fastas ## Move to dir with your FASTA files
          ## Concatenate files (assuming you have 3 files):
          cat myfile1.fasta myfile2.fasta myfile3.fasta > catfile.fasta
          ## Or, if all the files whose name ending in "fasta" have to be concatenated:
          cat *.fasta > catfile.fasta
          Now, just upload catfile.fasta to blast using "Browse" "Upload file"
          (If you end up with different sequences having the same name, I'm not sure how BLAST is going to handle it though...)

          Good luck!
          Dario
          Hallo,

          I tried this with some txt files (just changed the .fasta with .txt, because I dont have the fasta files on my home computer, I only have some txt files, however it did not work: it created the new files, but those files are empty...

          the command prom also says it doesnt not recognize cat as an internal or external ...
          If I leave the "cat" out of the command, it does the same: it just created a text file that is empty...

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          25 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          27 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          24 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Working...
          X