Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • new usage of SRA toolkit/ SRA archive data download

    It seems that the NCBI SRA archive changed the way how files can be downloaded. Up till now we used the link from SRA website to download files with Aspera Connect, then we used SRA toolkit to extract fasta sequences. Now the there is I must say a little confusing description that we are not able to apply. It seems that SRA toolkit can be used to directly process data from NCBI website. Did anybody solve this situation? We are working in Windows environment. Thanks.
    link to SRA description:

  • #2
    Can you post an example of an accession # that is not working as expected?

    Comment


    • #3
      The change applies for all SRA files. So a random example:


      when I go to the download tab, there used to be links to FTP and Aspera downloads. Now there is only the new description on the use of SRA toolkit.

      Comment


      • #4
        There is always the option of getting the fastq files directly from ENA avoiding sratoolkit altogether.

        Corresponding URL for the example you posted above:

        ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR617/SRR617107/

        ftp://ftp.sra.ebi.ac.uk/vol1/srr/SRR617/SRR617107

        Comment


        • #5
          Corresponding NCBI SRA direct URL (using information from SRA link you included above):

          ftp://ftp-trace.ncbi.nih.gov/sra/sra...617/SRR617107/

          Comment


          • #6
            OK, that works, thanks.

            However, it goes through regular download, the Aspera connection was much better. If I understand it correctly, SRA toolkit now allows processing the files directly from the NCBI site without the need to download them. For example, using the fastq dump to transform .sra files to fasta. Base on the description available on NCBI (link bellow), I was not able to do it though.

            Comment


            • #7
              Originally posted by Retro View Post
              However, it goes through regular download, the Aspera connection was much better. If I understand it correctly, SRA toolkit now allows processing the files directly from the NCBI site without the need to download them. For example, using the fastq dump to transform .sra files to fasta. Base on the description available on NCBI (link bellow), I was not able to do it though.

              http://www.ncbi.nlm.nih.gov/books/NB...sra_data_using
              After upgrading to the latest sratoolkit (v.2.4.2-1) I tried the new method out. Here is what I discovered.

              In order to get the downloads to work, every user (especially if you are on a shared system/cluster) will have to run the configuration utility (help located at: http://trace.ncbi.nlm.nih.gov/Traces...lkit_doc&f=std) and set an appropriate path for storing configuration directories/files. Remember to save settings before you exit the utility.

              Hint: Do the following in a xterm/X11 window if you want the text to be properly formatted.

              Code:
              $ /path_to/vdb-config -i
              Once this is done then you will be able to download fastq files (and other data) directly from NCBI without downloading the .sra files.

              Following example only prints five reads to screen

              Code:
              $ /path_to/fastq-dump -X 5 -Z SRR390729
              This command will then download the full data file as fastq to the current directory
              Code:
              $ /path_to/fastq-dump SRR390729
              Last edited by GenoMax; 11-20-2014, 10:37 AM.

              Comment


              • #8
                I confirm GenoMax last reply. I updated my version to 2.5.2 and it's working with the mentioned commands.

                This new version includes the setting of a proxy at the 'vdb-config -i' window, which in my case I had to enable and add as 'proxyort'. If not, the process remained stuck with no warnings.

                If you don't specify a directory, it will be downloaded at the one you are standing.

                Remember '--split-files' when you are downloading PE reads.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                18 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                22 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                16 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                47 views
                0 likes
                Last Post seqadmin  
                Working...
                X