Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • julio514
    Member
    • May 2011
    • 12

    Fetch fastqs from basespace with command line

    Dear seqanswers community,
    I'd like to know if anyone of you ever downloaded fastqs from basespace from the command line? If so, I'd really appreciate some help or point me in the right direction. Just to clarify, I don't want to use any basespace apps at all: some data of a collaborator sits in basespace and I just want to download it on my HPC cluster.

    Many thanks,
    Julio
  • lh3
    Senior Member
    • Feb 2008
    • 686

    #2
    See https://gist.github.com/lh3/54f535b11a9ee5d3be8e

    Comment

    • julio514
      Member
      • May 2011
      • 12

      #3
      Thanks lh3. This works! Do you know if there is a way to download all .fastq.gz for a given run. I found out how to download bcl files, but would like to get fastqs instead.
      Cheers,

      Comment

      • lh3
        Senior Member
        • Feb 2008
        • 686

        #4
        I don't know how, but there must be ways with their APIs.

        Comment

        • dariober
          Senior Member
          • May 2010
          • 311

          #5
          Originally posted by julio514 View Post
          Dear seqanswers community,
          I'd like to know if anyone of you ever downloaded fastqs from basespace from the command line? If so, I'd really appreciate some help or point me in the right direction. Just to clarify, I don't want to use any basespace apps at all: some data of a collaborator sits in basespace and I just want to download it on my HPC cluster.

          Many thanks,
          Julio
          I've used BaseSpaceR to get fastq files via R. (This was while ago).

          If I correctly remember and things haven't changed, it's a bit long winded to get started as you need to get a token. Also it's not enough to have a run shared with you, the owner of the project has to share the entire project (I think...). Then something on these lines should work:

          Code:
          library(BaseSpaceR)
          ACCESS_TOKEN<- 'dd9...mytoken...43'
          PROJECT_ID<- '123456'  ## Get proj ID from url of the project
          
          aAuth<- AppAuth(access_token = ACCESS_TOKEN)
          selProj <- Projects(aAuth, id = PROJECT_ID, simplify = TRUE) 
          sampl <- listSamples(selProj, limit= 1000)
          inSample <- Samples(aAuth, id = Id(sampl), simplify = TRUE)
          for(s in inSample){ 
              f <- listFiles(s, Extensions = ".gz")
              print(Name(f))
              getFiles(aAuth, id= Id(f), destDir = 'outdir/', verbose = TRUE)
          }

          Comment

          • julio514
            Member
            • May 2011
            • 12

            #6
            Thanks dariober, works like a charm!

            Comment

            • jdv
              Junior Member
              • Dec 2014
              • 3

              #7
              A while ago I wrote an interactive BaseSpace command-line client for downloading data to headless servers. You might wish to try it as an alternative to the other options here. It has an FTP-like interface that allows you to browse the data in your account based on the Project/Sample/File hierarchy currently used. You can download individual files or entire projects at a time. As with the other alternatives, you need to obtain a developer's access token to use with it, but it can store this token to disk using symmetric encryption to make future use a bit easier. It's meant for interactive use, so if you want batch or scripting capabilities use one of the other suggestions.

              I just got around today to uploading it to SourceForge:

              Download bsclient for free. An interactive FTP-like command-line BaseSpace download client. bsclient is an interactive text-based client for browsing and downloading files from Illumina BaseSpace. It has a simple interface simliar to FTP and can be used to easily download files onto a remote server or in any situation when the web-based interface is not accessible or desirable.


              It is written in Perl and I use it on Linux. I've also tested it briefly on Windows, where it seems to work with the exception of password masking on the command line.
              Last edited by jdv; 12-06-2014, 12:19 PM.

              Comment

              • jdv
                Junior Member
                • Dec 2014
                • 3

                #8
                Also, a quick note for anyone using lh3's method to manually construct URLs - not only does this open up your access_token for viewing in the process table by anyone else on the same machine (e.g. with top, ps, etc) as already mentioned, but it is also transmitted in cleartext with the HTTPS request URL and can be trivially captured by anyone happening to be watching the network traffic.

                The secure way of sending the request is with the access token specified in the 'x-access-token' HTTP header, which is encrypted in the SSL connection. See here:



                A modification of lh3's method using curl instead of wget would be like this:

                Code:
                curl -L -J --config token_header.txt https://api.basespace.illumina.com/v1pre3/files/YOUR-FILE-ID/content -O
                where the arbitrarily named 'token_header.txt' contains something like this:

                Code:
                header = "x-access-token: YOUR-TOKEN-HERE"
                This prevents snooping via the process table or network traffic. Obviously you won't want to leave the 'token_header.txt' file sitting around on shared disk space.

                Comment

                • lh3
                  Senior Member
                  • Feb 2008
                  • 686

                  #9
                  Nice tips. Thanks!

                  Comment

                  • SF_mallish
                    Member
                    • Jan 2011
                    • 10

                    #10
                    Originally posted by julio514 View Post
                    Thanks lh3. This works! Do you know if there is a way to download all .fastq.gz for a given run. I found out how to download bcl files, but would like to get fastqs instead.
                    Cheers,
                    It might be too late, but in the first link lh provided in his gist, there is a python script to allow you to download all files in one run by specifying the run ID.


                    I also have a small python script to allow you download all fastq files in one project by specifying project name and access Token.

                    Comment

                    • julio514
                      Member
                      • May 2011
                      • 12

                      #11
                      Also these python scripts that work well (tested)

                      Cheers,

                      Comment

                      • seahym
                        Junior Member
                        • Jun 2018
                        • 1

                        #12
                        Too late to answer OP, but in case anyone else needs an alternative, the BaseSpace command line interface was really straightforward to install and use on Ubuntu. To download fastq files by project:

                        Code:
                        bs download project --id <project_id> -o <target_directory>

                        Comment

                        Latest Articles

                        Collapse

                        • SEQadmin2
                          From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                          by SEQadmin2


                          Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                          The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                          ...
                          06-02-2026, 10:05 AM
                        • SEQadmin2
                          Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                          by SEQadmin2


                          With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                          Introduction

                          Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                          05-22-2026, 06:42 AM
                        • SEQadmin2
                          Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                          by SEQadmin2

                          Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                          Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                          05-06-2026, 09:04 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by SEQadmin2, Yesterday, 08:59 AM
                        0 responses
                        13 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-02-2026, 12:03 PM
                        0 responses
                        21 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-02-2026, 11:40 AM
                        0 responses
                        18 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 05-28-2026, 11:40 AM
                        0 responses
                        31 views
                        0 reactions
                        Last Post SEQadmin2  
                        Working...