SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
MEGAN by command line jtamames Bioinformatics 19 12-20-2016 12:52 AM
BioGPS, command line? sindrle Bioinformatics 4 07-17-2014 12:44 PM
blast2go command line! PSW Bioinformatics 1 12-03-2012 11:09 AM
MEGAN from the command line oTrout Bioinformatics 4 10-30-2012 04:31 AM
SIFT on the command line lamasmi Bioinformatics 2 08-17-2010 09:32 AM

Reply
 
Thread Tools
Old 10-21-2014, 01:24 PM   #1
julio514
Member
 
Location: Montreal, QC

Join Date: May 2011
Posts: 12
Default Fetch fastqs from basespace with command line

Dear seqanswers community,
I'd like to know if anyone of you ever downloaded fastqs from basespace from the command line? If so, I'd really appreciate some help or point me in the right direction. Just to clarify, I don't want to use any basespace apps at all: some data of a collaborator sits in basespace and I just want to download it on my HPC cluster.

Many thanks,
Julio
julio514 is offline   Reply With Quote
Old 10-21-2014, 04:02 PM   #2
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

See https://gist.github.com/lh3/54f535b11a9ee5d3be8e
lh3 is offline   Reply With Quote
Old 10-21-2014, 04:55 PM   #3
julio514
Member
 
Location: Montreal, QC

Join Date: May 2011
Posts: 12
Default

Thanks lh3. This works! Do you know if there is a way to download all .fastq.gz for a given run. I found out how to download bcl files, but would like to get fastqs instead.
Cheers,
julio514 is offline   Reply With Quote
Old 10-21-2014, 05:17 PM   #4
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

I don't know how, but there must be ways with their APIs.
lh3 is offline   Reply With Quote
Old 10-21-2014, 11:41 PM   #5
dariober
Senior Member
 
Location: Cambridge, UK

Join Date: May 2010
Posts: 310
Default

Quote:
Originally Posted by julio514 View Post
Dear seqanswers community,
I'd like to know if anyone of you ever downloaded fastqs from basespace from the command line? If so, I'd really appreciate some help or point me in the right direction. Just to clarify, I don't want to use any basespace apps at all: some data of a collaborator sits in basespace and I just want to download it on my HPC cluster.

Many thanks,
Julio
I've used BaseSpaceR to get fastq files via R. (This was while ago).

If I correctly remember and things haven't changed, it's a bit long winded to get started as you need to get a token. Also it's not enough to have a run shared with you, the owner of the project has to share the entire project (I think...). Then something on these lines should work:

Code:
library(BaseSpaceR)
ACCESS_TOKEN<- 'dd9...mytoken...43'
PROJECT_ID<- '123456'  ## Get proj ID from url of the project

aAuth<- AppAuth(access_token = ACCESS_TOKEN)
selProj <- Projects(aAuth, id = PROJECT_ID, simplify = TRUE) 
sampl <- listSamples(selProj, limit= 1000)
inSample <- Samples(aAuth, id = Id(sampl), simplify = TRUE)
for(s in inSample){ 
    f <- listFiles(s, Extensions = ".gz")
    print(Name(f))
    getFiles(aAuth, id= Id(f), destDir = 'outdir/', verbose = TRUE)
}
dariober is offline   Reply With Quote
Old 10-22-2014, 04:57 AM   #6
julio514
Member
 
Location: Montreal, QC

Join Date: May 2011
Posts: 12
Default

Thanks dariober, works like a charm!
julio514 is offline   Reply With Quote
Old 12-06-2014, 11:15 AM   #7
jdv
Junior Member
 
Location: Wisconsin, USA

Join Date: Dec 2014
Posts: 3
Default

A while ago I wrote an interactive BaseSpace command-line client for downloading data to headless servers. You might wish to try it as an alternative to the other options here. It has an FTP-like interface that allows you to browse the data in your account based on the Project/Sample/File hierarchy currently used. You can download individual files or entire projects at a time. As with the other alternatives, you need to obtain a developer's access token to use with it, but it can store this token to disk using symmetric encryption to make future use a bit easier. It's meant for interactive use, so if you want batch or scripting capabilities use one of the other suggestions.

I just got around today to uploading it to SourceForge:

http://sourceforge.net/projects/bsfetch/

It is written in Perl and I use it on Linux. I've also tested it briefly on Windows, where it seems to work with the exception of password masking on the command line.

Last edited by jdv; 12-06-2014 at 11:19 AM.
jdv is offline   Reply With Quote
Old 12-07-2014, 07:31 AM   #8
jdv
Junior Member
 
Location: Wisconsin, USA

Join Date: Dec 2014
Posts: 3
Default

Also, a quick note for anyone using lh3's method to manually construct URLs - not only does this open up your access_token for viewing in the process table by anyone else on the same machine (e.g. with top, ps, etc) as already mentioned, but it is also transmitted in cleartext with the HTTPS request URL and can be trivially captured by anyone happening to be watching the network traffic.

The secure way of sending the request is with the access token specified in the 'x-access-token' HTTP header, which is encrypted in the SSL connection. See here:

https://developer.basespace.illumina...e_Access_Token

A modification of lh3's method using curl instead of wget would be like this:

Code:
curl -L -J --config token_header.txt https://api.basespace.illumina.com/v1pre3/files/YOUR-FILE-ID/content -O
where the arbitrarily named 'token_header.txt' contains something like this:

Code:
header = "x-access-token: YOUR-TOKEN-HERE"
This prevents snooping via the process table or network traffic. Obviously you won't want to leave the 'token_header.txt' file sitting around on shared disk space.
jdv is offline   Reply With Quote
Old 12-07-2014, 08:49 AM   #9
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

Nice tips. Thanks!
lh3 is offline   Reply With Quote
Old 04-15-2015, 07:00 AM   #10
SF_mallish
Member
 
Location: Champaign-Urbana

Join Date: Jan 2011
Posts: 10
Default

Quote:
Originally Posted by julio514 View Post
Thanks lh3. This works! Do you know if there is a way to download all .fastq.gz for a given run. I found out how to download bcl files, but would like to get fastqs instead.
Cheers,
It might be too late, but in the first link lh provided in his gist, there is a python script to allow you to download all files in one run by specifying the run ID.
https://support.basespace.illumina.c...run-downloader

I also have a small python script to allow you download all fastq files in one project by specifying project name and access Token.
https://github.com/yu68/tools/tree/m...aseSpace-tools
SF_mallish is offline   Reply With Quote
Old 10-25-2016, 10:22 AM   #11
julio514
Member
 
Location: Montreal, QC

Join Date: May 2011
Posts: 12
Default

Also these python scripts that work well (tested)
https://github.com/nh13/basespace-invaders
Cheers,
julio514 is offline   Reply With Quote
Reply

Tags
basespace, command line, download, fastq, fetch

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:30 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO