![]() |
Download all assemblies in a bioproject
Dear SEQanswers community,
I am trying to download all the assembly in a bioproject: https://www.ncbi.nlm.nih.gov/bioproject/?term=474907 Can anyone tell me how to download them all without manually copying the link and download the assembly like this: wget --recursive -e robots=off --reject "index.html" --no-host-directories --cut-dirs=6 ftp://ftp.ncbi.nlm.nih.gov/genomes/a....1_ASM479397v1 ./ Any info will be greatly appreciated! Best, Wenhan |
Here's an easy way to do it, though not quite a one liner ...
1) go to the "run selector" for your project of interest: https://www.ncbi.nlm.nih.gov/Traces/...re&query_key=2 2) Download "runinfo table" (to a file called SraRunTable.txt ) 3) create url based on SRR id . example: wget ftp://ftp-trace.ncbi.nih.gov/sra/sra.../SRR001115.sra , use these URLS to wget in a script , like this ... cat SraRunTable.txt | cut -f10 | awk '{print "wget ftp://ftp-trace.ncbi.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/"substr($1,1,6)"/"$1"/"$1".sra"}' | bash Note that "cut -f10' is the field with the SRR ids. |
Cross posted and answered on Biostars: https://www.biostars.org/p/385930/
|
All times are GMT -8. The time now is 12:28 AM. |
Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.