SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   Download all assemblies in a bioproject (http://seqanswers.com/forums/showthread.php?t=89870)

dazhudou1122 06-21-2019 08:02 AM

Download all assemblies in a bioproject
 
Dear SEQanswers community,

I am trying to download all the assembly in a bioproject: https://www.ncbi.nlm.nih.gov/bioproject/?term=474907 Can anyone tell me how to download them all without manually copying the link and download the assembly like this: wget --recursive -e robots=off --reject "index.html" --no-host-directories --cut-dirs=6 ftp://ftp.ncbi.nlm.nih.gov/genomes/a....1_ASM479397v1 ./

Any info will be greatly appreciated!

Best,

Wenhan

Richard Finney 06-21-2019 10:04 AM

Here's an easy way to do it, though not quite a one liner ...

1) go to the "run selector" for your project of interest:

https://www.ncbi.nlm.nih.gov/Traces/...re&query_key=2


2) Download "runinfo table" (to a file called SraRunTable.txt )

3) create url based on SRR id . example: wget ftp://ftp-trace.ncbi.nih.gov/sra/sra.../SRR001115.sra , use these URLS to wget in a script , like this ...

cat SraRunTable.txt | cut -f10 | awk '{print "wget ftp://ftp-trace.ncbi.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/"substr($1,1,6)"/"$1"/"$1".sra"}' | bash


Note that "cut -f10' is the field with the SRR ids.

GenoMax 06-21-2019 11:05 AM

Cross posted and answered on Biostars: https://www.biostars.org/p/385930/


All times are GMT -8. The time now is 12:28 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.