![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
SRA - SRR*.lite.sra | adrian | Bioinformatics | 2 | 03-19-2012 10:43 AM |
Download from SRA archive | SongLi | Bioinformatics | 4 | 04-22-2011 10:55 AM |
Study Design | gavin.oliver | General | 2 | 02-25-2011 03:26 AM |
SRF metadata | Nick | Bioinformatics | 2 | 09-03-2010 01:24 AM |
metadata for SRA | Sequencing | Illumina/Solexa | 0 | 08-05-2010 04:43 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: India Join Date: Jun 2011
Posts: 26
|
![]()
Hi Guys,
I am trying to download the metadata of all the studies submitted in SRA but I am not able to find a complete list. Can anybody help me out with this. I want metadata(mainly abstract and description) (preferably in xml format) of all the studies/samples in SRA till date. Thnx in advance. ![]() |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: Cambridge UK Join Date: Sep 2008
Posts: 151
|
![]() |
![]() |
![]() |
![]() |
#3 |
Member
Location: India Join Date: Jun 2011
Posts: 26
|
![]()
Thanx for the reply.
I have checked it but it doesnt contain the information I want i.e. study abstract and description. It only contains IDs, which I already have. I am trying few other things lets hope if it works. ![]() |
![]() |
![]() |
![]() |
#4 |
Member
Location: Cambridge, UK Join Date: Sep 2009
Posts: 37
|
![]()
Grab the full XML dump and parse it:
ftp://ftp-trace.ncbi.nlm.nih.gov/sra...0111101.tar.gz |
![]() |
![]() |
![]() |
#5 |
Member
Location: India Join Date: Jun 2011
Posts: 26
|
![]()
@vadim... thnx a lot...
This is wht I was looking for. But in this also one issue is that not every SRA id is having study.xml, but its ok. I can live with that. ![]() |
![]() |
![]() |
![]() |
#6 |
Member
Location: Cambridge, UK Join Date: Sep 2009
Posts: 37
|
![]()
What do you mean by SRA id? Each SRA run should be associated with a study through SRA experiment. The XML schema might be useful:
http://www.ncbi.nlm.nih.gov/viewvc/v...a/doc/SRA_1-3/ or http://ftp.sra.ebi.ac.uk/meta/xsd/sra_1_3/ Also have a look here for a complete XML dump (including EBI SRA): http://ftp.sra.ebi.ac.uk/meta/xml/xml.all.tar.gz |
![]() |
![]() |
![]() |
#7 |
Member
Location: India Join Date: Jun 2011
Posts: 26
|
![]()
Both link to same page.
I understand SRA run, they have ID as SRR. For eg: take this case. http://www.ncbi.nlm.nih.gov/sra?term=SRA028225 http://www.ncbi.nlm.nih.gov/sra?term=SRA028192 http://www.ncbi.nlm.nih.gov/sra?term=SRA028059 Out of these only SRA028059 folder in the SRA Metadata is having *.study.xml. SRP = Study SRX = Experiment SRS = Sample SRR = Run But what basically is SRA for?? I am confused here. |
![]() |
![]() |
![]() |
#8 |
Member
Location: India Join Date: Jun 2011
Posts: 26
|
![]()
This is a reply I got from a person in SRA.
The SRA number acts as a collector for the information. This means that when a center submits metadata or data they create a submission (SRAXXXXXX), but the data or metadata in the submission links to another submission. This is ok with me, but I don't understand the fact that why have separate SRA ids for same study, even if one has to submit more samples to the same study at a later stage. |
![]() |
![]() |
![]() |
#9 |
Member
Location: Cambridge, UK Join Date: Sep 2009
Posts: 37
|
![]()
SRA* accessions are NCBI submission accessions, similarly ERA* accessions are EBI submission accessions and DRA* are DDBJ submission accessions.
Sometimes study is submitted before the run data, but since metadata dumps are organized by submission accession in such cases run and study metadata end up in separate folders. To get proper association use the livelists: NCBI: ftp://ftp-trace.ncbi.nlm.nih.gov/sra...Accessions.tab EBI: ftp://ftp.sra.ebi.ac.uk/meta/list/livelist.gz As for the submissions you are asking, it appears that the run was submitted in SRA028225, experiment in SRA028192 and the study in SRA028059. |
![]() |
![]() |
![]() |
#10 |
Member
Location: India Join Date: Jun 2011
Posts: 26
|
![]()
yeah... got it... That *.tab file is very useful indeed... thanx...
![]() |
![]() |
![]() |
![]() |
#11 |
Member
Location: India Join Date: Jun 2011
Posts: 26
|
![]()
I guess I can conclude this:
"NCBI-SRA doesn't have a single ID using which we can get everything related to it i.e. study, run, experiment, sample", the only way to go about it is use the SRA_Accessions.tab, using study_id *RP* get the *RA*(s) then using *RA*(s) get the *RR*(s)... there is no direct way of getting *RR* from *RP*." RA = SRA Accessions RR = Run RP = Study |
![]() |
![]() |
![]() |
#12 |
Senior Member
Location: Birmingham, UK Join Date: Jul 2009
Posts: 356
|
![]()
Maybe not relevant for this question but am finding the DNAnexus SRA interface much nicer than the NCBI's: http://sra.dnanexus.com/
|
![]() |
![]() |
![]() |
#13 | |
Member
Location: India Join Date: Jun 2011
Posts: 26
|
![]() Quote:
ok... will have a look at it as well... ![]() |
|
![]() |
![]() |
![]() |
#14 |
Member
Location: India Join Date: Jun 2011
Posts: 26
|
![]()
can anybody explain me dis...
many SRA(accession) ids having same SRP(study) id this is ok... as justified by an answer above... but many SRP(study) ids having same SRA(accession) id...? |
![]() |
![]() |
![]() |
#15 |
Member
Location: Maryland Join Date: Jan 2010
Posts: 14
|
![]()
You might take a look at:
http://www.bioconductor.org/packages...tml/SRAdb.html While this is an R/Bioconductor package, the underlying data are stored in a SQLite database that can be downloaded separately and used directly or from any language with a SQLite driver (most languages). What is done to create the database is to download all the SRA XML files containing metadata, parse those files, and then load them into a relational database. This makes bulk operations on the data easier and more flexible since SQL can be used. Some full-text searching capabilities are also included since SQLite supports that in later versions. Last edited by sdavis; 12-02-2011 at 10:21 AM. |
![]() |
![]() |
![]() |
#16 |
Member
Location: Cambridge, UK Join Date: Sep 2009
Posts: 37
|
![]() |
![]() |
![]() |
![]() |
Tags |
sra |
Thread Tools | |
|
|