Currently, I'm looking into the Gene Expression Omnibus. Are there any other good websites that curate RNA-seq data (or perhaps provide links to where RNA-seq data can be found)? A bit of background: I'm looking for any publicly available RNA-seq data sets containing at least 10 individuals with some form of cancer (ideally breast cancer).
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
-
You can check the TCGA data portal for cancer sample data.
The Cancer Genome Atlas (TCGA) is a landmark cancer genomics program that sequenced and molecularly characterized over 11,000 cases of primary cancer samples. Learn more about how the program transformed the cancer research community and beyond.
There are access tiers so you may need to look through those: https://tcga-data.nci.nih.gov/tcga/tcgaAccessTiers.jsp
Comment
-
TCGA sequence data is censored. There is no public primary or metastatic tumor data for U.S. studies. There are cell lines. But, are cell lines cancer? Patients should be allowed to release their genomes (both tumor and normal) for public inspection. Hopefully future studies will accommodate them and there will be great benefits from making the data public.
Comment
-
ENCODE data
ENCODE RNA-seq data from human cell lines can be found here:
For raw .fastq Read1/Read2 files select fastqRd1/2 in the "View" column.
Comment
-
"Asian Gastric Cancer" : http://trace.ncbi.nlm.nih.gov/Traces...tudy=SRP012016
Great! I knew that Asian samples where showing up in GEO with Affy SNP6 data.
Good to see Solid RNA-seq data is now showing up in SRA.
(I wish NCBI SRA would turn off their "freeze the machine" ajax/javascript nonsense)
Comment
-
Thanks for all the great suggestions. So far, I've mostly been looking into TCGA (https://tcga-data.nci.nih.gov/tcga/findArchives.htm). I downloaded the BRCA RNASeqV2 dataset (https://tcga-data.nci.nih.gov/tcga/s...rchiveId=10418). Does anyone know how to determine if each of the sample ID's came from distinct individuals? If that's the case, then there would be ~800-900 individuals in this data set - which seems to be unlikely given the size of other data sets. I wish there was a way to tell which samples came from distinct individuals.
The ENCODE database also looks very promising. Here is a tool that I've been using to find RNASeq data: http://genome.crg.es/~jlagarde/encode_RNA_dashboard/
Comment
-
Note that TCGA BRCA (breast cancer) data from UNC is just the idf/sdrf MAGE-TAB files.
It's just a description of the data processing.
See here : http://tab2mage.sourceforge.net/docs/magetab_docs.html form mage info.
The RNA-Seq "Asian Gastric Cancer" samples can be downloaded here:
ftp://ftp-trace.ncbi.nlm.nih.gov/sra...012/SRP012016/
Use "wget -r" to get the whole thing.
The whole separting out the SRA study/sample/experiment/run thing is frustrating, but do-able. (The scars heal eventually.)
Comment
-
Original data files for TCGA are available from CGHub: https://cghub.ucsc.edu/
You will have to apply to get access: https://cghub.ucsc.edu/get_access.html
All the samples should be unique. Breast cancer was one of the major types included so there are many samples.
Additional information here: http://www.ncbi.nlm.nih.gov/projects...hs000178.v5.p5
Comment
-
Here's a dataset with 79 samples of RNA-seq for breast cancer patient samples.
https://www.ebi.ac.uk/ega/studies/EGAS00001000132
Comment
-
NB: The EMBL-EBI data is controlled access:
From https://www.ebi.ac.uk/ega/datasets/EGAD00001000113
Who controls access to this dataset
For each dataset that requires access control, there is a corresponding Data Access Committee (DAC) who determine access permissions. Data access is not the responsibility of the EGA. If you need to request access to this data set, please contact: Department of Molecular Oncology, BC Cancer Research Centre, Data Access Committee
Comment
-
Wow, the ebi dataset looks really promising. And it has a nice Nature journal article to accompany it too (http://www.nature.com/nature/journal...ture10933.html). Very cool.
Comment
Latest Articles
Collapse
-
by seqadmin
Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...-
Channel: Articles
12-16-2024, 07:57 AM -
-
by seqadmin
Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.
Long-Read Sequencing
Long-read sequencing has seen remarkable advancements,...-
Channel: Articles
12-02-2024, 01:49 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 12-17-2024, 10:28 AM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
12-17-2024, 10:28 AM
|
||
Started by seqadmin, 12-13-2024, 08:24 AM
|
0 responses
43 views
0 likes
|
Last Post
by seqadmin
12-13-2024, 08:24 AM
|
||
Started by seqadmin, 12-12-2024, 07:41 AM
|
0 responses
29 views
0 likes
|
Last Post
by seqadmin
12-12-2024, 07:41 AM
|
||
Started by seqadmin, 12-11-2024, 07:45 AM
|
0 responses
42 views
0 likes
|
Last Post
by seqadmin
12-11-2024, 07:45 AM
|
Comment