SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Why microRNA MiRBase ID in TCGA and Target scan does not match with each other well ? watermark Bioinformatics 4 05-13-2014 06:40 AM
Analyse gene expression data and other measured factors in cummeRbund sindrle Bioinformatics 0 09-07-2013 02:49 AM
How is calculated the gene expression values of microarrays from TCGA? dsmarcoantonio Bioinformatics 0 12-18-2012 03:24 AM
differential expression analysis of microrna generated by NGS platform vcc Bioinformatics 2 03-29-2012 11:08 AM
PubMed: MicroRNA Expression Analysis Using the Illumina MicroRNA-Seq Platform. Newsbot! Literature Watch 0 12-07-2011 02:30 AM

Reply
 
Thread Tools
Old 05-17-2014, 06:48 AM   #1
watermark
Member
 
Location: Germany

Join Date: Nov 2013
Posts: 20
Question Which kind of microRNA expression are measured in TCGA?

Hi all,

I have download the mRNA and microRNA NGS data from TCGA for BRCA.
for downloading the microRNA data, I choosed the miRNASeq from filter setting.
Now I have the expression level of miRNA in diffrent samples.
The question for is that, which kind of microRNA they are quantified ? is it all mature miRNA ? or mirna precursor are also there ? They seqenced miRNA which they got from gel electrophoresis ?(to be sure that, all of them have same length in case of mature miRNA).
But when I look at the miRNA IDs; there is some problem:
for example : they have expression level for hsa-mir-135a-2 , which when I search for it in miRBase, it's stem loop and it's mature form in miRbase is hsa-miR-135a-5p . so now I'm really in trouble to undersrand that expression level of which type of miRNA are quantified ?

would someone clarify it more ?
watermark is offline   Reply With Quote
Old 11-22-2014, 02:50 AM   #2
GenePool
Registered Vendor
 
Location: San Francisco, CA

Join Date: Mar 2014
Posts: 18
Default

If you look at the "DESCRIPTION.TXT" files included in each package, there is the following description:

This data archive contains the miRNA expression data for cancer samples
of The Cancer Genome Atlas (TCGA) project. The experiments were
performed by the BCCA Genome Sciences Centre in BC using the miRNA-Seq approach
on the Illumina platform.

Please see DESCRIPTION.txt in the mage-tab for algorithm description of the data protocols.

The .adf file format describing miRNA annotations is as follows:

MiRNA ID
miRBase version
genome version and coordinates as <version>:<Chromosome>:<Start position>-<End position>:<Strand>
precursor sequence
mature strand coordinates relative to precursor coordinates, as <relative start>-<relative end>
mature strand accession
alternate mature strand coordinates, if provided by miRBase
alternate mature form accession
star strand name, if provided by miRBase
star strand form accession

The .mirna.quantification.txt data file describing summed expression for each miRNA is as follows:

miRNA name
raw read count
reads per million miRNA reads
cross-mapped to other miRNA forms (Y or N)

The .isoform.quantification.txt data file describing every individual sequence isoform observed is as follows:

miRNA name
alignment coordinates as <version>:<Chromosome>:<Start position>-<End position>:<Strand>
raw read count
reads per million miRNA reads
cross-mapped to other miRNA forms (Y or N)
region within miRNA

------------------

When one looks at the associated files such as *.isoform.quantification.txt and *.mirna.quantification.txt, one can get a sense of the coordinates for which counts were calculated.

As an example, let's take a look at "hsa-let-7a-1" below. You'll notice that the mature and star forms are mentioned in the last column. If you look at column two, you'll see that the reads all align to a range of 25 nucleotides (hg19:9:96938242-96938267:+) for the mature miRNA of hsa-let-7a-1 and a range of 25 nucleotides (hg19:9:96938292-96938317:+) for the star miRNA of hsa-let-7a-1.

From *.isoform.quantification.txt for one sample:

hsa-let-7a-1 hg19:9:96938242-96938264:+ 4 1.495846 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938242-96938266:+ 4 1.495846 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938243-96938264:+ 8 2.991692 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938243-96938265:+ 4 1.495846 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938243-96938266:+ 6 2.243769 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938244-96938263:+ 157 58.711957 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938244-96938264:+ 5196 1943.104000 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938244-96938265:+ 3954 1478.643806 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938244-96938266:+ 7029 2628.575446 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938244-96938267:+ 278 103.961299 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938244-96938268:+ 18 6.731307 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938245-96938264:+ 9 3.365654 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938245-96938265:+ 16 5.983384 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938245-96938266:+ 44 16.454306 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938245-96938267:+ 3 1.121885 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938246-96938264:+ 1 0.373962 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938247-96938265:+ 2 0.747923 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938247-96938266:+ 10 3.739615 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938247-96938267:+ 2 0.747923 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938248-96938266:+ 1 0.373962 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938248-96938267:+ 2 0.747923 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938292-96938311:+ 1 0.373962 N star,MIMAT0004481
hsa-let-7a-1 hg19:9:96938295-96938314:+ 3 1.121885 N star,MIMAT0004481
hsa-let-7a-1 hg19:9:96938295-96938315:+ 6 2.243769 N star,MIMAT0004481
hsa-let-7a-1 hg19:9:96938295-96938316:+ 13 4.861500 N star,MIMAT0004481
hsa-let-7a-1 hg19:9:96938295-96938317:+ 20 7.479230 N star,MIMAT0004481
hsa-let-7a-1 hg19:9:96938296-96938316:+ 3 1.121885 N star,MIMAT0004481
hsa-let-7a-1 hg19:9:96938296-96938317:+ 1 0.373962 N star,MIMAT0004481

From *.mirna.quantification.txt for the same sample

miRNA_ID read_count reads_per_million_miRNA_mapped cross-mapped
hsa-let-7a-1 16795 6280.683542 N

==================

The count of 16795 in *.mirna.quantification.txt for hsa-let-7a-1 is literally the sum of all the counts mapped to the 25-nucleotide-ranges for the mature and star regions of hsa-let-7a-1.

So in conclusion, the values represent expression for the mature miRNAs, not the stem loops.

Incidentally, if you're interested, we have imported all of the miRNA-Seq data along with all of the other TCGA assays into GenePool and linked in all of the patient & sample metadata. GenePool makes it very simple to slice and dice the samples according to patient characteristics and clinical metadata, then run analyses.

Here are the links to related threads:

http://seqanswers.com/forums/showthread.php?t=48485
http://seqanswers.com/forums/showthread.php?t=42471

Good Luck!

------------------------------
GenePool is making genomics data management, analysis, and sharing easier!
Products @ www.stationxinc.com

Last edited by GenePool; 11-23-2014 at 08:26 PM.
GenePool is offline   Reply With Quote
Old 02-02-2015, 12:01 AM   #3
LisaXiao
Junior Member
 
Location: Shanghai,China

Join Date: Feb 2015
Posts: 1
Default Hey

Hey, this problem troubles me too. However, after I checked as what you said, it seemed that the values represent expression for the mature miRNAs, ADDING the stem loops and PRECURSOR.





Quote:
Originally Posted by GenePool View Post
If you look at the "DESCRIPTION.TXT" files included in each package, there is the following description:

This data archive contains the miRNA expression data for cancer samples
of The Cancer Genome Atlas (TCGA) project. The experiments were
performed by the BCCA Genome Sciences Centre in BC using the miRNA-Seq approach
on the Illumina platform.

Please see DESCRIPTION.txt in the mage-tab for algorithm description of the data protocols.

The .adf file format describing miRNA annotations is as follows:

MiRNA ID
miRBase version
genome version and coordinates as <version>:<Chromosome>:<Start position>-<End position>:<Strand>
precursor sequence
mature strand coordinates relative to precursor coordinates, as <relative start>-<relative end>
mature strand accession
alternate mature strand coordinates, if provided by miRBase
alternate mature form accession
star strand name, if provided by miRBase
star strand form accession

The .mirna.quantification.txt data file describing summed expression for each miRNA is as follows:

miRNA name
raw read count
reads per million miRNA reads
cross-mapped to other miRNA forms (Y or N)

The .isoform.quantification.txt data file describing every individual sequence isoform observed is as follows:

miRNA name
alignment coordinates as <version>:<Chromosome>:<Start position>-<End position>:<Strand>
raw read count
reads per million miRNA reads
cross-mapped to other miRNA forms (Y or N)
region within miRNA

------------------

When one looks at the associated files such as *.isoform.quantification.txt and *.mirna.quantification.txt, one can get a sense of the coordinates for which counts were calculated.

As an example, let's take a look at "hsa-let-7a-1" below. You'll notice that the mature and star forms are mentioned in the last column. If you look at column two, you'll see that the reads all align to a range of 25 nucleotides (hg19:9:96938242-96938267:+) for the mature miRNA of hsa-let-7a-1 and a range of 25 nucleotides (hg19:9:96938292-96938317:+) for the star miRNA of hsa-let-7a-1.

From *.isoform.quantification.txt for one sample:

hsa-let-7a-1 hg19:9:96938242-96938264:+ 4 1.495846 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938242-96938266:+ 4 1.495846 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938243-96938264:+ 8 2.991692 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938243-96938265:+ 4 1.495846 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938243-96938266:+ 6 2.243769 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938244-96938263:+ 157 58.711957 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938244-96938264:+ 5196 1943.104000 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938244-96938265:+ 3954 1478.643806 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938244-96938266:+ 7029 2628.575446 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938244-96938267:+ 278 103.961299 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938244-96938268:+ 18 6.731307 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938245-96938264:+ 9 3.365654 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938245-96938265:+ 16 5.983384 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938245-96938266:+ 44 16.454306 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938245-96938267:+ 3 1.121885 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938246-96938264:+ 1 0.373962 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938247-96938265:+ 2 0.747923 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938247-96938266:+ 10 3.739615 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938247-96938267:+ 2 0.747923 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938248-96938266:+ 1 0.373962 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938248-96938267:+ 2 0.747923 N mature,MIMAT0000062
hsa-let-7a-1 hg19:9:96938292-96938311:+ 1 0.373962 N star,MIMAT0004481
hsa-let-7a-1 hg19:9:96938295-96938314:+ 3 1.121885 N star,MIMAT0004481
hsa-let-7a-1 hg19:9:96938295-96938315:+ 6 2.243769 N star,MIMAT0004481
hsa-let-7a-1 hg19:9:96938295-96938316:+ 13 4.861500 N star,MIMAT0004481
hsa-let-7a-1 hg19:9:96938295-96938317:+ 20 7.479230 N star,MIMAT0004481
hsa-let-7a-1 hg19:9:96938296-96938316:+ 3 1.121885 N star,MIMAT0004481
hsa-let-7a-1 hg19:9:96938296-96938317:+ 1 0.373962 N star,MIMAT0004481

From *.mirna.quantification.txt for the same sample

miRNA_ID read_count reads_per_million_miRNA_mapped cross-mapped
hsa-let-7a-1 16795 6280.683542 N

==================

The count of 16795 in *.mirna.quantification.txt for hsa-let-7a-1 is literally the sum of all the counts mapped to the 25-nucleotide-ranges for the mature and star regions of hsa-let-7a-1.

So in conclusion, the values represent expression for the mature miRNAs, not the stem loops.

Incidentally, if you're interested, we have imported all of the miRNA-Seq data along with all of the other TCGA assays into GenePool and linked in all of the patient & sample metadata. GenePool makes it very simple to slice and dice the samples according to patient characteristics and clinical metadata, then run analyses.

Here are the links to related threads:

http://seqanswers.com/forums/showthread.php?t=48485
http://seqanswers.com/forums/showthread.php?t=42471

Good Luck!

------------------------------
GenePool is making genomics data management, analysis, and sharing easier!
Products @ www.stationxinc.com
Attached Images
File Type: png ??1.PNG (51.1 KB, 6 views)
File Type: png ??.PNG (2.6 KB, 2 views)
LisaXiao is offline   Reply With Quote
Reply

Tags
bioinformatic analaysis, ngs 3g, ngs data analysis, rna sequencing, tcga

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:34 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO