Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Which kind of microRNA expression are measured in TCGA?

    Hi all,

    I have download the mRNA and microRNA NGS data from TCGA for BRCA.
    for downloading the microRNA data, I choosed the miRNASeq from filter setting.
    Now I have the expression level of miRNA in diffrent samples.
    The question for is that, which kind of microRNA they are quantified ? is it all mature miRNA ? or mirna precursor are also there ? They seqenced miRNA which they got from gel electrophoresis ?(to be sure that, all of them have same length in case of mature miRNA).
    But when I look at the miRNA IDs; there is some problem:
    for example : they have expression level for hsa-mir-135a-2 , which when I search for it in miRBase, it's stem loop and it's mature form in miRbase is hsa-miR-135a-5p . so now I'm really in trouble to undersrand that expression level of which type of miRNA are quantified ?

    would someone clarify it more ?

  • #2
    If you look at the "DESCRIPTION.TXT" files included in each package, there is the following description:

    This data archive contains the miRNA expression data for cancer samples
    of The Cancer Genome Atlas (TCGA) project. The experiments were
    performed by the BCCA Genome Sciences Centre in BC using the miRNA-Seq approach
    on the Illumina platform.

    Please see DESCRIPTION.txt in the mage-tab for algorithm description of the data protocols.

    The .adf file format describing miRNA annotations is as follows:

    MiRNA ID
    miRBase version
    genome version and coordinates as <version>:<Chromosome>:<Start position>-<End position>:<Strand>
    precursor sequence
    mature strand coordinates relative to precursor coordinates, as <relative start>-<relative end>
    mature strand accession
    alternate mature strand coordinates, if provided by miRBase
    alternate mature form accession
    star strand name, if provided by miRBase
    star strand form accession

    The .mirna.quantification.txt data file describing summed expression for each miRNA is as follows:

    miRNA name
    raw read count
    reads per million miRNA reads
    cross-mapped to other miRNA forms (Y or N)

    The .isoform.quantification.txt data file describing every individual sequence isoform observed is as follows:

    miRNA name
    alignment coordinates as <version>:<Chromosome>:<Start position>-<End position>:<Strand>
    raw read count
    reads per million miRNA reads
    cross-mapped to other miRNA forms (Y or N)
    region within miRNA

    ------------------

    When one looks at the associated files such as *.isoform.quantification.txt and *.mirna.quantification.txt, one can get a sense of the coordinates for which counts were calculated.

    As an example, let's take a look at "hsa-let-7a-1" below. You'll notice that the mature and star forms are mentioned in the last column. If you look at column two, you'll see that the reads all align to a range of 25 nucleotides (hg19:9:96938242-96938267:+) for the mature miRNA of hsa-let-7a-1 and a range of 25 nucleotides (hg19:9:96938292-96938317:+) for the star miRNA of hsa-let-7a-1.

    From *.isoform.quantification.txt for one sample:

    hsa-let-7a-1 hg19:9:96938242-96938264:+ 4 1.495846 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938242-96938266:+ 4 1.495846 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938243-96938264:+ 8 2.991692 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938243-96938265:+ 4 1.495846 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938243-96938266:+ 6 2.243769 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938244-96938263:+ 157 58.711957 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938244-96938264:+ 5196 1943.104000 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938244-96938265:+ 3954 1478.643806 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938244-96938266:+ 7029 2628.575446 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938244-96938267:+ 278 103.961299 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938244-96938268:+ 18 6.731307 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938245-96938264:+ 9 3.365654 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938245-96938265:+ 16 5.983384 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938245-96938266:+ 44 16.454306 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938245-96938267:+ 3 1.121885 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938246-96938264:+ 1 0.373962 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938247-96938265:+ 2 0.747923 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938247-96938266:+ 10 3.739615 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938247-96938267:+ 2 0.747923 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938248-96938266:+ 1 0.373962 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938248-96938267:+ 2 0.747923 N mature,MIMAT0000062
    hsa-let-7a-1 hg19:9:96938292-96938311:+ 1 0.373962 N star,MIMAT0004481
    hsa-let-7a-1 hg19:9:96938295-96938314:+ 3 1.121885 N star,MIMAT0004481
    hsa-let-7a-1 hg19:9:96938295-96938315:+ 6 2.243769 N star,MIMAT0004481
    hsa-let-7a-1 hg19:9:96938295-96938316:+ 13 4.861500 N star,MIMAT0004481
    hsa-let-7a-1 hg19:9:96938295-96938317:+ 20 7.479230 N star,MIMAT0004481
    hsa-let-7a-1 hg19:9:96938296-96938316:+ 3 1.121885 N star,MIMAT0004481
    hsa-let-7a-1 hg19:9:96938296-96938317:+ 1 0.373962 N star,MIMAT0004481

    From *.mirna.quantification.txt for the same sample

    miRNA_ID read_count reads_per_million_miRNA_mapped cross-mapped
    hsa-let-7a-1 16795 6280.683542 N

    ==================

    The count of 16795 in *.mirna.quantification.txt for hsa-let-7a-1 is literally the sum of all the counts mapped to the 25-nucleotide-ranges for the mature and star regions of hsa-let-7a-1.

    So in conclusion, the values represent expression for the mature miRNAs, not the stem loops.

    Incidentally, if you're interested, we have imported all of the miRNA-Seq data along with all of the other TCGA assays into GenePool and linked in all of the patient & sample metadata. GenePool makes it very simple to slice and dice the samples according to patient characteristics and clinical metadata, then run analyses.

    Here are the links to related threads:

    Registered SEQanswers sponsors/vendors can post commercial content here. Please support our sponsors!

    Registered SEQanswers sponsors/vendors can post commercial content here. Please support our sponsors!


    Good Luck!

    ------------------------------
    GenePool is making genomics data management, analysis, and sharing easier!
    Products @ www.stationxinc.com
    Last edited by GenePool; 11-23-2014, 09:26 PM.

    Comment


    • #3
      Hey

      Hey, this problem troubles me too. However, after I checked as what you said, it seemed that the values represent expression for the mature miRNAs, ADDING the stem loops and PRECURSOR.





      Originally posted by GenePool View Post
      If you look at the "DESCRIPTION.TXT" files included in each package, there is the following description:

      This data archive contains the miRNA expression data for cancer samples
      of The Cancer Genome Atlas (TCGA) project. The experiments were
      performed by the BCCA Genome Sciences Centre in BC using the miRNA-Seq approach
      on the Illumina platform.

      Please see DESCRIPTION.txt in the mage-tab for algorithm description of the data protocols.

      The .adf file format describing miRNA annotations is as follows:

      MiRNA ID
      miRBase version
      genome version and coordinates as <version>:<Chromosome>:<Start position>-<End position>:<Strand>
      precursor sequence
      mature strand coordinates relative to precursor coordinates, as <relative start>-<relative end>
      mature strand accession
      alternate mature strand coordinates, if provided by miRBase
      alternate mature form accession
      star strand name, if provided by miRBase
      star strand form accession

      The .mirna.quantification.txt data file describing summed expression for each miRNA is as follows:

      miRNA name
      raw read count
      reads per million miRNA reads
      cross-mapped to other miRNA forms (Y or N)

      The .isoform.quantification.txt data file describing every individual sequence isoform observed is as follows:

      miRNA name
      alignment coordinates as <version>:<Chromosome>:<Start position>-<End position>:<Strand>
      raw read count
      reads per million miRNA reads
      cross-mapped to other miRNA forms (Y or N)
      region within miRNA

      ------------------

      When one looks at the associated files such as *.isoform.quantification.txt and *.mirna.quantification.txt, one can get a sense of the coordinates for which counts were calculated.

      As an example, let's take a look at "hsa-let-7a-1" below. You'll notice that the mature and star forms are mentioned in the last column. If you look at column two, you'll see that the reads all align to a range of 25 nucleotides (hg19:9:96938242-96938267:+) for the mature miRNA of hsa-let-7a-1 and a range of 25 nucleotides (hg19:9:96938292-96938317:+) for the star miRNA of hsa-let-7a-1.

      From *.isoform.quantification.txt for one sample:

      hsa-let-7a-1 hg19:9:96938242-96938264:+ 4 1.495846 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938242-96938266:+ 4 1.495846 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938243-96938264:+ 8 2.991692 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938243-96938265:+ 4 1.495846 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938243-96938266:+ 6 2.243769 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938244-96938263:+ 157 58.711957 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938244-96938264:+ 5196 1943.104000 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938244-96938265:+ 3954 1478.643806 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938244-96938266:+ 7029 2628.575446 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938244-96938267:+ 278 103.961299 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938244-96938268:+ 18 6.731307 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938245-96938264:+ 9 3.365654 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938245-96938265:+ 16 5.983384 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938245-96938266:+ 44 16.454306 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938245-96938267:+ 3 1.121885 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938246-96938264:+ 1 0.373962 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938247-96938265:+ 2 0.747923 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938247-96938266:+ 10 3.739615 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938247-96938267:+ 2 0.747923 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938248-96938266:+ 1 0.373962 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938248-96938267:+ 2 0.747923 N mature,MIMAT0000062
      hsa-let-7a-1 hg19:9:96938292-96938311:+ 1 0.373962 N star,MIMAT0004481
      hsa-let-7a-1 hg19:9:96938295-96938314:+ 3 1.121885 N star,MIMAT0004481
      hsa-let-7a-1 hg19:9:96938295-96938315:+ 6 2.243769 N star,MIMAT0004481
      hsa-let-7a-1 hg19:9:96938295-96938316:+ 13 4.861500 N star,MIMAT0004481
      hsa-let-7a-1 hg19:9:96938295-96938317:+ 20 7.479230 N star,MIMAT0004481
      hsa-let-7a-1 hg19:9:96938296-96938316:+ 3 1.121885 N star,MIMAT0004481
      hsa-let-7a-1 hg19:9:96938296-96938317:+ 1 0.373962 N star,MIMAT0004481

      From *.mirna.quantification.txt for the same sample

      miRNA_ID read_count reads_per_million_miRNA_mapped cross-mapped
      hsa-let-7a-1 16795 6280.683542 N

      ==================

      The count of 16795 in *.mirna.quantification.txt for hsa-let-7a-1 is literally the sum of all the counts mapped to the 25-nucleotide-ranges for the mature and star regions of hsa-let-7a-1.

      So in conclusion, the values represent expression for the mature miRNAs, not the stem loops.

      Incidentally, if you're interested, we have imported all of the miRNA-Seq data along with all of the other TCGA assays into GenePool and linked in all of the patient & sample metadata. GenePool makes it very simple to slice and dice the samples according to patient characteristics and clinical metadata, then run analyses.

      Here are the links to related threads:

      Registered SEQanswers sponsors/vendors can post commercial content here. Please support our sponsors!

      Registered SEQanswers sponsors/vendors can post commercial content here. Please support our sponsors!


      Good Luck!

      ------------------------------
      GenePool is making genomics data management, analysis, and sharing easier!
      Products @ www.stationxinc.com
      Attached Files

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM
      • seqadmin
        Techniques and Challenges in Conservation Genomics
        by seqadmin



        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

        Avian Conservation
        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
        03-08-2024, 10:41 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 06:37 PM
      0 responses
      11 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 06:07 PM
      0 responses
      10 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-22-2024, 10:03 AM
      0 responses
      51 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-21-2024, 07:32 AM
      0 responses
      68 views
      0 likes
      Last Post seqadmin  
      Working...
      X