Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Papers on the sensitivity of mRNA-seq?

    is there an article discussing the sensitivity of mRNA-seq?

    I am looking the answer for this question:

    for a given transcriptome, how many reads of x length are needed to reliably discover a rare transcripts (say 1~2 copies / cell)?

    Thanks!
    Last edited by liux; 05-10-2010, 01:11 PM.

  • #2
    Hi

    I don't know of any papers, but is should be possible to calculate this yourself.

    Let's say a typical cell has N transcript molecules, then the concentration of your rare transcript is roughly 1/N. If your sequencing run produces M reads (typically, M is up to 20 mio), the probability that a given read is your transcript is M/N.

    The probability that none of the M reads show your sample is (1-M/N)^M, hence, the probability to see it at least once is, 1-((1-M/N)^M). If you say, you want to see it at least, say, k=10 times, you can easily calculate this with the Poisson distribution.

    Now, how do you know how many transcripts there are in a cell, i.e., what is the value of N? For such questions, the following nice paper and its web site, that collects a lot of such numbers, might be useful: Phillips and Milo, A feeling for numbers in biology, PNAS, Vol. 106, 21465-71 (2009).

    Finally, as you are looking for rare transcripts, you might also be interested in this new method to reduce the number of common transcripts, that a colleague happened to have shown me just an hour ago: Bogdanov et al., Normalizing cDNA Libraries, Curr Prot Mol Biol, 5.12.1, Apr 2010

    Simon

    Comment


    • #3
      Originally posted by liux View Post
      is there an article discussing the sensitivity of mRNA-seq?

      I am looking the answer for this question:

      for a given transcriptome, how many reads of x length are needed to reliably discover a rare transcripts (say 1~2 copies / cell)?

      Thanks!
      Hello,
      See if Trapnell et al. 2010 (Nature Biotech) helps. Figure 4 shows how many reads you need to recover a transcript expressed at a given RPKM.
      Maybe not exactly what you are asking but possibly you can get a feel for it.

      Dario

      Comment


      • #4
        Three apropos papers

        Genome Biol. 2010 May 11;11(5):R50. [Epub ahead of print]
        Modeling non-uniformity in short-read rates in RNA-Seq data.
        Li J, Jiang H, Wong WH.

        Abstract
        ABSTRACT: After mapping, RNA-Seq data can be summarized by a sequence of read counts commonly modeled as Poisson variables with constant rates along each transcript, which actually fit data poorly. We suggest using variable rates for different positions, and propose two models to predict these rates based on local sequences. These models explain more than 50% of the variations and can lead to improved estimates of gene and isoform expressions for both Illumina and Applied Biosystems (ABI) data.

        PMID: 20459815




        BMC Bioinformatics. 2010 Apr 29;11 Suppl 3:S6.
        Towards reliable isoform quantification using RNA-SEQ data.
        Howard BE, Heber S.

        Bioinformatics Research Center, North Carolina State University, Raleigh, 27606, USA. [email protected]
        Abstract
        BACKGROUND : In eukaryotes, alternative splicing often generates multiple splice variants from a single gene. Here we explore the use of RNA sequencing (RNA-Seq) datasets to address the isoform quantification problem. Given a set of known splice variants, the goal is to estimate the relative abundance of the individual variants. METHODS : Our method employs a linear models framework to estimate the ratios of known isoforms in a sample. A key feature of our method is that it takes into account the non-uniformity of RNA-Seq read positions along the targeted transcripts. RESULTS : Preliminary tests indicate that the model performs well on both simulated and real data. In two publicly available RNA-Seq datasets, we identified several alternatively-spliced genes with switch-like, on/off expression properties, as well as a number of other genes that varied more subtly in isoform expression. In many cases, genes exhibiting differential expression of alternatively spliced transcripts were not differentially expressed at the gene level. CONCLUSIONS : Given that changes in isoform expression level frequently involve a continuum of isoform ratios, rather than all-or-nothing expression, and that they are often independent of general gene expression changes, we anticipate that our research will contribute to revealing a so far uninvestigated layer of the transcriptome. We believe that, in the future, researchers will prioritize genes for functional analysis based not only on observed changes in gene expression levels, but also on changes in alternative splicing.

        PMID: 20438653


        BMC Genomics. 2010 May 5;11(1):282. [Epub ahead of print]
        A comparison of massively parallel nucleotide sequencing with oligonucleotide microarrays for global transcription profiling.
        Bradford JR, Hey Y, Yates T, Li Y, Pepper SD, Miller CJ.

        Abstract
        ABSTRACT: BACKGROUND: RNA-Seq exploits the rapid generation of gigabases of sequence data by Massively Parallel Nucleotide Sequencing, allowing for the mapping and digital quantification of whole transcriptomes. Whilst previous comparisons between RNA-Seq and microarrays have been performed at the level of gene expression, in this study we adopt a more fine-grained approach. Using RNA samples from a normal human breast epithelial cell line (MCF-10a) and a breast cancer cell line (MCF-7), we present a comprehensive comparison between RNA-Seq data generated on the Applied Biosystems SOLiD platform and data from Affymetrix Exon 1.0ST arrays. The use of Exon arrays makes it possible to assess the performance of RNA-Seq in two key areas: detection of expression at the granularity of individual exons, and discovery of transcription outside annotated loci. RESULTS: We found a high degree of correspondence between the two platforms in terms of exon-level fold changes and detection. For example, over 80% of exons detected in RNA-Seq were also detected on the Exon array, and 91% of exons flagged as changing from Absent to Present on at least one platform had fold-changes in the same direction. The greatest detection correspondence was seen when the read count threshold at which to flag exons Absent in the SOLiD data was set to t<1 suggesting that the background error rate is extremely low in RNA-Seq. We also found RNA-Seq more sensitive to detecting differentially expressed exons than the Exon array, reflecting the wider dynamic range achievable on the SOLiD platform. In addition, we find significant evidence of novel protein coding regions outside known exons, 93% of which map to Exon array probesets, and are able to infer the presence of thousands of novel transcripts through the detection of previously unreported exon-exon junctions. CONCLUSIONS: By focusing on exon-level expression, we present the most fine-grained comparison between the RNA-Seq and microarrays to date. Overall, our study demonstrates that data from a SOLiD RNA-Seq experiment are sufficient to generate results comparable to those produced from Affymetrix Exon arrays, even using only a single replicate from each platform, and when presented with a large genome.

        PMID: 20444259

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        31 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        32 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        28 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        53 views
        0 likes
        Last Post seqadmin  
        Working...
        X