Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Very low Sn/Sp outputs from Cufflinks?

    I have obtained a RNA-seq library from my collaborator with a total of more than 100M reads with length of 36bp from three Illumina sequencing lanes.

    So I tried to use tophat + cufflinks to discover some novel splice isoforms from this library. I do realize that it is ideal to use paired-end reads with longer lengths such as 75bp, I just want to see what I can get from cufflinks. However, the outputs seem to be a bit dissapointing after running cuffcompare:

    #--------------------| Sn | Sp | fSn | fSp
    Base level: 59.0 17.8 - -
    Exon level: 1.7 0.4 18.6 4.0
    Intron level: 7.5 47.0 7.6 47.3
    Intron chain level: 0.1 0.1 0.1 0.1
    Transcript level: 0.0 0.0 0.0 0.0
    Locus level: 0.1 0.0 0.2 0.0
    Missed exons: 66987/206780 ( 32.4%)
    Wrong exons: 813919/958710 ( 84.9%)
    Missed introns: 167142/185318 ( 90.2%)
    Wrong introns: 11318/29587 ( 38.3%)
    Missed loci: 5737/21602 ( 26.6%)
    Wrong loci: 782485/927668 ( 84.3%)

    At the transcript level, both Sn and Sp are zero! Does that mean cufflinks is not supposed to be run with short single-ended RNA-seq data? Is this typical or did I do sth. wrong? Any inputs?

    - L

  • #2
    I'm trying to lift this post. It's strange nobody replies to it. Does that mean nobody know the answer? ...

    Comment


    • #3
      Reads of 36bp are really short for a Tophat + Scripture/Cufflinks approach. The software will run, but you will take a performance hit in addition to getting less informative output. We have analyzed libraries of ~200 million paired 36-mers + 42-mers (20% and 80% of the data respectively) with Tophat + Scripture/Cufflinks. The output was interesting but did not work nearly as well as an approach involving mapping reads directly to a database of junctions, transcripts and genomic sequences. This is not a failing of the Tophat + Cufflinks/Scripture approach, it is simply that these methods are not optimized for such short reads. TopHat attempts to identify splice junctions by splitting the reads (an over simplification). Any method that takes this type of approach will suffer when the reads are that short. If you want to have decent sensitivity/specificity for detecting junctions you could try mapping to a database of known and predicted junction sequences of suitable length... Once you are analyzing libraries like paired 75-mers you will find that Cufflinks and Scripture shine a lot brighter... Another option is to use Trans-ABySS or some other de novo assembly approach to make longer contigs out of your 36-mers and then align these instead...

      Comment


      • #4
        Yeah...That's what I thought. Our library is mainly good for gene expression profiling. Thx for sharing your experience!

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 11:49 AM
        0 responses
        15 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-24-2024, 08:47 AM
        0 responses
        16 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        61 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Working...
        X