Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • TopHat & Cufflinks failing to assemble full length transcripts

    Hi,

    First post on SeqAnswers. The discussions here are very useful.

    We are using Tophat (v1.0.13) and Cufflinks (0.8.3) without a reference GTF and then use Cuffcompare to identify the assembled transcripts.

    We are finding many transcripts reported as novel isoforms that we suspect are actually just the main transcript being divided into 2 fragments, or are leaving out a few exons at the beginning which are clearly covered by reads.

    For example, if the gene has 34 exons, novel isoform j1 is identified as the first 23 exons, and j2 is made of the last 11 exons. Another example, a novel transcript is reported which begins from exon2, however there are an equal # of reads covering the first exon.

    Examination of the .wig file shows coverage of the complete transcript but for some reason the full length transcript is not being assembled. We've changed the # of bp on either side of the splice junction, with no avail. We also run the butterfly search, and no change with that option either.

    Does anyone have suggestions for us?

    Thanks,
    Jessica

  • #2
    Are you certain there are spliced reads connecting those exons as well? You want to visualize the read alignments in IGV to ensure that you have reads spanning all the junctions.

    Comment


    • #3
      I'm also interested in mRNA-seq reads assembly. It seems that there is not a good soft do that work because of the alternative splice.

      Comment


      • #4
        Try new version instead of this Cufflinks (0.8.3). I got greatly improved transcript assembling results with my single read data. Lot of them "full" length when compared with existing annotations.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        22 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        24 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        19 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        50 views
        0 likes
        Last Post seqadmin  
        Working...
        X