Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to quantify expression values of unannotated transcripts?

    Dear all,

    I need to quantify the expression value for several transcripts that are not annotated in standard ENSEMBL genomes after I have aligned the reads to reference genome/transcriptome and quantified the gene/transcript expression level.

    I think I should combine the unannotated transcripts with the reference genome, and then map all reads to the reference genome/transcriptome, instead of only map reads to the unannotated transcripts. Am I right? Or is there any other ways to quantify those annotated transcripts?

    And I don't have gtf file for the unannotated transcripts. I only have the sequences and RefSeq ID of those transcripts. So another question is: how to generate gtf file for the unannotated transcripts?

    Thanks a lot in advance.

  • #2
    I have a similar question. I would like to know if there is differential expression of unannotated transcripts in RNAseq data. I have a particular unannotated transcript that I can see visually as a track in the UCSC genome browser when I upload tophat or cufflinks files. But I would like to know if this transcript is differentially expression among the samples.

    Comment


    • #3
      I am kind of on the same boat. As far as i know, the tophat/cufflinks pipeline can detect some novel genes compared to the reference genome. If I am interested if those novel gene are differentially expressed, how could i achieve the goal? Currently, my workflow is as RNA-seq-----tophat------htseq-count-----EdgeR.

      Comment


      • #4
        Just a suggestion, but have you tried CuffDiff or RSEM? Or do they not do what you want?
        sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});

        Comment


        • #5
          I think the main problem is that the unannotated transcripts are likely to be different in the two samples. What you need is a common list of unannotated transcripts for the two samples, which would be a "quasi-annotation".

          For the ENCODE RNA-seq data we used Cuffmerge to merge "de novo" Cufflinks transcripts from different samples. A big problem with this approach is the over-extension of transripts, which is a very common issue with both Cufflinks and Cuffmerge.

          Comment


          • #6
            In my experience, "cuffmerge" and "cuffdiff" detect around 4000 novel genes, which represents 10% of the total gene, which seems totally out of our expectation and might be far from the truth. So, it seems to me that "cuffmerge" will overestimate the novel genes. I would like to hear from other people's opinion.

            Comment


            • #7
              Out of curiosity, have you looked at the genomic distribution of these 'novel genes'? When we did this type of analysis we found that a high proportion were intronic, and it turned out that they were not novel genes at all. Instead they represented immature (nascent) transcripts of the surrounding gene where the introns have not yet been spliced. Also, we found that nascent transcripts are more abundant in some tissues (like brain) compared to others.

              Don't know if this is what is going on here, but I suspect some programs could by mistake report nascent transcripts as being 'novel genes'.

              Comment


              • #8
                Hi, Adameur

                Thanks for your information. Could you please provide further information concerning how to distinguish nascent transcripts from true novel genes?

                Thanks

                Comment


                • #9
                  Hi wangli,

                  Nascent transcripts have a negative gradient of coverage across introns, with more reads in the 5' end of the intron compared to the 3' end. We have described this in detail in this publication in Nat Struct Mol Biol.

                  Also, its important to note that Total RNA-seq captures more nascent transcripts compared to PolyA+ RNA-seq.

                  Adam

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  25 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  27 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  24 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  52 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X