Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to quantify expression values of unannotated transcripts?

    Dear all,

    I need to quantify the expression value for several transcripts that are not annotated in standard ENSEMBL genomes after I have aligned the reads to reference genome/transcriptome and quantified the gene/transcript expression level.

    I think I should combine the unannotated transcripts with the reference genome, and then map all reads to the reference genome/transcriptome, instead of only map reads to the unannotated transcripts. Am I right? Or is there any other ways to quantify those annotated transcripts?

    And I don't have gtf file for the unannotated transcripts. I only have the sequences and RefSeq ID of those transcripts. So another question is: how to generate gtf file for the unannotated transcripts?

    Thanks a lot in advance.

  • #2
    I have a similar question. I would like to know if there is differential expression of unannotated transcripts in RNAseq data. I have a particular unannotated transcript that I can see visually as a track in the UCSC genome browser when I upload tophat or cufflinks files. But I would like to know if this transcript is differentially expression among the samples.

    Comment


    • #3
      I am kind of on the same boat. As far as i know, the tophat/cufflinks pipeline can detect some novel genes compared to the reference genome. If I am interested if those novel gene are differentially expressed, how could i achieve the goal? Currently, my workflow is as RNA-seq-----tophat------htseq-count-----EdgeR.

      Comment


      • #4
        Just a suggestion, but have you tried CuffDiff or RSEM? Or do they not do what you want?
        sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});

        Comment


        • #5
          I think the main problem is that the unannotated transcripts are likely to be different in the two samples. What you need is a common list of unannotated transcripts for the two samples, which would be a "quasi-annotation".

          For the ENCODE RNA-seq data we used Cuffmerge to merge "de novo" Cufflinks transcripts from different samples. A big problem with this approach is the over-extension of transripts, which is a very common issue with both Cufflinks and Cuffmerge.

          Comment


          • #6
            In my experience, "cuffmerge" and "cuffdiff" detect around 4000 novel genes, which represents 10% of the total gene, which seems totally out of our expectation and might be far from the truth. So, it seems to me that "cuffmerge" will overestimate the novel genes. I would like to hear from other people's opinion.

            Comment


            • #7
              Out of curiosity, have you looked at the genomic distribution of these 'novel genes'? When we did this type of analysis we found that a high proportion were intronic, and it turned out that they were not novel genes at all. Instead they represented immature (nascent) transcripts of the surrounding gene where the introns have not yet been spliced. Also, we found that nascent transcripts are more abundant in some tissues (like brain) compared to others.

              Don't know if this is what is going on here, but I suspect some programs could by mistake report nascent transcripts as being 'novel genes'.

              Comment


              • #8
                Hi, Adameur

                Thanks for your information. Could you please provide further information concerning how to distinguish nascent transcripts from true novel genes?

                Thanks

                Comment


                • #9
                  Hi wangli,

                  Nascent transcripts have a negative gradient of coverage across introns, with more reads in the 5' end of the intron compared to the 3' end. We have described this in detail in this publication in Nat Struct Mol Biol.

                  Also, its important to note that Total RNA-seq captures more nascent transcripts compared to PolyA+ RNA-seq.

                  Adam

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 03-27-2024, 06:37 PM
                  0 responses
                  12 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-27-2024, 06:07 PM
                  0 responses
                  11 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  53 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  69 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X