SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
differentiated expression analysis on isoforms/transcripts level arrchi Bioinformatics 12 01-20-2012 10:55 AM
Tophat and unannotated transcripts honey Bioinformatics 1 07-02-2011 07:58 AM
Transcripts expression estimation using cuffdiff by providing reference annotaion combiochem Bioinformatics 4 06-25-2010 12:53 AM
PubMed: Transcript assembly and quantification by RNA-Seq reveals unannotated transcr Newsbot! Literature Watch 2 06-11-2010 06:56 AM
Quantify Library before Cluster Generation kwebb Illumina/Solexa 7 06-22-2009 05:01 PM

Reply
 
Thread Tools
Old 06-04-2012, 12:17 AM   #1
icebsd
Junior Member
 
Location: ap

Join Date: Jul 2009
Posts: 3
Default How to quantify expression values of unannotated transcripts?

Dear all,

I need to quantify the expression value for several transcripts that are not annotated in standard ENSEMBL genomes after I have aligned the reads to reference genome/transcriptome and quantified the gene/transcript expression level.

I think I should combine the unannotated transcripts with the reference genome, and then map all reads to the reference genome/transcriptome, instead of only map reads to the unannotated transcripts. Am I right? Or is there any other ways to quantify those annotated transcripts?

And I don't have gtf file for the unannotated transcripts. I only have the sequences and RefSeq ID of those transcripts. So another question is: how to generate gtf file for the unannotated transcripts?

Thanks a lot in advance.
icebsd is offline   Reply With Quote
Old 06-06-2012, 11:45 AM   #2
chknbio
Member
 
Location: Baltimore

Join Date: May 2012
Posts: 14
Default

I have a similar question. I would like to know if there is differential expression of unannotated transcripts in RNAseq data. I have a particular unannotated transcript that I can see visually as a track in the UCSC genome browser when I upload tophat or cufflinks files. But I would like to know if this transcript is differentially expression among the samples.
chknbio is offline   Reply With Quote
Old 06-06-2012, 04:11 PM   #3
wangli
Member
 
Location: Texas

Join Date: Apr 2012
Posts: 48
Default

I am kind of on the same boat. As far as i know, the tophat/cufflinks pipeline can detect some novel genes compared to the reference genome. If I am interested if those novel gene are differentially expressed, how could i achieve the goal? Currently, my workflow is as RNA-seq-----tophat------htseq-count-----EdgeR.
wangli is offline   Reply With Quote
Old 06-06-2012, 05:51 PM   #4
Pseudonym
Research Engineer
 
Location: NICTA VRL, Melbourne, Australia

Join Date: Jun 2011
Posts: 12
Default

Just a suggestion, but have you tried CuffDiff or RSEM? Or do they not do what you want?
__________________
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
Pseudonym is offline   Reply With Quote
Old 06-07-2012, 07:47 AM   #5
alexdobin
Senior Member
 
Location: NY

Join Date: Feb 2009
Posts: 161
Default

I think the main problem is that the unannotated transcripts are likely to be different in the two samples. What you need is a common list of unannotated transcripts for the two samples, which would be a "quasi-annotation".

For the ENCODE RNA-seq data we used Cuffmerge to merge "de novo" Cufflinks transcripts from different samples. A big problem with this approach is the over-extension of transripts, which is a very common issue with both Cufflinks and Cuffmerge.
alexdobin is offline   Reply With Quote
Old 06-07-2012, 09:32 AM   #6
wangli
Member
 
Location: Texas

Join Date: Apr 2012
Posts: 48
Default

In my experience, "cuffmerge" and "cuffdiff" detect around 4000 novel genes, which represents 10% of the total gene, which seems totally out of our expectation and might be far from the truth. So, it seems to me that "cuffmerge" will overestimate the novel genes. I would like to hear from other people's opinion.
wangli is offline   Reply With Quote
Old 06-07-2012, 01:56 PM   #7
adameur
Member
 
Location: Uppsala, Sweden

Join Date: Nov 2009
Posts: 23
Default

Out of curiosity, have you looked at the genomic distribution of these 'novel genes'? When we did this type of analysis we found that a high proportion were intronic, and it turned out that they were not novel genes at all. Instead they represented immature (nascent) transcripts of the surrounding gene where the introns have not yet been spliced. Also, we found that nascent transcripts are more abundant in some tissues (like brain) compared to others.

Don't know if this is what is going on here, but I suspect some programs could by mistake report nascent transcripts as being 'novel genes'.
adameur is offline   Reply With Quote
Old 06-07-2012, 02:04 PM   #8
wangli
Member
 
Location: Texas

Join Date: Apr 2012
Posts: 48
Default

Hi, Adameur

Thanks for your information. Could you please provide further information concerning how to distinguish nascent transcripts from true novel genes?

Thanks
wangli is offline   Reply With Quote
Old 06-07-2012, 11:06 PM   #9
adameur
Member
 
Location: Uppsala, Sweden

Join Date: Nov 2009
Posts: 23
Default

Hi wangli,

Nascent transcripts have a negative gradient of coverage across introns, with more reads in the 5' end of the intron compared to the 3' end. We have described this in detail in this publication in Nat Struct Mol Biol.

Also, its important to note that Total RNA-seq captures more nascent transcripts compared to PolyA+ RNA-seq.

Adam
adameur is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:23 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO