SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bioscope multi-hits eoh001 Bioinformatics 1 08-15-2011 07:13 AM
Multi-Genome Alignment for QC... james hadfield Bioinformatics 0 08-17-2010 07:51 AM
to use multi-core in Tophat IrisZhu Bioinformatics 1 08-04-2010 12:39 PM
Help with glimmer multi-extract sbberes Bioinformatics 2 03-19-2010 01:35 PM
Tophat Multi-Thread ECHo Bioinformatics 2 02-08-2010 10:09 PM

Reply
 
Thread Tools
Old 03-25-2010, 07:05 PM   #1
RockChalkJayhawk
Senior Member
 
Location: Rochester, MN

Join Date: Mar 2009
Posts: 191
Default Cuffdiff multi-protein vs multi-promoter

I was hoping someone could clarify. I don't quite understnd the difference between the x_y_tss and x_y_CDS. I know it has something to do with the tss_ID and p_IDs, and I've read the manual 10 times. Does the x_y_tss tests mean that transcripts from the same gene, but at same promoter site are expressed differently and the x_y_CDS mean alternative transcripts but same promoter? In other words tss is for exon skipping and CDS is for alternative promoters?

If this is the case, what is the difference in these differential expression tests from the cds.diff, splicing.diff, and promoters.diff files?
RockChalkJayhawk is offline   Reply With Quote
Old 03-26-2010, 09:15 AM   #2
Cole Trapnell
Senior Member
 
Location: Boston, MA

Join Date: Nov 2008
Posts: 212
Default

We'll clarify the manual with a picture, which I think will explain this issue much better. In the short term: suppose you have a gene with two isoforms, each of which starts at a different TSS. They could actually code for the same protein (i.e. only differ in UTR length), or they could code for different proteins. In the first case, they would have the same p_id. In the second case, they'd have different p_ids. The reason we did it this way is that we are interested in cases where you have switching in promoter use, and we wanted to see in how many of these genes that switch might actually mean a switch in the dominant protein being produced.

In our manual and terminology, "splicing" refers only to the processing of a primary transcript, so alternative TSS doesn't strictly fall under "splicing". I realize that many people group alternative TSS under "alternative splicing".

So within a given gene:

X_Y_tss_group_exp has rows that are groups of transcripts that share a tss_id, and gives the total FPKM for each TSS group
X_Y_gene_exp has rows that are groups of transcripts that share a gene_id, and gives the total FPKM for each gene
X_Y_cds_exp has rows that are groups of transcripts that share a p_id, and gives the total FPKM for each CDS group

X_Y_splicing has rows that are groups of transcripts that share a tss_id, and gives the change in relative abundance of transcripts that share a tss_id

X_Y_promoters has rows that are groups of primary transcripts that share a gene_id. There is one primary transcript for each tss_id, and its expression is given in X_Y_tss_group_exp. X_Y_promoters gives the change in relative abundance of primary transcripts that share a gene_id, i.e. genes with promoter switching.

X_Y_cds (not X_Y_cds_exp) is just like X_Y_promoters, except instead of primary transcripts (transcripts grouped by tss_id), we're working with groups of transcripts that code for the same protein (transcripts grouped by tss_id).

Last edited by Cole Trapnell; 03-26-2010 at 09:18 AM.
Cole Trapnell is offline   Reply With Quote
Old 03-26-2010, 10:26 AM   #3
RockChalkJayhawk
Senior Member
 
Location: Rochester, MN

Join Date: Mar 2009
Posts: 191
Default

Quote:
Originally Posted by Cole Trapnell View Post
We'll clarify the manual with a picture, which I think will explain this issue much better. In the short term: suppose you have a gene with two isoforms, each of which starts at a different TSS. They could actually code for the same protein (i.e. only differ in UTR length), or they could code for different proteins. In the first case, they would have the same p_id. In the second case, they'd have different p_ids. The reason we did it this way is that we are interested in cases where you have switching in promoter use, and we wanted to see in how many of these genes that switch might actually mean a switch in the dominant protein being produced.
So transcripts sharing p_id means they have alternative UTRs (but same protein sequence) whereas those that have different p_id are involved in exon skipping?

I may just have to wait for the picture. How soon do you think the Cufflinks paper will be out?
RockChalkJayhawk is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:58 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO