Seqanswers Leaderboard Ad

**Cole Trapnell** · 03-26-2010, 09:15 AM

We'll clarify the manual with a picture, which I think will explain this issue much better. In the short term: suppose you have a gene with two isoforms, each of which starts at a different TSS. They could actually code for the same protein (i.e. only differ in UTR length), or they could code for different proteins. In the first case, they would have the same p_id. In the second case, they'd have different p_ids. The reason we did it this way is that we are interested in cases where you have switching in promoter use, and we wanted to see in how many of these genes that switch might actually mean a switch in the dominant protein being produced.

In our manual and terminology, "splicing" refers only to the processing of a primary transcript, so alternative TSS doesn't strictly fall under "splicing". I realize that many people group alternative TSS under "alternative splicing".

So within a given gene:

X_Y_tss_group_exp has rows that are groups of transcripts that share a tss_id, and gives the total FPKM for each TSS group
X_Y_gene_exp has rows that are groups of transcripts that share a gene_id, and gives the total FPKM for each gene
X_Y_cds_exp has rows that are groups of transcripts that share a p_id, and gives the total FPKM for each CDS group

X_Y_splicing has rows that are groups of transcripts that share a tss_id, and gives the change in relative abundance of transcripts that share a tss_id

X_Y_promoters has rows that are groups of primary transcripts that share a gene_id. There is one primary transcript for each tss_id, and its expression is given in X_Y_tss_group_exp. X_Y_promoters gives the change in relative abundance of primary transcripts that share a gene_id, i.e. genes with promoter switching.

X_Y_cds (not X_Y_cds_exp) is just like X_Y_promoters, except instead of primary transcripts (transcripts grouped by tss_id), we're working with groups of transcripts that code for the same protein (transcripts grouped by tss_id).

**RockChalkJayhawk** · 03-26-2010, 10:26 AM

Originally posted by Cole Trapnell View Post

We'll clarify the manual with a picture, which I think will explain this issue much better. In the short term: suppose you have a gene with two isoforms, each of which starts at a different TSS. They could actually code for the same protein (i.e. only differ in UTR length), or they could code for different proteins. In the first case, they would have the same p_id. In the second case, they'd have different p_ids. The reason we did it this way is that we are interested in cases where you have switching in promoter use, and we wanted to see in how many of these genes that switch might actually mean a switch in the dominant protein being produced.

So transcripts sharing p_id means they have alternative UTRs (but same protein sequence) whereas those that have different p_id are involved in exon skipping?

I may just have to wait for the picture. How soon do you think the Cufflinks paper will be out?

Topics	Statistics	Last Post
New Software Simplifies 3D Gene Expression Mapping by seqadmin Started by seqadmin, Today, 10:17 AM	0 responses 7 views 0 reactions	Last Post by seqadmin Today, 10:17 AM
AI Tool Creates High-Resolution 3D Maps of the Mouse Brain by seqadmin Started by seqadmin, 03-20-2025, 05:03 AM	0 responses 49 views 0 reactions	Last Post by seqadmin 03-20-2025, 05:03 AM
Studying Microbial Gene Transfer with RNA Barcoding by seqadmin Started by seqadmin, 03-19-2025, 07:27 AM	0 responses 59 views 0 reactions	Last Post by seqadmin 03-19-2025, 07:27 AM
Mapping the snoRNAome in Zebrafish to Advance Disease Research by seqadmin Started by seqadmin, 03-18-2025, 12:50 PM	0 responses 50 views 0 reactions	Last Post by seqadmin 03-18-2025, 12:50 PM

Seqanswers Leaderboard Ad

Cuffdiff multi-protein vs multi-promoter

Comment

Comment

Latest Articles

ad_right_rmr

News