Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cuffdiff multi-protein vs multi-promoter

    I was hoping someone could clarify. I don't quite understnd the difference between the x_y_tss and x_y_CDS. I know it has something to do with the tss_ID and p_IDs, and I've read the manual 10 times. Does the x_y_tss tests mean that transcripts from the same gene, but at same promoter site are expressed differently and the x_y_CDS mean alternative transcripts but same promoter? In other words tss is for exon skipping and CDS is for alternative promoters?

    If this is the case, what is the difference in these differential expression tests from the cds.diff, splicing.diff, and promoters.diff files?

  • #2
    We'll clarify the manual with a picture, which I think will explain this issue much better. In the short term: suppose you have a gene with two isoforms, each of which starts at a different TSS. They could actually code for the same protein (i.e. only differ in UTR length), or they could code for different proteins. In the first case, they would have the same p_id. In the second case, they'd have different p_ids. The reason we did it this way is that we are interested in cases where you have switching in promoter use, and we wanted to see in how many of these genes that switch might actually mean a switch in the dominant protein being produced.

    In our manual and terminology, "splicing" refers only to the processing of a primary transcript, so alternative TSS doesn't strictly fall under "splicing". I realize that many people group alternative TSS under "alternative splicing".

    So within a given gene:

    X_Y_tss_group_exp has rows that are groups of transcripts that share a tss_id, and gives the total FPKM for each TSS group
    X_Y_gene_exp has rows that are groups of transcripts that share a gene_id, and gives the total FPKM for each gene
    X_Y_cds_exp has rows that are groups of transcripts that share a p_id, and gives the total FPKM for each CDS group

    X_Y_splicing has rows that are groups of transcripts that share a tss_id, and gives the change in relative abundance of transcripts that share a tss_id

    X_Y_promoters has rows that are groups of primary transcripts that share a gene_id. There is one primary transcript for each tss_id, and its expression is given in X_Y_tss_group_exp. X_Y_promoters gives the change in relative abundance of primary transcripts that share a gene_id, i.e. genes with promoter switching.

    X_Y_cds (not X_Y_cds_exp) is just like X_Y_promoters, except instead of primary transcripts (transcripts grouped by tss_id), we're working with groups of transcripts that code for the same protein (transcripts grouped by tss_id).
    Last edited by Cole Trapnell; 03-26-2010, 09:18 AM.

    Comment


    • #3
      Originally posted by Cole Trapnell View Post
      We'll clarify the manual with a picture, which I think will explain this issue much better. In the short term: suppose you have a gene with two isoforms, each of which starts at a different TSS. They could actually code for the same protein (i.e. only differ in UTR length), or they could code for different proteins. In the first case, they would have the same p_id. In the second case, they'd have different p_ids. The reason we did it this way is that we are interested in cases where you have switching in promoter use, and we wanted to see in how many of these genes that switch might actually mean a switch in the dominant protein being produced.
      So transcripts sharing p_id means they have alternative UTRs (but same protein sequence) whereas those that have different p_id are involved in exon skipping?

      I may just have to wait for the picture. How soon do you think the Cufflinks paper will be out?

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin


        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
        Yesterday, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      39 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      41 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      35 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      55 views
      0 likes
      Last Post seqadmin  
      Working...
      X