Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Checking Cuffdiff

    I am using an interesting dataset to "test" differential isoform expression programs.

    Unfortunately, I am not an expert in every (any?) program, so I could use some sanity checking.

    I have 3 separate tissues, ABC. I want to use (in this case) cuffdiff to identify isoforms which are uniquely expressed in A/B/C, as I can use other "ground truth" runs to verify these claims.

    I ran the program as follows, alternating A, B, and C:
    Code:
     cuffdiff -p 8 -c 10 <ucsc.gtf> A1,A2,A3 B1,B2,B3,C1,C2,C3 -o outdir
    I'm not using a cufflinks-derived gtf or (exclusively) tophat-mapped reads. I imagine I'm doing it all wrong. I have two main questions:

    1) Can I get away with not using the entire cufflinks pathway here? (If not, why doesn't the program complain?)
    2) Am I properly comparing the 3 tissues? Does A vs B,C return transcripts DE in only A, as i intend it to?

  • #2
    Hello jparsons,

    I used cufflinks and cuffdiff with GSNAP alignments and it worked fine, so you do not need to stick to TopHat necessarily as long as the sam/bam-files have all required columns.
    However, I used the cufflinks -> cuffmerge -> cuffdiff variant to check my genes, since that way was suggested by the authors (but not very successful for me).

    After following some discussions in this forum, see
    http://seqanswers.com/forums/showthread.php?t=20702
    and
    http://seqanswers.com/forums/showthread.php?t=16528

    I concluded that cufflinks/cuffdiff have a problem in their correction for variance. For my analysis, the bigger my sample groups were, the fewer genes were found significantly DE until none were left. Therefore I assume that pooling group B and C will result in a similar problem due to high variance between both groups.

    Besides that, your command looks fine, so please keep us posted on your progress.

    Comment


    • #3
      Rboettcher,

      Thanks for the response. I eventually compared the output from tophat->cufflinks->cuffmerge->cuffdiff to that from only cuffdiff and found that they were (mostly) identical. I am content using cuffdiff without going through the entire pipeline.

      I got results for cuffdiff and finally managed to get RSEM to like me for long enough to spit out quantitations. When compared to the "truth" set (sadly only available on the gene level for now), the RSEM/cuffdiff lists are 'decent' individually, coming close to the expected ratio on average, but having numerous outliers. Taking the overlap set of genes called by both RSEM and cuffdiff makes for a much cleaner picture, with far less deviation from the ratio, and fewer false positives.

      I'm still working on making metrics that make sense, so 'decent' and 'cleaner' is the best i can offer for now. I imagine I will develop permissive and restrictive "true positive" lists at each ratio and then generate ROCs for each algorithm I can successfully test.

      I'm currently worried about algorithms making calls for downregulated genes or calling them as differentially expressed in cases where the assumption that "A>>B+C or A<<B+C" doesn't hold. I don't know how to handle that yet, and it may be the source of the outliers I mentioned before.

      Overall, I am actually impressed with cuffdiff's performance, given how much grief it gets here. Neither algorithm is even remotely perfect, neither is obviously superior.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Recent Advances in Sequencing Analysis Tools
        by seqadmin


        The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
        05-06-2024, 07:48 AM
      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 05-10-2024, 06:35 AM
      0 responses
      15 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 05-09-2024, 02:46 PM
      0 responses
      21 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 05-07-2024, 06:57 AM
      0 responses
      18 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 05-06-2024, 07:17 AM
      0 responses
      19 views
      0 likes
      Last Post seqadmin  
      Working...
      X