Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Does Cufflinks Give Me Trascriptomes?

    Hi Everyone,

    I'm a beginner in this area, please forget any silly question.

    My situation is that I have a raw scaffold whole genome sequences for my organism, but not annotated at all. I run Tophat and cufflinks and got some results. But does the results of cuffdiff here means transcripts? or scaffolds?

    I discussed with some of my friends and they all have a concern that maybe what Cuffdiff counts showed me was the scaffold (not really the transcripts). That means the Cuffdiff was counted based on how many reads on scaffold, instead of transcripts. But the Cufflinks website and manual seems to say Cufflinks do assemble transcripts.

    I pasted some output from my gene_exp.diff:

    test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
    XLOC_000001 XLOC_000001 - Scaffold1:84918-92189 5D 20D OK 310.481 1.31159 -7.88704 6.88788 5.66303e-12 1.023e-10 yes
    XLOC_000002 XLOC_000002 - Scaffold1:92592-96046 5D 20D OK 162.253 1.31639 -6.94551 4.25586 2.08245e-05 0.000137689 yes


    I appreciate any commends!

  • #2
    I'm not totally clear on what you've got there. Pretty much the only application for cufflinks is when you have RNA Seq reads that you'll align to some reference. Cufflinks can process those alignments and make some decisions about how those reads, as aligned to the reference, might form transcripts. Cufflinks then returns an annotation of the locations of those transcripts as a GTF file which basically just shows you the start and end coordinates of the estimated exons with annotation grouping them into transcripts. Cufflinks uses ids like "XLOC_xxxxxxxx" to name the transcripts it predicts from the alignments.

    Does this match your experiment?
    /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
    Salk Institute for Biological Studies, La Jolla, CA, USA */

    Comment


    • #3
      Thank you for responding.

      The problem is, I found more than one genes in a Cufflinks-predicted transcript. So my friend raised the doubts whether the assembly Cufflinks did is good or not. Does it the real transcript, which could response to differential expression; or it is a long fragment/contig, that contains several genes, and the differential expression is not accurate.

      Comment


      • #4
        So cuffdiff won't give you the genomic scaffold FPKMs, unless your annotation file you are feeding it has the whole scaffold listed as a gene or transcript, which is definitely not the right thing to do.

        If you have no annotation, I wouldn't recommend using cufflinks RABT annotation by itself. Instead, I would suggest assembling your RNAseq data de novo with trinity or trans-abyss, then doing a genome annotation with maker using the RNAseq derived transcripts. Once that is complete, you will get a much more reliable gtf annotation file to feed cufflinks and do the RABT annotation to add additional genes/transcripts, if you wish. Though I would trust PASA more than cufflinks for adding transcripts to your maker annotation file.

        I would warn you that this process is pretty involved, even for RNAseq veterans. But if you or your group went though the trouble to construct a decent genome, you'd be doing yourselves a disservice by not creating a decent annotation to go with it.

        Comment


        • #5
          Agreed. Also bioinformatics with unannotated species is no simple matter. I mostly have the luxury of working with mouse data which is nicely supported and even that gets complicated at times. I have been collaborating with someone working with frog data and its a real mess trying to use cufflinks and RNA seq reads to try to construct a real reference. For one thing cufflinks is going to provide you with real intron chains however it doesn't do any type of biologically informed analysis to determine the end points of transcripts. Plus you're at the mercy of the randomness of RNA seq data. Cufflinks tries to fill in gaps and make guesses but its entirely based on simple thresholds like if a gap in coverage is less than 50 bases it'll call that a continuous feature or to call the end of a 3' or 5' Exon it just has a pileup cutoff threshold. In addition it does whatever it can to report the least number of transcripts that "explain" the coverage so it could be easily tricked by a complex locus.
          /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
          Salk Institute for Biological Studies, La Jolla, CA, USA */

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Recent Advances in Sequencing Analysis Tools
            by seqadmin


            The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
            05-06-2024, 07:48 AM
          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 06:57 AM
          0 responses
          12 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 05-06-2024, 07:17 AM
          0 responses
          16 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 05-02-2024, 08:06 AM
          0 responses
          19 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-30-2024, 12:17 PM
          0 responses
          24 views
          0 likes
          Last Post seqadmin  
          Working...
          X