Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Daniel1977
    Junior Member
    • Mar 2013
    • 6

    No Replicates Cuffdiff; How does the 'variability model good fit' apply? and other...

    Hi,

    I want to clarify (in my head of course) how the system works so that I can make the proper choices when I use it.

    I have RNAseq data for several conditions, cell line with or without a virus (and deletion mutants) and/or different treatments (no biological/technical replicates, and yes I understand what this means in terms of the results).

    So, what Cuffdiff will do is ‘‘pool the conditions together to derive a dispersion model’’, which assumes that most genes won´t be differentially expressed then describes overall gene variances based on ‘expression’ levels (roughly; e.g. my gene’s fpkm is 500, I don´t know its variance as I do not have replicates, but most genes around that expression have a variance of 620, hence I will assign a variance of 620 to my gene for testing).

    Q1-Right?

    Next,’’ Cuffdiff assumes that gene's and transcripts with similar expression levels have similar variances in those expression levels. However, that's not always the case - some genes have unusually high variability, for biologically meaningful reasonss. Thus, Cuffdiff checks that the variability model is good fit before performing any signficance testing. This works as follows: Cuffdiff first calculates the expression level of the gene or transcript in each condition begin compared. It does so by pooling the data for the replicates of a condition together. The model gives an expression and variance estimate for each condition, along with confidence intervals around the expression estimates. Then, Cuffdiff calculates the expression level of the gene in each replicate independently. If the expression level of one or more replicates lies outside the confidence interval generated by the model, Cuffdiff flags the transcript as poorly fit by the model, and no signficance testing is performed.’’

    Q2-How does this apply to my setting (no replicates)?

    Q3- What exactly are the ‘NOTEST’ , ‘HIGHDATA’ and ‘FAIL’ flags?

    Q4- How does the –min-outlier-p option apply to the settings (no replicates)?

    Q5- Should I run Cuffdiff with all my conditions at once so that the variance model benefits from all the possible data instead of doing pairwise Cuffdiff runs?

    Q6- In case I want to use the expression values with tools not included in Cuffdiff (PCAs, correlations etc.) is the FPKM value alone the one to use?

    Any comments etc. would be greatly appreciated.
    D
  • Daniel1977
    Junior Member
    • Mar 2013
    • 6

    #2
    Any comments at least for Q4 and Q5?
    cheers

    Comment

    • Krish_143
      Member
      • Jan 2012
      • 45

      #3
      Incase if i do not have replicates,
      1) I will run cufdiff for 2 samples (ctl,sample1)(ctl,sample2)... do it for all and then compare all together.
      or
      a) I will preffer cufflinks 1st step (all samples), so i will have fpkm value for all the genes to all the samples and then based on gene of interest using R i will analyze the data.

      choose what you like.
      Last edited by Krish_143; 03-21-2013, 09:46 AM.
      Krishna

      Comment

      • Daniel1977
        Junior Member
        • Mar 2013
        • 6

        #4
        Thanks Kris_143.

        Yes one can always use the (a) option.

        But if you want to use cuffdiff it seems like a better option to run cuffdiff with all the conditions so that the variance model is more informed, hence you will probably get more precise results.

        Also regarding Q4 I´ve tried 0.01, 0.05 and 0.99 and got:

        P =0.99
        Performed 0 isoform-level transcription difference tests
        Performed 0 tss-level transcription difference tests
        Performed 38023 gene-level transcription difference tests
        Performed 0 CDS-level transcription difference tests

        P=0.05
        Performed 45266 isoform-level transcription difference tests
        Performed 40546 tss-level transcription difference tests
        Performed 38023 gene-level transcription difference tests
        Performed 39875 CDS-level transcription difference tests

        P=0.01
        Performed 45266 isoform-level transcription difference tests
        Performed 40546 tss-level transcription difference tests
        Performed 38025 gene-level transcription difference tests
        Performed 39875 CDS-level transcription difference tests

        It seems that it isn´t really doing anything (also from other comments in the post), but it puzzles me that the 0.99 option is devoid of all tests but the genes' .

        Comment

        • pengchy
          Senior Member
          • Feb 2009
          • 116

          #5
          For the experiment without any replicates, the following threads give suggestions:
          How to detect alternative splicing when there is no replicate
          DESeq, experimental design
          DESeq without replicates

          the document of DESeq also gives a pipeline for experiments without any replicates.
          Last edited by pengchy; 05-11-2013, 07:09 PM.

          Comment

          • Daniel1977
            Junior Member
            • Mar 2013
            • 6

            #6
            thanks for that, I´ll look into it.

            Comment

            • IBseq
              Member
              • Jul 2012
              • 56

              #7
              Hi Daniel77,
              did you get any answer on q1 and q2?
              I am also running cuffdiff without replicates and would like to understand how it does this

              thanks
              ibseq

              Comment

              Latest Articles

              Collapse

              • GATTACAT
                Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by GATTACAT
                Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                07-01-2026, 11:43 AM
              • SEQadmin2
                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by SEQadmin2


                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                Here are nine questions we think about, in roughly the order they matter, before...
                06-18-2026, 07:11 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, 07-02-2026, 11:08 AM
              0 responses
              11 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-30-2026, 05:37 AM
              0 responses
              13 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-26-2026, 11:10 AM
              0 responses
              20 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-17-2026, 06:09 AM
              0 responses
              54 views
              0 reactions
              Last Post SEQadmin2  
              Working...