Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cufflinks FPKM_conf_hi smaller than FPKM

    I am running into an issue with the current version of Cufflinks (2.1.1) that I have not seen in previous versions. Some of the FPKM_conf_hi values are smaller than the FPKM values. Since the Cufflinks manual says FPKM_conf_hi is supposed to represent "the upper limit of the 95% FPKM confidence interval", I cannot see how this can be the case.

    Here is a sampling of some of the problem values for FPKM, FPKM_conf_lo, and FPKM_conf_hi:

    Code:
    23.2928 4.09379 12.6907
    910.117 734.208 787.416
    59.0504 12.6523 27.3453
    31.9551 0       5.38742
    40.1109 16.4458 30.6233
    864.432 660.221 711.678
    95.3611 82.1519 94.1825
    325.557 0       8.91302
    16.3467 6.4136  14.623
    61.4582 49.2812 60.396
    9.48964 0       4.59155
    20.9856 13.6854 20.8668
    195.925 156.216 176.781
    77.0662 60.9163 74.946
    23.0498 13.7803 22.639
    709.762 496.554 549.887
    7.83772 1.44049 7.56256
    65.5262 50.9647 63.5416
    36.7331 21.2266 33.8123
    10.3437 3.33931 10.2962
    38.6569 15.8121 28.5183
    22.1078 1.89032 9.9242
    80.9467 66.8287 79.0195
    61.4388 47.2034 58.6141
    12.8293 0       4.77232
    In this sample, it occurs 75 times out of 23424 annotations.
    The version of Cufflinks I was using before (2.0.2) does not have this issue at all. Another person at my institution using Cufflinks 2.1.1 with a separate organism, genome, annotation, and aligner is also seeing this issue.

    Anyone else notice this?

  • #2
    I have also observed this in version 2.1 output. The version update information on the Cufflinks website reads:

    "The high and low confidence intervals reported by Cufflinks and Cuffdiff are now constructed from the samples generated from the beta negative binomial model, rather than estimated as twice the standard deviation. This better reflects the underlying distribution of the FPKM."

    In previous versions, the CI was taken as the FPKM estimate plus/minus 2*stdev, and so the point estimate was, by definition, always in the middle of the CI. But just finding plus/minus 2*stdevs does not correspond to a particular % CI because the FPKM's are not normal. By Chebyshev's inequality, going 2 stdevs out from the mean only guarantee's a minimum CI probability of 75%.

    In version 2.1, it is slightly unclear how Cuffdiff chooses the FPKM point estimate and CI from the beta negative binomial samples, but one can imagine how the situation you describe might arise. As I understand it, there are two general ways Cuffdiff might be defining a Bayesian-style 95% credibility interval - either they are taking the narrowest interval containing 95% of all simulated draws, or they are taking the interval with 2.5% of draws below it and 2.5% of draws above it (these turn out to be different intervals for non-symmetric distributions). If they are defining the FPKM point estimate as the mean of the beta negative binomial sample, there is actually no guarantee that the mean must lie within either of these interval definitions.

    Hope that helps!

    Comment


    • #3
      I too am noticing that many of the of the FPKM_conf_hi values are smaller than the FPKM values using Cuffdiff v2.1.1, a problem I did not observed with Cuffdiff v2.0.2. For example:

      stim_FPKM stim_conf_lo stim_conf_hi
      5.63254 1.72857 5.80799
      197.624 52.7574 118.929
      6.47284 2.25466 6.32556
      115.021 45.1935 100.745
      42.5205 10.6235 27.279
      8.39448 2.64285 7.7857
      176.191 61.9773 136.513

      I tried messing with several of the parameters, but to no avail. I think my solution is to simply go back to Cuffdiff v2.0.2, unless someone else has found a solution? An upper-bound on a 95% CI that is less than the mean is nonsensical in my simple mind.

      Comment


      • #4
        Anything new about this issue? Many of my FPKM values are higher then my FPKM_conf_hi, even when I use version 2.0.2.

        I am therefore wondering if I should trust those FPKM values or simply aim for the mid-point of the conf_lo-conf_hi interval.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        26 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        29 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        25 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        52 views
        0 likes
        Last Post seqadmin  
        Working...
        X