Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cuffdiff DE significance of zero FPKM values

    Hi-

    I was using cuffdiff 1.3.0 to compare four BAMs aligned with TopHat 1.4.1. I have noticed that zero FPKM values are never considered significant when compared with large FPKM values. Here is one example that I am concerned about:

    XLOC_017834 XLOC_017834 Olfr1307 2:111784672-111785611 FF FMC OK 0.257545 46.5301 7.4972 -4.50075 6.77156e-06 0.000658507 yes
    XLOC_017834 XLOC_017834 Olfr1307 2:111784672-111785611 FFC FMC OK 0 46.5301 1.79769e+308 1.79769e+308 0.0789649 0.365761 no
    XLOC_017834 XLOC_017834 Olfr1307 2:111784672-111785611 FM FMC OK 0.0380371 46.5301 10.2565 -5.37179 7.79602e-08 2.39776e-05 yes

    If line wrapping destroys the above genes_exp.diff output, here is a brief summary: FMC has an FPKM of 46.5301 and when compared to FF (FPKM=0.257545) and FM (FPKM=0.0380371) the q-value is below the FDR of 0.05. Appropriately the significance value is labeled as 'yes'. However, FFC has an FPKM of zero and this differential test is not significant. The goal of the experiment was to identify uniquely expressed genes only identified in the FMC data set. As such an infinite log(fold_change) seems more significant than the other comparisons. Can anyone explain this?

  • #2
    Hi,
    that is what is bugging my brain as well these days

    Maybe the math is not working properly with that zero value. But thats a guess only. Came to the idea to replace all the zeros with something much smaller than the minimum FPKM found in the GTF files. If the minimum FPKM is 0.01 i will try to replace all complete zeros with 0.00001 for example and then do the DE and see whats the difference.

    Comment


    • #3
      hi did you find out an answer to your question?
      I do have the same prob and cnt find a solution,

      best,
      ib

      Comment


      • #4
        Hi,

        about this point.
        I do see significant ones in my results, shown in here.

        I am using Tophat 2.0 and Cufflink 2.0.

        Comment


        • #5
          but did u have replicates?i have one to one comparison, no replicates (biological or technical)

          Comment


          • #6
            I do have replicates(3 in total).

            Therefore I checked two other samples, which I only have one replicate.
            The result is the same as yours, any zero value on one sample is not considered as significant. So I guess they do not considered one zero and one non-zero to be significant in their algorithms.

            Comment


            • #7
              Thanks....how would you approach this issue?I cannot sequence replicates thus I have to stick with these results, but then THEY ARE NOT RELIABLE...??
              I have use a recent program (recently published) called GFOLD. This, ranks all the DE but does not tell you if it is significant, but at lwast I can see the assigned values. Still, cant figure out of to test significance..

              ib

              Comment


              • #8
                Hi,

                I am not a real expert on statistics...
                so I still think some of them may be reliable, however, they might not pass the statistics test.
                I haven't tried to tackle that yet, but I am planning to maybe combine some other similar control sample? In this way, we may have > 1 control sample and 1 test sample, which may give us more confidence in DE detection. This is my plan but I haven't tested it out yet. Just my 2 cents.

                liye

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                27 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                31 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                27 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                52 views
                0 likes
                Last Post seqadmin  
                Working...
                X