Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • FPKMs and Limma R package

    Hi!

    I have generated a dataset with 9 different biological samples (plus replicates) and have analyzed it using TopHat and CuffLinks. Therefore, I currently have a table with the FPKM values for every gene in each sample.

    I am trying to use the Limma R package to model and extract differentially expressed genes between these several different samples (instead of 2-by-2 comparisons that can be made using CuffDiff) and have encountered the following problem to which I would really appreciate some advice.

    I have to transform the FPKM values into log2 values to then use this in the lmFit() function. However, since there are "zeros", if I do this directly on the FPKM table, a lot of "Infinite" values are generated. I was therefore thinking of adding a specific number to all of the FPKM values before transforming them into log2 data. So my questions are:

    1. Is this a good approach?
    Are there better alternatives?

    2. Is there a specific value that should be added?
    I was thinking of adding a small value (e.g. 10^-10, a value whose log2(10^-10) ~-33 is in the "opposite" range of the log2 positive values - in my table the maximum log2(FPKM)~22).
    But I am not sure if this is correct and would also like to know if there is a "normal" value that people usually add.

    Thanks!!!

    Note: I also have the count numbers and could eventually do everything with the voom function and then Limma, but since I have all my initial analysis using the FPKMs I would really like to stick with them for consistency... so any help is deeply appreciated!

  • #2
    Adding a small count seems to be the common method. If you look at how edgeR calculates log2(rpkm), for example, you'll see that it adds a small value (0.25 by default) to the raw counts before computing CPM, which is then used to get RPKM. For comparison, a minimum of 0.25 on the raw count scale would be ~2.5e-7 FPKM for a 1kb gene (depending on how library sizes were computed).

    Comment


    • #3
      Thanks!
      I have tried this but I am not happy with the results... I get really strange volcano plots (see figure), which I guess are a consequence of different variance stabilization methods...
      Therefore, I think I will stick with the use of the read counts (even if it means going back and re-doing my previous analysis).

      file:///Users/elsaabranches/Desktop/volcanos/volcano_plots.jpg

      Comment


      • #4
        Yes, that's a wise decision. Use the voom function to process the counts prior to lmFit. The voom-limma pipeline needs to work with counts, rather than with FPKM.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM
        • seqadmin
          Techniques and Challenges in Conservation Genomics
          by seqadmin



          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

          Avian Conservation
          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
          03-08-2024, 10:41 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 06:37 PM
        0 responses
        7 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, Yesterday, 06:07 PM
        0 responses
        7 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-22-2024, 10:03 AM
        0 responses
        49 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-21-2024, 07:32 AM
        0 responses
        66 views
        0 likes
        Last Post seqadmin  
        Working...
        X