Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • FPKMs and Limma R package

    Hi!

    I have generated a dataset with 9 different biological samples (plus replicates) and have analyzed it using TopHat and CuffLinks. Therefore, I currently have a table with the FPKM values for every gene in each sample.

    I am trying to use the Limma R package to model and extract differentially expressed genes between these several different samples (instead of 2-by-2 comparisons that can be made using CuffDiff) and have encountered the following problem to which I would really appreciate some advice.

    I have to transform the FPKM values into log2 values to then use this in the lmFit() function. However, since there are "zeros", if I do this directly on the FPKM table, a lot of "Infinite" values are generated. I was therefore thinking of adding a specific number to all of the FPKM values before transforming them into log2 data. So my questions are:

    1. Is this a good approach?
    Are there better alternatives?

    2. Is there a specific value that should be added?
    I was thinking of adding a small value (e.g. 10^-10, a value whose log2(10^-10) ~-33 is in the "opposite" range of the log2 positive values - in my table the maximum log2(FPKM)~22).
    But I am not sure if this is correct and would also like to know if there is a "normal" value that people usually add.

    Thanks!!!

    Note: I also have the count numbers and could eventually do everything with the voom function and then Limma, but since I have all my initial analysis using the FPKMs I would really like to stick with them for consistency... so any help is deeply appreciated!

  • #2
    Adding a small count seems to be the common method. If you look at how edgeR calculates log2(rpkm), for example, you'll see that it adds a small value (0.25 by default) to the raw counts before computing CPM, which is then used to get RPKM. For comparison, a minimum of 0.25 on the raw count scale would be ~2.5e-7 FPKM for a 1kb gene (depending on how library sizes were computed).

    Comment


    • #3
      Thanks!
      I have tried this but I am not happy with the results... I get really strange volcano plots (see figure), which I guess are a consequence of different variance stabilization methods...
      Therefore, I think I will stick with the use of the read counts (even if it means going back and re-doing my previous analysis).

      file:///Users/elsaabranches/Desktop/volcanos/volcano_plots.jpg

      Comment


      • #4
        Yes, that's a wise decision. Use the voom function to process the counts prior to lmFit. The voom-limma pipeline needs to work with counts, rather than with FPKM.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Recent Advances in Sequencing Analysis Tools
          by seqadmin


          The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
          Yesterday, 07:48 AM
        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 07:17 AM
        0 responses
        11 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 05-02-2024, 08:06 AM
        0 responses
        19 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-30-2024, 12:17 PM
        0 responses
        20 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-29-2024, 10:49 AM
        0 responses
        29 views
        0 likes
        Last Post seqadmin  
        Working...
        X