Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • rethinking log-log RPKM plots

    I put together my thoughts about why I think the log-log plots of RPKM values we are accustomed to seeing may not be the best way to go. It can be found here. I would be interested to hear what you guys think . . . .

    thanks,
    Justin

  • #2
    Interesting discussion and seems worth a try on some of my data. I'd always slipped back to my microarray days and added 16 to each FPKM/RPKM then logged. To many days playing with Plier versus Plier+16 I guess.

    Comment


    • #3
      Very interesting! you think that the asin transformation could also be used for normalization in differential expression analysis or other comparative approaches such as ChipSeq enrichment (over input) calculations?

      Comment


      • #4
        Interesting discussion and seems worth a try on some of my data. I'd always slipped back to my microarray days and added 16 to each FPKM/RPKM then logged. To many days playing with Plier versus Plier+16 I guess.
        Hi Jon,

        I thought it would be a good idea to give it a try on some real data, too. I tried it on some technical replicates from the Marioni paper and put the results here. Seems to support their observation that variation among technical replicates can be captured with a Poisson model, at least for the data they presented.

        Justin

        Comment


        • #5
          Very interesting! you think that the asin transformation could also be used for normalization in differential expression analysis or other comparative approaches such as ChipSeq enrichment (over input) calculations?
          Hi mudshark,

          These transformations are helpful to determine whether or not your data fit the Poisson model. For technical replicates, it seems that the Poisson model works well, and the variation can all be accounted for by the effects of random sampling. However, biological replicates appear to be over-dispersed in general, and so do not fit the Poisson model (in general). The variance stabilization transformations for over-dispersed data that I have come across in my searches all rely on knowing the over-dispersion parameter, which wouldn't be feasible to estimate on a gene-by-gene bases since the number of samples is usually small. Some methods that try to account for over-dispersion will pool genes with similar expression levels (like DESEq) to try to get some sort of estimate of the over-dispersion.

          So, I don't think this particular transformation would be good for over dispersed data, but if you used the appropriate transformation and knew the over-dispersion parameter, then I think it could be a useful plot for identifying differentially expressed genes.

          Comment


          • #6
            Anscomb or sqrt

            Dear Justin,

            I tried your suggestion, which makes mathematical sense, on some RNA-seq data of mine and it gives beautiful scatter plots among the technical replicates. There is not much difference between Anscombe and simple square root transformations.

            Thanks for the suggestion.

            Gunter

            Comment


            • #7
              Hi Gunter,

              I am glad you found that suggestion useful. I'll see what I can do about writing it up as a technical note and submitting it. By the way, my name is Justin, and I am at Washington University in St. Louis, working on the informatics pipeline for next-gen sequencing. Nice to meet you!

              Justin

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              12 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              51 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              68 views
              0 likes
              Last Post seqadmin  
              Working...
              X