Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Statistical models for DE

    Hey guys,

    I have some problems understanding the need for statistical models when dealing with differential expression in RNA-Seq. Of course I already used tools like DESeq2 or NOISeq. Nevertheless, I also want at least partially understand what these tools are doing. Unfortunately I don't have a good statistical background and found no tutorial, which is explaining the usage of statistical models in a for me understandable manner. I think the best would be if one could explain it for the Poisson model as this one seems to be easier to understand than a NB.

    So what I know is that after sequencing I align my reads to the reference genome, followed by generation of read counts for each annotated gene. Of course I cannot directly use these counts for testing DE cause of different library sizes as well as technical and biological variation.

    So what I read most of the time is that people fit statistical models to the count data. Like for example a Poisson model (as this one is accounting for the technical variance).

    Question 1: Is the model fitted on the read counts of all genes? Or is each gene getting its own model?

    Question 2: In the case of the Poisson model, where do I get the lambda? Should be calculated from my count data?

    Question 3: If I have constructed my Poisson model. What is it now used for? Do I use it to change my count data? Is it used in the statistical test? This is the step where I have absolutely no clue what is going on.

    I tried to read different publications including the DESeq publications or in the case of Poisson the Marioni paper from 2008. But with my little statistical knowledge I do not get the key idea of these statistical models and how they can help me when dealing with DE in RNA-Seq.

    I really hope someone can explain this general concept in a really easy way so I can understand it.

    Cheers
    Mario

  • #2
    N.B., I'm going to completely ignore the empirical Bayes parts of this for the sake of simplicity.

    1. The model is fit to each gene, one at a time. The actual model used is the same for all of them.
    2. The lambda is part of the fit. Note that there is a lambda per group.
    3. The model is used for a statistical test, which is typically of the form, "Do groups A and B have different lambdas?"

    Comment


    • #3
      First of all, thanks for your quick answer dpryan.

      But I still do not get where the lambda is coming from and what you mean by "group".

      Comment


      • #4
        A group is a group (eine "Gruppe" auf Deutsch), it has no special meaning in this context

        Regarding lambda, each gene has some sort of expression count associated with it, normally in the form of counts per sample. These counts are then used to estimate lambda.

        Comment


        • #5
          This is actually where I have a problem. I estimate lambda by the read counts of a gene (lambda = read count) and then I test the null-hypothesis that Condition A and B have the same lambda. So why do I use the Poisson model and not just test if A and B have the same read count?

          Does anyone know a tutorial or lecture with examples?

          Comment


          • #6
            You essentially are testing whether A and B have the same read count. The question is simply how you test that. One option is assuming Poisson variance, which requires estimating lambda and then doing a test. In most real cases, you'd have multiple groups of samples, so you couldn't just compare two numbers, but would need to come up with group estimates, likely accounting for differences in sequencing depth for each sample.

            Comment


            • #7
              Anyone out there who can explain the lambda estimation in more detail (probably with an example) or knows a nice tutorial?

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              51 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              67 views
              0 likes
              Last Post seqadmin  
              Working...
              X