Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Mchicken
    Member
    • Jan 2014
    • 39

    Statistical models for DE

    Hey guys,

    I have some problems understanding the need for statistical models when dealing with differential expression in RNA-Seq. Of course I already used tools like DESeq2 or NOISeq. Nevertheless, I also want at least partially understand what these tools are doing. Unfortunately I don't have a good statistical background and found no tutorial, which is explaining the usage of statistical models in a for me understandable manner. I think the best would be if one could explain it for the Poisson model as this one seems to be easier to understand than a NB.

    So what I know is that after sequencing I align my reads to the reference genome, followed by generation of read counts for each annotated gene. Of course I cannot directly use these counts for testing DE cause of different library sizes as well as technical and biological variation.

    So what I read most of the time is that people fit statistical models to the count data. Like for example a Poisson model (as this one is accounting for the technical variance).

    Question 1: Is the model fitted on the read counts of all genes? Or is each gene getting its own model?

    Question 2: In the case of the Poisson model, where do I get the lambda? Should be calculated from my count data?

    Question 3: If I have constructed my Poisson model. What is it now used for? Do I use it to change my count data? Is it used in the statistical test? This is the step where I have absolutely no clue what is going on.

    I tried to read different publications including the DESeq publications or in the case of Poisson the Marioni paper from 2008. But with my little statistical knowledge I do not get the key idea of these statistical models and how they can help me when dealing with DE in RNA-Seq.

    I really hope someone can explain this general concept in a really easy way so I can understand it.

    Cheers
    Mario
  • dpryan
    Devon Ryan
    • Jul 2011
    • 3478

    #2
    N.B., I'm going to completely ignore the empirical Bayes parts of this for the sake of simplicity.

    1. The model is fit to each gene, one at a time. The actual model used is the same for all of them.
    2. The lambda is part of the fit. Note that there is a lambda per group.
    3. The model is used for a statistical test, which is typically of the form, "Do groups A and B have different lambdas?"

    Comment

    • Mchicken
      Member
      • Jan 2014
      • 39

      #3
      First of all, thanks for your quick answer dpryan.

      But I still do not get where the lambda is coming from and what you mean by "group".

      Comment

      • dpryan
        Devon Ryan
        • Jul 2011
        • 3478

        #4
        A group is a group (eine "Gruppe" auf Deutsch), it has no special meaning in this context

        Regarding lambda, each gene has some sort of expression count associated with it, normally in the form of counts per sample. These counts are then used to estimate lambda.

        Comment

        • Mchicken
          Member
          • Jan 2014
          • 39

          #5
          This is actually where I have a problem. I estimate lambda by the read counts of a gene (lambda = read count) and then I test the null-hypothesis that Condition A and B have the same lambda. So why do I use the Poisson model and not just test if A and B have the same read count?

          Does anyone know a tutorial or lecture with examples?

          Comment

          • dpryan
            Devon Ryan
            • Jul 2011
            • 3478

            #6
            You essentially are testing whether A and B have the same read count. The question is simply how you test that. One option is assuming Poisson variance, which requires estimating lambda and then doing a test. In most real cases, you'd have multiple groups of samples, so you couldn't just compare two numbers, but would need to come up with group estimates, likely accounting for differences in sequencing depth for each sample.

            Comment

            • Mchicken
              Member
              • Jan 2014
              • 39

              #7
              Anyone out there who can explain the lambda estimation in more detail (probably with an example) or knows a nice tutorial?

              Comment

              Latest Articles

              Collapse

              • GATTACAT
                Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by GATTACAT
                Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                07-01-2026, 11:43 AM
              • SEQadmin2
                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by SEQadmin2


                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                Here are nine questions we think about, in roughly the order they matter, before...
                06-18-2026, 07:11 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, Yesterday, 11:08 AM
              0 responses
              7 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-30-2026, 05:37 AM
              0 responses
              12 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-26-2026, 11:10 AM
              0 responses
              20 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-17-2026, 06:09 AM
              0 responses
              54 views
              0 reactions
              Last Post SEQadmin2  
              Working...