Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • moredd
    Junior Member
    • May 2013
    • 8

    EdgeR GLM dispersions error

    I'm new on EdgeR and I'm facing some troubles that I could not realize how to solve...
    I'm interested in to make comparisons both between and within subjects, that include animals presetning two different status of susceptibility, i.g. resistant (R) or susceptible (S), at two different time points, i.g. pre and post challenge. For this, I'm following the item 3.5 of the EdgeR user's manual (page 32).
    Once I have organized my data frame and the design formula as the manual states, I have an error when I ask for the dispersion calculations
    targets
    animal status tempo
    1 1 R Pre
    2 1 R Pos
    3 2 R Pre
    4 2 R Pos
    5 3 R Pre
    6 3 R Pos
    7 4 R Pre
    8 4 R Pos
    9 5 R Pre
    10 5 R Pos
    11 6 R Pre
    12 6 R Pos
    13 7 R Pre
    14 7 R Pos
    15 8 R Pre
    16 8 R Pos
    17 9 R Pre
    18 9 R Pos
    19 10 R Pre
    20 10 R Pos
    21 11 R Pre
    22 11 R Pos
    23 12 R Pre
    24 12 R Pos
    25 13 R Pre
    26 13 R Pos
    27 14 R Pre
    28 14 R Pos
    29 15 R Pre
    30 15 R Pos
    31 16 R Pre
    32 16 R Pos
    33 17 R Pre
    34 17 R Pos
    35 18 R Pre
    36 18 R Pos
    37 19 R Pre
    38 19 R Pos
    39 20 R Pre
    40 20 R Pos
    41 1 S Pre
    42 1 S Pos
    43 2 S Pre
    44 2 S Pos
    45 3 S Pre
    46 3 S Pos
    47 4 S Pre
    48 4 S Pos
    49 5 S Pre
    50 5 S Pos
    51 6 S Pre
    52 6 S Pos
    53 7 S Pre
    54 7 S Pos
    55 8 S Pre
    56 8 S Pos
    57 9 S Pre
    58 9 S Pos
    59 10 S Pre
    60 10 S Pos
    61 11 S Pre
    62 11 S Pos
    63 12 S Pre
    64 12 S Pos
    65 13 S Pre
    66 13 S Pos
    67 14 S Pre
    68 14 S Pos
    69 15 S Pre
    70 15 S Pos
    71 16 S Pre
    72 16 S Pos
    73 17 S Pre
    74 17 S Pos
    75 18 S Pre
    76 18 S Pos
    77 19 S Pre
    78 19 S Pos
    design <- model.matrix(~status+status:animal+status:tempo, data = targets)
    dge <- estimateGLMCommonDisp(dge, design)
    Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset, :
    Design matrix not of full rank. The following coefficients not estimable:
    statusS:animal20
    Note that I have 20 animals in the first condition (R) and 19 in second (S). The same for the others dispersions.

    Then, I've changed my data.frame to:

    targets2
    animal status tempo
    1 1 R Pre
    2 1 R Pos
    3 2 R Pre
    4 2 R Pos
    5 3 R Pre
    6 3 R Pos
    7 4 R Pre
    8 4 R Pos
    9 5 R Pre
    10 5 R Pos
    11 6 R Pre
    12 6 R Pos
    13 7 R Pre
    14 7 R Pos
    15 8 R Pre
    16 8 R Pos
    17 9 R Pre
    18 9 R Pos
    19 10 R Pre
    20 10 R Pos
    21 11 R Pre
    22 11 R Pos
    23 12 R Pre
    24 12 R Pos
    25 13 R Pre
    26 13 R Pos
    27 14 R Pre
    28 14 R Pos
    29 15 R Pre
    30 15 R Pos
    31 16 R Pre
    32 16 R Pos
    33 17 R Pre
    34 17 R Pos
    35 18 R Pre
    36 18 R Pos
    37 19 R Pre
    38 19 R Pos
    39 20 R Pre
    40 20 R Pos
    41 21 S Pre
    42 21 S Pos
    43 22 S Pre
    44 22 S Pos
    45 23 S Pre
    46 23 S Pos
    47 24 S Pre
    48 24 S Pos
    49 25 S Pre
    50 25 S Pos
    51 26 S Pre
    52 26 S Pos
    53 27 S Pre
    54 27 S Pos
    55 28 S Pre
    56 28 S Pos
    57 29 S Pre
    58 29 S Pos
    59 30 S Pre
    60 30 S Pos
    61 31 S Pre
    62 31 S Pos
    63 32 S Pre
    64 32 S Pos
    65 33 S Pre
    66 33 S Pos
    67 34 S Pre
    68 34 S Pos
    69 35 S Pre
    70 35 S Pos
    71 36 S Pre
    72 36 S Pos
    73 37 S Pre
    74 37 S Pos
    75 38 S Pre
    76 38 S Pos
    77 39 S Pre
    78 39 S Pos
    And now, I have the following message:

    dge <- estimateGLMCommonDisp(dge, design2)
    Warning message:
    In estimateGLMCommonDisp.default(y = y$counts, design = design, :
    No residual df: setting dispersion to NA
    My third try was change my design formula to:

    design3 <- model.matrix(~animal+tempo, data = targets2)

    And then I could calculate the dispersions and fit the model (glmFit). However, this is not exactly what the manual says and doesn't look like the best option for me, once that when I will ask for the glmLRT, it seems that will be important include the animal status between comparisons.
    Moreover, is not clear until now how to represent the “coef” and “contrast” in this function, based on a numeric vector.

    Some help will be greatly appreciate
    Sincerely,
    Daniela Moré
  • dpryan
    Devon Ryan
    • Jul 2011
    • 3478

    #2
    In the first model, the model matrix will contain one column with all zeros (due to animal #20 being absent from the "S" group). Just remove that column from the design and see if that resolves the problem.

    Comment

    • moredd
      Junior Member
      • May 2013
      • 8

      #3
      dpryan, thank you very much

      It seems like worked perfectly

      Comment

      • moredd
        Junior Member
        • May 2013
        • 8

        #4
        One more thing is that I've changed my design formula to

        design <- model.matrix(~0+tempo:status+status:animal, data = targets)

        that it gives me more reasonable design colnames to make the comparisons

        Thanks

        Daniela

        Comment

        Latest Articles

        Collapse

        • SEQadmin2
          From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
          by SEQadmin2


          Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


          The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
          ...
          06-02-2026, 10:05 AM
        • SEQadmin2
          Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
          by SEQadmin2


          With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


          Introduction

          Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
          05-22-2026, 06:42 AM
        • SEQadmin2
          Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
          by SEQadmin2

          Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


          Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
          05-06-2026, 09:04 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by SEQadmin2, Today, 08:59 AM
        0 responses
        10 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-02-2026, 12:03 PM
        0 responses
        21 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-02-2026, 11:40 AM
        0 responses
        17 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 05-28-2026, 11:40 AM
        0 responses
        31 views
        0 reactions
        Last Post SEQadmin2  
        Working...