Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • EdgeR GLM dispersions error

    I'm new on EdgeR and I'm facing some troubles that I could not realize how to solve...
    I'm interested in to make comparisons both between and within subjects, that include animals presetning two different status of susceptibility, i.g. resistant (R) or susceptible (S), at two different time points, i.g. pre and post challenge. For this, I'm following the item 3.5 of the EdgeR user's manual (page 32).
    Once I have organized my data frame and the design formula as the manual states, I have an error when I ask for the dispersion calculations
    targets
    animal status tempo
    1 1 R Pre
    2 1 R Pos
    3 2 R Pre
    4 2 R Pos
    5 3 R Pre
    6 3 R Pos
    7 4 R Pre
    8 4 R Pos
    9 5 R Pre
    10 5 R Pos
    11 6 R Pre
    12 6 R Pos
    13 7 R Pre
    14 7 R Pos
    15 8 R Pre
    16 8 R Pos
    17 9 R Pre
    18 9 R Pos
    19 10 R Pre
    20 10 R Pos
    21 11 R Pre
    22 11 R Pos
    23 12 R Pre
    24 12 R Pos
    25 13 R Pre
    26 13 R Pos
    27 14 R Pre
    28 14 R Pos
    29 15 R Pre
    30 15 R Pos
    31 16 R Pre
    32 16 R Pos
    33 17 R Pre
    34 17 R Pos
    35 18 R Pre
    36 18 R Pos
    37 19 R Pre
    38 19 R Pos
    39 20 R Pre
    40 20 R Pos
    41 1 S Pre
    42 1 S Pos
    43 2 S Pre
    44 2 S Pos
    45 3 S Pre
    46 3 S Pos
    47 4 S Pre
    48 4 S Pos
    49 5 S Pre
    50 5 S Pos
    51 6 S Pre
    52 6 S Pos
    53 7 S Pre
    54 7 S Pos
    55 8 S Pre
    56 8 S Pos
    57 9 S Pre
    58 9 S Pos
    59 10 S Pre
    60 10 S Pos
    61 11 S Pre
    62 11 S Pos
    63 12 S Pre
    64 12 S Pos
    65 13 S Pre
    66 13 S Pos
    67 14 S Pre
    68 14 S Pos
    69 15 S Pre
    70 15 S Pos
    71 16 S Pre
    72 16 S Pos
    73 17 S Pre
    74 17 S Pos
    75 18 S Pre
    76 18 S Pos
    77 19 S Pre
    78 19 S Pos
    design <- model.matrix(~status+status:animal+status:tempo, data = targets)
    dge <- estimateGLMCommonDisp(dge, design)
    Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset, :
    Design matrix not of full rank. The following coefficients not estimable:
    statusS:animal20
    Note that I have 20 animals in the first condition (R) and 19 in second (S). The same for the others dispersions.

    Then, I've changed my data.frame to:

    targets2
    animal status tempo
    1 1 R Pre
    2 1 R Pos
    3 2 R Pre
    4 2 R Pos
    5 3 R Pre
    6 3 R Pos
    7 4 R Pre
    8 4 R Pos
    9 5 R Pre
    10 5 R Pos
    11 6 R Pre
    12 6 R Pos
    13 7 R Pre
    14 7 R Pos
    15 8 R Pre
    16 8 R Pos
    17 9 R Pre
    18 9 R Pos
    19 10 R Pre
    20 10 R Pos
    21 11 R Pre
    22 11 R Pos
    23 12 R Pre
    24 12 R Pos
    25 13 R Pre
    26 13 R Pos
    27 14 R Pre
    28 14 R Pos
    29 15 R Pre
    30 15 R Pos
    31 16 R Pre
    32 16 R Pos
    33 17 R Pre
    34 17 R Pos
    35 18 R Pre
    36 18 R Pos
    37 19 R Pre
    38 19 R Pos
    39 20 R Pre
    40 20 R Pos
    41 21 S Pre
    42 21 S Pos
    43 22 S Pre
    44 22 S Pos
    45 23 S Pre
    46 23 S Pos
    47 24 S Pre
    48 24 S Pos
    49 25 S Pre
    50 25 S Pos
    51 26 S Pre
    52 26 S Pos
    53 27 S Pre
    54 27 S Pos
    55 28 S Pre
    56 28 S Pos
    57 29 S Pre
    58 29 S Pos
    59 30 S Pre
    60 30 S Pos
    61 31 S Pre
    62 31 S Pos
    63 32 S Pre
    64 32 S Pos
    65 33 S Pre
    66 33 S Pos
    67 34 S Pre
    68 34 S Pos
    69 35 S Pre
    70 35 S Pos
    71 36 S Pre
    72 36 S Pos
    73 37 S Pre
    74 37 S Pos
    75 38 S Pre
    76 38 S Pos
    77 39 S Pre
    78 39 S Pos
    And now, I have the following message:

    dge <- estimateGLMCommonDisp(dge, design2)
    Warning message:
    In estimateGLMCommonDisp.default(y = y$counts, design = design, :
    No residual df: setting dispersion to NA
    My third try was change my design formula to:

    design3 <- model.matrix(~animal+tempo, data = targets2)

    And then I could calculate the dispersions and fit the model (glmFit). However, this is not exactly what the manual says and doesn't look like the best option for me, once that when I will ask for the glmLRT, it seems that will be important include the animal status between comparisons.
    Moreover, is not clear until now how to represent the “coef” and “contrast” in this function, based on a numeric vector.

    Some help will be greatly appreciate
    Sincerely,
    Daniela Moré

  • #2
    In the first model, the model matrix will contain one column with all zeros (due to animal #20 being absent from the "S" group). Just remove that column from the design and see if that resolves the problem.

    Comment


    • #3
      dpryan, thank you very much

      It seems like worked perfectly

      Comment


      • #4
        One more thing is that I've changed my design formula to

        design <- model.matrix(~0+tempo:status+status:animal, data = targets)

        that it gives me more reasonable design colnames to make the comparisons

        Thanks

        Daniela

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        27 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        31 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        27 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        52 views
        0 likes
        Last Post seqadmin  
        Working...
        X