Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DESeq2: Multi-factor designs

    With great help from dpryan I have come a long way, but now Im stuck again. I need to be able to calculated both within and between samples. I have two groups at two time points, the data is paired within samples, but not between:

    sampleFiles <- list.files(path="/Volumes/timemachine/HTseq_DEseq2",pattern="*.txt");
    status <- factor(c(rep("Healthy",26), rep("Diabetic",22)))
    timepoints = as.factor(c(1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2));
    sampleTable <- data.frame(sampleName = sampleFiles, fileName = sampleFiles, status=status, timepoints=timepoints);
    directory <- c("/Volumes/timemachine/HTseq_DEseq2/");
    des <- formula(~timepoints+status);
    ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design=des);
    ddsHTSeq

    These commands all work, but they are incorrect due to not taken the paired data in account.
    So I looked here: page 49; http://www.bioconductor.org/packages...usersguide.pdf

    So I tried instead;

    Treat <- factor(paste(sampleTable$status,sampleTable$timepoints,sep=""));
    design <- model.matrix(~0+Treat);
    colnames(design) <- levels(Treat);

    > design
    Diabetic1 Diabetic2 Healthy1 Healthy2
    1 0 0 1 0
    2 0 0 1 0
    3 0 0 1 0
    4 0 0 1 0
    5 0 0 1 0
    6 0 0 1 0
    7 0 0 1 0
    8 0 0 1 0
    9 0 0 1 0
    10 0 0 1 0
    11 0 0 1 0
    12 0 0 1 0
    13 0 0 1 0
    14 0 0 0 1
    15 0 0 0 1
    16 0 0 0 1
    17 0 0 0 1
    18 0 0 0 1
    19 0 0 0 1
    20 0 0 0 1
    21 0 0 0 1
    22 0 0 0 1
    23 0 0 0 1
    24 0 0 0 1
    25 0 0 0 1
    26 0 0 0 1
    27 1 0 0 0
    28 1 0 0 0
    29 1 0 0 0
    30 1 0 0 0
    31 1 0 0 0
    32 1 0 0 0
    33 1 0 0 0
    34 1 0 0 0
    35 1 0 0 0
    36 1 0 0 0
    37 1 0 0 0
    38 0 1 0 0
    39 0 1 0 0
    40 0 1 0 0
    41 0 1 0 0
    42 0 1 0 0
    43 0 1 0 0
    44 0 1 0 0
    45 0 1 0 0
    46 0 1 0 0
    47 0 1 0 0
    48 0 1 0 0
    attr(,"assign")
    [1] 1 1 1 1
    attr(,"contrasts")
    attr(,"contrasts")$Treat
    [1] "contr.treatment"

    However my "old" design, the "des", looks like this:

    > des
    ~timepoints + status

    Running with "new" design:
    ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design=design)

    Gives error:

    Error in formula.default(design) : invalid formula

    Not to suprising.. But Im stuck, what to do? Am I totally of?

    Thanks!
    Last edited by sindrle; 10-18-2013, 10:02 AM.

  • #2
    what colData(ddsHTSeq) gives ?

    Comment


    • #3
      The answer is, thanks to dpryan:

      status <- factor(c(rep("Healthy",26), rep("Diabetic",22)), levels=c("Healthy", "Diabetic"));
      timepoints = as.factor(c(1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2));
      patients <- factor(c(04,21,53,55,61,62,67,73,76,77,79,04,21,53,55,61,62,67,73,76,77,79, 01,08,13,70,71,72,75,81,82,83,86,87,88,01,08,13,70,71,72,75,81,82,83,86,87,88));
      sampleTable <- data.frame(sampleName = sampleFiles, fileName = sampleFiles, status=status, timepoints=timepoints, patients=patients);

      and

      des <- formula(~patients + timepoints*status);
      Last edited by sindrle; 10-20-2013, 02:22 AM.

      Comment


      • #4
        I suppose this could get the pair combination you like :
        Treat <- factor(paste(sampleTable$status,sampleTable$timepoints,sep=""));
        design <- formula(~ 0 + Treat)

        but I am not sure if the following DESeq() or nbinomWaldTest() or nbionomLRT() function can work with this design formula . I am so temped to switch to edgeR , which has nicer tutorial and examples to follow.

        Comment


        • #5
          So you mean I dont have paried test as I have wrote it?

          Comment


          • #6
            With the "~ patient + ...", you get paired tests. You currently test for interaction between time and status, i.e., you will get gene for which the amount of change between the time points differs significantly between healthy and diabetic subjects.

            Comment


            • #7
              Thanks!
              That what I thought, but is that the only thing tested?

              I get alot of results, of which I have chosen three:

              statusResults <- results(dds, "status_Healthy_vs_Diabetic");
              timepointsResults <- results(dds, "timepoints_2_vs_1");
              statusTreatmentResults <- results(dds, "timepoints2.statusHealthy");

              So you described the "timepoints2.statusHealthy" result, right?

              But am I correct to use "status_Healthy_vs_Diabetic" to look for differences between the groups (disregarding time), and vise versa for "timepoints_2_vs_1"?

              Thanks!

              Comment


              • #8
                Code:
                Treat <- factor(paste(sampleTable$status,sampleTable$timepoints,sep=""));
                design <- formula(~ 0 + Treat)
                Will create 4 groups: "Diabetic1", "Diabetic2", "Normal1", and "Normal2". So no, that won't keep the pairing. This is a similar design to what cuffdiff would do if you input the files as 4 groups. Aside from the pairing issue, some people prefer this since it's easier to directly compare groups, which is what they want to do. It's mostly a matter of the question you really want to ask.

                Comment


                • #9
                  So to compare Cuffdiff2 with DESeq2 I could run that design (since I have already done that on Cuffdiff 2), but as you said its not my actual question of interest, thus leading me to DEseq2 in the first place.

                  Comment


                  • #10
                    Originally posted by sindrle View Post
                    But am I correct to use "status_Healthy_vs_Diabetic" to look for differences between the groups (disregarding time), and vise versa for "timepoints_2_vs_1"?
                    For those genes for which the interaction is not significant: Yes.

                    For the others: not quite because you cannot disregard time because the difference depends on the time point.

                    If you want to average over the two time points, you can add half of the interaction effect to the status main effect (or, vice versa, add half of the interaction to the time-point main effect to get the average over disease states).

                    Comment


                    • #11
                      What if time had the same effect on both groups?
                      Or if groups are the same, but both changed in time.

                      Could you then do as I said?
                      Last edited by sindrle; 10-24-2013, 03:28 PM.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Advancing Precision Medicine for Rare Diseases in Children
                        by seqadmin




                        Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                        12-16-2024, 07:57 AM
                      • seqadmin
                        Recent Advances in Sequencing Technologies
                        by seqadmin



                        Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                        Long-Read Sequencing
                        Long-read sequencing has seen remarkable advancements,...
                        12-02-2024, 01:49 PM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 12-17-2024, 10:28 AM
                      0 responses
                      39 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 12-13-2024, 08:24 AM
                      0 responses
                      52 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 12-12-2024, 07:41 AM
                      0 responses
                      38 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 12-11-2024, 07:45 AM
                      0 responses
                      46 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X