Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Dang it! Good catch, my columns did not match up.

    When I get another error I'll screenshot it and post it here. They seem to pop-up at will. I have a normal release of R and the errors occur at my lab computer + personal computer.

    Comment


    • #17
      A safer way to reorder columns would be to build the DESeqDataSet with a design of ~ 1, then rearrange the columns of the DESeqDataSet, then add a new column anim.nested, then add the desired design. What do you mean "at will"? After running a particular function? At what points in the analysis to you get these?

      Comment


      • #18
        What do you mean "at will"? After running a particular function? At what points in the analysis to you get these?
        It's been pretty random when they happen- usually when I'm not running code (often just editing script files prior to running them) so I haven't been able to track down what particular thing is triggering them. All I do know is that they happen in sessions when I've loaded those packages. I think it might be tied into the phyloseq data structures.

        Comment


        • #19
          Yeah, that won't be from DESeq2 then. And maybe ask the phyloseq maintainers if they have seen this on their github site, although their data structures are data.frame and matrix so I wouldn't know why there would be a problem there. If you are loading objects from memory at the beginning those could be corrupted.

          Comment


          • #20
            So I got R to throw up this error (not a segfault...these don't happen too often) right after resultsNames(dds)

            Code:
            2014-10-27 19:31:47.892 R[5511:707] *** RController: caught ObjC exception while processing system events. Update to the latest GUI version and consider reporting this properly (see FAQ) if it persists and is not known. 
            *** reason: *** -[__NSArrayM objectAtIndex:]: index 18446744073709551615 beyond bounds [0 .. 85]
            *** name: NSRangeException, info: (null)
            *** Version: R 3.1.1 () R.app R 3.1.1 GUI 1.65 Mavericks build(null)
            Consider saving your work soon in case this develops into a problem.

            Comment


            • #21
              I'm not sure about this, but I would recommend following the message and updating R to the latest GUI version.

              Comment


              • #22
                Thats the thing... I am running the latest GUI

                Comment


                • #23
                  It might be a problem in .RData, which is loaded every time. Try deleting this file. (although doesn't explain why you get it on two separate computers)

                  Comment


                  • #24
                    Originally posted by Michael Love View Post
                    hi gamb,

                    Firstly, as you are just starting to analyze, best to update to Bioconductor 3.0 and DESeq2 version 1.6 which was just released.

                    As the analysis is a bit complicated, I'd recommend using DESeq(dds, betaPrior=FALSE) as this simplifies the building of contrasts with interactions.

                    The design should be

                    Code:
                    ~ tissue*chem1*chem2
                    ...because you're interested in both the interactions of chem1 and chem2, and in the specific effects for each tissue.

                    The list of effects is given by

                    Code:
                    resultsNames(dds)
                    You can build up the contrasts using a list like below. I'll give some examples and see if you can extend to the others. Comparing the tissues alone is simple by specifying the name of the factor and the two levels to compare, e.g.:

                    Code:
                    results(dds, contrast=c("tissue","C","B"))
                    Then some basics about interaction terms: the effect of chem 1 in tissue B will be:

                    Code:
                    results(dds, contrast=list(c("chem1","chem1.tissueB")))
                    ...because you need to add the main chem 1 effect (the effect for tissue A, the base level of tissue) and the extra chem 1 effect for tissue B.

                    If you wanted to test if chem 1 has an effect in tissue B different than tissue A you would only test the interaction term "chem1.tissueB", because this is that difference.

                    Tissue A is the base level, so the effect of chem 1 in tissue A is simply

                    Code:
                    results(dds, name="chem1")
                    The effect of chem 1 and chem 2 in tissue B is a bit lengthy, you need to add main effects and all relevant interaction terms:

                    Code:
                    results(dds, contrast=list(c("chem1","chem2","chem1.tissueB","chem2.tissueB","chem1.chem2","chem1.chem2.tissueB")))
                    It's simpler for tissue A because this is the base level:

                    Code:
                    results(dds, contrast=list(c("chem1","chem2","chem1.chem2")))
                    And if you want to test only the interaction, answering if the effect of the two chems is different than each separately, you remove some of these effects. For tissue B this contrast would be:

                    Code:
                    results(dds, contrast=list(c("chem1.chem2","chem1.chem2.tissueB")))
                    For tissue A it is again simpler:

                    Code:
                    results(dds, name="chem1.chem2")
                    Note that another way to compare individual groups would be to create a new variable which describes all groups:

                    Code:
                    dds$group = factor(paste(dds$chem1, dds$chem2, dds$tissue))
                    design(dds) = ~ group
                    Then use the following to compare to specific groups:

                    Code:
                    results(dds, contrast=c("group","...","..."))
                    Dear Michael,

                    I am afraid I cannot build my contrsats either.
                    I have cells from 2 species (bovin, porc) that have been treated or not (treatment=pos or treatment=neg)

                    Here is my model
                    design (ddsX)<-~species+treatment+species:treatment

                    resultsNames(ddsX)
                    #[1] "Intercept" "species_porc_vs_bovin" #"treatment_pos_vs_neg" "speciesporc.treatmentpos"

                    If I understand well, I can find the genes with a treatment effect with:
                    results(ddsX, contrast=c("treatment","neg","pos"))

                    then the genes with a species effect with:
                    results(ddsX, contrast=c("species","bovin","porc"))

                    But how can I find the genes:
                    - from species "bovin" with a treatment effect
                    - from species "porc" with a treatment effect
                    - with interaction (that do not react to the treatment in a same way in "bovin" or "porc")?

                    I am using DESeq2_1.6.2

                    Thanks for your help

                    Gwenola

                    Comment


                    • #25
                      hi Gwenola,

                      Check in the Examples section of the help for ?results:

                      You should find, Example 2: two conditions, two groups, with interaction term.

                      Comment


                      • #26
                        Thanks

                        Thanks a lot.

                        ## Example 2: two conditions, two groups, with interaction term

                        # the condition effect. Is this the condition effect in groupX?
                        results(dds, contrast=c("condition","B","A"))


                        # the condition effect in group B. Is there a typo (condition effect in group Y?)
                        results(dds, contrast=c(0,0,1,1))
                        # or, equivalently using list to add these two effects
                        results(dds, contrast=list(c("condition_B_vs_A","groupY.conditionB")))

                        If I want the condition effect (whatever the group), is that
                        results(dds, contrast=c(0,0,1,1/2))?


                        Best regards
                        Gwenola

                        Originally posted by Michael Love View Post
                        hi Gwenola,

                        Check in the Examples section of the help for ?results:

                        You should find, Example 2: two conditions, two groups, with interaction term.

                        Comment


                        • #27
                          Is this the condition effect in groupX?

                          yes

                          Is there a typo (condition effect in group Y?)

                          yes, thanks! I've just fixed this in devel

                          If I want the condition effect (whatever the group), is that results(dds, contrast=c(0,0,1,1/2))?


                          yes, that's the condition effect averaged over both groups

                          Comment


                          • #28
                            Thanks

                            Thanks for your answer. It is clear now.
                            Gwenola

                            Comment


                            • #29
                              Originally posted by Michael Love View Post
                              There is a way to control for the animal effect in this experiment, by recoding the animals within blocks of chem1*chem2 and adding an interaction term. It gets a bit trickier to form contrasts, but I will show how to do this.

                              It helps to clean up the colData a bit, I'll show on a clean version, and you can reorder your samples correspondingly. Note that you should reorder the columns of the count matrix so that they match. If you have a SummarizedExperiment, the colData and matrix are tied together, so simply reordering the columns of the SummarizedExperiment does both for you.

                              I made a sample data.frame like your data:

                              Code:
                              d <- data.frame(anim=factor(rep(1:12,each=3)),
                                chem1=factor(rep(rep(c("N","Y"),each=6),3)),
                                chem2=factor(rep(rep(c("N","Y"),each=3),6)),
                                tiss=factor(rep(1:3,12)))
                              this looks like

                              Code:
                                  anim chem1 chem2 tiss
                               1     1     N     N    1
                               2     1     N     N    2
                               3     1     N     N    3
                               4     2     N     Y    1
                               5     2     N     Y    2
                               6     2     N     Y    3
                               7     3     Y     N    1
                               8     3     Y     N    2
                               9     3     Y     N    3
                               10    4     Y     Y    1
                               11    4     Y     Y    2
                               12    4     Y     Y    3
                               13    5     N     N    1
                               14    5     N     N    2
                               15    5     N     N    3
                               16    6     N     Y    1
                               17    6     N     Y    2
                               18    6     N     Y    3
                               19    7     Y     N    1
                               20    7     Y     N    2
                               21    7     Y     N    3
                               22    8     Y     Y    1
                               23    8     Y     Y    2
                               24    8     Y     Y    3
                               25    9     N     N    1
                               26    9     N     N    2
                               27    9     N     N    3
                               28   10     N     Y    1
                               29   10     N     Y    2
                               30   10     N     Y    3
                               31   11     Y     N    1
                               32   11     Y     N    2
                               33   11     Y     N    3
                               34   12     Y     Y    1
                               35   12     Y     Y    2
                               36   12     Y     Y    3
                              Now reorder by chem1, chem2 and tiss. Then add a new variable which will be used to control animal specific effects within each block of chem1*chem2.

                              Code:
                              idx <- with(d, order(chem1, chem2, tiss))
                              d <- d[idx,]
                              d$anim.nested <- factor(rep(1:3,12))
                              You can examine d to see how this works.

                              The model matrix is then the chem1 and chem2 terms and interaction, as well as the interactions of these with animal, and the interaction of these with tissue:

                              Code:
                              m <- model.matrix(~ chem1 + chem2 + chem1:chem2 + chem1:chem2:anim.nested + chem1:chem2:tiss, data=d)
                              The column names of this are:

                              Code:
                              > colnames(m)
                                [1] "(Intercept)"                "chem1Y"
                                [3] "chem2Y"                     "chem1Y:chem2Y"
                                [5] "chem1N:chem2N:anim.nested2" "chem1Y:chem2N:anim.nested2"
                                [7] "chem1N:chem2Y:anim.nested2" "chem1Y:chem2Y:anim.nested2"
                                [9] "chem1N:chem2N:anim.nested3" "chem1Y:chem2N:anim.nested3"
                               [11] "chem1N:chem2Y:anim.nested3" "chem1Y:chem2Y:anim.nested3"
                               [13] "chem1N:chem2N:tiss2"        "chem1Y:chem2N:tiss2"
                               [15] "chem1N:chem2Y:tiss2"        "chem1Y:chem2Y:tiss2"
                               [17] "chem1N:chem2N:tiss3"        "chem1Y:chem2N:tiss3"
                               [19] "chem1N:chem2Y:tiss3"        "chem1Y:chem2Y:tiss3"

                              Many of the contrasts become hard to form, though possible with numeric contrasts. A trick for getting the numeric contrasts of the other comparisons, is to take the column means of the model matrix, for the groups of samples you want to compare. For instance, if you want to find the main chem1 effect in tissue 2, the contrast would be:

                              Code:
                              ctrst = colMeans(m[d$tiss == "2" & d$chem1 == "Y" & d$chem2 == "N",]) -
                                      colMeans(m[d$tiss == "2" & d$chem1 == "N" & d$chem2 == "N",])
                              This ends up being the chem1 base effect, the chem1Y effect in tissue 2 minus the chem1N effect in tissue 2 and a few 1/3 and -1/3 to the animal interaction effects.

                              Edit: note that interaction terms can be tested using the same logic, but you need to subtract the groups with only one chem and then add the base group. So it looks like: chem1Y chem2Y - chem1N chem2Y - chem1Y chem2N + chem1N chem2N.

                              Also, I had to update that custom line of size factor code for when all rows have a zero. If you were using that line of code to avoid the issue that all rows of your count matrix have one or more zeros, you should get the edited version:



                              We're implementing a solution here which was on the list of todos for a while now, but it won't be out for 6 months. Typical RNA-Seq doesn't have this issue, but since phyloseq we are seeing more metagenomics datasets which often have no genes without a zero.

                              Hi Michael,

                              I have also kinda same experimental design and I tried to use your suggestions in this thread but still I get the same error regarding to linear combination.
                              In my project each sample has been treated with one type of gut_microbiota: A, B, C, D, E and we have measured expression level of each mouse at three different tissues: s, p, l. Now I would like to see DE genes resulted from comparing of every gut_microbiota status (B, C, D, E) to the baseline (A) on each tissue separately. For each type of gut_microbiota I have different number of replicates, varying from 2 up to 7 replicates. Here is my design matrix:

                              Code:
                              d <- data.frame(animal=c(rep(c('5231', '5232', '5233', '5234', '5161', '5162', '5163'), each=3), rep('5164', 4) , rep(c('5165', '5166', '5167', '5168', '5169', '5170', '5171', '5172', '5173', '5174', '5175', '5176', '5177', '5178'), each=3), rep('5179', 2)), 
                                                         gut_microbiota = c(rep('A', 12), rep('B', 9), rep('E', 10), rep('D', 12), rep('B', 9), rep('D', 9), rep('C', 8)), 
                                                         tissue=c(rep(c('s', 'p', 'l'), each=1, 7), 's', 'p', 'l', 'l', rep(c('s', 'p', 'l'), each=1, 14), c('s', 'p')))
                              resulted in:

                              Code:
                              animal gut_microbiota tissue
                              1    5231              A      s
                              2    5231              A      p
                              3    5231              A      l
                              4    5232              A      s
                              5    5232              A      p
                              6    5232              A      l
                              7    5233              A      s
                              8    5233              A      p
                              9    5233              A      l
                              10   5234              A      s
                              11   5234              A      p
                              12   5234              A      l
                              13   5161              B      s
                              14   5161              B      p
                              15   5161              B      l
                              16   5162              B      s
                              17   5162              B      p
                              18   5162              B      l
                              19   5163              B      s
                              20   5163              B      p
                              21   5163              B      l
                              22   5164              E      s
                              23   5164              E      p
                              24   5164              E      l
                              25   5164              E      l
                              26   5165              E      s
                              27   5165              E      p
                              28   5165              E      l
                              29   5166              E      s
                              30   5166              E      p
                              31   5166              E      l
                              32   5167              D      s
                              33   5167              D      p
                              34   5167              D      l
                              35   5168              D      s
                              36   5168              D      p
                              37   5168              D      l
                              38   5169              D      s
                              39   5169              D      p
                              40   5169              D      l
                              41   5170              D      s
                              42   5170              D      p
                              43   5170              D      l
                              44   5171              B      s
                              45   5171              B      p
                              46   5171              B      l
                              47   5172              B      s
                              48   5172              B      p
                              49   5172              B      l
                              50   5173              B      s
                              51   5173              B      p
                              52   5173              B      l
                              53   5174              D      s
                              54   5174              D      p
                              55   5174              D      l
                              56   5175              D      s
                              57   5175              D      p
                              58   5175              D      l
                              59   5176              D      s
                              60   5176              D      p
                              61   5176              D      l
                              62   5177              C      s
                              63   5177              C      p
                              64   5177              C      l
                              65   5178              C      s
                              66   5178              C      p
                              67   5178              C      l
                              68   5179              C      s
                              69   5179              C      p
                              I have reordered design matrix as you mentioned and add a new term called, animal.nested:
                              Code:
                              idx <- with(d, order(gut_microbiota, tissue))
                              d <- d[idx, ]
                              d$animal.nested <- factor(c(rep(1:4, 3), rep(1:6, 3), c(1:2), rep(1:3, 2), rep(1:7, 3), 1, rep(1:3, 3)))
                              And finally this is my fitting design:
                              Code:
                              dds = DESeqDataSetFromHTSeqCount(sampleTable=sample_table, directory='~/htseq/', design= ~ gut_microbiota + gut_microbiota:animal.nested + gut_microbiota:tissue)
                              Can you please help with this? What is wrong in my codes?
                              Last edited by rozitaa; 03-30-2015, 04:41 AM.

                              Comment


                              • #30
                                hi rozitaa,

                                The issue here is that there are more animals in certain tissues. So when the R function model.matrix tries to form the model matrix internal to the DESeq() function, this results in a column of zeros for the nested animal 6, microbiota A combination for example.

                                There is an easy solution though. In the next release of Bioconductor, which will be released 17 April 2015, and has DESeq2 v1.8, you can provide a custom model matrix to DESeq(). (You can test the development version of DESeq2 v1.7 already, but this requires downloading the development version of R, and this is not impossible but not always easy.)

                                So all you need to do is to remove these interaction columns which have no samples:

                                Code:
                                design <- ~ gut_microbiota + gut_microbiota:animal.nested + gut_microbiota:tissue
                                mm <- model.matix(design, data=colData(dds))
                                ## remove columns of mm which have all zeros:
                                (all.zero <- apply(mm, 2, function(x) all(x == 0)))
                                mm <- mm[, !all.zero]
                                dds <- DESeq(dds, full=mm, betaPrior=FALSE)
                                We do have to turn off the log2 fold change shrinkage for this options though (betaPrior=FALSE). But the inference and results tables work the same.
                                Last edited by Michael Love; 03-31-2015, 05:45 AM. Reason: fixed code

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Essential Discoveries and Tools in Epitranscriptomics
                                  by seqadmin


                                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
                                  Yesterday, 07:01 AM
                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                55 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                52 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 09:21 AM
                                0 responses
                                45 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-04-2024, 09:00 AM
                                0 responses
                                55 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X