Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Deseq2 Design formulas; Too many possible roads!

    Dear NGS community,

    I am analysing RNA-seq data with deseq2 and I would truly appreciate any feedback regarding my current formula designs.

    I have 3 biological replicates for each one of the following 7 experiments:

    UND = Undifferentiated epidermal stem cells
    DIF = Differentiated epidermal stem cells
    HPLC = HAT family inhibitor
    DMSO = DMSO dilution used as control against HLPC
    siRNA1 = Targeting the same protein
    siRNA2 = Targeting the same protein
    siCTRL = siRNA Control

    stage......... treatment
    UND ......... siCTRL
    UND ......... DMSO
    DIF ......... siCTRL
    DIF ......... DMSO
    DIF ......... siRNA1
    DIF ......... siRNA2
    DIF ......... HPLC
    1. DE genes in Differentiation
    Null hypothesis: no changes in gene expression during differentiation, controlling for treatment:
    I'll be using only this 4 experiments to test this:

    stage......... treatment
    UND ......... siCTRL
    UND ......... DMSO
    DIF ......... siCTRL
    DIF ......... DMSO
    Code:
    design1 <- data.frame(experiment=colnames(data1),
     stage=c("UND","UND","UND",
            "UND","UND","UND",
            "DIF","DIF","DIF",
            "DIF","DIF","DIF"),
    
     treatment=c("siCTRL","siCTRL","siCTRL",
             "DMSO","DMSO","DMSO",
             "siCTRL","siCTRL","siCTRL",
             "DMSO", "DMSO", "DMSO") )
                                                      
    dLRT <- DESeqDataSetFromMatrix(countData = data1, colData = design1, 
                      design = ~ treatment + stage:treatment + stage )
    dLRT <- DESeq(dLRT, test="LRT", 
               full= ~ treatment + stage:treatment + stage, 
               reduced= ~ treatment )
    dLRT_res <- results(dLRT)
    dLRT_res$log2FoldChange <- dDif_res$log2FoldChange*-1 
    # To have in positive values in the L2FC for DIF
    It is my understanding that here the genes with small padj values will correspond to genes changing expression in differentiation
    and not because of the treatment. Is my design correct to address the question of which genes are DE during differentiation?
    Should I remove the "stage:treatment" from the design formula? Or completely change my design formula?

    2. DE genes in Differentiated cells after knocking down by siRNA

    Null hypotesis: no changes in gene expression after treatment with siRNA in Differentiated cells.
    So far I've been using this experiments:
    condition
    DIF siCTRL
    DIF siRNA1
    DIF siRNA2
    Code:
    design2 <- data.frame(experiment=colnames(data2),
              condition=c("siCTRL","siRNA1","siRNA2") )
    
    dLRT <- DESeqDataSetFromMatrix(countData = data2, colData = design2,
              design = ~ condition )
    
    dLRT <- DESeq(dLRT, test="LRT", reduced= ~ 1 )
    
    dLRT_res <- results(dLRT)
    
    dDif_siRNA1 <-results(dDif, contrast=c("condition","siRNA1","siCTRL"))
    dDif_siRNA2 <-results(dDif, contrast=c("condition","siRNA2","siCTRL"))
    dDif_siRNAvs <-results(dDif, contrast=c("condition","siRNA1","siRNA2"))
    At the end, to select for differentially expressed genes that are down regulated when treated with the siRNAs I use:

    Code:
    select = which(dDif_res$padj<0.01 & 
               dDif_siRNA1$log2FoldChange<(-1) & 
               dDif_siRNA2$log2FoldChange<(-1) & 
               abs(dDif_siRNAvs)<1 )
    I would like to know your thoughts about whether this approach is right to know which genes are DE when knocking down with siRNAs? or if there is a better way to solve this question.

    Also is it possible to control by DIF DMSO as well? with something like adding another column:

    condition......... treatmentGroup
    DIF DMSO ......... CNT
    DIF siCTRL ......... CNT
    DIF siRNA1 ......... TRT
    DIF siRNA2 ......... TRT
    And try something like "full = ~ condition + treatmentGroup, reduce = ~ condition " to use DIF DMSO also in the control group.
    But there is the problem of linear combination, so I though of using the edgeR trick "individuals in nested groups" from Deseq2 vignette to bypass this. Finally I decided to better ask for help.

    I am not convinced about this last design as, intuitively, it make no sense to compare siRNAs vs DMSO treatment. But at the end my goal is to compare the DE genes from siBRD4 treatment against the DE in Differentiation; and perhaps there is a better way to make this comparison than just overlapping gene names from 2 independent test (like involving the other dataset in differentiated cells : DIF DMSO).

    Thanks a lot in advance for the help and attention! Cheers!
    Rob TM
    Last edited by RTM; 05-27-2015, 12:30 AM.

  • #2
    1. Yes, you're testing for whether there's an effect due to "stage" while controlling for "treatment" with that design. Whether you want to include "stage:treatment" in the reduced model or not depends on exactly what you want to argue using the results. As is, you can say that there's a difference due to "stage", but possibly only due to it interacting with "treatment". Perhaps that's a good biological point to make, I don't know the biological background here. I suspect, though, that you want to make the argument that there are some changes due to "stage" in and of itself, in which case you would want "stage:treatment" in the reduced model.

    2. An adjusted p-value threshold of 0.01 seems rather strict unless you're getting a crazy number of DE genes otherwise. I'm not sure you really want to test "abs(dDif_siBRD4vs)<1". I understand that you're trying to ensure that the knock-down isn't just due to a weird siRNA, but given that the siRNAs could well have different efficiencies I worry that you'll be tossing some meaningful results.

    3. Regarding adding in a DMSO control, you'd have to give more details. I can only presume that the siRNAs have a bit of DMSO mixed in for solubility, in which case it'd be nice to control for that. However, isn't your siCTRL group already controlling for that (I assume that this is a scrambled siRNA delivered in an identical manner). In that case, you're really not gaining anything by including the DMSO group.

    Comment


    • #3
      Dear Devon Ryan,

      Thank you so much for the prompt reply. It is really appreciated.

      I'll be including the interaction "stage:treatment" as you are right about me wanting argument that those genes are DE because of the differentiation process.

      Indeed, you are correct in your assumptions and I agree that there is not much to gain by adding DIF DMSO. Perhaps a more appropriated question would be to ask if is there a better way to obtain which genes that are DE in differentiation are affected by the siRNA knock down?

      So far, the approach I have used is to ask which genes are DE in the differentiation process, then which genes are DE in differentiated cells upon treatment with siRNAs and finally intersect the two gene lists (to leave out genes DE because of sRNA treatment but not in the differentiation process).

      Thanks again!

      Comment


      • #4
        Regarding a different method, you might consider just putting both siRNA groups together. This would increase the variance, but you're only interested in consistent changes anyway, so that should be fine. You then have the added benefit of not filtering out genes that are DE due to either of the siRNAs but significantly more so in one versus the other.

        Comment


        • #5
          Thanks again for the prompt reply.

          I think I have not succeed in making myself clear; and I feel is because I have not detailed enough the experimental design.

          What I want to know is which genes are being affected by the siRNA knock down during the differentiation process. Here is a summary of the experimental design:

          1. Count the cells, seed them and transfect them with corresponding siRNAs (CNTRL, 1 and 2), DMSO and HPLC. At this point they are all Undif.
          2. After 24 hours post-transfection, differentiation is induced in all the required wells. (The undif wells stay untouched)
          3. Harvest the Undif siCNT and DMSO samples.
          4. siRNA retransfection is done ( to keep the efficiency of the knock-down) and the treatment fresh DMSO and HPLC treatment is added media change.
          5. Wait until the CNTRL cells are differentiated (siCTRL & DMSO) and harvest them all.

          Then an even better question might be: how to model or account for the effect of the interaction between the siRNAs and the differentiation process, this to obtain which of genes DE in the differentiation process are affected (DE) by the siRNA treatment.

          Thank you really much for the help provided so far.

          Comment


          • #6
            Hmm, well the experiment isn't really designed to properly disentangle the siRNA:differentiation interaction from the direct siRNA effect. There's really no way around that.

            Comment


            • #7
              Thanks a lot for the help and comments!

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              18 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              22 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              16 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              47 views
              0 likes
              Last Post seqadmin  
              Working...
              X