Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • [DESeq2] Multiple replicates, multiple treatments

    I'm experiencing issues again with the replicates of our experiment. We should be comparing 3 types of treatments in one heatmap.
    • Ctrl vs T
    • Ctrl vs B
    • Ctrl vs A


    Each of which have 3 replicates. The following code works very well when I don't use replicates, and simply compare one treatment vs control. Also, when calling mcols, it should show the 3 comparisons we will use in our heatmap, though it only says CTRL vs T (if I don't use replicates) or T vs A in this case.

    So here are my two questions:
    1. Is there a better way to define samples$condition?
    2. How should I use mcols to have the 3 comparisons?


    Thanks (ps: I'm not an R expert)

    Code:
    # DESeq1 libraries
    library( "DESeq2" )
    library("Biobase")
    
    # Heatmap libraries
    library(RColorBrewer)
    library( "genefilter" )
    library(gplots) 
    
    # Start loading matrix data
    clba = read.table("matrix_duplicates_merged_CLBA.txt", header=TRUE, row.names=1)
    head(clba)
    
    samplesclba <- data.frame(row.names=c("C1", "C2", "C3", "T1", "T2", "T3", "B1", "B2", "B3", "A1", "A2", "A3"), condition=as.factor(c(rep("C",3), rep("T", 3), rep("B", 3), rep("A", 3))))
    
    ## Relevel doesn't work with replicates
    samples$condition <- relevel(samples$condition, "C")
    [B]Error in samples$condition : object of type 'closure' is not subsettable[/B]
    
    # Launch DESeq2
    ddsclba <- DESeqDataSetFromMatrix(countData = as.matrix(clba), colData=samplesclba, design=~condition)
    ddsclba <- DESeq(ddsclba, betaPrior=FALSE)
    
    # Results
    resclba <- results( ddsclba )
    mcols(resclba, use.names=TRUE)
    DataFrame with 6 rows and 2 columns
                           type                           description
                    <character>                           <character>
    baseMean       intermediate           the base mean over all rows
    log2FoldChange      results  log2 fold change: condition T vs A
    lfcSE               results    standard error: condition T vs A
    stat                results    Wald statistic: condition T vs A
    pvalue              results Wald test p-value: condition T vs A
    padj                results                  BH adjusted p-values
    
    
    
    # Heatmap with top 35 genes
    rldclba <- rlogTransformation(ddsclba)
    topVarGenesclba <- order( rowVars( assay(rldclba) ), decreasing=TRUE ) [1:35]
    hmcol <- colorRampPalette( rev(brewer.pal(9, "RdBu")))(255)
    
    heatmap.2( assay(rldclba)[ topVarGenesclba, ], Colv=FALSE, scale="row", 
    trace="none", dendrogram="row", col = hmcol
    Last edited by KYR; 07-28-2014, 03:41 PM. Reason: typo

  • #2
    One problem that I can see in your script is samples$condition. There is no such variable. Relevel works on DESeqDataSet.

    Comment


    • #3
      We strongly recommend to read in a CSV file for the sample table. This helps to avoid little mistakes (accidental keystroke when scrolling through script) which can change the whole result of the analysis.

      As Tibor mentioned:

      dataframe$column <- value

      only works if dataframe exists already.

      I'm not sure what you expect from mcols(). This function just tells you more information about the columns of the results table. res is the results table itself. And you can customize the results table -- e.g. which comparison to make -- using the arguments to results():

      resTC = results(dds, contrast=c("condition","T","C"))

      this will give a results table of log fold changes (LFC) of T over C. Here a positive LFC means T has higher counts than C. A negative LFC means T has lower counts than C.

      Comment


      • #4
        Originally posted by Michael Love View Post

        I'm not sure what you expect from mcols(). This function just tells you more information about the columns of the results table. res is the results table itself. And you can customize the results table -- e.g. which comparison to make -- using the arguments to results():

        resTC = results(dds, contrast=c("condition","T","C"))

        this will give a results table of log fold changes (LFC) of T over C. Here a positive LFC means T has higher counts than C. A negative LFC means T has lower counts than C.
        Thanks for your answer. I use mcols() to double-check that the comparison is done as I need it, i.e.,

        Ctrl vs T
        Ctrl vs B
        Ctrl vs A

        I still don't understand if, despite what mcols says, the pipeline I wrote is giving me the heatmap I need with the 3 comparisons I wrote above..

        My guess is that mcols() should show the 3 comparisons?!...

        Comment


        • #5
          In your heatmap above you are plotting the rlog counts for the rows with the highest variance. The rlog is a transformation which has nothing to do with differential expression testing. So you haven't told the software that you want to visualize genes which are differentially expressed in some contrast.

          You can subset and plot genes which are differentially expressed in some comparison like so:

          Code:
          res = results(dds, contrast=c("condition","B","A"))
          sig = which(res$padj < 0.1)
          
          # check if this is too many rows to plot
          length(sig)
          
          # rlog or VST
          rld <- rlog(dds)
          heatmap.2( assay(rld)[sig, ])
          Last edited by Michael Love; 08-03-2014, 05:51 AM. Reason: test =>visualize

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM
          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 06:37 PM
          0 responses
          10 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 06:07 PM
          0 responses
          9 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-22-2024, 10:03 AM
          0 responses
          50 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-21-2024, 07:32 AM
          0 responses
          67 views
          0 likes
          Last Post seqadmin  
          Working...
          X