Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • [DESeq2] Multiple replicates, multiple treatments

    I'm experiencing issues again with the replicates of our experiment. We should be comparing 3 types of treatments in one heatmap.
    • Ctrl vs T
    • Ctrl vs B
    • Ctrl vs A


    Each of which have 3 replicates. The following code works very well when I don't use replicates, and simply compare one treatment vs control. Also, when calling mcols, it should show the 3 comparisons we will use in our heatmap, though it only says CTRL vs T (if I don't use replicates) or T vs A in this case.

    So here are my two questions:
    1. Is there a better way to define samples$condition?
    2. How should I use mcols to have the 3 comparisons?


    Thanks (ps: I'm not an R expert)

    Code:
    # DESeq1 libraries
    library( "DESeq2" )
    library("Biobase")
    
    # Heatmap libraries
    library(RColorBrewer)
    library( "genefilter" )
    library(gplots) 
    
    # Start loading matrix data
    clba = read.table("matrix_duplicates_merged_CLBA.txt", header=TRUE, row.names=1)
    head(clba)
    
    samplesclba <- data.frame(row.names=c("C1", "C2", "C3", "T1", "T2", "T3", "B1", "B2", "B3", "A1", "A2", "A3"), condition=as.factor(c(rep("C",3), rep("T", 3), rep("B", 3), rep("A", 3))))
    
    ## Relevel doesn't work with replicates
    samples$condition <- relevel(samples$condition, "C")
    [B]Error in samples$condition : object of type 'closure' is not subsettable[/B]
    
    # Launch DESeq2
    ddsclba <- DESeqDataSetFromMatrix(countData = as.matrix(clba), colData=samplesclba, design=~condition)
    ddsclba <- DESeq(ddsclba, betaPrior=FALSE)
    
    # Results
    resclba <- results( ddsclba )
    mcols(resclba, use.names=TRUE)
    DataFrame with 6 rows and 2 columns
                           type                           description
                    <character>                           <character>
    baseMean       intermediate           the base mean over all rows
    log2FoldChange      results  log2 fold change: condition T vs A
    lfcSE               results    standard error: condition T vs A
    stat                results    Wald statistic: condition T vs A
    pvalue              results Wald test p-value: condition T vs A
    padj                results                  BH adjusted p-values
    
    
    
    # Heatmap with top 35 genes
    rldclba <- rlogTransformation(ddsclba)
    topVarGenesclba <- order( rowVars( assay(rldclba) ), decreasing=TRUE ) [1:35]
    hmcol <- colorRampPalette( rev(brewer.pal(9, "RdBu")))(255)
    
    heatmap.2( assay(rldclba)[ topVarGenesclba, ], Colv=FALSE, scale="row", 
    trace="none", dendrogram="row", col = hmcol
    Last edited by KYR; 07-28-2014, 03:41 PM. Reason: typo

  • #2
    One problem that I can see in your script is samples$condition. There is no such variable. Relevel works on DESeqDataSet.

    Comment


    • #3
      We strongly recommend to read in a CSV file for the sample table. This helps to avoid little mistakes (accidental keystroke when scrolling through script) which can change the whole result of the analysis.

      As Tibor mentioned:

      dataframe$column <- value

      only works if dataframe exists already.

      I'm not sure what you expect from mcols(). This function just tells you more information about the columns of the results table. res is the results table itself. And you can customize the results table -- e.g. which comparison to make -- using the arguments to results():

      resTC = results(dds, contrast=c("condition","T","C"))

      this will give a results table of log fold changes (LFC) of T over C. Here a positive LFC means T has higher counts than C. A negative LFC means T has lower counts than C.

      Comment


      • #4
        Originally posted by Michael Love View Post

        I'm not sure what you expect from mcols(). This function just tells you more information about the columns of the results table. res is the results table itself. And you can customize the results table -- e.g. which comparison to make -- using the arguments to results():

        resTC = results(dds, contrast=c("condition","T","C"))

        this will give a results table of log fold changes (LFC) of T over C. Here a positive LFC means T has higher counts than C. A negative LFC means T has lower counts than C.
        Thanks for your answer. I use mcols() to double-check that the comparison is done as I need it, i.e.,

        Ctrl vs T
        Ctrl vs B
        Ctrl vs A

        I still don't understand if, despite what mcols says, the pipeline I wrote is giving me the heatmap I need with the 3 comparisons I wrote above..

        My guess is that mcols() should show the 3 comparisons?!...

        Comment


        • #5
          In your heatmap above you are plotting the rlog counts for the rows with the highest variance. The rlog is a transformation which has nothing to do with differential expression testing. So you haven't told the software that you want to visualize genes which are differentially expressed in some contrast.

          You can subset and plot genes which are differentially expressed in some comparison like so:

          Code:
          res = results(dds, contrast=c("condition","B","A"))
          sig = which(res$padj < 0.1)
          
          # check if this is too many rows to plot
          length(sig)
          
          # rlog or VST
          rld <- rlog(dds)
          heatmap.2( assay(rld)[sig, ])
          Last edited by Michael Love; 08-03-2014, 05:51 AM. Reason: test =>visualize

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          18 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          22 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          17 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          49 views
          0 likes
          Last Post seqadmin  
          Working...
          X