Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Deseq using replicates and time-matched controls

    Hi All,

    I have a question regarding Deseq and have the following miRNA dataset (4 timepoints, 2 conditions with 6 biological replicates each; 42 samples in total):

    Control (C) t=0 (n=6)
    Control (C) t=6h (n=6)
    Control (C) t=24h (n=6)
    Control (C) t=48h (n=6)

    Treated (T) t=6h (n=6)
    Treated (T) t=24h (n=6)
    Treated (T) t=48h (n=6)

    I would like to find differential miRNA expression as a result of the treatment at each time point using the time-matched controls. Would this be possible using Deseq?

    To identify differences in miRNA expression as a result of the treatment after 6h, I created the attached file (see below).

    I followed the vignette (2012-05-02) and used the following script:

    library( "DESeq" )
    countTable<- read.table("./test_deseq.txt",row.names=1, header=TRUE,)
    mirnaDesign <- data.frame( row.names = colnames( countTable ), condition = c( "C", "C", "C", "C", "C", "C", "T", "T", "T", "T", "T", "T" ), libType = c( "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end" ) )
    mirnaDesign

    singleSamples <- mirnaDesign$libType == "single-end"
    countTable <- countTable[ , singleSamples ]
    conds <- mirnaDesign$condition[ singleSamples ]

    library( "DESeq" )
    cds <- newCountDataSet( countTable, conds )
    head( counts(cds) )
    cds <- estimateSizeFactors( cds )
    sizeFactors( cds )

    head( counts( cds, normalized=TRUE ) )
    cds <- estimateDispersions( cds )
    str( fitInfo(cds) )
    # plotDispEsts( cds ) <-- for some reason this did not work!!!

    res <- nbinomTest( cds, "C", "T" )
    head(res)

    plotDE <- function( res )
    plot(res$baseMean, res$log2FoldChange, log="x", pch=20, cex=.3, col = ifelse( res$padj < .1, "red", "black" ) )
    plotDE( res )

    hist(res$pval, breaks=100, col="skyblue", border="slateblue", main="")
    resSig <- res[ res$padj < 0.1, ]

    head( resSig[ order( resSig$foldChange, -resSig$baseMean ), ] )
    head( resSig[ order( -resSig$foldChange, -resSig$baseMean ), ] )
    write.table( res, file="results_sample_data.txt" )


    Unfortunately, it didn't find any significantly altered miRNAs and I'm wondering if I'm doing anything wrong? I would appreciate if someone could send in me in the right direction!

    Many thanks in advance,
    Ron
    Attached Files

  • #2
    You code has a certain baroque complexity. All you would have been supposed to write is

    Code:
    library( DESeq )
    tbl <- read.table( "sample_data.txt", header=TRUE, row.names=1 )
    conds <- c( "C", "C", "C", "C", "C", "C", "T", "T", "T", "T", "T", "T" )
    # or, shorter: conds <- rep( c( "C", "T" ), each=6 )
    cds <- newCountDataSet( tbl, conds )
    cds <- estimateSizeFactors(cds)
    cds <- estimateDispersions( cds )
    res <- nbinomTest( cds, "C", "T" )
    However, this does not change anything. There is nothing significant in your data.

    To see, have a look at your five "best" miRNAs

    Code:
    > # which are the five lowest p values?
    > res[ head( order( res$pval ), 5 ), ]
               id     baseMean    baseMeanA    baseMeanB foldChange log2FoldChange       pval padj
    237   miR-993 2.452984e+00 4.166098e+00 7.398694e-01  0.1775929     -2.4933542 0.02405918    1
    204    miR-67 1.165563e+02 1.381480e+02 9.496464e+01  0.6874124     -0.5407522 0.04087454    1
    113 miR-281-2 1.611593e+05 2.044279e+05 1.178907e+05  0.5766861     -0.7941418 0.05051547    1
    115   miR-283 3.586856e+01 4.908510e+01 2.265203e+01  0.4614848     -1.1156451 0.07412905    1
    54    miR-193 2.891413e+02 2.177138e+02 3.605689e+02  1.6561603      0.7278423 0.08299859    1
    > 
    > # what are the normalized counts?
    > counts( cds, normalized=TRUE )[ head( order( res$pval ), 5 ), ]
                       C61          C62          C63          C64          C65          C66          T61          T62          T63          T64          T65          T66
    miR-993   9.796406e+00      0.00000 7.723074e-01 6.227026e+00 5.816508e+00 2.384339e+00      0.00000     0.000000 1.775529e+00 2.663688e+00      0.00000     0.000000
    miR-67    1.737229e+02    131.36554 3.552614e+01 1.307676e+02 1.900059e+02 1.674998e+02    125.68614   176.704522 5.149033e+01 1.083233e+02     54.67499    52.908566
    miR-281-2 2.646349e+05 341244.96963 2.011668e+05 1.287514e+05 1.432063e+05 1.475632e+05 104422.73840 82969.569595 1.720328e+05 1.984048e+05 100890.08972 48624.528183
    miR-283   4.441037e+01     45.13172 8.495382e+00 4.912432e+01 5.137915e+01 9.596965e+01     52.36923     6.796328 7.102115e+00 4.794638e+01     12.36130     9.336806
    miR-193   2.867081e+02    161.18472 2.618122e+02 1.536000e+02 2.927642e+02 1.502134e+02    142.14504   414.575995 5.220055e+02 4.679211e+02    236.29103   380.474834
    It does not seem that these change in any clear direction, sorry.

    Simon

    Comment


    • #3
      Dear Simon,

      Many thanks for your very quick reply! I think it is obvious that I'm new to this... Just a follow-up question: for the dataset I described earllier, should I work out differential gene expression between the treated and control samples at each timepoint using the scripts you provided above or is there a way to analyse the samples at the 3 different timepoints (treated vs controls at 6h, 24h and 48h) in one go?

      Best wishes,
      Ronny

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin


        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
        Today, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      37 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      41 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      35 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      54 views
      0 likes
      Last Post seqadmin  
      Working...
      X