Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Deseq using replicates and time-matched controls

    Hi All,

    I have a question regarding Deseq and have the following miRNA dataset (4 timepoints, 2 conditions with 6 biological replicates each; 42 samples in total):

    Control (C) t=0 (n=6)
    Control (C) t=6h (n=6)
    Control (C) t=24h (n=6)
    Control (C) t=48h (n=6)

    Treated (T) t=6h (n=6)
    Treated (T) t=24h (n=6)
    Treated (T) t=48h (n=6)

    I would like to find differential miRNA expression as a result of the treatment at each time point using the time-matched controls. Would this be possible using Deseq?

    To identify differences in miRNA expression as a result of the treatment after 6h, I created the attached file (see below).

    I followed the vignette (2012-05-02) and used the following script:

    library( "DESeq" )
    countTable<- read.table("./test_deseq.txt",row.names=1, header=TRUE,)
    mirnaDesign <- data.frame( row.names = colnames( countTable ), condition = c( "C", "C", "C", "C", "C", "C", "T", "T", "T", "T", "T", "T" ), libType = c( "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end" ) )
    mirnaDesign

    singleSamples <- mirnaDesign$libType == "single-end"
    countTable <- countTable[ , singleSamples ]
    conds <- mirnaDesign$condition[ singleSamples ]

    library( "DESeq" )
    cds <- newCountDataSet( countTable, conds )
    head( counts(cds) )
    cds <- estimateSizeFactors( cds )
    sizeFactors( cds )

    head( counts( cds, normalized=TRUE ) )
    cds <- estimateDispersions( cds )
    str( fitInfo(cds) )
    # plotDispEsts( cds ) <-- for some reason this did not work!!!

    res <- nbinomTest( cds, "C", "T" )
    head(res)

    plotDE <- function( res )
    plot(res$baseMean, res$log2FoldChange, log="x", pch=20, cex=.3, col = ifelse( res$padj < .1, "red", "black" ) )
    plotDE( res )

    hist(res$pval, breaks=100, col="skyblue", border="slateblue", main="")
    resSig <- res[ res$padj < 0.1, ]

    head( resSig[ order( resSig$foldChange, -resSig$baseMean ), ] )
    head( resSig[ order( -resSig$foldChange, -resSig$baseMean ), ] )
    write.table( res, file="results_sample_data.txt" )


    Unfortunately, it didn't find any significantly altered miRNAs and I'm wondering if I'm doing anything wrong? I would appreciate if someone could send in me in the right direction!

    Many thanks in advance,
    Ron
    Attached Files

  • #2
    You code has a certain baroque complexity. All you would have been supposed to write is

    Code:
    library( DESeq )
    tbl <- read.table( "sample_data.txt", header=TRUE, row.names=1 )
    conds <- c( "C", "C", "C", "C", "C", "C", "T", "T", "T", "T", "T", "T" )
    # or, shorter: conds <- rep( c( "C", "T" ), each=6 )
    cds <- newCountDataSet( tbl, conds )
    cds <- estimateSizeFactors(cds)
    cds <- estimateDispersions( cds )
    res <- nbinomTest( cds, "C", "T" )
    However, this does not change anything. There is nothing significant in your data.

    To see, have a look at your five "best" miRNAs

    Code:
    > # which are the five lowest p values?
    > res[ head( order( res$pval ), 5 ), ]
               id     baseMean    baseMeanA    baseMeanB foldChange log2FoldChange       pval padj
    237   miR-993 2.452984e+00 4.166098e+00 7.398694e-01  0.1775929     -2.4933542 0.02405918    1
    204    miR-67 1.165563e+02 1.381480e+02 9.496464e+01  0.6874124     -0.5407522 0.04087454    1
    113 miR-281-2 1.611593e+05 2.044279e+05 1.178907e+05  0.5766861     -0.7941418 0.05051547    1
    115   miR-283 3.586856e+01 4.908510e+01 2.265203e+01  0.4614848     -1.1156451 0.07412905    1
    54    miR-193 2.891413e+02 2.177138e+02 3.605689e+02  1.6561603      0.7278423 0.08299859    1
    > 
    > # what are the normalized counts?
    > counts( cds, normalized=TRUE )[ head( order( res$pval ), 5 ), ]
                       C61          C62          C63          C64          C65          C66          T61          T62          T63          T64          T65          T66
    miR-993   9.796406e+00      0.00000 7.723074e-01 6.227026e+00 5.816508e+00 2.384339e+00      0.00000     0.000000 1.775529e+00 2.663688e+00      0.00000     0.000000
    miR-67    1.737229e+02    131.36554 3.552614e+01 1.307676e+02 1.900059e+02 1.674998e+02    125.68614   176.704522 5.149033e+01 1.083233e+02     54.67499    52.908566
    miR-281-2 2.646349e+05 341244.96963 2.011668e+05 1.287514e+05 1.432063e+05 1.475632e+05 104422.73840 82969.569595 1.720328e+05 1.984048e+05 100890.08972 48624.528183
    miR-283   4.441037e+01     45.13172 8.495382e+00 4.912432e+01 5.137915e+01 9.596965e+01     52.36923     6.796328 7.102115e+00 4.794638e+01     12.36130     9.336806
    miR-193   2.867081e+02    161.18472 2.618122e+02 1.536000e+02 2.927642e+02 1.502134e+02    142.14504   414.575995 5.220055e+02 4.679211e+02    236.29103   380.474834
    It does not seem that these change in any clear direction, sorry.

    Simon

    Comment


    • #3
      Dear Simon,

      Many thanks for your very quick reply! I think it is obvious that I'm new to this... Just a follow-up question: for the dataset I described earllier, should I work out differential gene expression between the treated and control samples at each timepoint using the scripts you provided above or is there a way to analyse the samples at the 3 different timepoints (treated vs controls at 6h, 24h and 48h) in one go?

      Best wishes,
      Ronny

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      25 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      28 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      24 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      52 views
      0 likes
      Last Post seqadmin  
      Working...
      X