Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Deseq using replicates and time-matched controls

    Hi All,

    I have a question regarding Deseq and have the following miRNA dataset (4 timepoints, 2 conditions with 6 biological replicates each; 42 samples in total):

    Control (C) t=0 (n=6)
    Control (C) t=6h (n=6)
    Control (C) t=24h (n=6)
    Control (C) t=48h (n=6)

    Treated (T) t=6h (n=6)
    Treated (T) t=24h (n=6)
    Treated (T) t=48h (n=6)

    I would like to find differential miRNA expression as a result of the treatment at each time point using the time-matched controls. Would this be possible using Deseq?

    To identify differences in miRNA expression as a result of the treatment after 6h, I created the attached file (see below).

    I followed the vignette (2012-05-02) and used the following script:

    library( "DESeq" )
    countTable<- read.table("./test_deseq.txt",row.names=1, header=TRUE,)
    mirnaDesign <- data.frame( row.names = colnames( countTable ), condition = c( "C", "C", "C", "C", "C", "C", "T", "T", "T", "T", "T", "T" ), libType = c( "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end", "single-end" ) )
    mirnaDesign

    singleSamples <- mirnaDesign$libType == "single-end"
    countTable <- countTable[ , singleSamples ]
    conds <- mirnaDesign$condition[ singleSamples ]

    library( "DESeq" )
    cds <- newCountDataSet( countTable, conds )
    head( counts(cds) )
    cds <- estimateSizeFactors( cds )
    sizeFactors( cds )

    head( counts( cds, normalized=TRUE ) )
    cds <- estimateDispersions( cds )
    str( fitInfo(cds) )
    # plotDispEsts( cds ) <-- for some reason this did not work!!!

    res <- nbinomTest( cds, "C", "T" )
    head(res)

    plotDE <- function( res )
    plot(res$baseMean, res$log2FoldChange, log="x", pch=20, cex=.3, col = ifelse( res$padj < .1, "red", "black" ) )
    plotDE( res )

    hist(res$pval, breaks=100, col="skyblue", border="slateblue", main="")
    resSig <- res[ res$padj < 0.1, ]

    head( resSig[ order( resSig$foldChange, -resSig$baseMean ), ] )
    head( resSig[ order( -resSig$foldChange, -resSig$baseMean ), ] )
    write.table( res, file="results_sample_data.txt" )


    Unfortunately, it didn't find any significantly altered miRNAs and I'm wondering if I'm doing anything wrong? I would appreciate if someone could send in me in the right direction!

    Many thanks in advance,
    Ron
    Attached Files

  • #2
    You code has a certain baroque complexity. All you would have been supposed to write is

    Code:
    library( DESeq )
    tbl <- read.table( "sample_data.txt", header=TRUE, row.names=1 )
    conds <- c( "C", "C", "C", "C", "C", "C", "T", "T", "T", "T", "T", "T" )
    # or, shorter: conds <- rep( c( "C", "T" ), each=6 )
    cds <- newCountDataSet( tbl, conds )
    cds <- estimateSizeFactors(cds)
    cds <- estimateDispersions( cds )
    res <- nbinomTest( cds, "C", "T" )
    However, this does not change anything. There is nothing significant in your data.

    To see, have a look at your five "best" miRNAs

    Code:
    > # which are the five lowest p values?
    > res[ head( order( res$pval ), 5 ), ]
               id     baseMean    baseMeanA    baseMeanB foldChange log2FoldChange       pval padj
    237   miR-993 2.452984e+00 4.166098e+00 7.398694e-01  0.1775929     -2.4933542 0.02405918    1
    204    miR-67 1.165563e+02 1.381480e+02 9.496464e+01  0.6874124     -0.5407522 0.04087454    1
    113 miR-281-2 1.611593e+05 2.044279e+05 1.178907e+05  0.5766861     -0.7941418 0.05051547    1
    115   miR-283 3.586856e+01 4.908510e+01 2.265203e+01  0.4614848     -1.1156451 0.07412905    1
    54    miR-193 2.891413e+02 2.177138e+02 3.605689e+02  1.6561603      0.7278423 0.08299859    1
    > 
    > # what are the normalized counts?
    > counts( cds, normalized=TRUE )[ head( order( res$pval ), 5 ), ]
                       C61          C62          C63          C64          C65          C66          T61          T62          T63          T64          T65          T66
    miR-993   9.796406e+00      0.00000 7.723074e-01 6.227026e+00 5.816508e+00 2.384339e+00      0.00000     0.000000 1.775529e+00 2.663688e+00      0.00000     0.000000
    miR-67    1.737229e+02    131.36554 3.552614e+01 1.307676e+02 1.900059e+02 1.674998e+02    125.68614   176.704522 5.149033e+01 1.083233e+02     54.67499    52.908566
    miR-281-2 2.646349e+05 341244.96963 2.011668e+05 1.287514e+05 1.432063e+05 1.475632e+05 104422.73840 82969.569595 1.720328e+05 1.984048e+05 100890.08972 48624.528183
    miR-283   4.441037e+01     45.13172 8.495382e+00 4.912432e+01 5.137915e+01 9.596965e+01     52.36923     6.796328 7.102115e+00 4.794638e+01     12.36130     9.336806
    miR-193   2.867081e+02    161.18472 2.618122e+02 1.536000e+02 2.927642e+02 1.502134e+02    142.14504   414.575995 5.220055e+02 4.679211e+02    236.29103   380.474834
    It does not seem that these change in any clear direction, sorry.

    Simon

    Comment


    • #3
      Dear Simon,

      Many thanks for your very quick reply! I think it is obvious that I'm new to this... Just a follow-up question: for the dataset I described earllier, should I work out differential gene expression between the treated and control samples at each timepoint using the scripts you provided above or is there a way to analyse the samples at the 3 different timepoints (treated vs controls at 6h, 24h and 48h) in one go?

      Best wishes,
      Ronny

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Advancing Precision Medicine for Rare Diseases in Children
        by seqadmin




        Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
        12-16-2024, 07:57 AM
      • seqadmin
        Recent Advances in Sequencing Technologies
        by seqadmin



        Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

        Long-Read Sequencing
        Long-read sequencing has seen remarkable advancements,...
        12-02-2024, 01:49 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 12-17-2024, 10:28 AM
      0 responses
      27 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 12-13-2024, 08:24 AM
      0 responses
      43 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 12-12-2024, 07:41 AM
      0 responses
      29 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 12-11-2024, 07:45 AM
      0 responses
      42 views
      0 likes
      Last Post seqadmin  
      Working...
      X