Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Contradictory log-fold-change figures from DESeq?

    Hello, I have made a couple of figures in DESeq, the first should plot genes with adjusted p-value <0.05 in red with >0.05 genes in grey, and was made with:

    plotMA(res,
    + col = ifelse(res$padj>0.05, "gray69", "red3"),
    + linecol = "#ff000080",
    + xlab = "mean of normalized counts", ylab = expression(log[2]~fold~change),
    + log = "x", cex=0.45)
    > dev.off()


    The second should just plot genes that have a p-value >0.05 (i.e. all the red ones from the first figure):

    resSig = res[res$padj <0.05, ]
    > plotMA(resSig)


    And yet there are red points in the first (padj<0.05) that are not present in the second! Can anybody explain this to me? I have attached the two figures.

    And on a related note, is it better to use the p-value or adjusted p-value that DESeq gives you?

    Many thanks

    Alex
    Attached Files

  • #2
    Always use the adjusted p-value.

    The differences look to be simply due to the different scales on the graphs and perhaps the fact that you're using a slightly different coloring/inclusion threshold in each.

    If you instead did something like:
    Code:
    plotDE <- function(res)
        plot(res$baseMean, res$log2FoldChange, col = ifelse(res$padj>=0.05, "gray69", "red3"), log="x", pch=6, xlim=c(0.1,1e5), ylim=c(-4,4)))
    plotDE(res)
    resSig <- res[which(res$padj<0.05),]
    plotDE(resSig)
    then the two graphs should be more or less identical. The plotMA function is just a nicer version of the "plotDE" function I pasted above (in fact, if you just type "plotMA", without quotes, you'll see exactly what the function does).

    Comment


    • #3
      Thanks, I'm still not sure what was wrong with plotMA, but the plotDE line you gave me works fine!


      Thanks

      Alex

      Comment


      • #4
        Hi, got a related problem now where the scatterplot of ordinary vs moderated log ratios in DESeq seems to be wrong - the points seem to be upside down (picture attached and lines of code below)

        Would be grateful for any advice! (and do you know how I can retrieve the variance stabilised data into an excel/txt/csv file?)

        Thanks

        Alex

        > mod_lfc = (rowMeans( exprs(vsd)[, conditions(cds)=="WT", drop=FALSE] ) -
        + rowMeans( exprs(vsd)[, conditions(cds)=="KO", drop=FALSE] ))

        > lfc = res$log2FoldChange
        > table(lfc[!is.finite(lfc)], useNA="always")

        > logdecade = 1 + round( log10( 1+rowMeans(counts(cdsBlind, normalized=TRUE)) ) )
        > lfccol = colorRampPalette( c( "gray", "blue" ) )(6)[logdecade]

        > ymax = 4.5
        > pdf("Ordinary vs Moderated Log ratios.pdf")
        > plot( pmax(-ymax, pmin(ymax, lfc)), mod_lfc,
        + xlab = "ordinary log-ratio", ylab = "moderated log-ratio",
        + cex=0.45, asp=1, col = lfccol,
        + pch = ifelse(lfc<(-ymax), 60, ifelse(lfc>ymax, 62, 16)))
        > abline( a=0, b=1, col="red3")
        Attached Files

        Comment


        • #5
          Regarding the plot, I would check that res$log2FoldChange is log2(WT/KO) rather than log2(KO/WT), as it appears that there's just a swapped axis somewhere. Regarding writing the vsd data, the exprs() function returns a matrix, so you can just write.csv or write.delim to write to a file.

          Comment


          • #6
            Great, both worked - re: the vsd data (the numbers are counts?): can/should this be used in the same way as the original counts data to look at DEGs (since the problem of 0 counts seems to have been dealt with), or is it just a form that the data needs to be in to produce PCAs and sample-to-sample difference heatmaps?


            Thanks, I really appreciate your help!

            Alex

            Comment


            • #7
              Well, variance stabilized numbers are transformed counts. If you're interested in doing the DEG analysis in DESeq (or edgeR or other similar packages), then you'll find these sorts of transformations most useful for QC, such as PCA and heatmaps. That doesn't mean that you can't use variance stabilized data for DEG analysis. Voom, which is used to transform RNAseq data for use in analysis by limma, is effectively doing something like this, i.e., it tries to model the mean-variance relationship and then transforms the data such that it should have a more normal distribution such that the regular microarray tools work. In practice, both approaches tend to give pretty similar results, at least in my hands.

              Comment


              • #8
                Hi,

                I've seen this thread only yesterday, and now had a close look at the problem described in the original post. In short, this is a bug: The 'col' argument of 'plotMA' is broken and messes up the assignment of colours to points. (Internally, we subset the 'res' object to get rid of genes with all-zero counts and we forgot to subset 'col', too.) I'll fix this later today.

                Simon

                Comment


                • #9
                  I fixed the issue with the 'col' argument in DESeq 1.12.1. In DESeq2, it was already correct.

                  Comment


                  • #10
                    Thanks - if I re-load DeSEQ 1 from bioconductor, will it update the copy on my system?

                    Is there any advantage of plotMA over plotDE anyway?


                    Thanks

                    Alex

                    Comment


                    • #11
                      You'll have to wait a day or two for the change to get propagated to the web server (or you download from the SVN server, if you know how to do this).

                      What is plotDE?

                      Comment


                      • #12
                        I thought it was part of DESeq - dpryan introduced it to me:

                        Originally posted by dpryan View Post
                        Always use the adjusted p-value.

                        The differences look to be simply due to the different scales on the graphs and perhaps the fact that you're using a slightly different coloring/inclusion threshold in each.

                        If you instead did something like:
                        Code:
                        plotDE <- function(res)
                            plot(res$baseMean, res$log2FoldChange, col = ifelse(res$padj>=0.05, "gray69", "red3"), log="x", pch=6, xlim=c(0.1,1e5), ylim=c(-4,4)))
                        plotDE(res)
                        resSig <- res[which(res$padj<0.05),]
                        plotDE(resSig)
                        then the two graphs should be more or less identical. The plotMA function is just a nicer version of the "plotDE" function I pasted above (in fact, if you just type "plotMA", without quotes, you'll see exactly what the function does).

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Current Approaches to Protein Sequencing
                          by seqadmin


                          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                          04-04-2024, 04:25 PM
                        • seqadmin
                          Strategies for Sequencing Challenging Samples
                          by seqadmin


                          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                          03-22-2024, 06:39 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, 04-11-2024, 12:08 PM
                        0 responses
                        24 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 10:19 PM
                        0 responses
                        25 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-10-2024, 09:21 AM
                        0 responses
                        21 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 04-04-2024, 09:00 AM
                        0 responses
                        52 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X