View Single Post
Old 07-23-2014, 02:27 AM   #1
frymor
Senior Member
 
Location: Germany

Join Date: May 2010
Posts: 150
Unhappy DESeq2 with no replicates - strange results

Hi all,

I am using the DESeq2 package to analyse my RNA-Seq data set from the fruit fly (D. melanogaster). Unfortunately there are no replicates.

I know this is not optimal and one can't really relay on the statistical strength of the results, but we can still look into the data and relay on the fold-induction differences between the samples.

This is also the reason for my question.
I know the variance might be over-estimated, but what I don not understand is, why I get strange BaseMean and FoldChange results.

This is how I run DESeq2:
Code:
cds <- DESeqDataSetFromMatrix (
countData = Comp,
colData   = colData,  
design    = ~condition
)
fit = DESeq(cds)
res = results(fit)
But when I look at the results, I get the wrong numbers.
the raw values from my samples:
Code:
>Comp[13696:13706,]
            sample1  sample2
FBgn0085379        1    4
FBgn0085380      104  117
FBgn0085382      101  137
FBgn0085383       88  187
FBgn0085384       90  275
FBgn0085385       18   55
FBgn0085386       40   40
FBgn0085387       16  310
FBgn0085388      910 3333
FBgn0085390      192  179
FBgn0085391       96  359
and these is a snippet off the results from the "differential expression" analysis:
Code:
>res[13696:13706,]
log2 fold change (MAP): condition sample2 vs sample1 
Wald test p-value: condition sample2 vs sample1 
DataFrame with 11 rows and 6 columns
              baseMean log2FoldChange     lfcSE       stat    pvalue      padj
             <numeric>      <numeric> <numeric>  <numeric> <numeric> <numeric>
FBgn0085379   2.047768      0.1776917  1.656357  0.1072786 0.9145679  0.999346
FBgn0085380 119.997967     -1.0010375  1.365438 -0.7331255 0.4634819  0.999346
FBgn0085382 123.804622     -0.7832908  1.339541 -0.5847457 0.5587187  0.999346
FBgn0085383 128.899132     -0.2415351  1.299186 -0.1859127 0.8525132  0.999346
FBgn0085384 157.869569      0.2069421  1.275569  0.1622352 0.8711206  0.999346
...                ...            ...       ...        ...       ...       ...
FBgn0085386   44.59838     -1.0634868  1.528435 -0.6958011 0.4865534  0.999346
FBgn0085387  109.25461      2.2342308  1.536826  1.4537959 0.1460029  0.999346
FBgn0085388 1768.01176      0.4434720  1.179468  0.3759932 0.7069220  0.999346
FBgn0085390  210.03007     -1.2372886  1.341767 -0.9221335 0.3564590  0.999346
FBgn0085391  188.81235      0.4581024  1.270345  0.3606124 0.7183892  0.999346
My questions regards the values in the line "FBgn0085380" and "FBgn0085386", just as an example.

In the raw data for the first gene shows a slight higher read counts for sample2, while the number is equal for the second gene. But in the results of the differential expression I get a different picture.
for the first gene I get a BaseMean of ~119, though the numer of reads is lower, in the second I have a similar picture. The FoldChange values are off in the same way.
I get in both a downregulation in my first sample, though the number of reads is higher in the second or equal in the two samples respectively.

Is there an explanation for this behaviour? Are the numbers off due to the fact, that I have no replicate and all the samples are regarded as replicates ( but this still doesn't explain the BaseMean values)?


Thanks in advance

Assa
frymor is offline   Reply With Quote