SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
DESeq2 without biol replicates sisterdot Bioinformatics 20 02-24-2016 11:14 AM
DESeq2 with different replicates robertorun RNA Sequencing 0 02-09-2016 04:50 PM
DESeq2 with replicates no sig padj mrandel Bioinformatics 2 07-15-2015 10:36 AM
DESeq2 with no replicates - strange results frymor Bioinformatics 2 07-23-2014 03:17 AM
Chips-seq replicates and motif discovery: how to deal with the merged peaks? feralBiologist Bioinformatics 1 12-03-2013 05:06 AM

Reply
 
Thread Tools
Old 02-29-2016, 08:58 AM   #1
spabinger
Member
 
Location: Europe

Join Date: Jun 2011
Posts: 13
Default DESeq2 merged vs unmerged replicates

Hi,

I am currently analyzing a RNASeq experiments with 2 conditions. Each condition has several biological replicates and each replicate was sequenced multiple times (on different lanes and even flowcells). For each sub-replicate I have a separate fastq file.

Control 1-1 Treatment 1-1
Control 1-2 Treatment 1-2
Control 1-3 Treatment 1-3
Control 2-1 Treatment 2-1
Control 2-2 Treatment 2-2
Control 2-3 Treatment 2-3
...

I performed now 2 analyses:
1) Fastq -> STAR -> htseq count -> DESeq2
2) Fastq -> Merge Fastqs of each replicate (1-1+1-2+1-3) -> STAR -> htseq count -> DESeq2

I checked the count files and the individual counts of the replicates sum up to the merged count file.

However, when I run now DESeq2 on the different count-data-sets I get different results (most notably the adj p-value).

Not merged:
Code:
                    baseMean log2FoldChange      lfcSE      stat        pvalue         padj
                   <numeric>      <numeric>  <numeric> <numeric>     <numeric>    <numeric>
ENSG00000143369.10 479.26267      6.1618396 0.28753411  21.42994 7.026598e-102 2.636520e-97
ENSG00000135250.12 381.34094      0.7005228 0.03782914  18.51807  1.476331e-76 2.769745e-72
ENSG00000163898.5   62.74118      5.4375752 0.30318159  17.93504  6.281376e-72 7.856327e-68
ENSG00000154269.10  65.19322      3.9433545 0.23587129  16.71825  9.651700e-63 9.053777e-59
ENSG00000108821.9  306.78844      3.0304242 0.18264854  16.59156  8.020919e-62 6.019218e-58
Merged:
Code:
                     baseMean log2FoldChange     lfcSE      stat       pvalue         padj
                    <numeric>      <numeric> <numeric> <numeric>    <numeric>    <numeric>
ENSG00000077327.11   63.88176     -3.1465321 0.5082833 -6.190509 5.997039e-10 1.553113e-05
ENSG00000222022.1    13.70911      3.2297638 0.5455883  5.919782 3.223688e-09 4.174354e-05
ENSG00000135250.12 3787.92187      0.7295915 0.1324829  5.507063 3.648689e-08 3.149791e-04
ENSG00000108821.9  2925.22227      2.4831871 0.4592653  5.406869 6.413598e-08 4.152484e-04
ENSG00000110427.10   74.67557      2.7264235 0.5144769  5.299409 1.161784e-07 6.017577e-04

Could you please tell me if that is expected?
Which approach should I use?

Thanks for you help,
Stephan
spabinger is offline   Reply With Quote
Old 02-29-2016, 01:51 PM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,479
Default

The difference is because DESeq2 thinks your technical replicates are biological replicates. If you really want to include them then the best you can do is to make a factor for the flow cell. I have yet to see an instance in the last few years where there's a lane batch effect (other than getting fewer reads...but that's not a batch effect). In reality, you might just do a PCA plot and if there's no real batch effect then go with the merged results.
dpryan is offline   Reply With Quote
Old 03-01-2016, 02:27 AM   #3
spabinger
Member
 
Location: Europe

Join Date: Jun 2011
Posts: 13
Default

Thanks for the explanation.

In my case, I'll go for the merged files, correct?

Thanks,
Stephan
spabinger is offline   Reply With Quote
Old 03-01-2016, 02:37 AM   #4
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,479
Default

Correct

Ignore me: this message is too short otherwise.
dpryan is offline   Reply With Quote
Reply

Tags
analysis, countdata, deseq, rnaseq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:45 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO