![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
DESeq2 without biol replicates | sisterdot | Bioinformatics | 20 | 02-24-2016 10:14 AM |
DESeq2 with different replicates | robertorun | RNA Sequencing | 0 | 02-09-2016 03:50 PM |
DESeq2 with replicates no sig padj | mrandel | Bioinformatics | 2 | 07-15-2015 09:36 AM |
DESeq2 with no replicates - strange results | frymor | Bioinformatics | 2 | 07-23-2014 02:17 AM |
Chips-seq replicates and motif discovery: how to deal with the merged peaks? | feralBiologist | Bioinformatics | 1 | 12-03-2013 04:06 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Europe Join Date: Jun 2011
Posts: 13
|
![]()
Hi,
I am currently analyzing a RNASeq experiments with 2 conditions. Each condition has several biological replicates and each replicate was sequenced multiple times (on different lanes and even flowcells). For each sub-replicate I have a separate fastq file. Control 1-1 Treatment 1-1 Control 1-2 Treatment 1-2 Control 1-3 Treatment 1-3 Control 2-1 Treatment 2-1 Control 2-2 Treatment 2-2 Control 2-3 Treatment 2-3 ... I performed now 2 analyses: 1) Fastq -> STAR -> htseq count -> DESeq2 2) Fastq -> Merge Fastqs of each replicate (1-1+1-2+1-3) -> STAR -> htseq count -> DESeq2 I checked the count files and the individual counts of the replicates sum up to the merged count file. However, when I run now DESeq2 on the different count-data-sets I get different results (most notably the adj p-value). Not merged: Code:
baseMean log2FoldChange lfcSE stat pvalue padj <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> ENSG00000143369.10 479.26267 6.1618396 0.28753411 21.42994 7.026598e-102 2.636520e-97 ENSG00000135250.12 381.34094 0.7005228 0.03782914 18.51807 1.476331e-76 2.769745e-72 ENSG00000163898.5 62.74118 5.4375752 0.30318159 17.93504 6.281376e-72 7.856327e-68 ENSG00000154269.10 65.19322 3.9433545 0.23587129 16.71825 9.651700e-63 9.053777e-59 ENSG00000108821.9 306.78844 3.0304242 0.18264854 16.59156 8.020919e-62 6.019218e-58 Code:
baseMean log2FoldChange lfcSE stat pvalue padj <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> ENSG00000077327.11 63.88176 -3.1465321 0.5082833 -6.190509 5.997039e-10 1.553113e-05 ENSG00000222022.1 13.70911 3.2297638 0.5455883 5.919782 3.223688e-09 4.174354e-05 ENSG00000135250.12 3787.92187 0.7295915 0.1324829 5.507063 3.648689e-08 3.149791e-04 ENSG00000108821.9 2925.22227 2.4831871 0.4592653 5.406869 6.413598e-08 4.152484e-04 ENSG00000110427.10 74.67557 2.7264235 0.5144769 5.299409 1.161784e-07 6.017577e-04 Could you please tell me if that is expected? Which approach should I use? Thanks for you help, Stephan |
![]() |
![]() |
![]() |
#2 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,478
|
![]()
The difference is because DESeq2 thinks your technical replicates are biological replicates. If you really want to include them then the best you can do is to make a factor for the flow cell. I have yet to see an instance in the last few years where there's a lane batch effect (other than getting fewer reads...but that's not a batch effect). In reality, you might just do a PCA plot and if there's no real batch effect then go with the merged results.
|
![]() |
![]() |
![]() |
#3 |
Member
Location: Europe Join Date: Jun 2011
Posts: 13
|
![]()
Thanks for the explanation.
In my case, I'll go for the merged files, correct? Thanks, Stephan |
![]() |
![]() |
![]() |
#4 |
Devon Ryan
Location: Freiburg, Germany Join Date: Jul 2011
Posts: 3,478
|
![]()
Correct
Ignore me: this message is too short otherwise. |
![]() |
![]() |
![]() |
Tags |
analysis, countdata, deseq, rnaseq |
Thread Tools | |
|
|