Please excuse my ignorance as I am new to DESeq2 and differential gene expression analysis in general. Also please excuse the formatting of this post (this is my first post to this forum). I appreciate in advance any and all advice and guidance.
I am attempting to conduct a paired multi-factor analysis in DESeq2 but I seem to be detecting a lot of differentially expressed genes that are pushed into siginificance due to the presence of an outlier (see attachment of plot of normalized counts). I constructed a PCA plot (attached) and removed the labeled points and re-ran the DESeq2 analysis. I then got no significant genes. I know that DESeq2 has an internal outlier detection and replacement algorithm that requires a specified (I think default is 7) replicates. I have around 200 paired samples, ~100 pre and ~100 post treatment. In accordance with the paired multi-factor DESeq2 vignette, I have structured my sample table like so:
HTSeq_file sex condition nested
sample1 Male pre 1
sample1 Male post 1
sample2 Male pre 2
sample2 Male post 2
sample3 Female pre 1
sample3 Female post 1
... ... ... ...
When I remove the nested column, which to my understanding is the same as removing pairing, DESeq2 calls its internal filtering and outlier replacing methods. Is the inclusion of pairing making DESeq2 think that I have no replicates? How can I keep my paired design but still take advantage of DESeq2's outlier detection?
All comments are greatly appreciated.
Thanks,
Greg
I am attempting to conduct a paired multi-factor analysis in DESeq2 but I seem to be detecting a lot of differentially expressed genes that are pushed into siginificance due to the presence of an outlier (see attachment of plot of normalized counts). I constructed a PCA plot (attached) and removed the labeled points and re-ran the DESeq2 analysis. I then got no significant genes. I know that DESeq2 has an internal outlier detection and replacement algorithm that requires a specified (I think default is 7) replicates. I have around 200 paired samples, ~100 pre and ~100 post treatment. In accordance with the paired multi-factor DESeq2 vignette, I have structured my sample table like so:
HTSeq_file sex condition nested
sample1 Male pre 1
sample1 Male post 1
sample2 Male pre 2
sample2 Male post 2
sample3 Female pre 1
sample3 Female post 1
... ... ... ...
When I remove the nested column, which to my understanding is the same as removing pairing, DESeq2 calls its internal filtering and outlier replacing methods. Is the inclusion of pairing making DESeq2 think that I have no replicates? How can I keep my paired design but still take advantage of DESeq2's outlier detection?
All comments are greatly appreciated.
Thanks,
Greg
Comment