Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
DESeq2 workflow problem Sentinel156 RNA Sequencing 2 06-09-2015 05:29 PM
DESeq2 DESeqDataSetFromMatrix problem xrao RNA Sequencing 4 12-24-2014 06:20 AM
DESeq2: problem with variance stabilization greigite Bioinformatics 2 02-20-2014 10:33 AM
Problem with DESeq2 replaceOutliersWithTrimmedMean choishingwan Bioinformatics 3 01-21-2014 07:44 AM

Thread Tools
Old 01-24-2017, 08:30 AM   #1
Junior Member
Location: Boston

Join Date: Jan 2017
Posts: 8
Default DESeq2: outlier detection problem

Please excuse my ignorance as I am new to DESeq2 and differential gene expression analysis in general. Also please excuse the formatting of this post (this is my first post to this forum). I appreciate in advance any and all advice and guidance.

I am attempting to conduct a paired multi-factor analysis in DESeq2 but I seem to be detecting a lot of differentially expressed genes that are pushed into siginificance due to the presence of an outlier (see attachment of plot of normalized counts). I constructed a PCA plot (attached) and removed the labeled points and re-ran the DESeq2 analysis. I then got no significant genes. I know that DESeq2 has an internal outlier detection and replacement algorithm that requires a specified (I think default is 7) replicates. I have around 200 paired samples, ~100 pre and ~100 post treatment. In accordance with the paired multi-factor DESeq2 vignette, I have structured my sample table like so:

HTSeq_file sex condition nested
sample1 Male pre 1
sample1 Male post 1
sample2 Male pre 2
sample2 Male post 2
sample3 Female pre 1
sample3 Female post 1
... ... ... ...

When I remove the nested column, which to my understanding is the same as removing pairing, DESeq2 calls its internal filtering and outlier replacing methods. Is the inclusion of pairing making DESeq2 think that I have no replicates? How can I keep my paired design but still take advantage of DESeq2's outlier detection?

All comments are greatly appreciated.

Attached Images
File Type: png PCA_plot_sickle.png (49.3 KB, 17 views)
File Type: png example_outlier.png (35.2 KB, 13 views)
gstone is offline   Reply With Quote
Old 01-24-2017, 11:01 AM   #2
Devon Ryan
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480

It looks like your biggest issue is samples that should just be excluded, rather than detecting per-gene outliers.

Anyway, the outlier detection code only gets called once you have a sufficient number of samples per group. Once you have pairing you'll never hit that, so it'll never get called. I would suggest that you just remove the obviously outlier samples and see if you get more reasonable results.
dpryan is offline   Reply With Quote
Old 01-24-2017, 11:05 AM   #3
Junior Member
Location: Boston

Join Date: Jan 2017
Posts: 8

That's what I suspected. Thank you very much for your quick response, I appreciate the help.
gstone is offline   Reply With Quote

bioinformatics, deseq2, differential expression

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 08:12 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO