SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Error Running DESeq2 pm2012 RNA Sequencing 22 05-11-2016 03:05 PM
DESeq2: diff gene expression between species using gene-specific normalization factor mra Bioinformatics 4 12-01-2014 06:17 AM
DESeq2 error stormin Bioinformatics 6 09-08-2014 04:18 AM
DESeq2 error: varianceStabilizingTransformation error JonB Bioinformatics 7 11-22-2013 01:15 AM

Reply
 
Thread Tools
Old 10-21-2014, 11:52 AM   #1
stormin
Member
 
Location: US

Join Date: Aug 2014
Posts: 23
Default DESeq2 bizarre error - dates in gene list

Hey guys, I am running into a really weird problem in my RNA-seq analysis pipeline. In my final DESeq2 output, in the first column where the gene names are, there are a few entries that are purely dates with corresponding DESeq2 analysis. It makes no sense, I even have 12/1/2014 in there.. This is consistent across multiple runs through the analysis pipeline. Anyone else had this problem before?

Example of what I see:

HGS 3683.112663 0.130896607 0.111611495 1.17278786 0.240880888 0.932952701
2-Mar 692.3726347 0.194663432 0.165998807 1.172679702 0.240924274 0.932952701
CNTNAP3 167.220247 -0.285623039 0.243693625 -1.172057901 0.24117381 0.933557997

or:

DDX53 0 NA NA NA NA NA
DEAR 0 NA NA NA NA NA
1-Dec 0 NA NA NA NA NA
DEF6 16.53633559 0.56896377 0.269345768 2.112391716 0.034652865 NA
DEFA1 0 NA NA NA NA NA
stormin is offline   Reply With Quote
Old 10-21-2014, 12:41 PM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

You used Excel at some point, I imagine.
dpryan is offline   Reply With Quote
Old 10-21-2014, 01:00 PM   #3
blancha
Senior Member
 
Location: Montreal

Join Date: May 2013
Posts: 367
Default

dpryan is right.
This is a ridiculous problem with Excel converting gene names to dates.
There is no way to turn this "feature" off.
Also, if you have saved the file in Excel, you can never recover the original gene names.
This is one of the fun parts of working as a bioinformatician, dealing with inane bugs.

Here are 3 work-arounds.

1. Never use Excel. Very tempting, but unfortunately Excel is still the most popular application to view spreadsheets.

2. Write directly to Excel within your Deseq script, for example with the R library xlsx.
This is what I do.

3. Instead of opening the file directly, first define the format of the first column as text, and then import the data.

It is difficult to believe though that the wealthiest software company in the world is not able to add an option to prevent character strings being converted to dates.
blancha is offline   Reply With Quote
Old 10-21-2014, 01:19 PM   #4
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 700
Default

"the wealthiest software company in the world"

Microsoft makes Excel, not apple which is the wealthiest company in the world.

It's been a long running joke for a while : http://www.biomedcentral.com/1471-2105/5/80 (published 2004)

The date conversions affect at least 30 gene names; the floating-point conversions affect at least 2,000 if Riken identifiers are included. These conversions are irreversible; the original gene names cannot be recovered.

Beware the Ides of March.

... and March 1st, and December 2nd and April 5th ...

http://www.biomedcentral.com/content...105-5-80-1.jpg

Last edited by Richard Finney; 10-21-2014 at 01:23 PM.
Richard Finney is offline   Reply With Quote
Old 10-21-2014, 02:22 PM   #5
stormin
Member
 
Location: US

Join Date: Aug 2014
Posts: 23
Default

Ahh, thanks for your help. Looks like excel indeed is to be blamed for this error. I just checked my dataset using python, looks all clear!

Last edited by stormin; 10-21-2014 at 02:27 PM.
stormin is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:36 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO