SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Error at Creating Count Table for DESeq2 sazz Bioinformatics 2 11-11-2014 03:52 AM
DESeq2 error: varianceStabilizingTransformation error JonB Bioinformatics 7 11-22-2013 01:15 AM
Wired DESeq2 error sindrle Bioinformatics 5 10-25-2013 05:29 AM
Error Message in nbinomLRT in DESeq2 ToddB Bioinformatics 13 09-05-2013 06:22 AM
tophat Error running running 'prep_reads' victoryhe Bioinformatics 2 10-17-2011 04:53 AM

Reply
 
Thread Tools
Old 04-15-2014, 07:59 AM   #1
pm2012
Member
 
Location: DC

Join Date: Apr 2012
Posts: 18
Default Error Running DESeq2

Hi

I am trying to run DESeq2 using the reference manual provided in the bioconductor website. However I am running in the following error after this step: ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design= ~ condition) I created my count files using HTseq.

Error msg.

Error in Ops.factor(a$V1, l[[1]]$V1) : level sets of factors are different In addition: Warning message: In is.na(e1) | is.na(e2) : longer object length is not a multiple of shorter object length

Any help is appreciated as I am both R/bioconductor and DESeq2 newbie.
pm2012 is offline   Reply With Quote
Old 04-15-2014, 08:03 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,478
Default

That's a new one. What are the contents of "sampleTable" and "condition"? Also, which version (just type "sessionInfo()" and post the results)?

Last edited by dpryan; 04-15-2014 at 08:04 AM. Reason: Someday I'll reread things before posting...
dpryan is offline   Reply With Quote
Old 04-15-2014, 08:47 AM   #3
pm2012
Member
 
Location: DC

Join Date: Apr 2012
Posts: 18
Default

Thanks for your reply.
Here are the contents of sampleTable including the condition. I have 2 conditions with 2 replicates each.

sampleNames sampleFiles condition stringAsFactors
1 231un1 231-un1-DESeq un1 TRUE
2 231un2 231-un2-DESeq un1 TRUE
3 231trt1 231-trt1-DESeq trt2 TRUE
4 231trt2 231-trt2-DESeq trt2 TRUE

Here's the version info:

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base

other attached packages:
[1] DESeq2_1.2.10 RcppArmadillo_0.4.200.0 Rcpp_0.11.1
[4] GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7
[7] DESeq_1.14.0 lattice_0.20-24 locfit_1.5-9.1
[10] Biobase_2.22.0 BiocGenerics_0.8.0

loaded via a namespace (and not attached):
[1] annotate_1.40.1 AnnotationDbi_1.24.0 DBI_0.2-7
[4] genefilter_1.44.0 geneplotter_1.40.0 grid_3.0.2
[7] RColorBrewer_1.0-5 RSQLite_0.11.4 splines_3.0.2
[10] stats4_3.0.2 survival_2.37-7 XML_3.98-1.1
[13] xtable_1.7-3
pm2012 is offline   Reply With Quote
Old 04-15-2014, 11:36 AM   #4
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,478
Default

After a bit of poking around the source code, it looks like this might happen if there's something wrong with the count files, namely if they're different lengths. From the command line, you might run:

Code:
cut -f 1 231-un1-DESeq | sort | uniq -c
If you compare the results of that file to the others the output of uniq -c, which just counts how many unique gene/feature names you have in a file, should be the same...but may not be for you.

BTW, you can get rid of the last column of sampleTable ("stringAsFactors").
dpryan is offline   Reply With Quote
Old 04-16-2014, 08:57 AM   #5
pm2012
Member
 
Location: DC

Join Date: Apr 2012
Posts: 18
Default

Thanks a lot for help. It was indeed a problem with my count files. I didn't realize I had to redirect the output of HTseq into a different file. I was using file generated with -o option as an input.
I reran the script & was able to generate the correct file (also filtered the last few lines starting with __). The rest of code seems to be working well now.
I also got rid of last colum in sampleTable. It was just one of the many things I was trying to solve my issue.
pm2012 is offline   Reply With Quote
Old 08-05-2014, 07:56 AM   #6
essepf
Junior Member
 
Location: Switzerland

Join Date: Aug 2014
Posts: 5
Default

Hello I have more ou less the same problem:

> ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = des)
Error in Ops.factor(a$V1, l[[1]]$V1) :
level sets of factors are different

the error is constantly on the factors but I'm not understand why.

I have my ---- sampleCondition=factor and sampleTable=data.frame(sampleName=sampleFiles, fileName=sampleFiles,condition=sampleCondition)

des <- formula(~ condition)

I do not know if you can help with this error.

Thank you very much
essepf is offline   Reply With Quote
Old 08-05-2014, 09:00 AM   #7
Michael Love
Senior Member
 
Location: Boston

Join Date: Jul 2013
Posts: 333
Default

hi essepf,

Did you check the length of the count files, as Devon recommended above?

What does sampleCondition look like?

What's your sessionInfo()
Michael Love is offline   Reply With Quote
Old 08-05-2014, 09:51 AM   #8
essepf
Junior Member
 
Location: Switzerland

Join Date: Aug 2014
Posts: 5
Default DESeq2 --- htseq-count

Hi Michael

Thank you for your suggestion......

Quote:
Originally Posted by Michael Love View Post
hi essepf,

Did you check the length of the count files, as Devon recommended above?

Yes...all files have 38932....

What does sampleCondition look like?

> sampleCondition
[1] pr pr pr wt wt wt
Levels: pr wt


What's your sessionInfo()

> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)

locale:
[1] C

attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base

other attached packages:
[1] BiocInstaller_1.14.2 gplots_2.14.1 ggplot2_1.0.0 DESeq_1.16.0 lattice_0.20-29 locfit_1.5-9.1 Biobase_2.24.0
[8] DESeq2_1.4.5 RcppArmadillo_0.4.320.0 Rcpp_0.11.2 GenomicRanges_1.16.4 GenomeInfoDb_1.0.2 IRanges_1.22.10 BiocGenerics_0.10.0

loaded via a namespace (and not attached):
[1] AnnotationDbi_1.26.0 DBI_0.2-7 KernSmooth_2.23-12 MASS_7.3-33 RColorBrewer_1.0-5 RSQLite_0.11.4 XML_3.98-1.1 XVector_0.4.0
[9] annotate_1.42.1 bitops_1.0-6 caTools_1.17 colorspace_1.2-4 digest_0.6.4 gdata_2.13.3 genefilter_1.46.1 geneplotter_1.42.0
[17] grid_3.1.0 gtable_0.1.2 gtools_3.4.1 munsell_0.4.2 plyr_1.8.1 proto_0.3-10 reshape2_1.4 scales_0.2.4
[25] splines_3.1.0 stats4_3.1.0 stringr_0.6.2 survival_2.37-7 tools_3.1.0 xtable_1.7-3

essepf is offline   Reply With Quote
Old 08-05-2014, 10:51 AM   #9
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,478
Default

What happens if you do this instead: des <- ~sampleCondition
dpryan is offline   Reply With Quote
Old 08-05-2014, 11:15 AM   #10
essepf
Junior Member
 
Location: Switzerland

Join Date: Aug 2014
Posts: 5
Default

Hi

this is my script:

library("DESeq2")
sampleFiles <- list.files(path="/Users/me/Desktop/RNASeq/htseq-count_Results_6Samples/htseq_Adp/")
sampleCondition=factor(c(rep("pr",3), rep("wt",3)))
sampleTable=data.frame(sampleName=sampleFiles, fileName=sampleFiles,condition=sampleCondition)
directory <- c("/Users/me/Desktop/RNASeq/htseq-count_Results_6Samples/htseq_Adp/")
des <- formula(~ condition)
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = des)

if I do what you suggest me I have exactly same error.

> ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = des)
Error in Ops.factor(a$V1, l[[1]]$V1) :
level sets of factors are different

Thank you
essepf is offline   Reply With Quote
Old 08-06-2014, 05:36 AM   #11
pm2012
Member
 
Location: DC

Join Date: Apr 2012
Posts: 18
Default

Did you check your count files generated from HTSeq? I had an issue with the count file itself thats why I was getting the error. The count files need to be filtered. See my previous reply to the thread above.
pm2012 is offline   Reply With Quote
Old 08-06-2014, 06:00 AM   #12
essepf
Junior Member
 
Location: Switzerland

Join Date: Aug 2014
Posts: 5
Default

Hi pm2012

Thanks for your reply.

The output files I have are on this format.

gene reads_WR1
610005C13Rik 2473
0610007N19Rik 15
0610007P14Rik 1291
0610008F07Rik 149
0610009B14Rik 0
0610009B22Rik 361
0610009D07Rik 272
0610009E02Rik 4
0610009L18Rik 8

when you say, filtered, you refers to what?

command I used to generate the count:

samtools view file.bam | htseq-count -s no -i gene_name - mus_musculus.gff > WT_results_counts.txt

Thank you for your help
essepf is offline   Reply With Quote
Old 08-06-2014, 06:02 AM   #13
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,478
Default

You might just post those files somewhere so we can reproduce and track down the cause of this problem.
dpryan is offline   Reply With Quote
Old 08-06-2014, 06:04 AM   #14
essepf
Junior Member
 
Location: Switzerland

Join Date: Aug 2014
Posts: 5
Default

Hi dpryan

I can send you 2 files output from htseq-count by mail, can be?
can provide me your email?
essepf is offline   Reply With Quote
Old 08-06-2014, 06:06 AM   #15
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,478
Default

Sure, at least as long as those 2 files are sufficient to cause the problem. You can email me at [email protected].
dpryan is offline   Reply With Quote
Old 08-06-2014, 06:50 AM   #16
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,478
Default

Just to keep the group in the loop, there ended up being two problems. The error message posted here was due to an apparent typo in one of the count files. Fixing that solved that problem. There was an additional issue due to a header line having been added (I don't know if this was done by htseq-count or not, I should have asked). Removing that allowed for the creation of a proper DESeqDataSet object.
dpryan is offline   Reply With Quote
Old 08-06-2014, 09:43 AM   #17
Michael Love
Senior Member
 
Location: Boston

Join Date: Jul 2013
Posts: 333
Default

thank you Devon. good to know.
Michael Love is offline   Reply With Quote
Old 06-17-2015, 09:07 PM   #18
antoshka
Junior Member
 
Location: Irvine, CA

Join Date: Jun 2015
Posts: 3
Default

Quote:
Originally Posted by pm2012 View Post
Thanks a lot for help. It was indeed a problem with my count files. I didn't realize I had to redirect the output of HTseq into a different file. I was using file generated with -o option as an input.
I reran the script & was able to generate the correct file (also filtered the last few lines starting with __). The rest of code seems to be working well now.
I also got rid of last colum in sampleTable. It was just one of the many things I was trying to solve my issue.
Hello pm2012,
I am having the same problem that you had back then.
I also just used the file produced by -o option and got the same error message.
How exactly did you redirect your output file to make it compatible with DESeq2?
Thanks

Last edited by antoshka; 05-09-2016 at 07:43 PM. Reason: typo
antoshka is offline   Reply With Quote
Old 05-09-2016, 07:20 PM   #19
ronaldrcutler
Member
 
Location: Virginia

Join Date: May 2016
Posts: 80
Default DESeqDataSet creation error

Hi, I'm a new to learning DESeq,

I am having a similar problem that has been talked about here. This is the error:
Code:
Error in Ops.factor(a$V1, l[[1]]$V1) : 
  level sets of factors are different
In addition: Warning message:
In is.na(e1) | is.na(e2) :
  longer object length is not a multiple of shorter object length
This is the script I am using:
Code:
library("DESeq2")

files = c("merged_sample_2.bam_htseq_out.txt","merged_sample_11.bam_htseq_out.txt","merged_sample_20.bam_htseq_out.txt","merged_sample_3.bam_htseq_out.txt","merged_sample_12.bam_htseq_out.txt","merged_sample_21.bam_htseq_out.txt")

cond = c("GFP","GFP","GFP","DBM","DBM","DBM")

sTable = data.frame(sampleName = files, fileName = files, condition = cond)

dds <-DESeqDataSetFromHTSeqCount(sampleTable=sTable, directory = "/Volumes/cachannel/RNA_SEQ/Notch_RNASeq/in_silico_test/DESeq", design = ~condition)
I also tried running this code from the command line as mentioned above:
Code:
cut -f merged_sample_2.bam_htseq_out.txt | sort | uniq -c
But got this error:
Code:
cut: [-cf] list: illegal list value
Any help would be appreciated. Thanks!

Last edited by ronaldrcutler; 05-09-2016 at 07:35 PM.
ronaldrcutler is offline   Reply With Quote
Old 05-09-2016, 08:09 PM   #20
antoshka
Junior Member
 
Location: Irvine, CA

Join Date: Jun 2015
Posts: 3
Default

I think the error you describe may be due to mismatch between the count tables that you provided (e.g. different number of rows, non-unique rows, typos).
How did you generate your input files? How do your count txt files look?
antoshka is offline   Reply With Quote
Reply

Tags
bioconductor, deseq2, rna-seq, transcriptomics

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:13 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO