SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
HTseq-count to DEseq2: Need a little help.. sindrle Bioinformatics 22 04-01-2016 03:48 AM
Question for making count table for edgeR lucer105 Bioinformatics 3 12-13-2013 08:01 AM
What to include in my count table(s) for DESeq rndouglas Bioinformatics 5 11-08-2013 06:54 AM
Creating a data.frame for DESeq2 KHubbard Bioinformatics 3 10-12-2013 12:16 AM
Alternative to cufflinks? - Bam to count table? hbt Bioinformatics 3 11-09-2012 07:09 AM

Reply
 
Thread Tools
Old 03-23-2014, 05:20 AM   #1
sazz
Member
 
Location: Istanbul, Turkey

Join Date: Oct 2012
Posts: 28
Default Error at Creating Count Table for DESeq2

I have used Tophat-CuffDiff pipeline so far but I want to give a try for DESeq2. I have 2 conditions and 3 replicates for each, aim is to find the differentially expressed genes.

For a couple of days, I am trying to use HTSeq to prepare my count files. I guess I did it but now I am stuck at creating the count table as the DESeq2 input.

I didn't use R that much so far, so I am having difficulties. Here is the problem:

Code:
> library('DESeq2')
Loading required package: GenomicRanges
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply,
    parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB

The following object is masked from ‘package:stats’:

    xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, as.vector, cbind, colnames, duplicated, eval, evalq, Filter, Find,
    get, intersect, is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
    pmin.int, Position, rank, rbind, Reduce, rep.int, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unlist

Loading required package: IRanges
Loading required package: XVector
Loading required package: Rcpp
Loading required package: RcppArmadillo

> setwd("C:/Python27/SKMEL-5")
> directory<-"C:/Python27/SKMEL-5/ALL"
> sampleFiles <- grep("SKMEL-5",list.files(directory),value=TRUE)
> sampleCondition<-c("KD","KD","KD","WT","WT","WT")
> sampleTable<-data.frame(sampleName=sampleFiles, fileName=sampleFiles, condition=sampleCondition)
> sampleTable
       sampleName        fileName condition
1 SKMEL-5_I-1.txt SKMEL-5_I-1.txt        KD
2 SKMEL-5_I-2.txt SKMEL-5_I-2.txt        KD
3 SKMEL-5_I-3.txt SKMEL-5_I-3.txt        KD
4 SKMEL-5_L-1.txt SKMEL-5_L-1.txt        WT
5 SKMEL-5_L-2.txt SKMEL-5_L-2.txt        WT
6 SKMEL-5_L-3.txt SKMEL-5_L-3.txt        WT
> ddsHTSeq<-DESeqDataSetFromHTSeqCount(sampleTable=sampleTable, directory=directory, design=~condition)
Error in DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory,  : 
  Gene IDs (first column) differ between files.
In addition: There were 36 warnings (use warnings() to see them)
Here is the 36 warnings:

Code:
Warning messages:
1: In read.table(file.path(directory, fn)) :
  line 1 appears to contain embedded nulls
2: In read.table(file.path(directory, fn)) :
  line 2 appears to contain embedded nulls
3: In read.table(file.path(directory, fn)) :
  line 3 appears to contain embedded nulls
4: In read.table(file.path(directory, fn)) :
  line 4 appears to contain embedded nulls
5: In read.table(file.path(directory, fn)) :
  line 5 appears to contain embedded nulls
6: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
  embedded nul(s) found in input
7: In read.table(file.path(directory, fn)) :
  line 1 appears to contain embedded nulls
8: In read.table(file.path(directory, fn)) :
  line 2 appears to contain embedded nulls
9: In read.table(file.path(directory, fn)) :
  line 3 appears to contain embedded nulls
10: In read.table(file.path(directory, fn)) :
  line 4 appears to contain embedded nulls
11: In read.table(file.path(directory, fn)) :
  line 5 appears to contain embedded nulls
12: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
  embedded nul(s) found in input
13: In read.table(file.path(directory, fn)) :
  line 1 appears to contain embedded nulls
14: In read.table(file.path(directory, fn)) :
  line 2 appears to contain embedded nulls
15: In read.table(file.path(directory, fn)) :
  line 3 appears to contain embedded nulls
16: In read.table(file.path(directory, fn)) :
  line 4 appears to contain embedded nulls
17: In read.table(file.path(directory, fn)) :
  line 5 appears to contain embedded nulls
18: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
  embedded nul(s) found in input
19: In read.table(file.path(directory, fn)) :
  line 1 appears to contain embedded nulls
20: In read.table(file.path(directory, fn)) :
  line 2 appears to contain embedded nulls
21: In read.table(file.path(directory, fn)) :
  line 3 appears to contain embedded nulls
22: In read.table(file.path(directory, fn)) :
  line 4 appears to contain embedded nulls
23: In read.table(file.path(directory, fn)) :
  line 5 appears to contain embedded nulls
24: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
  embedded nul(s) found in input
25: In read.table(file.path(directory, fn)) :
  line 1 appears to contain embedded nulls
26: In read.table(file.path(directory, fn)) :
  line 2 appears to contain embedded nulls
27: In read.table(file.path(directory, fn)) :
  line 3 appears to contain embedded nulls
28: In read.table(file.path(directory, fn)) :
  line 4 appears to contain embedded nulls
29: In read.table(file.path(directory, fn)) :
  line 5 appears to contain embedded nulls
30: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
  embedded nul(s) found in input
31: In read.table(file.path(directory, fn)) :
  line 1 appears to contain embedded nulls
32: In read.table(file.path(directory, fn)) :
  line 2 appears to contain embedded nulls
33: In read.table(file.path(directory, fn)) :
  line 3 appears to contain embedded nulls
34: In read.table(file.path(directory, fn)) :
  line 4 appears to contain embedded nulls
35: In read.table(file.path(directory, fn)) :
  line 5 appears to contain embedded nulls
36: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
  embedded nul(s) found in input
Because it says "Gene IDs (first column) differ between files.", I have checked each file but all have the same number of rows and I guess the first column is same for all (well, I have used the same gtf file for all of them, so it must be).

I know the problem is at a very basic stage but I have no clue as an R-noob.

Last edited by sazz; 03-23-2014 at 05:27 AM.
sazz is offline   Reply With Quote
Old 03-23-2014, 05:42 AM   #2
sazz
Member
 
Location: Istanbul, Turkey

Join Date: Oct 2012
Posts: 28
Default

Solved, my files were not in Tab Delimited format :/
sazz is offline   Reply With Quote
Old 11-11-2014, 04:52 AM   #3
angus878
Junior Member
 
Location: London

Join Date: Nov 2014
Posts: 1
Default Additional answer

I got the same issue and found your post helpful. To solve I opened the file in notepad and changed the encoding from Unicode to ANSI and then it imported cleanly into R.
angus878 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:01 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO