I am trying the pathway analysis from this workflow (1). I recieve error that parsing some file failed! Other question I have is, if it is enough to just change the species="hsa" to species="spo" for s. pombe or I need to change the codes upstream also? (1) RNA-Seq Data Pathway and Gene-set Analysis Workflows - Weijun Luo.
Code:
> gnCnt <- summarizeOverlaps(exByGn, bamfls, mode="Union", ignore.strand=TRUE, param=param)
> dim(gnCnt)
[1] 7017 2
> pfh1.cnts=assay(gnCnt)
> cnts=pfh1.cnts
> dim(cnts)
[1] 7017 2
> sel.rn=rowSums(cnts) !=0
> cnts=cnts[sel.rn,]
> dim(cnts)
[1] 5968 2
> library(DESeq2)
Loading required package: Rcpp
Loading required package: RcppArmadillo
> grp.idx <- rep(c("cotnrol", "experiment")
+
> grp.idx <- rep(c("cotnrol", "experiment"))
> coldat=DataFrame(grp=factor(grp.idx))
> dds <- DESeqDataSetFromMatrix(cnts, colData=coldat, design= ~ grp)
> dds <- DESeq(dds)
estimating size factors
estimating dispersions
same number of samples and coefficients to fit, estimating dispersion by treating samples as replicates
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
> deseq2.res <- results(dds)
> deseq2.fc=deseq2.res$log2FoldChange
> names(deseq2.fc)=rownames(deseq2.res)
> exp.fc=deseq2.fc
> out.suffix="deseq2"
> require(gage)
Loading required package: gage
> kg.spo=kegg.gsets("spo")
> fc.kegg.p <- gage(exp.fc, gsets=kg.spo, ref=NULL, samp=NULL)
> sel <- fc.kegg.p$greater[, "q.val"] < 0.1 & !is.na(fc.kegg.p$greater[, "q.val"])
> path.ids <- rownames(fc.kegg.p$greater)[sel]
> sel.l <- fc.kegg.p$less[, "q.val"] < 0.1 & !is.na(fc.kegg.p$less[,"q.val"])
> path.ids.1 <- rownames(fc.kegg.p$less)[sel.l]
> path.ids2 <- substr(c(path.ids, path.ids.l), 1, 8)
Error in substr(c(path.ids, path.ids.l), 1, 8) :
error in evaluating the argument 'x' in selecting a method for function 'substr': Error: object 'path.ids.l' not found
> path.ids2 <- substr(c(path.ids, path.ids.1), 1, 8)
> require(pathview)
Loading required package: pathview
Loading required package: KEGGgraph
Loading required package: XML
Loading required package: graph
Attaching package: ‘graph’
The following object is masked from ‘package:XML’:
addNode
The following object is masked from ‘package:Biostrings’:
complement
Loading required package: org.Hs.eg.db
Loading required package: DBI
##############################################################################
Pathview is an open source software package distributed under GNU General
Public License version 3 (GPLv3). Details of GPLv3 is available at
http://www.gnu.org/licenses/gpl-3.0.html.
The pathview downloads and uses KEGG data. Academic users may freely use the
KEGG website at http://www.kegg.jp/ or its mirror site at GenomeNet
http://www.genome.jp/kegg/. Academic users may also freely link to the KEGG
website. Non-academic users may use the KEGG website as end users for
non-commercial purposes, but any other use requires a license agreement
(details at http://www.kegg.jp/kegg/legal.html).
##############################################################################
> pv.out.list <- sapply(path.ids2[1:3], function(pid) pathview(gene.data= exp.fc, pathway.id = pid, species= "spo", out.suffix=out.suffix))
Getting gene ID data from KEGG...
Done with data retrieval!
[1] "Downloading xml files for spoNA, 1/1 pathways.."
[1] "Downloading png files for spoNA, 1/1 pathways.."
Download of spoNA xml and png files failed!
Failed to download KEGG xml/png files, spoNA skipped!
Getting gene ID data from KEGG...
Done with data retrieval!
Start tag expected, '<' not found
Parsing ./spoNA.xml file failed, please check the file!
Getting gene ID data from KEGG...
Done with data retrieval!
Start tag expected, '<' not found
Parsing ./spoNA.xml file failed, please check the file!
Warning message:
In download.file(png.url, png.target, quiet = T, mode = "wb") :
cannot open: HTTP status was '404 Not Found'
Comment