I am trying the pathway analysis from this workflow (1). I recieve error that parsing some file failed! Other question I have is, if it is enough to just change the species="hsa" to species="spo" for s. pombe or I need to change the codes upstream also? (1) RNA-Seq Data Pathway and Gene-set Analysis Workflows - Weijun Luo.
Code:
> gnCnt <- summarizeOverlaps(exByGn, bamfls, mode="Union", ignore.strand=TRUE, param=param) > dim(gnCnt) [1] 7017 2 > pfh1.cnts=assay(gnCnt) > cnts=pfh1.cnts > dim(cnts) [1] 7017 2 > sel.rn=rowSums(cnts) !=0 > cnts=cnts[sel.rn,] > dim(cnts) [1] 5968 2 > library(DESeq2) Loading required package: Rcpp Loading required package: RcppArmadillo > grp.idx <- rep(c("cotnrol", "experiment") + > grp.idx <- rep(c("cotnrol", "experiment")) > coldat=DataFrame(grp=factor(grp.idx)) > dds <- DESeqDataSetFromMatrix(cnts, colData=coldat, design= ~ grp) > dds <- DESeq(dds) estimating size factors estimating dispersions same number of samples and coefficients to fit, estimating dispersion by treating samples as replicates gene-wise dispersion estimates mean-dispersion relationship final dispersion estimates fitting model and testing > deseq2.res <- results(dds) > deseq2.fc=deseq2.res$log2FoldChange > names(deseq2.fc)=rownames(deseq2.res) > exp.fc=deseq2.fc > out.suffix="deseq2" > require(gage) Loading required package: gage > kg.spo=kegg.gsets("spo") > fc.kegg.p <- gage(exp.fc, gsets=kg.spo, ref=NULL, samp=NULL) > sel <- fc.kegg.p$greater[, "q.val"] < 0.1 & !is.na(fc.kegg.p$greater[, "q.val"]) > path.ids <- rownames(fc.kegg.p$greater)[sel] > sel.l <- fc.kegg.p$less[, "q.val"] < 0.1 & !is.na(fc.kegg.p$less[,"q.val"]) > path.ids.1 <- rownames(fc.kegg.p$less)[sel.l] > path.ids2 <- substr(c(path.ids, path.ids.l), 1, 8) Error in substr(c(path.ids, path.ids.l), 1, 8) : error in evaluating the argument 'x' in selecting a method for function 'substr': Error: object 'path.ids.l' not found > path.ids2 <- substr(c(path.ids, path.ids.1), 1, 8) > require(pathview) Loading required package: pathview Loading required package: KEGGgraph Loading required package: XML Loading required package: graph Attaching package: ‘graph’ The following object is masked from ‘package:XML’: addNode The following object is masked from ‘package:Biostrings’: complement Loading required package: org.Hs.eg.db Loading required package: DBI ############################################################################## Pathview is an open source software package distributed under GNU General Public License version 3 (GPLv3). Details of GPLv3 is available at http://www.gnu.org/licenses/gpl-3.0.html. The pathview downloads and uses KEGG data. Academic users may freely use the KEGG website at http://www.kegg.jp/ or its mirror site at GenomeNet http://www.genome.jp/kegg/. Academic users may also freely link to the KEGG website. Non-academic users may use the KEGG website as end users for non-commercial purposes, but any other use requires a license agreement (details at http://www.kegg.jp/kegg/legal.html). ############################################################################## > pv.out.list <- sapply(path.ids2[1:3], function(pid) pathview(gene.data= exp.fc, pathway.id = pid, species= "spo", out.suffix=out.suffix)) Getting gene ID data from KEGG... Done with data retrieval! [1] "Downloading xml files for spoNA, 1/1 pathways.." [1] "Downloading png files for spoNA, 1/1 pathways.." Download of spoNA xml and png files failed! Failed to download KEGG xml/png files, spoNA skipped! Getting gene ID data from KEGG... Done with data retrieval! Start tag expected, '<' not found Parsing ./spoNA.xml file failed, please check the file! Getting gene ID data from KEGG... Done with data retrieval! Start tag expected, '<' not found Parsing ./spoNA.xml file failed, please check the file! Warning message: In download.file(png.url, png.target, quiet = T, mode = "wb") : cannot open: HTTP status was '404 Not Found'
Comment