DEXSeq analysis does not complete

[email protected]

Member

Join Date: Jan 2010

Posts: 25
- Share
- Tweet
#1

DEXSeq analysis does not complete

09-21-2015, 12:47 AM

Hi all,

I am having problem in diff. exon usage analysis of my dataset. I generated count data using protocol mentioned in DEXSeq's manual. My runs seem to be halted after some time when DEXSeq function is called. I have also tried manual steps which are being called in DEXSeq function to see where the problem is. And it seems that estimateDispersions() has to o something with this problem. I am running my job on cluster with 16 cores (512GB) for 48 hours but no progress. The output remained the same as shown below without any achanges after first two hours of the run. Below is the sample of my counts data, annotation file and script output produced after 48 hours. Can anybody help me resolving the issue or what I am doing wrong here?

COUNT FILE

ENSG00000000003:001 105
ENSG00000000003:002 145
ENSG00000000003:003 139
ENSG00000000003:004 136
ENSG00000000003:005 139
ENSG00000000003:006 17
ENSG00000000003:007 33
ENSG00000000003:008 46
ENSG00000000003:009 185
.
.
.
_ambiguous 9946
_ambiguous_readpair_position 0
_empty 9520711
_lowaqual 0
_notaligned 0

##################################

FLAT FILE:

chr1 dexseq_prepare_annotation.py aggregate_gene 29554 31109 . + . gene_id "ENSG00000243485"
chr1 dexseq_prepare_annotation.py exonic_part 29554 30039 . + . transcripts "ENST00000473358"; exonic_part_number "001"; gene_id "ENSG00000243485"
chr1 dexseq_prepare_annotation.py exonic_part 30267 30365 . + . transcripts "ENST00000469289"; exonic_part_number "002"; gene_id "ENSG00000243485"
chr1 dexseq_prepare_annotation.py exonic_part 30366 30503 . + . transcripts "ENST00000607096+ENST00000469289"; exonic_part_number "003"; gene_id "ENSG00000243485"
chr1 dexseq_prepare_annotation.py exonic_part 30504 30563 . + . transcripts "ENST00000469289"; exonic_part_number "004"; gene_id "ENSG00000243485"
chr1 dexseq_prepare_annotation.py exonic_part 30564 30667 . + . transcripts "ENST00000469289+ENST00000473358"; exonic_part_number "005"; gene_id "ENSG00000243485"
chr1 dexseq_prepare_annotation.py exonic_part 30976 31097 . + . transcripts "ENST00000469289+ENST00000473358"; exonic_part_number "006"; gene_id "ENSG00000243485"
chr1 dexseq_prepare_annotation.py exonic_part 31098 31109 . + . transcripts "ENST00000469289"; exonic_part_number "007"; gene_id "ENSG00000243485"

###################################

OUTPUT script.Rout

R version 3.2.1 (2015-06-18) -- "World-Famous Astronaut"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Creating a generic function for ‘nchar’ from package ‘base’ in package ‘S4Vectors’
[Previously saved workspace restored]

>
> #source("https://bioconductor.org/biocLite.R")
> #biocLite("DEXSeq")
> #pythonScriptsDir = system.file( "python_scripts", package="DEXSeq" )
> #list.files(pythonScriptsDir)
> #/Library/Frameworks/R.framework/Versions/3.2/Resources/library/DEXSeq/python_scripts
>
> library(DEXSeq)
Loading required package: BiocParallel
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘packagearallel’:

clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB

The following object is masked from ‘package:stats’:

xtabs

The following objects are masked from ‘package:base’:

anyDuplicated, append, as.data.frame, as.vector, cbind, colnames,
do.call, duplicated, eval, evalq, Filter, Find, get, intersect,
is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax,
pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rep.int,
rownames, sapply, setdiff, sort, table, tapply, union, unique,
unlist, unsplit

Welcome to Bioconductor

Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: IRanges
Loading required package: S4Vectors
Loading required package: stats4
Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Loading required package: DESeq2
Loading required package: Rcpp
Loading required package: RcppArmadillo
> inDir = "dexseq_counts"
> files_fc <- list.files(inDir, pattern = "*fc", full.names = TRUE) # ".tab$"
>
> #-----------------------------Creating meta-data-----------------------------------------------#
>
> sampleTable = data.frame(row.names = c("cer_fetal_cyto_063_4","cer_fetal_cyto_063_6","cer_fetal_cyto_063_8", "cer_fetal_nuc_064_4","cer_fetal_nuc_064_6","cer_fetal_nuc_064_8"),
+ condition = c("cyto", "cyto", "cyto", "nuc", "nuc", "nuc"),
+ individual = c( "1", "2", "3", "1", "2", "3"))
>
> #-------------------------Running DEXSeq---------------------------------------#
>
> BPPARAM = MulticoreParam(workers=16)
>
> fullmodel= ~ sample + exon + individual:exon + condition:exon
> reduced_model = ~ sample + exon + individual:exon
> flatfile = "Homo_sapiens.GRCh37.75_protein_lincRNA.DEXSeq.chr.gff"
>
> dxd_fc = DEXSeqDataSetFromHTSeq(files_fc,sampleData=sampleTable, design= fullmodel, flattenedfile = flatfile)
converting counts to integer mode
> dxr_fc = DEXSeq(dxd_fc, reducedModel = reduced_model,BPPARAM = BPPARAM, quiet = FALSE)
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
-- note: fitType='parametric', but the dispersion trend was not well captured by the
function: y = a/x + b, and a local regression fit was automatically substituted.
specify fitType='local' or 'mean' to avoid this message next time.
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix
using supplied model matrix

regards,
adnan

~Adnan~
Tags: dexseq, exon usage

Previous template Next

Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM
Strategies for Sequencing Challenging Samples

by seqadmin

Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
- Channel: Articles
03-22-2024, 06:39 AM

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

DEXSeq analysis does not complete

Latest Articles

ad_right_rmr

News