Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DEXSeq analysis does not complete

    Hi all,

    I am having problem in diff. exon usage analysis of my dataset. I generated count data using protocol mentioned in DEXSeq's manual. My runs seem to be halted after some time when DEXSeq function is called. I have also tried manual steps which are being called in DEXSeq function to see where the problem is. And it seems that estimateDispersions() has to o something with this problem. I am running my job on cluster with 16 cores (512GB) for 48 hours but no progress. The output remained the same as shown below without any achanges after first two hours of the run. Below is the sample of my counts data, annotation file and script output produced after 48 hours. Can anybody help me resolving the issue or what I am doing wrong here?


    COUNT FILE

    ENSG00000000003:001 105
    ENSG00000000003:002 145
    ENSG00000000003:003 139
    ENSG00000000003:004 136
    ENSG00000000003:005 139
    ENSG00000000003:006 17
    ENSG00000000003:007 33
    ENSG00000000003:008 46
    ENSG00000000003:009 185
    .
    .
    .
    _ambiguous 9946
    _ambiguous_readpair_position 0
    _empty 9520711
    _lowaqual 0
    _notaligned 0

    ##################################

    FLAT FILE:

    chr1 dexseq_prepare_annotation.py aggregate_gene 29554 31109 . + . gene_id "ENSG00000243485"
    chr1 dexseq_prepare_annotation.py exonic_part 29554 30039 . + . transcripts "ENST00000473358"; exonic_part_number "001"; gene_id "ENSG00000243485"
    chr1 dexseq_prepare_annotation.py exonic_part 30267 30365 . + . transcripts "ENST00000469289"; exonic_part_number "002"; gene_id "ENSG00000243485"
    chr1 dexseq_prepare_annotation.py exonic_part 30366 30503 . + . transcripts "ENST00000607096+ENST00000469289"; exonic_part_number "003"; gene_id "ENSG00000243485"
    chr1 dexseq_prepare_annotation.py exonic_part 30504 30563 . + . transcripts "ENST00000469289"; exonic_part_number "004"; gene_id "ENSG00000243485"
    chr1 dexseq_prepare_annotation.py exonic_part 30564 30667 . + . transcripts "ENST00000469289+ENST00000473358"; exonic_part_number "005"; gene_id "ENSG00000243485"
    chr1 dexseq_prepare_annotation.py exonic_part 30976 31097 . + . transcripts "ENST00000469289+ENST00000473358"; exonic_part_number "006"; gene_id "ENSG00000243485"
    chr1 dexseq_prepare_annotation.py exonic_part 31098 31109 . + . transcripts "ENST00000469289"; exonic_part_number "007"; gene_id "ENSG00000243485"

    ###################################

    OUTPUT script.Rout


    R version 3.2.1 (2015-06-18) -- "World-Famous Astronaut"
    Copyright (C) 2015 The R Foundation for Statistical Computing
    Platform: x86_64-redhat-linux-gnu (64-bit)

    R is free software and comes with ABSOLUTELY NO WARRANTY.
    You are welcome to redistribute it under certain conditions.
    Type 'license()' or 'licence()' for distribution details.

    Natural language support but running in an English locale

    R is a collaborative project with many contributors.
    Type 'contributors()' for more information and
    'citation()' on how to cite R or R packages in publications.

    Type 'demo()' for some demos, 'help()' for on-line help, or
    'help.start()' for an HTML browser interface to help.
    Type 'q()' to quit R.

    Creating a generic function for ‘nchar’ from package ‘base’ in package ‘S4Vectors’
    [Previously saved workspace restored]

    >
    > #source("https://bioconductor.org/biocLite.R")
    > #biocLite("DEXSeq")
    > #pythonScriptsDir = system.file( "python_scripts", package="DEXSeq" )
    > #list.files(pythonScriptsDir)
    > #/Library/Frameworks/R.framework/Versions/3.2/Resources/library/DEXSeq/python_scripts
    >
    > library(DEXSeq)
    Loading required package: BiocParallel
    Loading required package: Biobase
    Loading required package: BiocGenerics
    Loading required package: parallel

    Attaching package: ‘BiocGenerics’

    The following objects are masked from ‘packagearallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

    The following object is masked from ‘package:stats’:

    xtabs

    The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, as.vector, cbind, colnames,
    do.call, duplicated, eval, evalq, Filter, Find, get, intersect,
    is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax,
    pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rep.int,
    rownames, sapply, setdiff, sort, table, tapply, union, unique,
    unlist, unsplit

    Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

    Loading required package: IRanges
    Loading required package: S4Vectors
    Loading required package: stats4
    Loading required package: GenomicRanges
    Loading required package: GenomeInfoDb
    Loading required package: DESeq2
    Loading required package: Rcpp
    Loading required package: RcppArmadillo
    > inDir = "dexseq_counts"
    > files_fc <- list.files(inDir, pattern = "*fc", full.names = TRUE) # ".tab$"
    >
    > #-----------------------------Creating meta-data-----------------------------------------------#
    >
    > sampleTable = data.frame(row.names = c("cer_fetal_cyto_063_4","cer_fetal_cyto_063_6","cer_fetal_cyto_063_8", "cer_fetal_nuc_064_4","cer_fetal_nuc_064_6","cer_fetal_nuc_064_8"),
    + condition = c("cyto", "cyto", "cyto", "nuc", "nuc", "nuc"),
    + individual = c( "1", "2", "3", "1", "2", "3"))
    >
    > #-------------------------Running DEXSeq---------------------------------------#
    >
    > BPPARAM = MulticoreParam(workers=16)
    >
    > fullmodel= ~ sample + exon + individual:exon + condition:exon
    > reduced_model = ~ sample + exon + individual:exon
    > flatfile = "Homo_sapiens.GRCh37.75_protein_lincRNA.DEXSeq.chr.gff"
    >
    > dxd_fc = DEXSeqDataSetFromHTSeq(files_fc,sampleData=sampleTable, design= fullmodel, flattenedfile = flatfile)
    converting counts to integer mode
    > dxr_fc = DEXSeq(dxd_fc, reducedModel = reduced_model,BPPARAM = BPPARAM, quiet = FALSE)
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    -- note: fitType='parametric', but the dispersion trend was not well captured by the
    function: y = a/x + b, and a local regression fit was automatically substituted.
    specify fitType='local' or 'mean' to avoid this message next time.
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix

    regards,
    adnan
    ~Adnan~

Latest Articles

Collapse

  • seqadmin
    Recent Innovations in Spatial Biology
    by seqadmin


    Spatial biology is an exciting field that encompasses a wide range of techniques and technologies aimed at mapping the organization and interactions of various biomolecules in their native environments. As this area of research progresses, new tools and methodologies are being introduced, accompanied by efforts to establish benchmarking standards and drive technological innovation.

    3D Genomics
    While spatial biology often involves studying proteins and RNAs in their...
    Yesterday, 07:30 PM
  • seqadmin
    Advancing Precision Medicine for Rare Diseases in Children
    by seqadmin




    Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
    12-16-2024, 07:57 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 12-30-2024, 01:35 PM
0 responses
22 views
0 likes
Last Post seqadmin  
Started by seqadmin, 12-17-2024, 10:28 AM
0 responses
41 views
0 likes
Last Post seqadmin  
Started by seqadmin, 12-13-2024, 08:24 AM
0 responses
55 views
0 likes
Last Post seqadmin  
Started by seqadmin, 12-12-2024, 07:41 AM
0 responses
41 views
0 likes
Last Post seqadmin  
Working...
X