Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DEXSeq analysis does not complete

    Hi all,

    I am having problem in diff. exon usage analysis of my dataset. I generated count data using protocol mentioned in DEXSeq's manual. My runs seem to be halted after some time when DEXSeq function is called. I have also tried manual steps which are being called in DEXSeq function to see where the problem is. And it seems that estimateDispersions() has to o something with this problem. I am running my job on cluster with 16 cores (512GB) for 48 hours but no progress. The output remained the same as shown below without any achanges after first two hours of the run. Below is the sample of my counts data, annotation file and script output produced after 48 hours. Can anybody help me resolving the issue or what I am doing wrong here?


    COUNT FILE

    ENSG00000000003:001 105
    ENSG00000000003:002 145
    ENSG00000000003:003 139
    ENSG00000000003:004 136
    ENSG00000000003:005 139
    ENSG00000000003:006 17
    ENSG00000000003:007 33
    ENSG00000000003:008 46
    ENSG00000000003:009 185
    .
    .
    .
    _ambiguous 9946
    _ambiguous_readpair_position 0
    _empty 9520711
    _lowaqual 0
    _notaligned 0

    ##################################

    FLAT FILE:

    chr1 dexseq_prepare_annotation.py aggregate_gene 29554 31109 . + . gene_id "ENSG00000243485"
    chr1 dexseq_prepare_annotation.py exonic_part 29554 30039 . + . transcripts "ENST00000473358"; exonic_part_number "001"; gene_id "ENSG00000243485"
    chr1 dexseq_prepare_annotation.py exonic_part 30267 30365 . + . transcripts "ENST00000469289"; exonic_part_number "002"; gene_id "ENSG00000243485"
    chr1 dexseq_prepare_annotation.py exonic_part 30366 30503 . + . transcripts "ENST00000607096+ENST00000469289"; exonic_part_number "003"; gene_id "ENSG00000243485"
    chr1 dexseq_prepare_annotation.py exonic_part 30504 30563 . + . transcripts "ENST00000469289"; exonic_part_number "004"; gene_id "ENSG00000243485"
    chr1 dexseq_prepare_annotation.py exonic_part 30564 30667 . + . transcripts "ENST00000469289+ENST00000473358"; exonic_part_number "005"; gene_id "ENSG00000243485"
    chr1 dexseq_prepare_annotation.py exonic_part 30976 31097 . + . transcripts "ENST00000469289+ENST00000473358"; exonic_part_number "006"; gene_id "ENSG00000243485"
    chr1 dexseq_prepare_annotation.py exonic_part 31098 31109 . + . transcripts "ENST00000469289"; exonic_part_number "007"; gene_id "ENSG00000243485"

    ###################################

    OUTPUT script.Rout


    R version 3.2.1 (2015-06-18) -- "World-Famous Astronaut"
    Copyright (C) 2015 The R Foundation for Statistical Computing
    Platform: x86_64-redhat-linux-gnu (64-bit)

    R is free software and comes with ABSOLUTELY NO WARRANTY.
    You are welcome to redistribute it under certain conditions.
    Type 'license()' or 'licence()' for distribution details.

    Natural language support but running in an English locale

    R is a collaborative project with many contributors.
    Type 'contributors()' for more information and
    'citation()' on how to cite R or R packages in publications.

    Type 'demo()' for some demos, 'help()' for on-line help, or
    'help.start()' for an HTML browser interface to help.
    Type 'q()' to quit R.

    Creating a generic function for ‘nchar’ from package ‘base’ in package ‘S4Vectors’
    [Previously saved workspace restored]

    >
    > #source("https://bioconductor.org/biocLite.R")
    > #biocLite("DEXSeq")
    > #pythonScriptsDir = system.file( "python_scripts", package="DEXSeq" )
    > #list.files(pythonScriptsDir)
    > #/Library/Frameworks/R.framework/Versions/3.2/Resources/library/DEXSeq/python_scripts
    >
    > library(DEXSeq)
    Loading required package: BiocParallel
    Loading required package: Biobase
    Loading required package: BiocGenerics
    Loading required package: parallel

    Attaching package: ‘BiocGenerics’

    The following objects are masked from ‘packagearallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

    The following object is masked from ‘package:stats’:

    xtabs

    The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, as.vector, cbind, colnames,
    do.call, duplicated, eval, evalq, Filter, Find, get, intersect,
    is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax,
    pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rep.int,
    rownames, sapply, setdiff, sort, table, tapply, union, unique,
    unlist, unsplit

    Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

    Loading required package: IRanges
    Loading required package: S4Vectors
    Loading required package: stats4
    Loading required package: GenomicRanges
    Loading required package: GenomeInfoDb
    Loading required package: DESeq2
    Loading required package: Rcpp
    Loading required package: RcppArmadillo
    > inDir = "dexseq_counts"
    > files_fc <- list.files(inDir, pattern = "*fc", full.names = TRUE) # ".tab$"
    >
    > #-----------------------------Creating meta-data-----------------------------------------------#
    >
    > sampleTable = data.frame(row.names = c("cer_fetal_cyto_063_4","cer_fetal_cyto_063_6","cer_fetal_cyto_063_8", "cer_fetal_nuc_064_4","cer_fetal_nuc_064_6","cer_fetal_nuc_064_8"),
    + condition = c("cyto", "cyto", "cyto", "nuc", "nuc", "nuc"),
    + individual = c( "1", "2", "3", "1", "2", "3"))
    >
    > #-------------------------Running DEXSeq---------------------------------------#
    >
    > BPPARAM = MulticoreParam(workers=16)
    >
    > fullmodel= ~ sample + exon + individual:exon + condition:exon
    > reduced_model = ~ sample + exon + individual:exon
    > flatfile = "Homo_sapiens.GRCh37.75_protein_lincRNA.DEXSeq.chr.gff"
    >
    > dxd_fc = DEXSeqDataSetFromHTSeq(files_fc,sampleData=sampleTable, design= fullmodel, flattenedfile = flatfile)
    converting counts to integer mode
    > dxr_fc = DEXSeq(dxd_fc, reducedModel = reduced_model,BPPARAM = BPPARAM, quiet = FALSE)
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    -- note: fitType='parametric', but the dispersion trend was not well captured by the
    function: y = a/x + b, and a local regression fit was automatically substituted.
    specify fitType='local' or 'mean' to avoid this message next time.
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix
    using supplied model matrix

    regards,
    adnan
    ~Adnan~

Latest Articles

Collapse

  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM
  • seqadmin
    Strategies for Sequencing Challenging Samples
    by seqadmin


    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
    03-22-2024, 06:39 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
25 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
28 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
24 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
52 views
0 likes
Last Post seqadmin  
Working...
X