Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • easyRNASeq errors / experiences

    Hi all,

    At the moment I'm trying to set up easyRNASeq + edgeR to analyse my paired-end data using R. I'm following the easyRNASeq manual to acquire a table of read counts which can then be used as DGElist object for edgeR.

    Unfortunately, I do not even get close to the point of obtaining the DGElist object, as the easyRNASeq function crashes with the following error:

    Code:
    Checking arguments... 
    Fetching annotations... 
    Computing gene models... 
    Summarizing counts... 
    Processing EMC_18_alignment.bam 
    Updating the read length information. 
    The alignments are gapped. 
    Minimum length of 1 bp. 
    Maximum length of 101 bp. 
    Error in mk_singleBracketReplacementValue(x, value) : 
      'value' must be a CompressedIntegerList object
    In addition: Warning messages:
    1: In easyRNASeq(organism = "Hsapiens", annotationMethod = "biomaRt",  :
      There are 16696 synthetic exons as determined from your annotation that overlap! This implies that some reads will be counted more than once! Is that really what you want?
    2: In fetchCoverage(rnaSeq, format = format, filename = filename, filter = filter,  :
      You enforce UCSC chromosome conventions, however the provided alignments are not compliant. Correcting it.
    Did anyone experience a similar error message yet?
    My bamfiles list consists of 4 samples aligned via GSNAP and here's how I run the function itself:

    Code:
    count.genes <- easyRNASeq(organism="Hsapiens",
                         annotationMethod="biomaRt",
                         gapped=TRUE, count="genes",
                         summarization="geneModels",
                         filesDirectory=getwd(),
                         filenames=bamfiles,
                         outputFormat="RNAseq")
    I use the devel version of easyRNASeq since it supports varying read lengths.


    Any help is greatly appreciated.


    Code:
    > sessionInfo()
    R version 2.15.1 (2012-06-22)
    Platform: x86_64-pc-linux-gnu (64-bit)
    
    locale:
     [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
     [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
     [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
     [7] LC_PAPER=C                 LC_NAME=C                 
     [9] LC_ADDRESS=C               LC_TELEPHONE=C            
    [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
    
    attached base packages:
    [1] parallel  stats     graphics  grDevices utils     datasets  methods  
    [8] base     
    
    other attached packages:
     [1] BiocInstaller_1.5.12               BSgenome.Hsapiens.UCSC.hg19_1.3.19
     [3] easyRNASeq_1.3.14                  ShortRead_1.15.11                 
     [5] latticeExtra_0.6-24                RColorBrewer_1.0-5                
     [7] Rsamtools_1.9.30                   DESeq_1.9.14                      
     [9] lattice_0.20-10                    locfit_1.5-8                      
    [11] BSgenome_1.25.8                    GenomicRanges_1.9.65              
    [13] Biostrings_2.25.12                 IRanges_1.15.44                   
    [15] edgeR_2.99.8                       limma_3.13.20                     
    [17] biomaRt_2.13.2                     Biobase_2.17.7                    
    [19] genomeIntervals_1.13.3             BiocGenerics_0.3.1                
    [21] intervals_0.13.3                  
    
    loaded via a namespace (and not attached):
     [1] annotate_1.35.3       AnnotationDbi_1.19.37 bitops_1.0-4.1       
     [4] DBI_0.2-5             genefilter_1.39.0     geneplotter_1.35.1   
     [7] grid_2.15.1           hwriter_1.3           RCurl_1.91-1         
    [10] RSQLite_0.11.2        splines_2.15.1        stats4_2.15.1        
    [13] survival_2.36-14      tools_2.15.1          XML_3.9-4            
    [16] xtable_1.7-0          zlibbioc_1.3.0

  • #2
    I have got the same error:

    #get annotation
    RNASeq<- easyRNASeq(filesDirectory=getwd(),
    organism="Hsapiens",
    #chr.sizes=chr.sizes,
    #readLength=80L,
    annotationMethod="biomaRt",
    format="bam",
    count="genes",
    summarization="geneModels",
    filenames=bamfiles[1],
    outputFormat="RNAseq"
    )
    gAnnot <- genomicAnnotation(rnaSeq)




    Checking arguments...
    Fetching annotations...
    Computing gene models...
    Summarizing counts...
    Processing RU_009_final.sorted.bam
    Updating the read length information.
    The reads have been trimmed.
    Minimum length of 50 bp.
    Maximum length of 80 bp.
    Error in mk_singleBracketReplacementValue(x, value) :
    'value' must be a CompressedIntegerList object
    In addition: Warning messages:
    1: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", annotationMethod = "biomaRt", :
    You enforce UCSC chromosome conventions, however the provided chromosome size list is not compliant. Correcting it.
    2: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", annotationMethod = "biomaRt", :
    There are 16696 synthetic exons as determined from your annotation that overlap! This implies that some reads will be counted more than once! Is that really what you want?
    3: In fetchCoverage(rnaSeq, format = format, filename = filename, filter = filter, :
    You enforce UCSC chromosome conventions, however the provided alignments are not compliant. Correcting it.
    Last edited by hollandorange; 09-20-2012, 01:28 AM.

    Comment


    • #3
      Hi hollandorange,

      could you include which aligner (+ version) you used? I forgot to mention that I aligned to Hg19 with GSNAP (version 2012-07-12).

      Cheers

      Comment


      • #4
        Hi rboettcher,

        Thanks for your email pointing me to that thread.

        There indeed seem to be a bug in a sub-setting step when getting the reads' information.

        As I'm usually not scanning the seqanswers forum for posts related to easyRNASeq, a better place to post about it is the bioconductor mailing list (I've forwarded your post there). Let's go on with this discussion over there.

        Cheers,

        Nico

        Comment


        • #5
          Hi Rboettcher,

          The bam files that I used for easyRNAseq was generated from Tophat. I also wanted to use GSNAP, since I heard it is more accurate.

          Could you also forward me to the bioconductor email thread for this issue? thanks!

          Hollandorange

          Comment


          • #6
            Hi hollandorange,

            You can register for that mailing list there: http://www.bioconductor.org/help/mailing-list/ (the best option IMO) or follow it on GMANE there:



            Cheers,

            Nico

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Advancing Precision Medicine for Rare Diseases in Children
              by seqadmin




              Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
              12-16-2024, 07:57 AM
            • seqadmin
              Recent Advances in Sequencing Technologies
              by seqadmin



              Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

              Long-Read Sequencing
              Long-read sequencing has seen remarkable advancements,...
              12-02-2024, 01:49 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 12-17-2024, 10:28 AM
            0 responses
            26 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 12-13-2024, 08:24 AM
            0 responses
            42 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 12-12-2024, 07:41 AM
            0 responses
            28 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 12-11-2024, 07:45 AM
            0 responses
            42 views
            0 likes
            Last Post seqadmin  
            Working...
            X