Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • gene tracking with cummerBund - persistent problem

    Hello. I'm having some difficulty with the makeGeneRegionTrack feature of cummerBund. If I rebuild my cuffData.db with 'mm10_ensGene.gtf', then the following happens:

    Code:
    > cuff <- readCufflinks(genome='mm10',gtfFile='mm10_ensGene.gtf',rebuild=TRUE)
    
    > 	cuff
    CuffSet instance with:
    	 4 samples
    	 23285 genes
    	 30073 isoforms
    	 25872 TSS
    	 24748 CDS
    	 139440 promoters
    	 155232 splicing
    	 122340 relCDS
    
    > 	myGene
    CuffGene instance for gene Insm2 
    Short name:	 Insm2 
    Slots:
    	 annotation
    	 features
    	 fpkm
    	 repFpkm
    	 diff
    	 count
    	 isoforms	 CuffFeature instance of size 1 
    	 TSS		 CuffFeature instance of size 1 
    	 CDS		 CuffFeature instance of size 1 
    
    > 	genetrack <- makeGeneRegionTrack(myGene)
    Error in .Call2("Rle_constructor", values, lengths, check, 0L, PACKAGE = "IRanges") : 
      Rle of type 'list' is not supported
    On the other hand, if I rebuild my cuffData.db file with 'genes.gtf' from the Tuxedo package, then the following happens:



    Code:
    > cuff <- readCufflinks(genome='mm10',gtfFile='genes.gtf',rebuild=TRUE)
    
    > 	cuff
    CuffSet instance with:
    	 4 samples
    	 23285 genes
    	 30073 isoforms
    	 25872 TSS
    	 24748 CDS
    	 139440 promoters
    	 155232 splicing
    	 122340 relCDS
    
    > 	myGene
    CuffGene instance for gene Insm2 
    Short name:	 Insm2 
    Slots:
    	 annotation
    	 features
    	 fpkm
    	 repFpkm
    	 diff
    	 count
    	 isoforms	 CuffFeature instance of size 1 
    	 TSS		 CuffFeature instance of size 1 
    	 CDS		 CuffFeature instance of size 1 
    
    > 	genetrack <- makeGeneRegionTrack(myGene)
    Error in `[.data.frame`(features(object), , featCols) : 
      undefined columns selected

    I've noticed similar problems in other posts, but with no solutions. Any suggestions would be appreciated.

    Thanks.

  • #2
    stephenhart: I have the first issue but not that second one.

    Either way, does anyone have any ideas yet?


    Using EMBL annotation:
    Code:
    > cuff <- readCufflinks(genome="vinifera.fa", gtfFile="vinifera.gtf", rebuild=T)
    
    >       cuff
    CuffSet instance with:
             6 samples
             30446 genes
             32764 isoforms
             31480 TSS
             29927 CDS
             456690 promoters
             472200 splicing
             437595 relCDS
    
    >       myGene
    CuffGene instance for gene XLOC_001303
    Short name:      SUT2
    Slots:
             annotation
             features
             fpkm
             repFpkm
             diff
             count
             isoforms        CuffFeature instance of size 1
             TSS             CuffFeature instance of size 1
             CDS             CuffFeature instance of size 1
    > genetrack <- makeGeneRegionTrack(myGene)
    Error in .Call2("Rle_constructor", values, lengths, check, 0L, PACKAGE = "S4Vectors") :
      Rle of type 'list' is not supported

    Using my merged annotation:
    Code:
    > cuff <- readCufflinks(genome="vinifera.fa", gtfFile="merged.gtf", rebuild=T)
    
    >       cuff
    CuffSet instance with:
             6 samples
             30446 genes
             32764 isoforms
             31480 TSS
             29927 CDS
             456690 promoters
             472200 splicing
             437595 relCDS
    
    >       myGene
    CuffGene instance for gene XLOC_001303
    Short name:      SUT2
    Slots:
             annotation
             features
             fpkm
             repFpkm
             diff
             count
             isoforms        CuffFeature instance of size 1
             TSS             CuffFeature instance of size 1
             CDS             CuffFeature instance of size 1
    > genetrack <- makeGeneRegionTrack(myGene)
    Error in .Call2("Rle_constructor", values, lengths, check, 0L, PACKAGE = "S4Vectors") :
      Rle of type 'list' is not supported
    Here's my sessionInfo:
    Code:
    > sessionInfo()
    R version 3.1.1 (2014-07-10)
    Platform: x86_64-unknown-linux-gnu (64-bit)
    
    locale:
     [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
     [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
     [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
     [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
     [9] LC_ADDRESS=C               LC_TELEPHONE=C
    [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
    
    attached base packages:
     [1] grid      stats4    parallel  stats     graphics  grDevices utils
     [8] datasets  methods   base
    
    other attached packages:
     [1] cummeRbund_2.8.0     Gviz_1.10.2          rtracklayer_1.26.1
     [4] GenomicRanges_1.18.1 GenomeInfoDb_1.2.2   IRanges_2.0.0
     [7] S4Vectors_0.4.0      fastcluster_1.1.13   reshape2_1.4
    [10] ggplot2_1.0.0        RSQLite_1.0.0        DBI_0.3.1
    [13] BiocGenerics_0.12.0
    
    loaded via a namespace (and not attached):
     [1] acepack_1.3-3.3          AnnotationDbi_1.28.1     base64enc_0.1-2
     [4] BatchJobs_1.5            BBmisc_1.8               Biobase_2.26.0
     [7] BiocParallel_1.0.0       biomaRt_2.22.0           Biostrings_2.34.0
    [10] biovizBase_1.14.0        bitops_1.0-6             brew_1.0-6
    [13] BSgenome_1.34.0          checkmate_1.5.0          cluster_1.15.3
    [16] codetools_0.2-9          colorspace_1.2-4         dichromat_2.0-0
    [19] digest_0.6.4             fail_1.2                 foreach_1.4.2
    [22] foreign_0.8-61           Formula_1.1-2            GenomicAlignments_1.2.0
    [25] GenomicFeatures_1.18.2   gtable_0.1.2             Hmisc_3.14-5
    [28] iterators_1.0.7          lattice_0.20-29          latticeExtra_0.6-26
    [31] MASS_7.3-35              matrixStats_0.10.3       munsell_0.4.2
    [34] nnet_7.3-8               plyr_1.8.1               proto_0.3-10
    [37] RColorBrewer_1.0-5       Rcpp_0.11.3              RCurl_1.95-4.3
    [40] R.methodsS3_1.6.1        rpart_4.1-8              Rsamtools_1.18.1
    [43] scales_0.2.4             sendmailR_1.2-1          splines_3.1.1
    [46] stringr_0.6.2            survival_2.37-7          tools_3.1.1
    [49] VariantAnnotation_1.12.2 XML_3.98-1.1             XVector_0.6.0
    [52] zlibbioc_1.12.0

    Comment


    • #3
      hey stephenhart:
      Would you mind posting this function's output:

      Code:
      > head(features(myGene))
        seqnames    start      end width strand    source type score phase
      1        1 12442003 12442535   533      - Cufflinks exon    NA    NA
      2        1 12442642 12442922   281      - Cufflinks exon    NA    NA
      3        1 12443021 12443186   166      - Cufflinks exon    NA    NA
            gene_id     isoform_id exon_number gene_name                   oId
      1 XLOC_001339 TCONS_00001459           1       NAP VIT_01s0026g02710.t01
      2 XLOC_001339 TCONS_00001459           2       NAP VIT_01s0026g02710.t01
      3 XLOC_001339 TCONS_00001459           3       NAP VIT_01s0026g02710.t01
                  nearest_ref class_code TSS_group_id CDS_id contained_in
      1 VIT_01s0026g02710.t01          =      TSS1387  P1304         <NA>
      2 VIT_01s0026g02710.t01          =      TSS1387  P1304         <NA>
      3 VIT_01s0026g02710.t01          =      TSS1387  P1304         <NA>
      I'm curious only because in the CummeRbund manual the example gives:
      Code:
      > head(features(myGene))
       
         seqnames    start      end width strand source type score
      1     chr1 20959948 20960428   481      + coding exon    NA
      2     chr1 20964335 20964622   288      + coding exon    NA
      3     chr1 20966385 20966485   101      + coding exon    NA
      4     chr1 20970983 20971165   183      + coding exon    NA
      5     chr1 20972053 20972216   164      + coding exon    NA
      6     chr1 20974998 20975125   128      + coding exon    NA
        phase class_code exon_number     gene_id gene_name nearest_ref
      1    NA          =           1 XLOC_000172     PINK1  uc001bdm.2
      2    NA          =           2 XLOC_000172     PINK1  uc001bdm.2
      3    NA          =           3 XLOC_000172     PINK1  uc001bdm.2
      4    NA          =           4 XLOC_000172     PINK1  uc001bdm.2
      5    NA          =           5 XLOC_000172     PINK1  uc001bdm.2
      6    NA          =           6 XLOC_000172     PINK1  uc001bdm.2
               oId CDS_id     isoform_id TSS_group_id
      1 uc001bdm.2   P364 TCONS_00000480       TSS264
      2 uc001bdm.2   P364 TCONS_00000480       TSS264
      3 uc001bdm.2   P364 TCONS_00000480       TSS264
      4 uc001bdm.2   P364 TCONS_00000480       TSS264
      5 uc001bdm.2   P364 TCONS_00000480       TSS264
      6 uc001bdm.2   P364 TCONS_00000480       TSS264

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM
      • seqadmin
        Techniques and Challenges in Conservation Genomics
        by seqadmin



        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

        Avian Conservation
        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
        03-08-2024, 10:41 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 06:37 PM
      0 responses
      10 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 06:07 PM
      0 responses
      9 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-22-2024, 10:03 AM
      0 responses
      49 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-21-2024, 07:32 AM
      0 responses
      67 views
      0 likes
      Last Post seqadmin  
      Working...
      X