Empty instance in cummeRbund

rubbertjes

Member

Join Date: Jun 2013

Posts: 13
- Share
- Tweet
#1

Empty instance in cummeRbund

07-01-2013, 02:57 AM

Hi everyone,

I'm using cuffdiff to do my differential expression analysis for my mouse data. I am using annotation file to map only known transcripts using TopHat. I'm using the same GTF file for my Cuffdiff analysis.
Most of the output files look fine I think, but when I try to load the files into CummeRbund, the resulting variable is empty. The ouput:
> cuff_data <- readCufflinks('diff_out_KOvsWT')
> cuff_data
CuffSet instance with:
0 samples
0 genes
0 isoforms
0 TSS
0 CDS
0 promoters
0 splicing
0 relCDS

I've not been able to figure out why this is, does anybody have an idea? I have no idea whether the problem is in my installation of cummerbund,or if it has to do with the output from cuffdiff.

thanks in advance!

Below is some of the relevant information.

My output in R:
R version 2.15.3 (2013-03-01) -- "Security Blanket"
Copyright (C) 2013 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(cummeRbund)
Loading required package: BiocGenerics

Attaching package: â€˜BiocGenericsâ€™

The following object(s) are masked from â€˜package:statsâ€™:

xtabs

The following object(s) are masked from â€˜package:baseâ€™:

anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find,
get, intersect, lapply, Map, mapply, mget, order, paste, pmax,
pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int,
rownames, sapply, setdiff, table, tapply, union, unique

Loading required package: RSQLite
Loading required package: DBI
Loading required package: ggplot2
Loading required package: reshape2
Loading required package: fastcluster

Attaching package: â€˜fastclusterâ€™

The following object(s) are masked from â€˜package:statsâ€™:

hclust

Loading required package: rtracklayer
Loading required package: GenomicRanges
Loading required package: IRanges
Loading required package: Gviz
Loading required package: grid
> cuff_data <- readCufflinks('diff_out_KOvsWT')
> cuff_data
CuffSet instance with:
0 samples
0 genes
0 isoforms
0 TSS
0 CDS
0 promoters
0 splicing
0 relCDS

The cuffdiff run output (last couple of lines):
> Default Std Dev: 80
> Map Properties:
> Normalized Map Mass: 45818586.86
> Raw Map Mass: 50853518.89
> Number of Multi-Reads: 2689569 (with 7422256 total hits)
> Fragment Length Distribution: Truncated Gaussian (default)
> Default Mean: 200
> Default Std Dev: 80
> Map Properties:
> Normalized Map Mass: 45818586.86
> Raw Map Mass: 45467086.16
> Number of Multi-Reads: 2561480 (with 7010002 total hits)
> Fragment Length Distribution: Truncated Gaussian (default)
> Default Mean: 200
> Default Std Dev: 80
> Map Properties:
> Normalized Map Mass: 45818586.86
> Raw Map Mass: 45613387.71
> Number of Multi-Reads: 2548387 (with 6777011 total hits)
> Fragment Length Distribution: Truncated Gaussian (default)
> Default Mean: 200
> Default Std Dev: 80
> Map Properties:
> Normalized Map Mass: 45818586.86
> Raw Map Mass: 47467692.15
> Number of Multi-Reads: 2705326 (with 7047321 total hits)
> Fragment Length Distribution: Truncated Gaussian (default)
> Default Mean: 200
> Default Std Dev: 80
> Map Properties:
> Normalized Map Mass: 45818586.86
> Raw Map Mass: 52463715.19
> Number of Multi-Reads: 2987109 (with 7526927 total hits)
> Fragment Length Distribution: Truncated Gaussian (default)
> Default Mean: 200
> Default Std Dev: 80
[13:59:58] Calculating preliminary abundance estimates
> Processed 21109 loci. [*************************] 100%
[14:56:21] Learning bias parameters.
[15:14:08] Testing for differential expression and regulation in locus.
> Processed 21109 loci. [*************************] 100%
Performed 12350 isoform-level transcription difference tests
Performed 0 tss-level transcription difference tests
Performed 10502 gene-level transcription difference tests
Performed 0 CDS-level transcription difference tests
Performed 0 splicing tests
Performed 0 promoter preference tests
Performing 0 relative CDS output tests
Writing isoform-level FPKM tracking
Writing TSS group-level FPKM tracking
Writing gene-level FPKM tracking
Writing CDS-level FPKM tracking
Writing isoform-level count tracking
Writing TSS group-level count tracking
Writing gene-level count tracking
Writing CDS-level count tracking
Writing isoform-level read group tracking
Writing TSS group-level read group tracking
Writing gene-level read group tracking
Writing CDS-level read group tracking
Writing read group info
Writing run info

My files look like (first 10 lines):
rhorst@fm:~/RNAanalysis/data/20130306_Martijn_Sander/diff_out_KOvsWT$ head -n10 *
==> bias_params.info <==
condition_name replicate_num param pos_i pos_j value
WT 0 start_seq 0 0 1.04493
WT 0 start_seq 0 1 1.02886
WT 0 start_seq 0 2 1.03362
WT 0 start_seq 0 3 1.03065
WT 0 start_seq 0 4 1.02129
WT 0 start_seq 0 5 1.02415
WT 0 start_seq 0 6 0.976724
WT 0 start_seq 0 7 0.879226
WT 0 start_seq 0 8 1.06228

==> cds.count_tracking <==
tracking_id

==> cds.diff <==
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 sqrt(JS) test_stat p_value q_value significant

==> cds_exp.diff <==
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant

==> cds.fpkm_tracking <==
tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage

==> cds.read_group_tracking <==
tracking_id condition replicate raw_frags internal_scaled_frags external_scaled_frags FPKM effective_length status

==> cuffData.db <==
SQLite format 3@ A#-â&—AûöñìçâÝØÓÎÉÄ¿ºµ°«¦¡—%9indexsqlite_autoindex_TSS_1TSgtablesamplessamplesCREATE TABLE "samples"(
"sample_index" INTEGER NOT NULL,
"sample_name" VARCHAR(45) PRIMARY KEY NOT NULL
)-Aindexsqlite_autoindex_samples_1samplesdablebiasDatabiasDataCREATE TABLE "biasData"(
"biasData_id" INTEGER PRIMARY KEY NOT NULL
)wƒMtablegenesgenesCREATE TABLE "genes"(
"gene_id" VARCHAR(45) PRIMARY KEY NOT NULL,
"class_code" VARCHAR(45),
"nearest_ref_id" VARCHAR(45),
"gene_short_name" VARCHAR(45),

==> gene_exp.diff <==
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
0610005C13Rik 0610005C13Rik 0610005C13Rik chr7:45567794-45589710 WT KO OK 0 0.604775 inf -nan 0.2076 0.639265 no
0610007N19Rik 0610007N19Rik 0610007N19Rik chr15:32240567-32244662 WT KO OK 5.46571 7.5372 0.463622 0.26062 0.6309 0.891824 no
0610007P14Rik 0610007P14Rik 0610007P14Rik chr12:85815454-85824545 WT KO OK 0.00531297 2.53232 8.89673 1.65699 0.13885 0.592104 no
0610008F07Rik 0610008F07Rik 0610008F07Rik chr2:163492317-163541721 WT KO NOTEST 0 0 0 0 1 1 no
0610009B14Rik 0610009B14Rik 0610009B14Rik chr12:80686374-80692466 WT KO NOTEST 0 0 0 0 1 1 no
0610009B22Rik 0610009B22Rik 0610009B22Rik chr11:51685384-51688634 WT KO OK 115.715 87.1343 -0.409267 -0.755209 0.29795 0.701585 no
0610009D07Rik 0610009D07Rik 0610009D07Rik chr12:4817607-4827659 WT KO OK 106.758 92.7211 -0.203369 -0.399578 0.5278 0.841627 no
0610009L18Rik 0610009L18Rik 0610009L18Rik chr11:120348677-120351190 WT KO OK 26.0873 43.6863 0.743836 0.908239 0.3811 0.76145 no
0610009O20Rik 0610009O20Rik 0610009O20Rik chr18:38238404-38262629 WT KO OK 26.684 43.2191 0.695695 1.08275 0.063 0.533389 no

==> genes.count_tracking <==
tracking_id WT_count WT_count_variance WT_count_uncertainty_var WT_count_dispersion_var WT_status KO_count KO_count_variance KO_count_uncertainty_var KO_count_dispersion_var KO_status
0610005C13Rik 0 2.01378 0 0 OK 19.7291 8094.36 0 6927.97 OK
0610007N19Rik 353.184 92023.9 0 91760 OK 478.051 122087 0 122086 OK
0610007P14Rik 0.300445 1 0 0.374955 OK 135.823 34661.3 0 33302.6 OK
0610008F07Rik 0 0 0 0 OK 0 0 0 0 OK
0610009B14Rik 0 0 0 0 OK 0 0 0 0 OK
0610009B22Rik 3247.34 1.08884e+06 0 1.08884e+06 OK 2501.94 804035 0 804035 OK
0610009D07Rik 4741.38 1.70538e+06 0 1.70538e+06 OK 4301.16 1.51118e+06 0 1.51118e+06 OK
0610009L18Rik 397.699 104366 0 104365 OK 736.619 202041 0 201922 OK
0610009O20Rik 2694.43 859079 0 859079 OK 4362.3 1.51431e+06 0 1.51431e+06 OK

==> genes.fpkm_tracking <==
tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage WT_FPKM WT_conf_lo WT_conf_hi WT_status KO_FPKM KO_conf_lo KO_conf_hi KO_status
0610005C13Rik - - 0610005C13Rik 0610005C13Rik - chr7:45567794-45589710 - - 0 0 0.0238527 OK 0.604775 0 3.14855 OK
0610007N19Rik - - 0610007N19Rik 0610007N19Rik - chr15:32240567-32244662 - - 5.46571 0 16.2838 OK 7.5372 0.0170511 19.7281 OK
0610007P14Rik - - 0610007P14Rik 0610007P14Rik - chr12:85815454-85824545 - - 0.00531297 0 0.0552537 OK 2.53232 0 9.22736 OK
0610008F07Rik - - 0610008F07Rik 0610008F07Rik - chr2:163492317-163541721 - - 0 0 0 OK 0 0 0 OK
0610009B14Rik - - 0610009B14Rik 0610009B14Rik - chr12:80686374-80692466 - - 0 0 0 OK 0 0 0 OK
0610009B22Rik - - 0610009B22Rik 0610009B22Rik - chr11:51685384-51688634 - - 115.715 48.0154 183.03 OK 87.1343 33.1359 141.027 OK
0610009D07Rik - - 0610009D07Rik 0610009D07Rik - chr12:4817607-4827659 - - 106.758 53.8199 159.367 OK 92.7211 46.1145 139.028 OK
0610009L18Rik - - 0610009L18Rik 0610009L18Rik - chr11:120348677-120351190 - - 26.0873 1.30458 50.5611 OK 43.6863 5.88822 80.4606 OK
0610009O20Rik - - 0610009O20Rik 0610009O20Rik - chr18:38238404-38262629 - - 26.684 9.45996 43.8886 OK 43.2191 19.8421 66.5767 OK

==> genes.read_group_tracking <==
tracking_id condition replicate raw_frags internal_scaled_frags external_scaled_frags FPKM effective_length status
0610005C13Rik WT 1 0 0 0 0 - OK
0610005C13Rik WT 0 0 0 0 0 - OK
0610005C13Rik WT 2 0 0 0 0 - OK
0610005C13Rik KO 1 62.0006 59.1874 59.1874 1.81433 - OK
0610005C13Rik KO 0 0 0 0 0 - OK
0610005C13Rik KO 2 0 0 0 0 - OK
0610007N19Rik WT 1 2 1.85224 1.85224 0.0315828 - OK
0610007N19Rik WT 0 545 491.228 491.228 6.70655 - OK
0610007N19Rik WT 2 526 566.472 566.472 9.65898 - OK

==> isoform_exp.diff <==
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
NM_001001130 Zfp85-rs1 Zfp85-rs1 chr13:67729261-67756838 WT KO OK 1.90369 0.00601979 -8.30487 -2.05805 0.1332 0.664166 no
NM_001001144 Scap Scap chr9:110333292-110384946 WT KO NOTEST 0.0524217 0.040569 -0.369786 0 1 1 no
NM_001001152 Zfp458 Zfp458 chr13:67254917-67269068 WT KO NOTEST 0.00205102 0 -inf 0 1 1 no
NM_001001160 Fbxo41 Fbxo41 chr6:85469577-85502886 WT KO NOTEST 0 0.0688398 inf 0 1 1 no
NM_001001176 Taf9b Taf9b chrX:106206873-106221158 WT KO NOTEST 8.89874e-05 0.000169376 0.928556 0 1 1 no
NM_001001177 BC051142 BC051142 chr17:34398819-34460734 WT KO NOTEST 0 0 0 0 1 1 no
NM_001001178 Ccdc148 Ccdc148 chr2:58821697-59018606 WT KO NOTEST 0.00455913 0.00357061 -0.352587 0 1 1 no
NM_001001179 BC048546 BC048546 chr6:128539821-128581606 WT KO NOTEST 0 0.00134926 inf 0 1 1 no
NM_001001180 Zfp941 Zfp941 chr7:140809676-140822178 WT KO NOTEST 0.0276003 0.0376914 0.449551 0 1 1 no

==> isoforms.count_tracking <==
tracking_id WT_count WT_count_variance WT_count_uncertainty_var WT_count_dispersion_var WT_status KO_count KO_count_variance KO_count_uncertainty_var KO_count_dispersion_var KO_status
NM_001001130 175.756 41657 0 41656.4 OK 0.55577 2 0 1.13185 OK
NM_001001144 8.42453 6833.56 0 576.583 OK 7.58851 1855 0 1854.66 OK
NM_001001152 0.308707 1 0 0.392771 OK 0 0 0 0 OK
NM_001001160 0 0 0 0 OK 17.8208 1299.74 0 1294.42 OK
NM_001001176 0.013785 2 0 1.01778 OK 0.0215626 764.266 0 2.2876 OK
NM_001001177 0 0 0 0 OK 0 0 0 0 OK
NM_001001178 0.717963 2 0 1.81451 OK 0.65445 1 0 0.866675 OK
NM_001001179 0 0 0 0 OK 0.277885 1 0 0.326309 OK
NM_001001180 5.28247 37.5317 0 36.6546 OK 6.40012 81.3041 0 79.5666 OK

==> isoforms.fpkm_tracking <==
tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage WT_FPKM WT_conf_lo WT_conf_hi WT_status KO_FPKM KO_conf_lo KO_conf_hi KO_status
NM_001001130 - - Zfp85-rs1 Zfp85-rs1 - chr13:67729261-67756838 2218 - 1.90369 0 4.39756 OK 0.00601979 0 0.0108314 OK
NM_001001144 - - Scap Scap - chr9:110333292-110384946 4286 - 0.0524217 0 0.908718 OK 0.040569 0 0.106908 OK
NM_001001152 - - Zfp458 Zfp458 - chr13:67254917-67269068 3488 - 0.00205102 0 0.0187717 OK 0 0 0 OK
NM_001001160 - - Fbxo41 Fbxo41 - chr6:85469577-85502886 6562 - 0 0 0 OK 0.0688398 0 0.289362 OK
NM_001001176 - - Taf9b Taf9b - chrX:106206873-106221158 2583 - 8.89874e-05 0 0 OK 0.000169376 0 0.0091703 OK
NM_001001177 - - BC051142 BC051142 - chr17:34398819-34460734 1900 - 0 0 0 OK 0 0 0 OK
NM_001001178 - - Ccdc148 Ccdc148 - chr2:58821697-59018606 3640 - 0.00455913 0 0.0179878 OK 0.00357061 0 0.0119919 OK
NM_001001179 - - BC048546 BC048546 - chr6:128539821-128581606 4698 - 0 0 0 OK 0.00134926 0 0.0139369 OK
NM_001001180 - - Zfp941 Zfp941 - chr7:140809676-140822178 3909 - 0.0276003 0 0.0949165 OK 0.0376914 0 0.134 OK

==> isoforms.read_group_tracking <==
tracking_id condition replicate raw_frags internal_scaled_frags external_scaled_frags FPKM effective_length status
NM_001001130 WT 1 517 478.804 478.804 5.18614 - OK
NM_001001130 WT 0 0 0 0 0 - OK
NM_001001130 WT 2 45 48.4625 48.4625 0.524918 - OK
NM_001001130 KO 1 0 0 0 0 - OK
NM_001001130 KO 0 0 0 0 0 - OK
NM_001001130 KO 2 2 1.66731 1.66731 0.0180594 - OK
NM_001001144 WT 1 0.00702877 0.00650949 0.00650949 3.47958e-05 - OK
NM_001001144 WT 0 28.0258 25.2607 25.2607 0.157196 - OK
NM_001001144 WT 2 0.00596457 0.0064235 0.0064235 3.43362e-05 - OK

==> promoters.diff <==
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 sqrt(JS) test_stat p_value q_value significant

==> read_groups.info <==
file condition replicate_num total_mass norm_mass internal_scale external_scale
./KO918_C1BGEACXX_3_3413/accepted_hits.bam WT 0 4.85925e+07 4.58186e+07 1.10947 1
./KO922_C1BGEACXX_6_3415/accepted_hits.bam WT 1 5.08535e+07 4.58186e+07 1.07977 1
./KO957_C1BGEACXX_6_3416/accepted_hits.bam WT 2 4.54671e+07 4.58186e+07 0.928554 1
./WT915_C1BGEACXX_3_3412/accepted_hits.bam KO 0 4.56134e+07 4.58186e+07 0.991352 1
./WT921_C1BGEACXX_3_3414/accepted_hits.bam KO 1 4.74677e+07 4.58186e+07 1.04753 1
./WT962_C1BGEACXX_6_3417/accepted_hits.bam KO 2 5.24637e+07 4.58186e+07 1.19954 1

==> run.info <==
param value
cmd_line cuffdiff -o diff_out_KOvsWT -b ../../RefData/index_files/Mouse/UCSC/Bowtie2Index/genome.fa -p 20 -L WT,KO -u ../../RefData/gtf_refs/mouse/refGene_mouse_UCSC_adapted.gtf ./KO918_C1BGEACXX_3_3413/accepted_hits.bam,./KO922_C1BGEACXX_6_3415/accepted_hits.bam,./KO957_C1BGEACXX_6_3416/accepted_hits.bam ./WT915_C1BGEACXX_3_3412/accepted_hits.bam,./WT921_C1BGEACXX_3_3414/accepted_hits.bam,./WT962_C1BGEACXX_6_3417/accepted_hits.bam
version 2.1.1
SVN_revision 4046M
boost_version 104900

==> splicing.diff <==
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 sqrt(JS) test_stat p_value q_value significant

==> tss_group_exp.diff <==
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant

==> tss_groups.count_tracking <==
tracking_id

==> tss_groups.fpkm_tracking <==
tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage

==> tss_groups.read_group_tracking <==
tracking_id condition replicate raw_frags internal_scaled_frags external_scaled_frags FPKM effective_length status

==> var_model.info <==
condition locus compatible_count_mean compatible_count_var total_count_mean total_count_var fitted_var
WT chr1:3214481-3671498 0 0 9.5328 32.1283 0
WT chr1:4290845-4409241 286.716 38731.3 531.42 257374 63261.5
WT chr1:4490927-4496413 14.5177 362.867 16.2529 409.386 237.869
WT chr1:4773199-4785726 11866.9 910828 12014 896817 5.98831e+06
WT chr1:4807892-4846735 4219.87 1.56711e+06 4257.21 1.57699e+06 1.44602e+06
WT chr1:4857693-4897909 1609.49 636541 1723.2 617485 472116
WT chr1:4909575-5070285 13.033 2.7366 60.1232 8594.32 190.845
WT chr1:5083172-5162549 2522.24 1.12535e+06 2666.59 1.21034e+06 776043
WT chr1:5588492-5606133 0 0 419.29 13007.3 0
Tags: cuffdiff, cummerbund
jp.

Senior Member

Join Date: Jul 2013

Posts: 142
- Share
- Tweet
#2

07-30-2013, 06:42 PM

Hi there
I got the same CuffSet instance with:
4 samples
39098 genes
0 isoforms
0 TSS
0 CDS
0 promoters
0 splicing
0 relCDS
can you guess please ?

Thank you

Originally posted by rubbertjes View Post

Hi everyone,

I'm using cuffdiff to do my differential expression analysis for my mouse data. I am using annotation file to map only known transcripts using TopHat. I'm using the same GTF file for my Cuffdiff analysis.
Most of the output files look fine I think, but when I try to load the files into CummeRbund, the resulting variable is empty. The ouput:
> cuff_data <- readCufflinks('diff_out_KOvsWT')
> cuff_data
CuffSet instance with:
0 samples
0 genes
0 isoforms
0 TSS
0 CDS
0 promoters
0 splicing
0 relCDS

................................
WT chr1:5083172-5162549 2522.24 1.12535e+06 2666.59 1.21034e+06 776043
WT chr1:5588492-5606133 0 0 419.29 13007.3 0
Comment

Previous template Next

Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM
Strategies for Sequencing Challenging Samples

by seqadmin

Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
- Channel: Articles
03-22-2024, 06:39 AM

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 27 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 26 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Empty instance in cummeRbund

Comment

Latest Articles

ad_right_rmr

News