Unconfigured Ad

**lgoff** · 03-13-2012, 10:28 AM

Hi turnersd,
The workflow for cummeRbund has been simplified a bit since the paper was submitted. The recommended approach to this (for cummeRbund 1.1.3 or greater) is as follows

Code:

> cuff <- readCufflinks()

#Retrive significant gene IDs (XLOC) with a pre-specified alpha
> diffGeneIDs <- getSig(cuff,level="genes",alpha=0.05)

#Use returned identifiers to create a CuffGeneSet object with all relevant info for given genes
> diffGenes<-getGenes(cuff,diffGeneIDs)

#gene_short_name values (and corresponding XLOC_* values) can be retrieved from the CuffGeneSet by using:
> featureNames(diffGenes)

fpkm(), fpkmMatrix(), features(), and diffData() are all available methods for the CuffGeneSet object as well.

Cheers,
Loyal

**kareldegendt** · 03-25-2012, 10:35 PM

weird...

cummeRbund tells me this:

> diffGeneIDs <-getSig(cuff_data_Input,level="genes",alpha=0.05)
Error: could not find function "getSig"

Did I do something wrong?

**kareldegendt** · 03-25-2012, 10:38 PM

OK, got it, I got an older version (0.1.3)....
But where can I find the latest version then?

K.

**lgoff** · 03-26-2012, 05:03 PM

Hi,
you can find the freshest cummeRbund at:

CummeRbund - An R package for persistent storage, analysis, and visualization of RNA-Seq from cufflinks output

http://compbio.mit.edu/cummeRbund/

Open source tools for exploration, analysis and visualization of high-throughput RNA-Seq data

Also, be sure to sign up for the bowtie-bio-announce mailing list if you would like to be updated to new releases/features.

Cheers,
Loyal

**shurjo** · 03-26-2012, 09:40 PM

Hi Loyal,

I'm facing the same situation as kareldegendt.

Can you provide a set of instructions for upgrading to a newer version when cummerRbund is already on a system (64-bit Linux)? This would be very helpful for R novices like myself. I've tried downloading and unzipping the tarball and using make but that doesn't seem to work.

Thanks,

Shurjo

**kareldegendt** · 03-27-2012, 11:22 AM

OK, here's what I did:

I first downloaded the cummeRbund Mac OS X binary (for version 1.1.5)

Then I did this in R:
> install.packages('/Users/kareldegendt/Downloads/cummeRbund_1.1.5.tgz',repos = NULL)

then I added cummeRbund to the current session:

>library(cummeRbund)

It loaded abunch of things and told me that the package cummeRbund was built under R version 2.15.00
No clue if that's gonna hurt anything. I'll test and if so, I'll probably upgrade R...

best,
Karel

**kareldegendt** · 03-27-2012, 11:23 AM

OK, this works BUT:

I still did not get gene symbols (like f.e. Akt or Bact) but the XLOC_000.... and uc00.... names...
Not really a solution :-/

K.

**shurjo** · 03-27-2012, 03:39 PM

Hi Karel,

Thanks for the tips. They worked for me as well.

Regards,

Shurjo

**Thomas Doktor** · 03-28-2012, 01:50 AM

You could do it with merge:

cuff <- readCufflinks()

#Retrive significant gene IDs (XLOC) with a pre-specified alpha
diffGeneIDs <- getSig(cuff,level="genes",alpha=0.05)

#Use returned identifiers to create a CuffGeneSet object with all relevant info for given genes
diffGenes<-getGenes(cuff,diffGeneIDs)

#gene_short_name values (and corresponding XLOC_* values) can be retrieved from the CuffGeneSet by using:
names<-featureNames(diffGenes)
row.names(names)=names$tracking_id
diffGenesNames<-as.matrix(names)
diffGenesNames<-diffGenesNames[,-1]

# get the data for the significant genes
diffGenesData<-diffData(diffGenes)
row.names(diffGenesData)=diffGenesData$gene_id
diffGenesData<-diffGenesData[,-1]

# merge the two matrices by row names
diffGenesOutput<-merge(diffGenesNames,diffGenesData,by="row.names")

**lgoff** · 03-28-2012, 06:34 AM

Hi All,
Sorry I missed the earlier posts in this thread. Sorry for the troubles in getting updates to cummeRbund installed, but it appears that you were both successful.

The reason for this setup is that a 'stable' version of cummeRbund (v1.0.0) has to be maintained with the current 'release' version of Bioconductor. Active development of new features (including getSig, etc) is done on the 'development' version of Bioconductor which is attached to the 'development' release of R (currently v2.15). When the new version of Bioconductor is released (in the next few weeks), all of the development features of cummeRbund will be available using the standard BioC install methods. The benefit of using this is obviously earlier access to newer features but the drawback is of course a moderate amount of instability and growing pains. You can also install the development version of R and the most recent version of cummeRbund will be installed by BiocLite() by default.

The way I am trying to write cummeRbund, the 'development' versions should also be compatible with earlier versions of R (at least 2.13 or greater).

To answer the question of gene names directly, they can always be accessed as part of the 'features' data.frame returned from a call to features() on a CuffData or CuffGeneSet object. The 'featureNames()' function is just a shorthand that in most cases just returns the gene_id and gene_short_names (when present) only.

Another common way to represent the FPKM data is in a 'matrix' format of featuresXconditions. You can use fpkmMatrix(myGeneSet,fullnames=T) to generate this matrix.

The general problem with using gene names is that they are inherently non-unique (despite efforts to enforce this). This causes significant problems for a lot of the behind-the-scenes data wrangling in both cufflinks/cuffdiff and cummeRbund. This is why the XLOC_* and TCONS_* ids are essential to track individual features. Our suggestion, as mentioned above, is to use the 'features()' method to get all annotation associated with features in a CuffData, CuffGeneSet, or CuffGene object. The output of this method should be a standard R data.frame on which you can do any manipulations/merges that you would like. Please let me know if you have specific workflows in which you are having difficulty mapping these ids to gene names and I can help with the syntax.

Cheers!

Loyal

**lgoff** · 03-28-2012, 06:35 AM

Originally posted by Thomas Doktor View Post

You could do it with merge:

Thanks Thomas for posting this solution...

Cheers,
Loyal

**Kittykat22** · 03-30-2012, 06:12 PM

Hi everyone,
very helpful responses so far! Retrieving the gene names works great for me, but unfortunately just for differentially expressed genes. If I want to look at splicing or isoforms I cannot get it to work. I just get a list of identifiers or, if I try with the above path and use "Isoforms" or "Splicing" instead of "Genes" the list is empty.
I am very new to all these things so not sure what I am doing wrong or what I have to do to determine the significantly differentially expressed isoforms, splicing, promoters,...and get a list including the gene names.
Will be very greatful for any help!
Cheers,
K

**kareldegendt** · 03-31-2012, 10:35 PM

Hi all,
It finally worked for me too. I re-ran tophat and cufflinks with the refFlat file as an annotation file and that did the trick :-)

Thanks for all your help!!

Karel

**lgoff** · 04-01-2012, 04:15 AM

Originally posted by Kittykat22 View Post

Hi everyone,
very helpful responses so far! Retrieving the gene names works great for me, but unfortunately just for differentially expressed genes. If I want to look at splicing or isoforms I cannot get it to work. I just get a list of identifiers or, if I try with the above path and use "Isoforms" or "Splicing" instead of "Genes" the list is empty.
I am very new to all these things so not sure what I am doing wrong or what I have to do to determine the significantly differentially expressed isoforms, splicing, promoters,...and get a list including the gene names.
Will be very greatful for any help!
Cheers,
K

Hi Kittykat22,
This is something that I'm actively working on including in a future release of cummeRbund. Right now, it's very easy to retrieve all information for a particular gene, however, several people have asked for a 'getFeatures()' method (similar to getGenes()) that would retrieve just the information you are looking for. I will try to post to this thread when I have it working, but also, please keep checking the website for an update.

As an alternative, you can get the significantly different isoforms list by using getSig (level='isoforms'). And you can retrieve ALL isoform annotation by using 'features(isoforms(cuff))' and/or 'fpkm(isoforms(cuff))'. You should be able to filter those data.frames using the list generated from the call to getSig().

Cheers,
Loyal

Topics	Statistics	Last Post
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 24 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 42 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 48 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 49 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM

Unconfigured Ad

cummeRbund - how to get gene name in diffData output

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News