SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
CummeRbund questions glados Bioinformatics 15 07-10-2017 03:13 AM
Problem with cummeRbund and RSQLite noplacetoh1de Bioinformatics 9 11-03-2014 02:33 PM
cummeRbund get isoforms paolo.kunder Bioinformatics 6 04-23-2014 08:59 AM
cummeRbund installation m_elena_bioinfo Bioinformatics 3 08-02-2012 02:42 AM
getSig function in cummerBund shurjo Bioinformatics 3 04-25-2012 12:57 PM

Reply
 
Thread Tools
Old 04-21-2012, 07:24 AM   #1
rvaerle
Junior Member
 
Location: United Kingdom

Join Date: Oct 2011
Posts: 6
Default Some cummeRbund questions

I worked my way through the cummeRbund manual and so far I'm really impressed with it. However, I have a few questions as I'm new to R and I couldn't find the solution to these in the manual:

1). After finding similarly expressed genes using:

mySimilar<-findSimilar(cuff,"PINK1",n=20)

how do I write the expression data (with statistics) for these genes to a file?

2). How can I produce a heatmap of all differentially expressed genes?

3). I have a list of gene IDs in a text file - how can I produce a heatmap of these?
rvaerle is offline   Reply With Quote
Old 04-21-2012, 02:32 PM   #2
rvaerle
Junior Member
 
Location: United Kingdom

Join Date: Oct 2011
Posts: 6
Default

I managed to create a heatmap for all differentially expressed genes (2 above) using a post by Loyal elsewhere in this forum:

cuff <- readCufflinks()

#Retrive significant gene IDs (XLOC) with a pre-specified alpha
diffGeneIDs <- getSig(cuff,level="genes",alpha=0.05)

#Use returned identifiers to create a CuffGeneSet object with all relevant info for given genes
diffGenes<-getGenes(cuff,diffGeneIDs)

h<-csHeatmap(diffGenes,cluster='both')

However, I'm still struggling with writing similarly expressed genes to a text file and to import a list of gene IDs from a file. I would appreciate if someone could put me in the right direction. Thanks!
rvaerle is offline   Reply With Quote
Old 04-22-2012, 05:21 PM   #3
lgoff
Member
 
Location: Cambridge, MA

Join Date: Feb 2008
Posts: 82
Default

Quote:
Originally Posted by rvaerle View Post
I worked my way through the cummeRbund manual and so far I'm really impressed with it. However, I have a few questions as I'm new to R and I couldn't find the solution to these in the manual:

1). After finding similarly expressed genes using:

mySimilar<-findSimilar(cuff,"PINK1",n=20)

how do I write the expression data (with statistics) for these genes to a file?

2). How can I produce a heatmap of all differentially expressed genes?

3). I have a list of gene IDs in a text file - how can I produce a heatmap of these?
Hi rvaerle,
Heres how you can do this with cummeRbund:

1)
Code:
>write.table(diffData(mySimilar),"mySimilar.diff")
>write.table(fpkm(mySimilar),"mySimilar.fpkm")
>write.table(features(mySimilar),"mySimilar.features")
2) It seems you got this one working already

3)
>
Code:
myIDs<-read.table("Ids.txt")
myGenes<-getGenes(cuff,myIDs)
heat<-csHeatmap(myGenes)
Please let me know if you need any additional help!

Cheers,
Loyal
lgoff is offline   Reply With Quote
Old 04-22-2012, 05:36 PM   #4
rvaerle
Junior Member
 
Location: United Kingdom

Join Date: Oct 2011
Posts: 6
Default

Many thanks for this, Loyal! It looks so easy after seeing the code... Also, thanks for this great package!

BW
Ronny
rvaerle is offline   Reply With Quote
Old 05-29-2012, 06:45 AM   #5
waspboyz
Junior Member
 
Location: germany

Join Date: May 2012
Posts: 3
Default

Hello, this thread has been extremely useful to me already (as another R/bioinformatics in general novice). I am also using the mySimilar function, and have found some interesting results. But I have the problem that it gives me mostly results that were deemed not significant by cuffdiff (in general this set of genes are very lowly expressed, and any patterns detected are likely to be noise, i guess) . If I increase the number of loci to return, in addition to more of these noisy genes, I also get more genes that might be "real" hits.
So my question is if there is any way to exclude the non-significant genes in the first place? Or can anyone suggest an efficient way to sort through the noise?
Thanks,
Jeremy
waspboyz is offline   Reply With Quote
Old 06-04-2012, 05:23 AM   #6
lgoff
Member
 
Location: Cambridge, MA

Join Date: Feb 2008
Posts: 82
Default

Quote:
Originally Posted by waspboyz View Post
Hello, this thread has been extremely useful to me already (as another R/bioinformatics in general novice). I am also using the mySimilar function, and have found some interesting results. But I have the problem that it gives me mostly results that were deemed not significant by cuffdiff (in general this set of genes are very lowly expressed, and any patterns detected are likely to be noise, i guess) . If I increase the number of loci to return, in addition to more of these noisy genes, I also get more genes that might be "real" hits.
So my question is if there is any way to exclude the non-significant genes in the first place? Or can anyone suggest an efficient way to sort through the noise?
Thanks,
Jeremy
Hi waspboyz,
FindSimilar and getSig, are essentially asking two very different questions of the data. Since different restrictions and filters can be applied to both questions, the best way to find the set of genes that you are interested in my be to generate a list of significant genes, generate a second list of similar genes, and then use intersect() to find the list of genes common to both lists.

Cheers,
Loyal
lgoff is offline   Reply With Quote
Old 06-04-2012, 11:39 AM   #7
wangli
Member
 
Location: Texas

Join Date: Apr 2012
Posts: 48
Default error in cummeRbund

Hi, I found the post is very helpful. I am struggling with cummeRbund now. I tried some codes listed here, and am confronted with some errors.

> cuff_data <- readCufflinks('diff_out')
> csDensity(genes(cuff_data))
Error in dat$fpkm + pseudocount : non-numeric argument to binary operator

> diffGeneIDs <- getSig(cuff_data, level="genes", alpha=0.05)
> diffGenes <- getGenes(cuff_data, diffGeneIDs)
Error in sqliteExecStatement(conn, statement, ...) :
RS-DBI driver: (RS_SQLite_exec: could not execute1: cannot start a transaction within a transaction)

> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] cummeRbund_1.2.0 reshape2_1.2.1 ggplot2_0.9.1 RSQLite_0.11.1
[5] DBI_0.2-5 BiocInstaller_1.4.6

loaded via a namespace (and not attached):
[1] MASS_7.3-18 RColorBrewer_1.0-5 colorspace_1.1-1 dichromat_1.2-4
[5] digest_0.5.2 grid_2.15.0 labeling_0.1 memoise_0.1
[9] munsell_0.3 plyr_1.7.1 proto_0.3-9.2 scales_0.2.1
[13] stringr_0.6 tools_2.15.0

Do someone has any idea how to get rid of the error?
Thanks
Li
wangli is offline   Reply With Quote
Old 06-05-2012, 11:27 AM   #8
RUlearner
Junior Member
 
Location: New York

Join Date: Jun 2012
Posts: 1
Default

Hi all,

I am getting the same error as Li when trying to run csDensity with cummeRbund. Any suggestions?

> epi <- readCufflinks( )
> epi
CuffSet instance with:
4 samples
25078 genes
42661 isoforms
30261 TSS
24612 CDS
150468 promoters
181566 splicing
121338 relCDS
dens <- csDensity(genes(epi))
Error in dat$fpkm + pseudocount : non-numeric argument to binary operator

Thanks so much
RUlearner is offline   Reply With Quote
Old 06-15-2012, 10:19 AM   #9
lgoff
Member
 
Location: Cambridge, MA

Join Date: Feb 2008
Posts: 82
Default Version info...

Hi Wangli and RULearner,
Can you provide me with version information for both cuffdiff and cummeRbund and I can try to help. Also, if either of you wants to email me your cuffData.db files so that I can reproduce the error, I can try to see what's going on.

Cheers,
Loyal
lgoff is offline   Reply With Quote
Old 06-20-2012, 02:26 PM   #10
Starr_Hazard
Member
 
Location: South Carolina

Join Date: Nov 2010
Posts: 19
Default Reading in a table of Gene names for cummeRbund analysis

I want examine the CuffDiff/cummeRbund analyses for a list of genes that other RNAseq software has flagged as significant for my data.

I made both plain text and csv tables of the gene names
say,

Fcrls
Ndst4
Prokr2
Grm2
Snap25

Then I use read.table to enter the names to R/cummeRbund as suggested by Loyal

>
Code:
myIDs<-read.table("Ids.txt")
myGenes<-getGenes(cuff,myIDs)
heat<-csHeatmap(myGenes)
The heat map is never produced and there are no errors.

When I then attempt to examine the CuffGeneSet "myGenes"
it contains only a single gene (the first one in the list)

myGeneFeatureNames<-featureNames(myGenes)

the object gives only the XLOC and gene_short_name for the first entry of the list

if I

print(myIDs)

I see every entry. So read.table works ( so does read.csv)

I can however use this syntax

myGeneID<-myIDs[[i,1]]
myGene<-getGene(myGeneID)

to get every gene one at a time and I can make isoform plots for example.

What more can I do to get a list of Gene names entered as a CuffGeneSet
?
Starr_Hazard is offline   Reply With Quote
Old 07-11-2012, 04:05 PM   #11
ftorri
Member
 
Location: Orange County

Join Date: Oct 2010
Posts: 11
Default

Hi all,

I am having the same problem as described in the last pot: I can see all my genes doing:

print(myIDs)

Then if I produce the heatmap I have the heatmap only of the first gene.

Did anyone of you find a solution already? Thanks!

Federica
ftorri is offline   Reply With Quote
Old 07-11-2012, 04:21 PM   #12
lgoff
Member
 
Location: Cambridge, MA

Join Date: Feb 2008
Posts: 82
Default

Hi Starr and fed,

MyIDs must be a vector of gene ids. If you are reading these in with read.table, than myIDs is most likely a data.frame. The following can confirm this:

is.vector(myIDs)

If this is FALSE, then getGenes will not read the list of identifiers correctly, and will most likely only use the first gene. You can probably do something like the following:

myIDs<-as.vector(myIDs$v1)

To coerce the first column of the data.frame to a vector.

Cheers,
Loyal
lgoff is offline   Reply With Quote
Old 07-11-2012, 05:35 PM   #13
ftorri
Member
 
Location: Orange County

Join Date: Oct 2010
Posts: 11
Default

Hi Loyal,

here what I got:

> is.vector(myIDs)
[1] FALSE
> myIDs<-as.vector(myIDs$v1)

But it isn't still a vector. Am I doing something wrong?

> is.vector(myIDs)
[1] FALSE

Fed
ftorri is offline   Reply With Quote
Old 07-11-2012, 05:44 PM   #14
ftorri
Member
 
Location: Orange County

Join Date: Oct 2010
Posts: 11
Default

Actually when I do it, myIDs becomes 'empty' (NULL), and the lists disappears:

> myIDs<-as.vector(myIDs$v1)
> is.vector(myIDs)
[1] FALSE
> myIDs
NULL

Fed
ftorri is offline   Reply With Quote
Old 07-11-2012, 05:58 PM   #15
ftorri
Member
 
Location: Orange County

Join Date: Oct 2010
Posts: 11
Default

Hi,

trying to get the heatmap of the significant genes:

> diffGeneIDs <- getSig(cuff_data,level="genes",alpha=0.0001)
> diffGenes<-getGenes(cuff_data,diffGeneIDs)
> h<-csHeatmap(diffGenes,cluster='both')
> h

I have back a heatmap where the gene name is always preceded by NA|, but my gene.diff doesn't have any NA field.

Fed
ftorri is offline   Reply With Quote
Old 07-11-2012, 06:11 PM   #16
ftorri
Member
 
Location: Orange County

Join Date: Oct 2010
Posts: 11
Default

Actually I noticed that my 'gene' column has a '-' all the way.
I think I am missing the gene short name.
Fed
ftorri is offline   Reply With Quote
Old 07-12-2012, 12:41 AM   #17
rboettcher
Member
 
Location: Berlin

Join Date: Oct 2010
Posts: 71
Default

try to use unlist() before using as.vector()
rboettcher is offline   Reply With Quote
Old 07-12-2012, 09:43 AM   #18
ftorri
Member
 
Location: Orange County

Join Date: Oct 2010
Posts: 11
Default

Thanks for your reply!
I tried, but still it is not a vector:

> myIDs<-read.table("Ids_test.txt")
> unlist()
Error in unlist() : argument "x" is missing, with no default
> unlist(myIDs)
V11 V12 V13 V14 V15 V16 V17 V18 V19 V110 V111 V112 V113 V114 V115 V116 V117 V118 V119
DUSP4 KLF4 RALGAPA1 PCDH11X PTCH1 RAB3IP PALM2 LMLN CDK12 PIP5K1A CSNK1G3 ARSA AKAP2 RNF182 TP73-AS1 RXFP1
{...}
> myIDs<-read.table("Ids_test.txt")
> is.vector(myIDs)
[1] FALSE

Federica
ftorri is offline   Reply With Quote
Old 07-16-2012, 11:15 AM   #19
billstevens
Senior Member
 
Location: Baltimore

Join Date: Mar 2012
Posts: 120
Default

Hey guys,

Quick question. When I use csScatter following the Nature Protocols paper, it takes a while (3-5minutes) for the image to load. Is there a way to output the image to a file and not have it load up right then, thereby saving time?
billstevens is offline   Reply With Quote
Old 07-17-2012, 12:10 PM   #20
aslihan
Member
 
Location: USA

Join Date: Jun 2011
Posts: 23
Default

I have same problem like starr and fed.


myIDs<-read.table("gene_name.txt")

It works.There are 2 columns first: number 2nd column: gene_short_name

myGenes<-getGenes(cuff_data,myIDs)
heat<-csHeatmap(myGenes)

ONLY GIVES HEATMAP FOR FIRST ONE GENE!
I tried what you said.

> myIDs<-read.table("gene_name.txt")
> myIDs<-as.vector(myIDs$v1)
> myGenes<-getGenes(cuff_data,myIDs)
Error in sqliteExecStatement(conn, statement, ...) :
RS-DBI driver: (RS_SQLite_exec: could not execute1: cannot start a transaction within a transaction)


How can I solve problem ?

Thanks a lot.
aslihan is offline   Reply With Quote
Reply

Tags
cummerbund

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:15 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO