SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Goseq for non-native goseq db annaprotasio Bioinformatics 12 03-31-2013 10:36 PM
GOseq errors egorleg RNA Sequencing 6 11-29-2012 01:09 PM
installing goseq chknbio Bioinformatics 2 05-29-2012 01:00 PM
HELP: GO analysis of non-natively supported organism using R package goseq tianyub836 Bioinformatics 3 03-07-2012 07:40 AM
GO analysis for Arabidopsis with Goseq melis Bioinformatics 1 01-19-2012 07:39 AM

Reply
 
Thread Tools
Old 06-13-2012, 01:33 AM   #1
SanderEST
Junior Member
 
Location: Tartu, Estonia

Join Date: Jun 2012
Posts: 4
Default GOSeq analysis problem with geneLenDataBase

Hi!

I am doing analysis on my RNA-Seq data. When I reached to the GOSeq I encountered a problem as there is no support for the genome and gene references I need. The error message:

> pwf=nullp(genes,"mm10","refGene")
Error in getlength(names(DEgenes), genome, id) :
Length information for genome mm10 and gene ID refGene is not in the geneLenDataBase database. You will have to specify bias.data manually.

Is it even possible to manually add length data or how should I proceed? I could easily get the lengths from the reference files, but how can I import the gene lengths to GOSeq? Or maybe my only option is to wait for the upgrade of geneLenDataBase?

Thanks in advance!
SanderEST is offline   Reply With Quote
Old 06-13-2012, 05:52 AM   #2
dariober
Senior Member
 
Location: Cambridge, UK

Join Date: May 2010
Posts: 311
Default

Hi,

If you can get gene length data, you can pass it as a vector to the argument bias.data of nullp. The length data format is (from http://www.bioconductor.org/packages.../doc/goseq.pdf)

Quote:
5.1 Length data format
The length data must be formatted as a numeric vector, of the same length as the main named vector specifying gene names/DE genes. Each entry should give the length of the corresponding gene in bp. If length data is unavailable for some genes, that entry should be set to NA.
Good luck!
Dario
dariober is offline   Reply With Quote
Old 06-13-2012, 06:16 AM   #3
SanderEST
Junior Member
 
Location: Tartu, Estonia

Join Date: Jun 2012
Posts: 4
Default

Thank you a lot! I should have read the manual more carefully. Actually the manual resolves my current issues clearly, but thank you for pointing that out!

Sander
SanderEST is offline   Reply With Quote
Old 03-16-2013, 04:16 AM   #4
jfrias
Junior Member
 
Location: Boston

Join Date: Aug 2012
Posts: 1
Default

Dear Dario,

I was having the same problems Sander had and even following the format suggested in the manual I could not get rid of them. I really do not know what I am doing wrong.
I created a mock set of results to test the procedure. This is the code I am using:

> de.genes <- scan("de_genes.txt", what=character() )
Read 27 items
> assayed.genes <- scan("all_genes.txt", what=character() )
Read 37 items
> gene.length=scan("gene_lengths.txt", what=numeric() )
Read 27 items
> names(gene.vector) = assayed.genes
> pwf=nullp(gene.vector,bias.data=gene.length)
Error in nullp(gene.vector, bias.data = gene.length) :
bias.data vector must have the same length as DEgenes vector!

R is telling me the size of de.genes and gene.length is the same but it stills sends me the error message. If would really appreciate if someone could help me with this problem.

Thanks

Jorge
jfrias is offline   Reply With Quote
Old 10-24-2013, 09:26 AM   #5
thanhhoang
Member
 
Location: Ohio, USA

Join Date: Jul 2013
Posts: 16
Default

Hi guys,
I have a similar problem as well when working with GOSeq. There is support for mm10 genome but not Gene ID ( I am using geneSymbol).
I am trying to get length information by following the Goseq manual but I still dont understand. So, could you please show me how to get the length information for mm10 genome and geneID geneSymbol ?

>genes = as.integer(all.genes %in% F.genes)
> names(genes) = all.genes
> head(genes)
Cryba1 Cryba4 Cryga Crygb Crygc Crygd
1 1 1 1 1 1
> pwf=nullp(genes,"mm10", "geneSymbol")
Can't find mm10/geneSymbol length data in genLenDataBase... Trying to download from UCSC. This might take a couple of minutes.
Error in value[[3L]](cond) :
Length information for genome mm10 and gene ID geneSymbol is not available. You will have to specify bias.data manually.

Thank you so much
thanhhoang is offline   Reply With Quote
Reply

Tags
genelendatabase, goseq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:02 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO