SEQanswers

Go Back   SEQanswers > General



Similar Threads
Thread Thread Starter Forum Replies Last Post
orf or CDS predicted from de-novo assembled transcrits neurospora RNA Sequencing 0 09-16-2012 07:39 PM
ORF from Multifasta File mhadidi2002 Bioinformatics 1 03-07-2012 10:05 AM
SNP finder roshanbernard Bioinformatics 5 12-16-2011 01:46 AM
Any pipeline to find automatically ORF in consensus sequences? Christopher Sauvage Bioinformatics 6 05-21-2010 06:09 AM
How to use Glimmer to predict orf from Solexa contigs anyone1985 Bioinformatics 2 09-07-2009 08:28 PM

Reply
 
Thread Tools
Old 10-09-2012, 10:10 PM   #1
svj
Junior Member
 
Location: USA

Join Date: Jul 2012
Posts: 8
Default Eukaryotic orf finder

Hi All,

I am looking for Eukaryotic orf finder algorithm/source code. I am trying to build training model for unknown eukaryotic genome using Glimmerhmm. I need collect orf's for the Glimmerhmm training model. So I did BLASTp against known eukaryotic protein sequences (closest neighbour to the unknown eukaryote) but am unable to build the training model with resultant orf's. The error I get after trainGlimmerhmm is:
Training data created successfully! Check exons.dat and seqs for accuracy.


Acceptor sites for training: 18292
False acceptor sites for training: 853751
Donor sites for training: 18219
False donor sites for training: 672464


ERROR 69: /GlimmerHMM/train/score exited funny: 35584


If this process of building training model is right then can anyone help me with this situation. If not then what can I do to build training model? Should I look for acceptor and donor sites in the upstream and downstream of the orf's I got in blastp?
svj is offline   Reply With Quote
Old 07-16-2013, 11:07 AM   #2
dong01
Junior Member
 
Location: sd

Join Date: Jun 2011
Posts: 4
Default

have you solved this problem
dong01 is offline   Reply With Quote
Old 05-15-2014, 04:33 AM   #3
hi-koike
Member
 
Location: Japan

Join Date: Jul 2013
Posts: 13
Question

I would like to know if anyone have solved the problem ?

Thanks in advance,
Hideaki
hi-koike is offline   Reply With Quote
Old 08-19-2014, 02:15 AM   #4
MVictoria
Junior Member
 
Location: Utrecht

Join Date: Aug 2014
Posts: 5
Default

Hi!!

Did you manage to solve this problem??

I am getting similar error:

Simple Consensus = cgttgtggtggtgggggtggtggtggtggtggtggtggtggtggtggtggtggtggtggtggtggtggtggtggtggtgg
Markov Consensus = ggatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatgatg
******** Old Way = ctctgaggatgatgaggatgatgatgatgatgatgatgatgatgatgagatgatgatgatgatgatgatgatgatgatga
Segmentation fault (core dumped)
ERROR 69: /media/sdb1/genome_assembly/GlimmerHMM/train/score exited funny: 35584 at ./../trainGlimmerHMM line 445.


And the log file:


Code:
more TrainGlimmM2014-08-18D15\:53\:24.log
    Training data created successfully! Check exons.dat and seqs for accuracy.


    Acceptor sites for training: 35581
    False acceptor sites for training: 412224
    Donor sites for training: 35572
    False donor sites for training: 410763
The training files look like this
Code:
1. mfasta

    >supercontig_01
    GATCATACAAATCATCCCCTTGGCCTCTGTTAGCCTTCTGCGATCTATCGTGCTCGGAGCAGCTGCAAGC
    CCCGCCAAGTGACAATCCGAAACGGACTCAATAAGATTTGGCGTTGTCGACTTCATTTCAGTTCCGCCGA
    CCTTCCAGCTGCAGCTATCGACTGTCGAAGCCGACCCTCCACGAGTCAAACAGATTGGAAACGATAATAA
    ACCGATCTCCCGAGATAAGAATGGCGCTTTGGTCAAACATGAAGGCGTGAGTGAACACTCTGCTGACTTC
    ATGTAAGTGAGGAGAATATCGCTAAATGTGATACGGACATGACATTAGACTTGCAACAGAAAGAATAATA
    CATGCAGGTCCGAGATGAACAACGAGACAAACCTTGTGTGGTGCTCAACATAGTTTGCTAATAGAAACGT
    GATTGACCGTCACATGGCTCCTTGACTGTCTAGATACATCCGGCTGATCATACTTTGTTCTAGTGTATCC
    ATGACGGAGAAAAGTGCATTTATGATTTTTATGATCGATCTGTTGAATGCCAATAGGCACTTGCGGCTGG
    CCGGCGGAATTGGAAAGGAGCAGGTAGCACTCAACATCAGAGGTGTAACAACCAGCGAACCCATTCAACG
    TTGGAGTCATTTATTGTTTATCTCCGCTCTAGTTTCAGTTTCCTCTCGCGACTTGCTTGTTTGTATCTGA
    GTAAGCACCCGATAATAAAGTAGTTGTCATCACTGGCTTGAAAAATCAAACAATTACTCGCATCTCGCGA
    GAAAGAACAGACTGCTCGTAACAAGCAAGCAAACGCCAAGCTCTTATTCAGATAACATTACTGGATCCCC
    TTCTGCTATCTGATTTATTTAGTGACTGGTCCCGGGCCCGAAGCCGCCACCCTGTGCCACCTCATTTTAA


2. exon file

    supercontig_01 678584 678745
    supercontig_01 678804 678855
    supercontig_01 678924 679629
    supercontig_01 679711 679801

    supercontig_01 681196 681196
    supercontig_01 681108 681102
    supercontig_01 680978 680798
    supercontig_01 680562 680452
    supercontig_01 680342 680256

    supercontig_01 683416 683414
    supercontig_01 683197 682953
    supercontig_01 682896 682791
    supercontig_01 682737 682599
    supercontig_01 682548 682162
    supercontig_01 682111 681695
    supercontig_01 681579 681549
    supercontig_01 681489 681408
    supercontig_01 681372 681265

Thanks in advance!!
Victoria
__________________
--
M. Victoria Aguilar Pontes
PhD student, Fungal Physiology

CBS-KNAW FUNGAL BIODIVERSITY CENTRE
Institute of the Royal Netherlands Academy of Arts and Sciences(KNAW)
Fungal Molecular Physiology, Utrecht University
v.aguilar@cbs.knaw.nl
MVictoria is offline   Reply With Quote
Old 08-19-2014, 09:08 AM   #5
hi-koike
Member
 
Location: Japan

Join Date: Jul 2013
Posts: 13
Lightbulb

Hi Victoria,

What Linux OS did you use?
I tried to run training on Ubuntu OS, but I failed.
Then I tried to run on Cent OS and it worked.

I am not sure the reason, but anyway I could manage to solved the problem.
Once I succeeded to train, I can run glimmer with the trained files on Ubuntu OS.

Cheers,
Hideaki
hi-koike is offline   Reply With Quote
Old 08-20-2014, 01:48 AM   #6
MVictoria
Junior Member
 
Location: Utrecht

Join Date: Aug 2014
Posts: 5
Default

Hi Hi-koike,

I am using a server running Ubuntu 12.04.5 LTS precise.

I tried also train another dataset and after 4 days running I got the same error. Any ideas??

Thanks in advance,
Victoria
__________________
--
M. Victoria Aguilar Pontes
PhD student, Fungal Physiology

CBS-KNAW FUNGAL BIODIVERSITY CENTRE
Institute of the Royal Netherlands Academy of Arts and Sciences(KNAW)
Fungal Molecular Physiology, Utrecht University
v.aguilar@cbs.knaw.nl
MVictoria is offline   Reply With Quote
Old 08-20-2014, 07:27 AM   #7
hi-koike
Member
 
Location: Japan

Join Date: Jul 2013
Posts: 13
Default

Hi Victoria,

Can you run a glimmer using already trained files?
If you can, it might be the same problem I experienced.

Can you get a computer to run Cent OS or RedHat OS ?

I used an old computer formerly used for Windows computer.
It is easy to install Cent OS and you can install glimmer on the
Cent OS computer.

You might need to get some libraries (I forgot the correct names,
but you can find it by web-search using error message).
In my case, I could run training on Cent OS within a day.

Cheers,
Hideaki
hi-koike is offline   Reply With Quote
Old 08-21-2014, 01:37 AM   #8
MVictoria
Junior Member
 
Location: Utrecht

Join Date: Aug 2014
Posts: 5
Default

Hi Hi-koike,

I run trainGlimmer in our server (Ubuntu 12.04.5 LTS precise) with trained files and my own files and I have always got the same error (previous post).

Now I am running train Glimmer in my computer which is also using Ubuntu 12.04.5 LTS precise but at least the trained files works. So now I am waiting to see the results for own files but this might take longer.

As a backup plan, I am installing CentOS in the VBox just in case.

Thank you very much for your help.

Victoria
__________________
--
M. Victoria Aguilar Pontes
PhD student, Fungal Physiology

CBS-KNAW FUNGAL BIODIVERSITY CENTRE
Institute of the Royal Netherlands Academy of Arts and Sciences(KNAW)
Fungal Molecular Physiology, Utrecht University
v.aguilar@cbs.knaw.nl
MVictoria is offline   Reply With Quote
Old 08-22-2014, 05:02 AM   #9
MVictoria
Junior Member
 
Location: Utrecht

Join Date: Aug 2014
Posts: 5
Default

Hi Hi-koike,

As I said before I got trainglimmer running with the example data in Ubuntu 12.04.5 LTS precise, but my files crash. It is always the same error.

Now I am running the example file on Cent OS 7 and I got the same error. Do you remember which Cent OS did you use??

Thanks

Victoria
__________________
--
M. Victoria Aguilar Pontes
PhD student, Fungal Physiology

CBS-KNAW FUNGAL BIODIVERSITY CENTRE
Institute of the Royal Netherlands Academy of Arts and Sciences(KNAW)
Fungal Molecular Physiology, Utrecht University
v.aguilar@cbs.knaw.nl

Last edited by MVictoria; 08-22-2014 at 06:20 AM.
MVictoria is offline   Reply With Quote
Old 08-29-2014, 08:40 AM   #10
hi-koike
Member
 
Location: Japan

Join Date: Jul 2013
Posts: 13
Default

Hi Victoria,

I am sorry to hear that you could not run on centOS neither.

I am not sure the version of centOS which I used, because I am traveling
abroad. It might be CentOS 6 because I installed in the April.

I have succeeded to run on two RedHat machines and one CentOS machine,
but I failed on two Ubuntu machines.

On 1 RedHat machine, I could not run because the machined did not
have installed libstdcc++.

If you got the same error, the problem might be different from mine.
I am very sorry that I cannot help.

Best regards,
Hideaki
hi-koike is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:22 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO