SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Best classifier for viral reads--PhyloSift? BLAST? Others? clintp Metagenomics 1 07-21-2015 04:59 AM
Problem while building RDP tools rahbz General 0 10-24-2013 09:17 PM
RDP database SDPA_Pet Bioinformatics 2 05-31-2013 12:53 PM
Taxonomy classifier Chuckytah Bioinformatics 16 11-02-2011 12:55 PM
VariantClassifier: A hierarchical variant classifier for annotated genomes NGSfan Literature Watch 0 08-31-2010 02:41 AM

Reply
 
Thread Tools
Old 05-29-2014, 12:15 PM   #1
greigite
Senior Member
 
Location: Cambridge, MA

Join Date: Mar 2009
Posts: 141
Default RDP classifier training issues

I am unsuccessfully attempting to retrain the RDP classifier, version 2.7 on the UNITE fungal database. I have tried reformatting and generating a custom tax file with entries like this, following the examples in the sample files folder:
Code:
0*Root*-1*0*rootrank
1*Fungi*0*1*domain
2*Chytridiomycota*1*2*phylum
3*Neocallimastigomycetes*2*3*class
4*Neocallimastigales*3*4*order
5*Neocallimastigaceae*4*5*family
6*Piromyces*5*6*genus
7*Piromyces_sp_I_GRL_10*6*7*species
8*Piromyces_sp_D_GRL_5*6*7*species
9*Piromyces_sp_AF_CTS_BTP1*6*7*species
10*Orpinomyces*5*6*genus
11*Orpinomyces_sp_NIANP60*10*7*species
12*Orpinomyces_sp_AF_CTS_BTO1*10*7*species
13*Orpinomyces_sp_AF_CTS_CHO3*10*7*species
I reformatted the headers in the UNITE file to look like this:
Code:
>Phaeoacremonium_pallidum|EU128053|SH114132.06FU|refs|r__Root;d__Fungi;p__Ascomycota;c__Sordariomycetes;o__Diaporthales;f__Togniniaceae;g__Phaeoacremonium;s__Phaeoacremonium_pallidum;
However, running the following command
Code:
java -jar rdp_classifier_2.7/dist/classifier.jar train --seq sh_
input.fasta -t rdp_tax_file.txt -o ~/rdp_classifier/
gives the following error:
Code:
Exception in thread "main" java.lang.NumberFormatException: For input string: ""
        at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
        at java.lang.Integer.parseInt(Integer.java:504)
        at java.lang.Integer.<init>(Integer.java:677)
        at edu.msu.cme.rdp.classifier.train.TreeFactory.addSequencewithTaxid(TreeFactory.java:157)
        at edu.msu.cme.rdp.classifier.train.TreeFactory.addSequence(TreeFactory.java:141)
        at edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker.<init>(ClassifierTraineeMaker.java:72)
        at edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker.main(ClassifierTraineeMaker.java:171)
        at edu.msu.cme.rdp.classifier.cli.ClassifierMain.main(ClassifierMain.java:60)
Can anyone help me figure this out? Pretty sure I am missing something quite obvious, like for example changing the fasta headers to match the numbers in the tax file.
greigite is offline   Reply With Quote
Old 03-16-2016, 12:57 PM   #2
yingeddi2008
Junior Member
 
Location: Denton, TX

Join Date: Oct 2013
Posts: 6
Default

I am trying to use SILVA database to train RDP classifier to species level. I don't even know how the taxonomy file is generated. And in the RDP README file, it didn't say neither...I don't know how this retrain feature is useful
yingeddi2008 is offline   Reply With Quote
Old 07-26-2016, 11:49 AM   #3
bio_informatics
Senior Member
 
Location: USA

Join Date: Nov 2013
Posts: 182
Default

Did you get any lead on this? I've similar query at this page.
__________________
Bioinformaticscally calm
bio_informatics is offline   Reply With Quote
Old 10-13-2016, 07:39 AM   #4
danova
Member
 
Location: France

Join Date: Sep 2010
Posts: 27
Default

I found this script useful to retrain SILVA.

https://github.com/geraldinepascal/F...2retrainRDP.py
danova is offline   Reply With Quote
Old 11-30-2016, 09:49 AM   #5
yingeddi2008
Junior Member
 
Location: Denton, TX

Join Date: Oct 2013
Posts: 6
Default Still trying to figure out how to construct the taxonomy file

Hi everyone,

I am still trying to figure out how to construct the taxonomy file. What is the rule for putting the taxonomy ID and name together? Any leads will be appreciated.

Thanks,

Eddi
yingeddi2008 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:03 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO