SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Reply
 
Thread Tools
Old 10-28-2014, 02:41 AM   #1
carolW
Senior Member
 
Location: US

Join Date: Apr 2013
Posts: 103
Default raxml

Hi,
I don't know how the species' name should be in the phylip format for raxml. Even if there is no space in the name, I get the error msg

ERROR: Problem reading number of species and sites

I use either format below, I get this msg. How should the name appear?

Look forward to your reply,

Carol
---------------------------------------
>1112142-1113254_NC_014152.1_Thermincola_potens_JR_chromosome,_complete_genome
ATGCGCAGTTTGAAGGGGGTAATTTCCACATCTGCTCTGGAACTGGGGGTGGACTTGCCG
GAACTGGA----GTTTTGCGTTTTGCTTCAACGGCTTCTGGCAGAGAGTAGG

>1112142-1113254 NC_014152.1 Thermincola potens JR chromosome, complete genome
ATGCGCAGTTTGAAGGGGGTAATTTCCACATCTGCTCTGGAACTGGGGGTGGACTTGCCG
GAACTGGA----GTTTTGCGTTTTGCTTCAACGGCTTCTGGCAGAGAGTAGG
>1109551-1110576 NC_014152.1 Thermincola potens JR chromosome, complete genome
carolW is offline   Reply With Quote
Old 10-28-2014, 04:00 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Phylip file format example for DNA is here: http://www.molecularevolution.org/re...ats/phylip_dna

That said, RAXML manual says this about names:

Quote:
Prohibited Character(s) in taxon names are names that contain any form of whitespace character, like blanks, tabulators, and carriage returns, as well as one of the following prohibited characters: : or () or []
Have you tried replacing the "." in GenBank ID's with something else?
GenoMax is offline   Reply With Quote
Old 10-28-2014, 04:05 AM   #3
carolW
Senior Member
 
Location: US

Join Date: Apr 2013
Posts: 103
Default

No, your comment is correct but I think - in genomic coordinates should be replaced.

It seems that a name processing should be carried out for all species name which is an extra work. I don't know if it is the same for other phylogeny soft.

Many thanks

Carol
carolW is offline   Reply With Quote
Old 10-28-2014, 04:11 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Does not hurt to try replacing the "-" too.

There are many interesting requirements for other phylogeny software packages (e.g. program truncating the names to first 8 characters so you need to make that part unique etc). RAXML claims that it will take any 256 characters but we shall see if you can get past this first step by doing the two replacements.
GenoMax is offline   Reply With Quote
Old 10-28-2014, 04:16 AM   #5
carolW
Senior Member
 
Location: US

Join Date: Apr 2013
Posts: 103
Default

yes, it does. I should also have replaced the space, carriage return between name and seq, add number of seq and length of seq on the first line, etc. it means that the fasta format of the seq should be processed to be used with raxml.
carolW is offline   Reply With Quote
Reply

Tags
phylip, raxml

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:12 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO