Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RAxML error: Problem reading number of species and sites

    Hi ,
    I am trying to make a tree with RAxML that I will then use with placer.

    When I run the raxml command I get the error: Problem reading number of species and sites.

    I googled the error and used the sed command to remove spaces from the header of the aligned nucleotide file. I used muscle to align my sequences. This is what my header looks like after removing spaces:

    >ENA|CAJ48085|CAJ48085.1Bordetellaavium197Nbiodegradativeargininedecarboxylase
    ATGAAATTTCGCTTCCCCATTTTCATCATCGACGAAGACTTCCGTTCCGAGAACGCCTCG

    The raxml manual indicates that identical sequences are a problem. This is from the "Alignment Error Checking" section of the manual:

    2. Identical Sequence(s) that have di erent names but are exactly identical. This mostly happens when you excluded some hard-to-align alignment regions from your alignment and does not make sense to use.

    My sequences have a high percent identity. Uclust percent identity output for my sequences looks like 100%, 100%, 99.9%, 99.9%, 99.8%, 99.8% and some of the genus/species names are the same too. Is this the source of the error? Or maybe it is something else.

    Is there a tree building software that I can use with placer that can handle sequences where some have 100% identity ???

    Thanks!! Sorry this is so long.

  • #2
    It looks like your sequence is in fasta format, I think you need to get them into Phylip format for RAxML. I do that with Mesquite on the Mac, biopython alignIO will also work.

    Comment


    • #3
      Hi cliffbeall,
      Thx for the reply. I installed RAxML 8.2 and it accepts fasta files. I am no longer getting that error message when I run RAxML. But now when I run pplacer I get an error message:

      Running pplacer v1.1.alpha17-6-g5cecf99 analysis on Q2KYQ0_BORA1.fasta...
      Didn't find any reference sequences in given alignment file. Using supplied reference alignment.
      Warning: using a statistics file directly is now deprecated. We suggest using a reference package. If you already are, then please use the latest version of taxtastic.
      WARNING: your stats file is from RAxML 8.2.8; RAxML has been tested with the following versions: 7.0.4; 7.2.3; 7.2.5; 7.2.6; 7.2.7
      I'm going to try parsing as if this was version 7.2.3Problem parsing info or stats fileRAxML_info.10_Q2KYQ0
      Uncaught exception: Parse_stats.Stats_parsing_error("too many partitions. Only one is allowed.")

      I did include a reference alignment. It is the same alignment that RAxML used to make the RAxML_result file. Maybe pplacer does not accept the newer version of RAxML???

      I am at a loss. I am also a novice. Any additional help would be appreciated!
      Thx,
      Brian

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Today, 11:49 AM
      0 responses
      12 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 08:47 AM
      0 responses
      16 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      61 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Working...
      X