I have used the follow command sequence to build the reference index for archaea, bacteria and viral genomes:
centrifuge-download -o taxonomy taxonomy
centrifuge-download -o library -m -d "archaea,bacteria,viral" refseq > seqid2taxid.map
cat library/*/*.fna > input-sequences.fna
centrifuge-build -p 4 --conversion-table seqid2taxid.map --taxonomy-tree taxonomy/nodes.dmp --name-table taxonomy/names.dmp input-sequences.fna abv
On the final step, it reports 'Warning: Taxonomy ID 227859 is not in the provided taxonomy tree (taxonomy/nodes.dmp)' and hangs up.
If I remove this ID from seqid2taxid.map and run again, it reports 'Warning: Taxonomy ID 227859 is not in the provided taxonomy tree (taxonomy/nodes.dmp)' and hangs up. This is the accession number for ID 227859, which at this point no longer exists in seqid2taxid.map.
How do I diagnose and resolve this issue?
centrifuge-download -o taxonomy taxonomy
centrifuge-download -o library -m -d "archaea,bacteria,viral" refseq > seqid2taxid.map
cat library/*/*.fna > input-sequences.fna
centrifuge-build -p 4 --conversion-table seqid2taxid.map --taxonomy-tree taxonomy/nodes.dmp --name-table taxonomy/names.dmp input-sequences.fna abv
On the final step, it reports 'Warning: Taxonomy ID 227859 is not in the provided taxonomy tree (taxonomy/nodes.dmp)' and hangs up.
If I remove this ID from seqid2taxid.map and run again, it reports 'Warning: Taxonomy ID 227859 is not in the provided taxonomy tree (taxonomy/nodes.dmp)' and hangs up. This is the accession number for ID 227859, which at this point no longer exists in seqid2taxid.map.
How do I diagnose and resolve this issue?
Comment