Unconfigured Ad

**miguelangel** · 07-03-2014, 08:23 AM

I have tried to do the same but using 'makeblastdb' and now I get this different error:

Error: (803.7) Blast-def-line-set.E.title
Bad char [0x9] in string at byte 38
uncultured bacterium; L2Sp-13 Lineage=Root;rootrank;Bacteria;domain;"Actinobacteria";phylum;Actinobacteria;class;Acidimicrobidae;subclass;Acidimicrobiales;order;"Acidimicrobineae";suborder;Acidimicrobiaceae;family;Ilumatobacter;genus

And a .nhr almost equal to the one generated with 'formatdb'.

I am sure that the problem is in the format of the original fasta file, that looks like this entry:

>S000655540 uncultured bacterium; L2Sp-13 Lineage=Root;rootrank;Bacteria;domain;"Actinobacteria";phylum;Actinobacteria;class;Acidimicrobidae;subclass;Acidimicrobiales;order;"Acidimicrobineae";suborder;Acidimicrobiaceae;family;Ilumatobacter;genus
ggaatcttgcgcaatgggcgaaagcctgacgcagcaacgccgcgtgcgggatgaaggccttcgggctgtaaaccgctttc
agcaggaacgaaaatgacggtacctgcagaagaaggagcggccaactacgtgccagcagccgcggtgacacgtaggctcc
aagcgttgtccggatttattgggcgtaaagagctcgtaggcggttgagtaagtcgggtgtgaaaactctgggcttaaccc
ggagacgccatccgatactgctctgactagagttcaggaggggagtggggaattcctagtgtagcggtgaaatgcgcaga
tattaggaggaacaccggtggcgaaggcgccactctggactgaaactgacgctgaggagcgaaagcatgggtatcaaaca
ggattagataccctggtactccatgccgtaaacggtgggcactaggtgtgggttccaactaacgggatccgcgccgtcgc
taacgcattaagtgccccgcctggggagtacggtcgcaagactaaaactcaaatgaattgacgg

Any idea of how can I change this format to fit into the formatdb/makeblastdb commands?

Thanks again

**GenoMax** · 07-03-2014, 08:58 AM

Problem is likely a tab character (between the S* and the rest of the header?) (based on the 0x9 code in your error). Also ID header line is probably wrapping on to second line (unless your copy/paste did that). You will likely need to reformat the headers.

**GenoMax** · 07-03-2014, 11:47 AM

Using the "release11_2_Bacteria_unaligned.fa" file downloaded from the link you posted I was able to create the indexes using makeblastdb (v. 2.2.29+) without the errors you saw. I did

Code:

$ makeblastdb -dbtype nucl -in release11_2_Bacteria_unaligned.fa

I got a certain number of errors (below), which may or may not indicate a real problem http://www.acgt.me/blog/2014/5/15/fu...rom-ncbi-blast

Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is about 45% ambiguous nucleotides (shouldn't be over 40%)

**miguelangel** · 07-04-2014, 12:27 AM

Thanks a lot

I got exactly the same errors, so I will try if with these new files are properly formated to run BLAST.

**GenoMax** · 07-04-2014, 12:17 PM

I tried a test blast with a few sequences from the RDP fasta file. Worked without any problems.

If you do not need all the extra stuff in the fasta header ID you could remove most of it using the following command (leaving the S* ID's)

Code:

$ sed -e 's/>* .*$//' release11_2_Bacteria_unaligned.fa > release11_2_Bacteria_unaligned_truncated_header.fa

Then build the indexes from the new file.

Topics	Statistics	Last Post
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, Yesterday, 08:59 AM	0 responses 14 views 0 reactions	Last Post by SEQadmin2 Yesterday, 08:59 AM
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, 06-02-2026, 12:03 PM	0 responses 22 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 12:03 PM
DNA Methylation Study Reveals How Epigenetic Changes Pass Between Generations by SEQadmin2 Started by SEQadmin2, 06-02-2026, 11:40 AM	0 responses 19 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 11:40 AM
MetaBeeAI Helps Scientists Process Research Literature Faster by SEQadmin2 Started by SEQadmin2, 05-28-2026, 11:40 AM	0 responses 32 views 0 reactions	Last Post by SEQadmin2 05-28-2026, 11:40 AM

Unconfigured Ad

How can I format RDP database to be used in a BLAST search?

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News