Seqanswers Leaderboard Ad

**GenoMax** · 07-16-2015, 06:08 AM

What is the size of the transcript.fa file and how much RAM do you have on this machine? Do you get seg fault right away or after some time?

**Kasfen** · 07-16-2015, 06:21 AM

The file size of transcript.fa is about 40MB,and the ram of my machine is 300GB.
I got the seg fault right away.

**GenoMax** · 07-16-2015, 10:00 AM

The problem is likely something other than 80 characters. Can you post an example of your fasta sequence ID's?

Just noticed that you have

"makeblastdb protein"

in your first post. Is this nucleotide or protein sequence?

**Kasfen** · 07-16-2015, 09:58 PM

**GenoMax** · 07-17-2015, 03:52 AM

The problem is with format of your ID's. I am able to make a nucleotide database with your ID's (using blast v.2.2.31) but if I try to retrieve the accession numbers then I get the error

Code:

$ blastdbcmd -entry all -db ./transcript -outfmt '%a'
Error: [blastdbcmd] FASTA-style ID LCL|1017.G16854.T1_RECNAME|_FULL=LETHAL(3)MALIGNANT_BRAIN_TUMOR-LIKE_PROTEIN_1|_SHORT=H-L(3)MBT|_SHORT=H-L(3)MBT_PROTEIN|_SHORT=L(3)MBT-LIKE|_ALTNAME|_FULL=L(3)MBT_PROTEIN_HOMOLOG has too many parts.
Error: [blastdbcmd] FASTA-style ID LCL|1056.G17143.T1_RECNAME|_FULL=LETHAL(3)MALIGNANT_BRAIN_TUMOR-LIKE_PROTEIN_1|_SHORT=H-L(3)MBT|_SHORT=H-L(3)MBT_PROTEIN|_SHORT=L(3)MBT-LIKE|_ALTNAME|_FULL=L(3)MBT_PROTEIN_HOMOLOG has too many parts.
Error: [blastdbcmd] FASTA-style ID LCL|1017.G16884.T1_RECNAME|_FULL=PRELI_DOMAIN-CONTAINING_PROTEIN_1|_MITOCHONDRIAL|_ALTNAME|_FULL=PX19-LIKE_PROTEIN|_FLAGS|_PRECURSOR_&GT has too many parts.

If you are able to live with shortened header ID's. e.g. like

Code:

$  awk -F "|" '{if (/^>/) print $1; else print $0;}' your_file.fa > new_file.fa

Which now gives you short ID's

Code:

>1017.g16854.t1_RecName
>1056.g17143.t1_RecName
>1017.g16884.t1_RecName

makeblastdb/blastdbcmd will work.

**Kasfen** · 07-17-2015, 05:08 AM

Thank you so much!!!!!

The problem is due to the length of ID (too long),right?

what should I do ,if I want to keep ID untouched?

**GenoMax** · 07-17-2015, 06:10 AM

What version of blast are you using? Have you tried using the latest (v.2.2.31)? I was able to build the database fine with that version.

The error I saw with your ID's is similar to Peter Cock's blog entry (http://blastedbio.blogspot.com/2012/...argetonly.html) though it is not for the command I am using. the -target_only option is working fine in 2.2.31.

~~Perhaps it is the leading "_" that you have in the names that is causing the problem (e.g. _Short=H-l(3)mbt). Let me see if I can find a way to remove those easily.~~

Update: That does not seem to be a problem. It must be something else.

Peter also participates on this forum and he may come along with a suggestion later today.

**GSviral** · 10-13-2015, 03:28 AM

Sorry to dig up an old thread.

I am having the same problem with the makeblastdb command. I Get a segmentation fault error even when I type makeblastdb -help. It's like the command doesn't want to run whatsoever.

Was there any eventual solution to this problem?

Cheers.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 24 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 23 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

blast makeblastdb problem

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News