![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Extract sequence from multi fasta file with PERL | andreitudor | Bioinformatics | 27 | 07-07-2019 08:45 AM |
extract full fasta file for local blast hits | Oyster | Bioinformatics | 9 | 07-07-2019 08:39 AM |
Extract subset of Fastq sequences based on a list of IDs | pepperoni | Bioinformatics | 36 | 05-06-2013 02:38 AM |
extract subsequence from genomic fasta file | jwhite | Bioinformatics | 7 | 06-28-2012 12:15 PM |
Extract snp ids | seq_GA | Bioinformatics | 0 | 11-22-2011 06:09 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Italy Join Date: Jul 2012
Posts: 30
|
![]()
Hi all,
i'm new about learning blast and i'm exploring now its functions by command line. I already know that to make a blastx i have first to indicize my fasta db with makeblastdb. I already used blast to learn how it works and I would that in the output not all the informations about the sequence are present (code, description,..etc) but only the sequence code. How can I do it? Somewhere I read that I have to give some parameter to the makeblastdb command.... someone here knows what? Thanks at all.. |
![]() |
![]() |
![]() |
#2 |
Peter (Biopython etc)
Location: Dundee, Scotland, UK Join Date: Jul 2009
Posts: 1,543
|
![]()
When do do a BLAST search (e.g. blastp or blastn), there are several different output formats. The plain text and XML have the original FASTA record descriptions, however this is not (currently) available in the tabular output.
http://blastedbio.blogspot.co.uk/201...criptions.html Is that what you meant? |
![]() |
![]() |
![]() |
#3 |
Member
Location: Italy Join Date: Jul 2012
Posts: 30
|
![]()
Yes.. maybe it has been useful. I find that maybe I could do it also with makeblastd. Because my problem is that I would that blast won't use the complete file with all the informations for each sequence but only the sequence id.
So, in example, the command can be this: makeblastdb -in db.fasta -title db -parse_seqids -gi_mask What do you think about? And maybe later I could use the command blastx with -outfmt "6 qgi sgi" to let me see only a table with the results and only showing GI for query and sequence.. I'm trying executing them since I don't know if there is a way to see how it has done the db with makeblastdb. |
![]() |
![]() |
![]() |
#4 |
Peter (Biopython etc)
Location: Dundee, Scotland, UK Join Date: Jul 2009
Posts: 1,543
|
![]()
I only use -parse_seqids if my FASTA files are labeled using the NCBI style with pipe characters (the vertical bases, |, are called pipes). Otherwise I find it doesn't work very well.
|
![]() |
![]() |
![]() |
#5 |
Member
Location: Italy Join Date: Jul 2012
Posts: 30
|
![]()
The format of my fasta file are from NCBI and it look like this
tr|H3ISY8|H3ISY8_STRPU description OrganismType Other params I want that blast use only the first sequence code: H3ISY8 And show me only these in the results... The command I've written bring me a "0 0 0" file... I don't know why. If I erase the -outfmt "6 qgi sgi" and tell it only "-outfmt "6" it returns a correct table. I'm continuing trying with different parameters as input. |
![]() |
![]() |
![]() |
#6 |
Member
Location: Italy Join Date: Jul 2012
Posts: 30
|
![]()
So finally, I've seen a lot of parameter and cannot do it. Can it be concluded that is it not permitted to create the binary database that blast uses, only using the sequence id number?
And there is also no way to have with blastx, in our results, only this code instead that the three parts separated by pipe (|). |
![]() |
![]() |
![]() |
#7 | |
Peter (Biopython etc)
Location: Dundee, Scotland, UK Join Date: Jul 2009
Posts: 1,543
|
![]() Quote:
Personally I'd use the database as is and process the BLAST output in a script instead. |
|
![]() |
![]() |
![]() |
#8 |
Member
Location: Italy Join Date: Jul 2012
Posts: 30
|
![]()
ok thanks... someone said me that there is a parameter to give to makeblastx... but maybe he's wrong...
|
![]() |
![]() |
![]() |
#9 | |
Peter (Biopython etc)
Location: Dundee, Scotland, UK Join Date: Jul 2009
Posts: 1,543
|
![]() Quote:
If your FASTA file identifiers are not already in the expected format, you'd have to modify the FASTA file - and in my view in that case you might as well avoid using this option, and simply format the identifiers exactly as you want them. |
|
![]() |
![]() |
![]() |
#10 | |
Member
Location: Italy Join Date: Jul 2012
Posts: 30
|
![]() Quote:
tr|I1GCL2|I1GCL2_AMPQE Uncharacterized protein OS=Amphimedon queenslandica GN=LOC100637533 PE=4 SV=1 I would that makeblastdb uses only the ID I1GCL2 as identifier. This could be interesting for me since I want the minor possible heavy database to manage. I already have the other informations collected in a db. I used this command makeblastdb -in uniprot_kb_2012_06.fasta -title uniprot_kb_2012_06 -parse_seqids but it doesn't work as I thought... it collects all the informations of the header :-( Last edited by angeloulivieri; 07-26-2012 at 03:53 AM. |
|
![]() |
![]() |
![]() |
#11 |
Member
Location: Italy Join Date: Jul 2012
Posts: 30
|
![]()
no one knows how to do it?
|
![]() |
![]() |
![]() |
#12 |
Peter (Biopython etc)
Location: Dundee, Scotland, UK Join Date: Jul 2009
Posts: 1,543
|
![]()
You haven't said which output format you are using. The specially formatted identifiers (with the pipe characters) are how BLAST identifies an accession number - which you can ask for explicitly when using the tabular output.
Last edited by maubp; 07-30-2012 at 03:39 AM. Reason: corrected typo |
![]() |
![]() |
![]() |
#13 |
Member
Location: Italy Join Date: Jul 2012
Posts: 30
|
![]()
I know that when run blastx I can obtain a tabular output with only the the Accession Numbers but it is a different problem. I would have that when the program makeblastdb creates its binary format db it takes only the accession. The reason is that I already have accessions->descriptions in a db and this way could be useful to reduce the quantity of informations to manage when later I run blastx. I hope to be clear...
(Maybe something could be done by formatdb command but I see that it's an old command) |
![]() |
![]() |
![]() |
#14 | |
Peter (Biopython etc)
Location: Dundee, Scotland, UK Join Date: Jul 2009
Posts: 1,543
|
![]() Quote:
Anything you could do with 'formatdb' would (I hope) be supported in 'makeblastdb'. |
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|