![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
MUMmer VS blastp | anyone1985 | Bioinformatics | 13 | 02-11-2013 07:52 AM |
Can blastp (blast+) print a line for seqs with no hits when using -outfmt 6? | kmkocot | Bioinformatics | 3 | 07-10-2012 10:10 AM |
The same sequence occurs multiple times in blastp output | bioagri | Bioinformatics | 1 | 03-19-2012 12:28 AM |
[BLASTP] where are some hits ? | sohnic | General | 0 | 11-24-2011 01:37 AM |
MAQ output format | m_elena_bioinfo | Bioinformatics | 0 | 12-09-2009 01:35 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: Europe Join Date: Oct 2012
Posts: 2
|
![]()
Hello to all the members!
This is my first post here on SEQanswers. ![]() Today I was updating the BLASTp application on the nodes of our grid to the latest version and, after running some jobs to test it, I noticed that the output files have different number of lines depending on the output format. (CSV and tabular format.) I used the same database and query file for both run, the only difference was the output format parameter: blastp -evalue 0.1 -db F10DRD -out test_output_f10drd_180.txt -outfmt '10 qseqid sseqid qstart qend evalue' -query f10drd_180.fas blastp -evalue 0.1 -db F10DRD -out test_output_f10drd_180.txt -outfmt '6 qseqid sseqid qstart qend evalue' -query f10drd_180.fas The output CSV file contained 1288845, the tabular file contained 1293150. I replaced the \t characters with commas in the tabular file and compared the two outputs with diff. It showed that the tabular file contains all lines from the CSV, but has 4305 more. I would like to ask if any of you noticed the same problem before. Thank you for your time and your answers! |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: The University of Melbourne, AUSTRALIA Join Date: Apr 2008
Posts: 275
|
![]()
I tried to replicate yoru results with BLAST 2.2.27+ using blastn and 1000 sequences:
Code:
formatdb -i contigs.fa -p F -o T blastn -query contigs.fa -db ./contigs.fa -evalue 0.1 -out blast.csv -outfmt '10 qseqid sseqid qstart qend evalue' blastn -query contigs.fa -db ./contigs.fa -evalue 0.1 -out blast.tsv -outfmt '6 qseqid sseqid qstart qend evalue' wc -l blast.* 14858 blast.csv 14858 blast.tsv Are you using version 2.2.27 ? Did you use "-parse_seqids" for makeblastdb? (or -o T for formatdb) Are the sequence IDs unique in your database file? |
![]() |
![]() |
![]() |
#3 |
Junior Member
Location: Europe Join Date: Oct 2012
Posts: 2
|
![]()
Thank you for your reply, Torst!
Yes, -o T was used for formatdb, the IDs are unique and it is the 2.2.27 version. The problem was caused by something else. After spending days with running several tests with different queries to find the source of this problem I found that those test jobs that completed in less than ~3 hours produced the same output in both CSV and tabular format. This led to ask for our computing grid’s error logs from the administrator. I finally got the logs and it revealed that the different output files were the result of an incorrectly set CPU limit assigned to our account. It was a recent change what we were unaware of. Now, after they corrected it, the test runs I made gave correct and identically results. I am sorry for taking your time with this question. Last edited by easolvig; 10-13-2012 at 03:27 AM. |
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: The University of Melbourne, AUSTRALIA Join Date: Apr 2008
Posts: 275
|
![]()
Glad it worked out, and there wasn't a bug in BLAST+.
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|