I am new to blast and using bioinformatics tools on the command line. I recently have cleaned the data I was given (groomed, filtered, trimmed, and converted into fasta). I have read the documentation of ncbi and the commandline quick guide, but still have questions.
I have the following two requirements given me by my boss for blastn:
I have the following questions:
Here is my current skeleton I have written for the blastn command, but I am stuck:
I deeply appreciate any comments or assistance that could be rendered.
I have the following two requirements given me by my boss for blastn:
- E-value has to be smaller than 10E-100 for a hit to be reporte
- Use the whole NCBI nt as my database that I blast against (it would be good if I could filter out tick sequences, if possible)
I have the following questions:
- What parameter allows me to choose 10E-100 or smaller to be significant?
- How can I reference the whole NCBI nt db?
- How can I filter out certain tick sequences?
- I have about 63 fasta files I need to blast, I think. Do I run this in parallel or is there a way I can define a directory as the input with blastn?
Here is my current skeleton I have written for the blastn command, but I am stuck:
Code:
/homes/hlyates/ncbi-blast-2.2.28+/bin ./blastn -db ? -query ~/scripts/Sample_Index2/Sample1.fasta
Comment