Seqanswers Leaderboard Ad

**GenoMax** · 09-24-2015, 01:08 PM

In that case you need to use -db 16sBLASTdb when you are doing the blast search.

Code:

$ blastn -query unibac.fasta -db 16sBLASTdb  -out blastn.outfmt6 -evalue 1e-5 -num_threads 6 -max_target_seqs 1 -outfmt 6

That said, are you not using the latest Blast+ package where the command is now makeblastdb?

**Naphtap** · 09-24-2015, 04:16 PM

Originally posted by GenoMax View Post

In that case you need to use -db 16sBLASTdb when you are doing the blast search.

Code:

$ blastn -query unibac.fasta -db 16sBLASTdb  -out blastn.outfmt6 -evalue 1e-5 -num_threads 6 -max_target_seqs 1 -outfmt 6

That said, are you not using the latest Blast+ package where the command is now makeblastdb?

Ok this time I used makeblastdb with the following command:

makeblastdb -in 16s.fasta -out 16sdatabaseBLAST -dbtype nucl -parse_seqids

I ended up receiving 6 16sdatabaseBLAST named files, each with a different file extension.

I then ran the blast query with the code that was mentioned previously. Consequently, I received a single file called blastn.outfmt6. The only problem's that it doesn't have anything inside the output file... At least there isn't an error this time.

**GenoMax** · 09-24-2015, 04:55 PM

Start by removing the e-value restriction and other limits. Once you are sure the blast is working then you can start adding filters in as needed.

**Naphtap** · 09-26-2015, 12:41 PM

Originally posted by GenoMax View Post

Start by removing the e-value restriction and other limits. Once you are sure the blast is working then you can start adding filters in as needed.

I just recalled that I gave only one short sequence that belonged to the forward primer that's used to amplify Bacteroidales sequences. When I used a fasta file that contained many more sequences, I received a 6.5 GB file.

I'm also looking through all the types of outputs I can generate... I'm just not sure which one's the best to use for downstream analyses, or what some of these terms mean.

Here's the list of formats that can be generated.

0 = pairwise,
1 = query-anchored showing identities,
2 = query-anchored no identities,
3 = flat query-anchored, show identities,
4 = flat query-anchored, no identities,
5 = XML Blast output,
6 = tabular,
7 = tabular with comment lines,
8 = Text ASN.1,
9 = Binary ASN.1,
10 = Comma-separated values,
11 = BLAST archive format (ASN.1),
12 = JSON Seqalign output,
13 = JSON Blast output,
14 = XML2 Blast output

**GenoMax** · 09-27-2015, 04:16 AM

I am not sure why you are using blast here but there are well known programs (Qiime and Mothur) that are designed for NGS data and computational ecology. They are going to be more efficient that using blast.

**Naphtap** · 09-27-2015, 10:57 AM

Originally posted by GenoMax View Post

I am not sure why you are using blast here but there are well known programs (Qiime and Mothur) that are designed for NGS data and computational ecology. They are going to be more efficient that using blast.

I'm mainly using blast because my profs suggested it for the course. For my project, I plan on working with QIIME; our fellow Masters student already prepared a pipeline for processing sequences with QIIME. I heard that he had trouble incorporating a check for chimeric sequences though. Once he's done his Masters project and I've finished the course, it's most likely I'll take over his pipeline,

**GenoMax** · 09-28-2015, 04:04 AM

Originally posted by Naphtap View Post

I just recalled that I gave only one short sequence that belonged to the forward primer that's used to amplify Bacteroidales sequences. When I used a fasta file that contained many more sequences, I received a 6.5 GB file.

I'm also looking through all the types of outputs I can generate... I'm just not sure which one's the best to use for downstream analyses, or what some of these terms mean.

Here's the list of formats that can be generated.

0 = pairwise,
1 = query-anchored showing identities,
2 = query-anchored no identities,
3 = flat query-anchored, show identities,
4 = flat query-anchored, no identities,
5 = XML Blast output,
6 = tabular,
7 = tabular with comment lines,
8 = Text ASN.1,
9 = Binary ASN.1,
10 = Comma-separated values,
11 = BLAST archive format (ASN.1),
12 = JSON Seqalign output,
13 = JSON Blast output,
14 = XML2 Blast output

Format 6/7 are popular when one needs to parse the output programatically. First few options are more visual (compare at a glance) formats. Peter Cock has a post on these here: http://blastedbio.blogspot.com/2012/...criptions.html

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News