SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
how do i do a bitscore cut off in blastn in blast+ Kirstin General 1 10-01-2018 06:27 PM
Local BLAST no hits found PSchneeb Bioinformatics 6 04-18-2012 05:16 AM
Blastn or Blastx? kalu Bioinformatics 11 09-23-2011 03:18 PM
Using blastn to look for intron/exons boundries alexa039 General 2 01-14-2011 11:45 AM
Megablast ***** No hits found ****** NGSnoob Bioinformatics 2 08-25-2010 11:34 PM

Reply
 
Thread Tools
Old 10-19-2012, 09:28 AM   #1
dgio
Junior Member
 
Location: Ancona

Join Date: Oct 2012
Posts: 3
Default 454 contings blastn no hits found!!!

Hi everyone,
I'm fresh in the forum and a newbie in bioinformatics.
I'm trying to blast some contigs to use the output in MEGAN. I installed the Blast+ suite (on a mac computer) and I downloaded the nr database in fasta format from NCBI ftp.
I then formatted the database using:
Code:
makeblastdb -in nr -dbtype nucl -out nr.db
the output looks ok to me:
Code:
Building a new DB, current time: 10/18/2012 19:12:48
New DB name:   nr.db
New DB title:  nr
Sequence type: Nucleotide
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 21062489 sequences in 1624.48 seconds.
Now i'm tring to run the blastn against my local database but I get *****no hits found**** for all my contigs. I've tried to blast few of them in the blast website and they get many good hits.
What am I doing wrong??

These are few contigs from my .fasta file:
Code:
>Assembly_Contig_1 
GGGCGGTCGCCTCCGTAAAAAGTAACGGGAGGACGTTACAAAGTTCGGBTCAGGTGGGTTGGAAWHCCACCGTAGAGTATAATGGCATAAGCCGGACTGACTGTGAGACATACAAGTCGAGCAGAGTCGAAAGACGGTCATAGTGATCCGGTGGTTCTGTGTGGAAGGGCCATCGCTCAAAGGATAAAAGGTACGCCGGGGATAACAGGCTGATCTCCCCCAAGAGCTCACATCGACGGGGAGGTTTGGCACCTCGATGTCGGCTCATCGCATCCTGGGGCTGGAGCAGGTCCCAAGGGTATGGCTGTTCGCCATTTAAAGCGGTACGCGAGCTGGGTTCAGAACGTCGTGAGACAGTTCGGTCCCTATCTTCCGTGGGCGTAGGAACGTTGARGAGAGCTGACCCTAGTACGAGAGGACCGGGTTGGACGTGCCACTGGTGCACCAGTTGTTCTGCCAAGAGCATCGCTGGGTAGCTACGCACGGATGAGATAACCGCTGAAAGCATCTAAGCGGGAAGCCAACTCYGAGATGAACGTTCCCTGAAGTACGCTTGAAGACTACAAGCTTGAKASKMKGSWKGTTGTACCGCACGAGTAATCT
>Assembly_Contig_2 
CTCCCCGTCGATGTGAGCTCTTGGGGGAGATCAGCCTGTTATCCCCGTGCACCTTTACTATAGCTTGACACTGCAATTGGGATATWYWTGTGCAGGATAGGTGGGARSCWTTGATTCATAGTCGCYAGATTATGATGAGSYATCCTTGAGATACCACCCTTATATATTCTGATTGCTAACTTGCKMCAGTTATCCTGKSSGAGGACAATGTCTGGTGGGTAGTTTGACTGGGGCGGTCGCCTCCTAAAAAGTAACGGAGGCTTACAAAGGTTGGYTCAGATGGGTTGGAAATCCATCGYAGAGTATAATGGTACAARCCAGCTTAACTGYGAGACRTACAKGTCGARCAGAGACGAAAGTCGGTCATAGTGATCCGGTGGTTCTGTGTGGAAGGGCCATCGCTCAAAGGATAAAAGGTACGCCGGGGATAACAGGCTGATCTCCCCCAAGAGCTCACATCGACGGGGAGGTTTGGCACCTCGATGTCGGCTCATCGCATCCTGGGGCTGAAGCAGGTCCCAAGGGTATGGCTGTTCGCCATTTAAAGCGGTACGCGAGCTGGGTTCAGAACGTCGTGAGACAGTTCGGTCCCTATCTGCCGTGGGCGTTGGATGATTGAGGAGAGTTGCCCCTAGTACGAGAGGACCGGGGTGAACGAACCACTRGTGCACCARTTKTBSTGCCAAGRGCATMGSTKGGKWRGCTACGTTCGGATGG
>Assembly_Contig_3 
CTACGGTGGATTTCCAACCCACCTGAGCCGAACTTTGTAAGCCTCCGTTACTTTTTAGGAGGCTTACAAAGGTTGGCTCATATCGGTTGGAAAYCSATMGCAGAGTATAATGGTACAARCCAGCTTAACTGCGAGACRTACATGTCGAGCAGAGACGAAAGTCGGTCATAGTGATCCGGTGGTTCTGTGTGGAAGGGCCATCGCTCAAAGGATAAAAGGTACGCCGGGGATAACAGGCTGATCTCCCCCAAGAGCTCACATCGACGGGGAGGTTTGGCACCTCGATGTCGGCTCATCRCATCCTGGGGCTGAAGCAGGTCCCAAGGGTAYGGCTGTTCGCCRTTTAAAGYGGTACGCGAGCTGGGTTCAGAACGTCGTGAGACAGTTCGGTCCCTATCTGCCGTGGGCGTTGGATGATTGAGGAGAGTTGCCCCTAGTACGAGAGGACCGGGGTGAACGAACCACTAGTGCACCAATTGTTCTGCCAAGAGCATAGTTGGGTAGCTACGTTCGGATGWGATAACCGCTGAAGGCATCTAAGCGGGAAGCCAACTCCAAGATTAATCATCCCTGAAGATCCCAAGAAGACTACTTGGTTGATAGGCTGGGTGTGTAAGCGATGTAAGTCGTTTAGCTGACCAGTACTAATAGATCGTTTRKHTWWAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>Assembly_Contig_4 
CATATATATCCCAATTGCAGTGTCAAGCTGTAGTGGAGGTGAAAATTCCTCCTACCCGCGGAAGACGGAAAGACCCCGTGCACCTTTACTATAGCTTGACACTGCTGTTGGKAWWTTCATGTGCAGGATAGGTGGGAGCCATTGATTCATRGWCGCCAGWTTATGATGAGGCATCCYTKRRRWWMCACCCTTGAATATTCTGATAGCTAACTCCGTACAATTATCTTGTGCGAGGACAATGTCTGGTGGGTAGTTTGACTGGGGCGGTCGCCTCCTAAAAAGTAACGGAGGCTTACAAAGTTCGGCTCAGGTGGGTTGGAAATCCACCGTAGAGTATAATGGCATAAGCCGGACTGACTGTGAGACATACAWGTCGAGCAGAGTCGAAAGACGGTCATAGTGATCCGGTGGTTCTGTGTGGAAGGGCCATCGCTCAAAGGATAAAAGGTACGCCGGGGATAACAGGCTGATCTCCCCCAAGAGCTCACATCGACGGGGAGGTTTGGCACCTCGATGTCGGCTCATCGCATCCTGGGGCTGGAGCAGGTCCCAAGGGTATGGCTGTTCGCCATTTAAASSGGDVSGSSRRSYKGKTYHVRACGTCGTGAGACAGTTCGGTCCCTTA
And this is the code i'm using:
Code:
blastn -db nr.db -query contigs.fasta -out  outblastn.txt -export_search_strategy  blastn_parameters.txt  -num_threads 4
I need the top 10 hits for every sequence to feed to MEGAN. At the moment my output is:

Code:
BLASTN 2.2.27+

Reference: Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb
Miller (2000), "A greedy algorithm for aligning DNA sequences", J
Comput Biol 2000; 7(1-2):203-14.

Database: nr
           21,062,489 sequences; 7,218,481,314 total letters

Query= Assembly_Contig_1
Length=603

***** No hits found *****

Lambda      K        H
    1.35    0.627     1.14 
Gapped
Lambda      K        H
    1.28    0.460    0.850 
Effective search space used: 3755491256660

Query= Assembly_Contig_2
Length=717

***** No hits found *****
and so on.....

Can someone help me please?

Don
dgio is offline   Reply With Quote
Old 10-19-2012, 11:02 AM   #2
cliffbeall
Senior Member
 
Location: Ohio

Join Date: Jan 2010
Posts: 144
Default

The nr database on NCBI is proteins, so you could format that as protein and do a blastx search against it. That is what many people would do to use MEGAN because distant similarities are much easier to see at the protein level.

If you really want to search nucleic acid against nucleic acid, download the database labeled nt (aka nr/nt).

A timesaver is that you can download either one preformatted, you don't have to format them with makeblastdb yourself.
cliffbeall is offline   Reply With Quote
Old 10-19-2012, 11:08 AM   #3
dgio
Junior Member
 
Location: Ancona

Join Date: Oct 2012
Posts: 3
Default

ooooppppss.....my bad...
I'm reformatting the database with the prot option and i'll try again.

I have one more question. Some people suggest to remove the low complexity filtering. How do I do it in blast+? and is it a good idea?

Thank you very much for your suggestion. greatly appreciated!
Don
dgio is offline   Reply With Quote
Old 10-22-2012, 08:04 AM   #4
cliffbeall
Senior Member
 
Location: Ohio

Join Date: Jan 2010
Posts: 144
Default

I think masking is off by default for protein searches.

Maybe you could check a small subset and see what you get - if you see what seem to be erroneous taxonomic assignments and the alignments are to low complexity areas, then include it?
cliffbeall is offline   Reply With Quote
Reply

Tags
454 data analysis, blast+, blastn, megan

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:49 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO