![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
BLAST+ creating custom blast database and using blast+ filtering features | deniz | Bioinformatics | 3 | 07-07-2019 09:04 AM |
BLAST database error - when changing to new BLAST+ local program | biobio | Bioinformatics | 4 | 06-15-2011 06:20 AM |
batch job manager? | Richard Finney | Bioinformatics | 2 | 04-26-2011 02:10 PM |
Blast+ question | andreitudor | Bioinformatics | 0 | 03-28-2011 09:14 AM |
question on making BLAST db | rdu | Bioinformatics | 4 | 01-13-2011 12:45 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: USA Join Date: Jan 2011
Posts: 2
|
![]()
Hi everybody,
I am new with CLC genomics and 454 data. I am working on a non model species (a limpet) so I don't have any reference genome. I did a 454 run on cDNA library (transcriptome). I successfully did a trimming and alignement of the sequences. Now, I would like to blast the contigs against all organisms in NCBI using blastx or blastn to know which genes correspond in these contigs. I would like to know if I can do that directly with the NCBI BLAST available in CLC genomic or if I have to download RefSeq from NCBI to do a local BLAST. I have around 30 000 contigs to BLAST. I know that sometimes, when you blast to many sequences as a batch to NCBI using a software, you can be "black listed" and forbidden to use NCBI (it happened to a searcher from my previous lab who didn't know that before...). So I don't want this to happen... I guess it may depend on the software you use (maybe different ways to submit the batch according to the software, I am not a bioinformatician... ![]() ![]() Thank you by advance for your help! All the best Sophie |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: Pathum Thani, Thailand Join Date: Nov 2009
Posts: 190
|
![]()
Get yourself Blast2GO. It is very easy to use (no programming required) and does exactly what you want.
http://www.blast2go.org/start_blast2go However, the contigs output file has previous contigs from the isotig to which it belongs appended to it so you need to do some data manipulation (take a look at the size of the contig versus the length of the contig sequence for contigs with status=isotig and you will see what I mean). Below is a good example, as you can see the actual sequence length is 521 bp but the contig is listed as 125 bp the previous contig has 396 bp and has been appended to the start of this contig due to a programming error (Roche are aware of it). e.g. Code:
>contig17281 length=125 numreads=55 gene=isogroup00117 status=isotig TCCTTCCATgTTGTTTACATGGGGATAAAACCGCCTTGTTTTTtCTAAAGAGGGATGAAa CCTATgCTCCCTAAAGCgtATGAATCcTGGgcGaCCAAAgTCCAATCcAcAtGGTACAAC TTTGaCATCTCTTTTTCTgAGTgCATAGTCTATAATaGCTTCATTCTCCGGAAtCATCAC aGAACAagTTGAGTAgACTACAAaTCCTCCTGATTTGGAaTTAgcGTCcACTAAATCAAT TGCTgCTAAAATCaGTtGCTTTTgAAgAAAAgCaCAATTTcGTACATCTTCAATGGACTT GGATGTTTTAATAGATTGTTGATCTGGGCATATAGTCCCACTGCCGGTGCAGGGAGCATC CAATAATACTCTATCAACAGAATTTAATCCAAGGATcttcggtagctccttcccATcata gTTCttCAGGTTGTTCTCCCTTGCTCTtGCAATACTGTGCTCCTCcAACCTTTCTcTtCT TCAAGGCTTTCCTTTTCTCCTCTCTGGCTTGTGAAATTTCC |
![]() |
![]() |
![]() |
#3 |
Junior Member
Location: USA Join Date: Jan 2011
Posts: 2
|
![]()
Thank you Jeremy for your reply, that was very helpfull.
![]() Blast2go is now running, doing exactly what I want!!! Sophie |
![]() |
![]() |
![]() |
#4 |
Junior Member
Location: Chennai Join Date: Dec 2013
Posts: 6
|
![]()
Hi, even I would like to perform blast x for my non-model plant species. I assembled some 72,000 reads into 29059 unigenes. I would like to know whether BLAST2GO can be performed even if the system is hibernated or in sleep. Also I would like to how to obtain the exact gene function from thse Unigenes. Because i tried for 1st 10 unigenes for a sample, and i could annotate and obtain pathwya info for only 2 unigenes. Kindly help me through this..
-Swetha |
![]() |
![]() |
![]() |
#5 | |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,091
|
![]() Quote:
BTW: What kind of sequencing is this (72000 reads is pretty small for NGS, but a good size for sanger). 29000 unigenes from 72000 reads does not look very promising. |
|
![]() |
![]() |
![]() |
#6 |
Junior Member
Location: Chennai Join Date: Dec 2013
Posts: 6
|
![]()
I would like to perform blastx against nr database. I'm currently working on a non model plant transcriptome reads obtained by 454 sequencing. The datasets were downloaded from the database so personally I dont know much about the sequencing details.. But after pre processing and qc, i got 72018 reads from 81146 reads.. I performed de novo assembly and it assembled to 29051 unigenes. In the paper I referred, from the same number of raw reads, they have obtained 20000 unique sequences around 12k singletons and 8k contigs.. So I thought my assembly is also not that bad. What are the tools used to chk the quality of the assembly? How do I validate my assembly stats?
|
![]() |
![]() |
![]() |
#7 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,091
|
![]()
Swetha: ~30K sequences is going to be a big blastx job to run against the nr db. You will need to use some sort of a cluster, if you have any hope of finishing in a reasonable period of time.
I am not sure what you are trying to do (are you just trying to recreate the analysis reported in the paper?). If you are only interested in the contigs and do something else with that data consider contacting the authors of the paper to see if they can share the contig file. There are threads on this forum with tools for checking assembly quality (search for them). |
![]() |
![]() |
![]() |
#8 |
Junior Member
Location: Chennai Join Date: Dec 2013
Posts: 6
|
![]()
Yes I would like to perform the analysis, Im new to NGS DATA ANALYSIS, so I want to learn from qc, assembly all the basic steps.. also i requested the author for the supplementar files, but I didnt get any reply... My other question is - Can we perform BLASTX using cloud computing services and import the results into BLAST2GO for further annotation process? The BLASTX which I performed for 29501 sequences gave me output in txt format. How to import in BLAST 2 GO for further annotation steps? I know that B2G itself can perform BLASTX on its own, but the cloud services are pretty fast in obtaining the blastx results. so pls suggest me some cloud pipeline for annotation of de novo assembled unigenes.
|
![]() |
![]() |
![]() |
#9 |
Member
Location: Guangzhou China Join Date: Aug 2013
Posts: 82
|
![]()
What we do is do both blast and blast2go locally, but that would require access to a cluster.
I'm not familiar with cloud-computing, but I think you can translate the blast results from txt into xml and then feed it to the B2G pipeline. |
![]() |
![]() |
![]() |
#10 |
Junior Member
Location: Chennai Join Date: Dec 2013
Posts: 6
|
![]()
thank you so much for your reply. Is there any online converters to convert txt file to xml format? When I searched, i came acroos, only xml to txt file converters... and by getting access to cluster means what does that mean?
|
![]() |
![]() |
![]() |
#11 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,091
|
![]()
@Swetha: Since your original blastx search finished quickly perhaps you can go back and re-run that and this time save output as XML (-outfmt 5).
|
![]() |
![]() |
![]() |
Tags |
454, blast, ncbi |
Thread Tools | |
|
|