SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
CLC bio - GATK spreeth84 Bioinformatics 3 02-10-2012 10:41 AM
Processing SOLiD data with CLC bio Genomics work bench wishSEQ RNA Sequencing 4 03-01-2009 05:43 AM
Abi Solid WT Sample Data vs. CLC Bio Genetics Workbench 3.0.0 Blaize RNA Sequencing 2 02-23-2009 04:44 AM
CLC Bio dcfargo General 5 09-01-2008 08:25 PM
Hello from CLC bio in Denmark Roald Introductions 4 08-28-2008 05:16 AM

Reply
 
Thread Tools
Old 08-17-2010, 10:34 AM   #1
Anelda
Member
 
Location: Cape Town, South Africa

Join Date: May 2010
Posts: 30
Default Annotation of NGS data with CLC Bio

Hi there,

Has anyone used CLC Bio (or can it be used) to annotate complete genomes or large contigs obtained from either de novo assembly or mapping to a reference genome? I'm talking about gene prediction, TFBS finding, GO annotation etc.

Thanks!
Anelda is offline   Reply With Quote
Old 08-17-2010, 10:47 AM   #2
NextGenSeq
Senior Member
 
Location: USA

Join Date: Apr 2009
Posts: 482
Default

Yes, there is a pdf posted previously on how to do this. Note for large mammalian genomes the files become very large with each annotation added. For human I usually just annotate the exons and SNPs.
NextGenSeq is offline   Reply With Quote
Old 08-17-2010, 10:50 AM   #3
Anelda
Member
 
Location: Cape Town, South Africa

Join Date: May 2010
Posts: 30
Default

Can someone please send the pdf or refer me to the link where I can find it?
Anelda is offline   Reply With Quote
Old 08-17-2010, 11:37 AM   #4
kopi-o
Senior Member
 
Location: Stockholm, Sweden

Join Date: Feb 2008
Posts: 319
Default

I think this might be it: http://www.clcbio.com/index.php?id=1343
kopi-o is offline   Reply With Quote
Old 08-17-2010, 11:41 AM   #5
Anelda
Member
 
Location: Cape Town, South Africa

Join Date: May 2010
Posts: 30
Default

Quote:
Originally Posted by kopi-o View Post
I think this might be it: http://www.clcbio.com/index.php?id=1343
Hi, thanks for your help. This is unfortunately to upload a gff file. That means you have already annotated the genome and have created a gff file and just want to upload your annotations. What I am looking for is how to actually create such a file using CLC - if this can be done.

Thanks again.
Anelda is offline   Reply With Quote
Old 08-17-2010, 04:44 PM   #6
husamia
Member
 
Location: cinci

Join Date: Apr 2010
Posts: 66
Default

can you give some example of what you mean by annotate?

Last edited by husamia; 08-17-2010 at 04:44 PM. Reason: short
husamia is offline   Reply With Quote
Old 08-17-2010, 07:31 PM   #7
KevinLam
Senior Member
 
Location: SEA

Join Date: Nov 2009
Posts: 203
Default

I would actually use artemis instead
http://www.sanger.ac.uk/resources/software/artemis/

I believe CLCbio has the ability to do the same. But I got lost in the menus.
artemis was written to serve this purpose only and it handles embl format which might be useful for submission.

but correct me if I am wrong, you are not really trying to annotate NGS data but contigs derived from NGS data.
KevinLam is offline   Reply With Quote
Old 08-17-2010, 10:17 PM   #8
Anelda
Member
 
Location: Cape Town, South Africa

Join Date: May 2010
Posts: 30
Default

I have come across several very good tools designed specifically for annotation, but I've been told that CLC can give me the same annotation. Unfortunately no-one can tell me how to get it from CLC. I was hoping someone in the community has had some experience. Sorry, I realize this is moving a bit away from NGS.

Any help would be much appreciated
Anelda is offline   Reply With Quote
Old 08-17-2010, 10:41 PM   #9
KevinLam
Senior Member
 
Location: SEA

Join Date: Nov 2009
Posts: 203
Default

Oh no worries, just wanted to clarify the problem in case I am misunderstanding the question.
I would think that commercial providers would be able to help your question better?
You might get an answer here but I would think getting info from the source would be much more directed and helpful?
You did pay for the software so support is obligatory
KevinLam is offline   Reply With Quote
Old 08-17-2010, 11:38 PM   #10
Anelda
Member
 
Location: Cape Town, South Africa

Join Date: May 2010
Posts: 30
Default

Quote:
Originally Posted by KevinLam View Post
You did pay for the software so support is obligatory
That's just the thing. Haven't bought it yet. Still trying to determine what percentage of our work could be done on it. Have the trial version. The vendor has been helpful, but not really answering my question so far. That's why I've gone to the community in the meanwhile. Will poke the vendor again today.
Anelda is offline   Reply With Quote
Old 11-17-2010, 11:49 PM   #11
alpapan
Junior Member
 
Location: Canberra

Join Date: Nov 2010
Posts: 3
Default alternatively

Hey

We've been using Geneious Pro rather successfully to accomplish this for small contigs (I guess it depends how much memory your computer has). It allows to annotate a sequence, export it as GFF. You can also import other GFFs as annotations only (i.e. if it is a GFF file without a FASTA section but you have a geneious document with the reference sequence).

Geneious is also prettier than CLC bio and much much cheaper...

cheers,
a
alpapan is offline   Reply With Quote
Old 11-18-2010, 06:58 AM   #12
NextGenSeq
Senior Member
 
Location: USA

Join Date: Apr 2009
Posts: 482
Default

CLC Bio is a bit buggy. You have to force import your reference genome as normal sequence data NOT next gen sequence data. Once you do that it is easy to annotate it using a GFF file. I tried attach the pdf describing it but this website gave an error message. If you private message me with your email address I can email it to you.
NextGenSeq is offline   Reply With Quote
Old 11-18-2010, 07:05 AM   #13
johnny
Member
 
Location: Germany

Join Date: Dec 2009
Posts: 15
Default

When you mark a base you can right-click on it and use the "Add Annotation".
For extracting annotations you have to install a plugin called "Extract annotations"

http://www.clcbio.com/index.php?id=873

There you also find the "Annotate Sequence with GFF File" plugin.
johnny is offline   Reply With Quote
Old 11-18-2010, 10:50 AM   #14
steven
Senior Member
 
Location: Southern France

Join Date: Aug 2009
Posts: 269
Default

I have the same question than husamia: what do you mean by "annotate"?
Labeling a genomic region as a "contig/cluster/whatever of reads/ESTs/whatever" is one thing, performing automatic gene structure prediction is a totally different exercice (although both can produce a GFF file). I do not believe that CLC nor Geneious do the latter.
steven is offline   Reply With Quote
Old 11-18-2010, 11:18 AM   #15
NextGenSeq
Senior Member
 
Location: USA

Join Date: Apr 2009
Posts: 482
Default

You have to annotate your reference genome to know where the coding exons and SNPs are. After the assembly and SNP/DIP detection the software tells you if the variations are coding or known SNPs. You have to know this for mutation discovery.
NextGenSeq is offline   Reply With Quote
Old 11-18-2010, 11:31 AM   #16
steven
Senior Member
 
Location: Southern France

Join Date: Aug 2009
Posts: 269
Default

Quote:
Originally Posted by NextGenSeq View Post
You have to annotate your reference genome to know where the coding exons and SNPs are. After the assembly and SNP/DIP detection the software tells you if the variations are coding or known SNPs. You have to know this for mutation discovery.
OK, so that is not "gene prediction" as mentioned in the first post -if i get it right. If you want to automatically annotated gene structures on a genomic sequence, then look for something like G-Morse or Cufflink, or a traditional gene finder.
steven is offline   Reply With Quote
Old 11-18-2010, 11:48 AM   #17
bioinfosm
Senior Member
 
Location: USA

Join Date: Jan 2008
Posts: 482
Default

sorry, but whats DIP?
__________________
--
bioinfosm
bioinfosm is offline   Reply With Quote
Old 11-18-2010, 11:51 AM   #18
steven
Senior Member
 
Location: Southern France

Join Date: Aug 2009
Posts: 269
Default

i bet on deletion insertion polymorphism.
steven is offline   Reply With Quote
Old 03-29-2011, 08:17 AM   #19
pedrotanno
Junior Member
 
Location: brazil

Join Date: Oct 2009
Posts: 3
Default

Quote:
Originally Posted by NextGenSeq View Post
CLC Bio is a bit buggy. You have to force import your reference genome as normal sequence data NOT next gen sequence data. Once you do that it is easy to annotate it using a GFF file. I tried attach the pdf describing it but this website gave an error message. If you private message me with your email address I can email it to you.

Hi guys,
I'm having the same problem using CLC ... and I couldn't find any viable solution...About this pdf... is it from Clc? Because not even using their example data from the 'Annotation Plug-in' we could make it work.
pedrotanno is offline   Reply With Quote
Old 03-29-2011, 08:48 AM   #20
zee
NGS specialist
 
Location: Malaysia

Join Date: Apr 2008
Posts: 249
Default

If you're looking to do basic analysis have a look at some open solutions e.g. DIYA. It's quite simple to run a set of programs in serial. The hard part with annotation is manual curation and Artemis is a good tried and tested option. Geneious has some good offerings and it is prettier.
zee is offline   Reply With Quote
Reply

Tags
annotate, clc bio, genome

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:42 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO