Go Back   SEQanswers > General

Similar Threads
Thread Thread Starter Forum Replies Last Post
CLC bio - GATK spreeth84 Bioinformatics 3 02-10-2012 09:41 AM
finding unique reads with clc bio shawpa Bioinformatics 0 10-18-2011 06:41 AM
CLC bio Genomics and miRNA nomenclature HTFRIEDELFF Bioinformatics 1 07-05-2011 04:00 AM
Annotation of NGS data with CLC Bio Anelda Bioinformatics 22 03-29-2011 09:49 AM
Hello from CLC bio in Denmark Roald Introductions 4 08-28-2008 04:16 AM

Thread Tools
Old 08-15-2008, 04:40 AM   #1
Location: Chapel Hill

Join Date: Aug 2008
Posts: 22
Default CLC Bio

Is anyone using the CLC Bio genomics tools? If so would you be willing to share your experiences.
dcfargo is offline   Reply With Quote
Old 08-15-2008, 05:22 AM   #2
Senior Member
Location: Switzerland

Join Date: Aug 2008
Posts: 116

Will be attending a workshop here by CLC bio on 26 Aug. The CLC Genomics Workbench sounds cool. Will find out soon.
Melissa is offline   Reply With Quote
Old 08-15-2008, 07:27 AM   #3
--Site Admin--
Location: SF Bay Area, CA, USA

Join Date: Oct 2007
Posts: 1,358

Their software is very well put together, and runs well on Mac and Windows, I really liked the general package.

For now their SOLiD analysis methodology leaves a bit to be desired (they just convert all reads to basespace immediately), but they are working on it.
ECO is offline   Reply With Quote
Old 08-15-2008, 07:48 AM   #4
Senior Member
Location: USA

Join Date: Jan 2008
Posts: 482

Originally Posted by Melissa View Post
Will be attending a workshop here by CLC bio on 26 Aug. The CLC Genomics Workbench sounds cool. Will find out soon.
It will be great to hear back how the tool performs. They talk a lot about the tertiary analysis, but how does the REAL alignment work, and compare to others
bioinfosm is offline   Reply With Quote
Old 08-29-2008, 05:48 AM   #5
Director at CLC bio
Location: Denmark

Join Date: Aug 2008
Posts: 26
Default CLC Genomics Workbench and more

Dear all,

Several people have requested that we wrote an introduction to the CLC Genomics Workbench, so here goes.

Next generation sequencing technologies are causing some dramatic changes in the high-throughput sequencing landscape and in turn generating a lot of challenges to the field of bioinformatics. The Genomics Workbench was created to address these challenges.
The objective of the CLC Genomics Workbench is to create an integrated bioinformatics environment which combines the power to handle the magnitude of NGS data with a carefully designed graphical user interface.

For the first version we have focused on handling the secondary level of NGS bioinformatics, namely de novo assembly and reference assembly. However, we have also included some tertiary analyses like SNP detection and graphical identification of large scale genomic events.
For a full feature list, have a look here.

Version 2.0 of the software is out in a few days, and for this release we have focused on bringing our Workbench to a state where it can comfortably handle human genome size data sets. This includes the following improvements:
  • A completely new short read assembler delivering the worlds fastest reference assembly click here for more info and white paper
  • Improved memory handling
  • Options to mask reference genomes
  • Smoother handling of hybrid data sets (cross-platform, cross-experiment-design)

Alongside Genomics WB 2.0, we are also releasing a command line program package for de novo and reference assembly which will give users access to these tools in a scripting environment. This package is a separate product which includes the fast assembly algorithms and a number of utilities for handling assembly results.

Having established a firm basis for secondary analysis we have an ambitious roadmap for including more tertiary analysis tools later this year. These include:
  • Tag and array based transcriptomics
  • Advanced feature queries feature tracks
  • Chip-seq framework
  • Improved de-novo assembly
  • Improved detection of genome scale events
  • Full support for color space analysis

Further down the line we are looking at including features like:
  • RNA-seq
  • CNV detection
  • Metagenomics analyses
  • And lots more

However, although we intend to provide a very comprehensive tool set we know that we can not cover all applications there is. For this reason, we are focusing on providing an open industry-strength platform that users can modify and extend. For this reason we provide a Software Developer Kit which gives access to an extensive and well supported API and a developer community.

I hope this was of help and please feel free to post any questions or comments to this that you may have.


Roald is offline   Reply With Quote
Old 09-01-2008, 07:25 PM   #6
Senior Member
Location: Switzerland

Join Date: Aug 2008
Posts: 116

Originally Posted by bioinfosm View Post
It will be great to hear back how the tool performs. They talk a lot about the tertiary analysis, but how does the REAL alignment work, and compare to others
Although I have no prior knowledge in assembling NGS data, I'll try my best to answer your question.

At first glare, the software is user-friendly especially for people like me who can neither read script nor write code. I like the graphical view that show coverage in every position. The colour alignment is very useful to identify different kinda reads. CLC is doing local alignment. It's fast. 2Gb of RAm should be good enough. Running analysis on your own laptop is pretty attractive but gotta make sure you close all other programs. CLC can map 85% of the human sequence read to the reference genome compared to 83% by SOAP n MAQ (2 % does make a lot of difference).

The website is self-explanatory.
Check out this video on how to handle multiplexing data assembly

Here's a few things I noticed about this software
  • Although it can assemble all sort of seq data,there are a certain size limitation for Sanger reads (max=10000 sequence if not mistaken)
  • If a sequence can mapped to a few locations in the genome, it will be mapped to only one location of the genome. This read is shown in yellow which distinguish it from other reads. Or you can choose to remove this read.
  • My greatest interest besides assembling hybrid data is to look at how well can the SNP detection perform. It's based on the quality of neigbouring bases (it sounds good). I'm still not very confident on how good will that filter out the false positive. They failed to give me any validation rate of the SNPs discovered.
  • Another thing is the quality score. All the data quality score will be converted to Phred score. What I know is that the quality score of different sequencing read means different things. For example, 454 quality score only indicate if the homopolymer length has been called correctly.

I hope these problems can be resolved in version 2.0. I was told that Genomic WB 2.0 will support barcoded Solexa pair-end reads.
By the end of the day, my questions remain unanswered. How good is de nove assembly & the SNP detection? Well, not until someone has tried it out. The 30-day trial version is available for download.
Melissa is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 11:37 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO