SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
PubMed: Exome sequencing: a transformative technology. Newsbot! Literature Watch 0 09-24-2011 05:21 AM
Confirmation sequencing technology gendxdoc General 7 08-05-2011 05:18 AM
PubMed: The next-generation sequencing technology and application. Newsbot! Literature Watch 0 07-01-2011 11:20 AM
Amplicon sequencing using Titanium technology sacha 454 Pyrosequencing 2 04-23-2009 05:08 AM
If you could use next-gen sequencing technology to answer any question.... ECO General 2 06-18-2008 10:21 PM

Reply
 
Thread Tools
Old 01-29-2013, 03:56 AM   #1
dan
wiki wiki
 
Location: Cambridge, England

Join Date: Jul 2008
Posts: 266
Default Sequencing technology database?

Hi all,

I'm planning to build a database of sequencing technologies, companies, instruments (platforms), and capabilities (i.e. run-types), not necessarily limited to NGS, but starting there initially.

An example data-point could be:
* Pyrosequencing
** Roche
*** GS-FLX
**** PE

Before we get bogged down adding data to the database, I'd like to discuss what types of data you think should be collected? i.e. what fields to include in the database?

Some ideas include: machine cost, reads per run, read length (statistics), run time, errors (error types and statistics), ...

This will be a database in the wiki, so once we've settled on a data-model, everyone can contribute to the data, allowing the community to more easily compare platforms and keep track of trends and updates.

I'm thinking simple is best, but please let me know what you think should be included in such a database :-)
__________________
Homepage: Dan Bolser
MetaBase the database of biological databases.
dan is offline   Reply With Quote
Old 01-29-2013, 05:16 AM   #2
marcowanger
Senior Member
 
Location: Hong Kong

Join Date: Dec 2008
Posts: 350
Default

Hi Dan,

You make me think of this http://www.molecularecologist.com/ne...eldguide-2012/

Quote:
Originally Posted by dan View Post
Hi all,

I'm planning to build a database of sequencing technologies, companies, instruments (platforms), and capabilities (i.e. run-types), not necessarily limited to NGS, but starting there initially.

An example data-point could be:
* Pyrosequencing
** Roche
*** GS-FLX
**** PE

Before we get bogged down adding data to the database, I'd like to discuss what types of data you think should be collected? i.e. what fields to include in the database?

Some ideas include: machine cost, reads per run, read length (statistics), run time, errors (error types and statistics), ...

This will be a database in the wiki, so once we've settled on a data-model, everyone can contribute to the data, allowing the community to more easily compare platforms and keep track of trends and updates.

I'm thinking simple is best, but please let me know what you think should be included in such a database :-)
__________________
Marco
marcowanger is offline   Reply With Quote
Old 01-29-2013, 05:20 AM   #3
marcowanger
Senior Member
 
Location: Hong Kong

Join Date: Dec 2008
Posts: 350
Default

Another thought is,

machine and reagent costs vary from country to country. I think it depends who you are (big institution or small, etc). And throughput depends on how much you can squeeze out..

Put it another way around, I do think it may uncover these hidden marketing strategies, reflect the true costs..
__________________
Marco

Last edited by marcowanger; 01-29-2013 at 05:21 AM. Reason: remove the quote
marcowanger is offline   Reply With Quote
Old 01-29-2013, 05:48 AM   #4
dan
wiki wiki
 
Location: Cambridge, England

Join Date: Jul 2008
Posts: 266
Default

Oooh! Nice! What license is that data :-D
__________________
Homepage: Dan Bolser
MetaBase the database of biological databases.
dan is offline   Reply With Quote
Old 01-29-2013, 06:02 AM   #5
dan
wiki wiki
 
Location: Cambridge, England

Join Date: Jul 2008
Posts: 266
Default

Quote:
Originally Posted by marcowanger View Post
Another thought is,

machine and reagent costs vary from country to country. I think it depends who you are (big institution or small, etc). And throughput depends on how much you can squeeze out..

Put it another way around, I do think it may uncover these hidden marketing strategies, reflect the true costs..
One way round this would be to allow people to submit estimates. Then we could provide upper and lower bounds, median etc... However, the idea of the wiki is to let that kind of consensus emerge through discussion... Not sure what's best here... I guess we could just have one value, and if people complain bitterly, provide for multiple values to be added?
__________________
Homepage: Dan Bolser
MetaBase the database of biological databases.
dan is offline   Reply With Quote
Old 01-29-2013, 06:18 AM   #6
marcowanger
Senior Member
 
Location: Hong Kong

Join Date: Dec 2008
Posts: 350
Default

i think multiple values may be better. If only 1 single value is allowed, it might becomes a reference price list by manufacturer (IMHO, is useless and miss the point of community oriented).
__________________
Marco
marcowanger is offline   Reply With Quote
Old 01-29-2013, 07:32 AM   #7
krobison
Senior Member
 
Location: Boston area

Join Date: Nov 2007
Posts: 747
Default

I think a key bit for such a database is to timestamp all the entries on price, performance, etc. That way you could pull out trends.

The mix-and-match nature of the technologies can make life more complicated and may be fun modeling. For example, with PacBio there are separate loading and running chemistries, and right now there are two choices for each (sadly, with the same names in each one) -- C3 and XL. So you can load C3 and run XL (but I think nobody does; puts each in its weak spot), load XL and run C3 (longer reads at higher quality), load XL and run XL (longest reads but quality drop) or load C3 run C3 (if you don't have any XL kits).
krobison is offline   Reply With Quote
Old 01-29-2013, 08:37 AM   #8
dan
wiki wiki
 
Location: Cambridge, England

Join Date: Jul 2008
Posts: 266
Default

Quote:
Originally Posted by krobison View Post
I think a key bit for such a database is to timestamp all the entries on price, performance, etc. That way you could pull out trends.

The mix-and-match nature of the technologies can make life more complicated and may be fun modeling. For example, with PacBio there are separate loading and running chemistries, and right now there are two choices for each (sadly, with the same names in each one) -- C3 and XL. So you can load C3 and run XL (but I think nobody does; puts each in its weak spot), load XL and run C3 (longer reads at higher quality), load XL and run XL (longest reads but quality drop) or load C3 run C3 (if you don't have any XL kits).
Cry...

So ... I guess... I'd call each one of these combinations a different 'capability' of the one machine?
__________________
Homepage: Dan Bolser
MetaBase the database of biological databases.
dan is offline   Reply With Quote
Old 01-29-2013, 08:43 PM   #9
marcowanger
Senior Member
 
Location: Hong Kong

Join Date: Dec 2008
Posts: 350
Default

Quote:
Originally Posted by dan View Post
Cry...

So ... I guess... I'd call each one of these combinations a different 'capability' of the one machine?
I think that is why we need a database for this ..
__________________
Marco
marcowanger is offline   Reply With Quote
Old 03-06-2013, 11:34 PM   #10
dan
wiki wiki
 
Location: Cambridge, England

Join Date: Jul 2008
Posts: 266
Default

Any more thoughts on fields to collect? Based on the link from Marco, I currently have this list of fields:
* Instrument
* Run time
* Millions of Reads/run
* Bases / read
* Yield (MB/run)

What about costs? Cost per run, cost per box? Cost of sample prep? Kits?
__________________
Homepage: Dan Bolser
MetaBase the database of biological databases.
dan is offline   Reply With Quote
Old 03-06-2013, 11:41 PM   #11
marcowanger
Senior Member
 
Location: Hong Kong

Join Date: Dec 2008
Posts: 350
Default

Quote:
Originally Posted by dan View Post
Any more thoughts on fields to collect? Based on the link from Marco, I currently have this list of fields:
* Instrument
* Run time
* Millions of Reads/run
* Bases / read
* Yield (MB/run)

What about costs? Cost per run, cost per box? Cost of sample prep? Kits?
I think all make sense.
__________________
Marco
marcowanger is offline   Reply With Quote
Old 03-07-2013, 12:17 AM   #12
dan
wiki wiki
 
Location: Cambridge, England

Join Date: Jul 2008
Posts: 266
Default

Thing is, I don't want to implement a LIMS (I don't mind doing it, but I don't have time to do it!)
__________________
Homepage: Dan Bolser
MetaBase the database of biological databases.
dan is offline   Reply With Quote
Reply

Tags
database, technology, wiki

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:01 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO