View Single Post
Old 07-30-2013, 02:36 PM   #4
dan
wiki wiki
 
Location: Cambridge, England

Join Date: Jul 2008
Posts: 266
Talking

Hi lpantano,

Great news that you can be involved!

I like both these ideas :-)

Currently the form fields use 'auto completion' [1] to try to help users to use standard terms, but you are right, a lot of 'overlapping' terminology has been used... You can see all unique values here:
http://seqanswers.com/wiki/Special:BrowseData
As an aside, that page is a great place to start when working on data standardization. You can quickly see common vs. rare terms, and a few clicks take you to the pages to update.


Additionally, we build a page for each term, which tries to integrate data from the EDAM biological software ontology [2]. So we could flag all terms that aren't formally in the ontology (and either work to rename them, or request them to be added in EDAM).
As an aside, we clearly need to make EDAM integration more visible... I just build the pages and give up ;-)


Here are some examples:
* Biological domain - http://seqanswers.com/wiki/ChIP-Seq
* Bioinfx method - http://seqanswers.com/wiki/Peak_calling

On those pages the description text and synonyms are queried from EDAM via the BioPortal API [3]. It would be possible to have the form auto completion be keyed on that source rather than the terms used locally, but it's technically challenging to present the users with definitions wile auto completing (although I think that would be ideal).

About point 2, I forget weather we collect contact email data for each tool? If not, I could add that field and then we could mail everyone on the list with a 'please check your data' shout... I think at least this would be a good way to get more logos uploaded... Again, the display of logos is something that can be improved a lot (I hate design!)

Many thanks again for your interest in this project! I think cleaning data could be a great way to get your hands dirty to begin with.


Cheers,
Dan.
  1. MediaWiki.Org/Extension:Semantic_Forms/Autocompletion
  2. http://EDAMOntology.Org (the "Biological domain" field on our site matches the EDAM 'topic' branch, while the "Bifx Method" field matches the EDAM 'operation' branch.
  3. Funnily enough, we're one of the biggest users of the BioPortal API! Here is a link to the EDAM widgets on BioPortal.
__________________
Homepage: Dan Bolser
MetaBase the database of biological databases.

Last edited by dan; 07-30-2013 at 02:37 PM. Reason: Wanted the aside text to be small, not big!
dan is offline   Reply With Quote