View Full Version : Re 'tagging' the tools in the wiki?

08-30-2011, 01:52 AM
Many of the 'tags' for tools are not good, not giving an accurate picture of how the tools work or what they do. For example, the following kinds of tag are clearly important, but missing:

* short read alignment
* de novo assembly
* transcript assembly
* transcriptomics
* structural rearrangements

I'm happy to go through and clean up the tags, possibly dropping the distinction between 'biological domain' and 'bioinformatics method' (which were never thought through) and possibly adding other tag categories. However, before redesigning what was never properly designed in the first place, I though I should open it up to discussion.

Can anyone suggest useful tags or new tag categories?

08-30-2011, 05:55 AM
"Biological method" and "Biological domain" carries different meaning. But so far things that should be in "method" is mis-placed in "domain". e.g. Read alignment...

What you suggests to be missing may be actually there, but with subtle different name. For example, structural rearrangement is coined as "structural variants".

Maybe we should define the scope of each category first and clean them up together.

I also suggest rearranging the order of the category. Shows "Biological domain" first, then "biological method", "Technology", "Maintenance", "OS" then "license" and "language". ( in the order of interest to NGS software user, at least for me)

08-30-2011, 05:56 AM
That sounds great!

08-30-2011, 06:09 AM
dan, I cannot edit the "http://seqanswers.com/wiki/Special:BrowseData". Is it a special page that requires further permission?

I think "Biological method" can be sub-divided to "Analysis type" and "Algorithms"

Anyway, defining the scope

1> Langauge: Programming language
2> Operating system: OS that package runs on
3> Is the software maintained?: Maintenance status
4> Technology: High throughput sequencing technologies
5> Analysis type (the task done by the package): Read mapping, alignments, peak calling, Denovo assembly....
6> Algorithms (by how the package do the things): such as Burrows-Wheeler or De Bruijn graph, Maximum Likehood, suffix array
7> Domain: I think I will broadly classify it as a BIG picture of the software. For example, the GATK pipeline should be classified as Genome. Other tags includes, Transcriptomics, Chip-Seq, certainly there can be a little bit sub-categorization, but not deep down to the task done, otherwise it will overlap with 5>

08-30-2011, 12:59 PM
Special:BrowseData is dynamically created by Semantic Drilldown, and isn't directly editable. For details on how to configure SD, see: http://www.mediawiki.org/wiki/Extension:Semantic_Drilldown

Something I've been meaning to investigate is using SD components on other pages. Not sure how that is done, but in theory we can make more than one SD page.

Thanks for kicking things off above. What I'd like to do is transfer the discussion to the wiki so it's easier to edit the list of things, and where the definitions of properties can go in the property pages.


08-31-2011, 05:09 AM
It would also be valuable to somehow be able to do batch edits & rewrites of keys -- such as change all instances of key "X" to key "Y". I've at times tried to impose my own discipline on the keywords

Also, if keywords were listed with a case-insensitive sort it would help; folks changing the capitalization are a source of unnecessary splitting.