Tuesday, January 09, 2007

Relation types

OmegaWiki includes one thesaurus at the moment. The GEMET thesaurus was a boon, having it demonstrated really well that what was then WiktionaryZ is able to include a thesaurus and does a good job showing relations.

The next step will be to demonstrate that we can reliably include multiple thesauri. This is a lot more complicated. The problem has to do with the relation types used and what they mean. The issue is that you cannot infer that what is meant by a particular phrase like "is part of" in one thesaurus means the same in an other.

This means that you have to tread carefully. The first thing that you can do is treat a collection as a self contained unit. The relation types would as a consequence be only available and applicable to those DefinedMeanings that are part of the collection.

When a collection is to be integrated, there will be a need to merge those DefinedMeanings that are conceptually the same. This may merge pre existing relations and collection relations. In effect this may demonstrate that certain relation types are indeed the same and consequently the collection relations may now get a relation type that is of an higher level.

The higher level relations are based on domains. You will agree with me that only organisms include proteins. The consequence is that both parts of such a relation will have to be either an organism or a protein.

The last level would be the universal relation types. They will be true never mind the domain. Currently ALL relation types can be universally applied. The current GEMET relation types will not remain that way. They will prove to be quite arbitrary and I expect that we will at some stage restrict their usage. This will likely be offset by functionality that will offset the pain of losing a tool that is quiet popular.


No comments: