Friday, September 28, 2007

Major update for OmegaWiki

OmegaWiki has had some major update; the version of MySQL that is installed has been updated, several files have been changed to InnoDB and a lot of functionality has changed behind the scenes.

One of the effects is that the performance has improved noticeable; that was really needed. The difference in performance is a relief. It is fun again to work on the data.

One difference is that the way the relations work; relations are currently associated with a "class" and this class defines what relation types are possible. We have added a few classes so far; "linguistic entity" is one. The associated relation types allow us to indicate where the linguistic entity fits in and, where it is spoken. There will be many more classes and relation types, the quality of the classes and relation types will make a difference to the quality and the usefulness of our data.

Thanks,
GerardM

Friday, September 07, 2007

Demo Semantic Support on a new URL

At Wikimania I presented what we are doing to bring real time Semantic Support to Wikipedia. The URL in my presentation is no longer valid, the new location is at: wikipedia.wikitestsite.org.

You will find a dump of the English Wikipedia and you many of the expressions that we already now are in green. We are working towards a situation where new concepts defined in OmegaWiki will be recognised in the future data mining of the same article.

What we are discussing at the moment is adding functionality to the concepts found. Some are obvious like giving the definition, giving an option to go to OmegaWiki when the definition does not fit, showing translations for the expression in the language that is of interest to the reader.

We can imagine that there is more functionality that you would consider useful. Please let us know .. :)

Thanks,
GerardM

Monday, September 03, 2007

Connecting data from different databases

In OmegaWiki there are different datasets. These represent different origins and have a different emphasis. What we are working on is to connecting the data in these different datasets. Currently over four percent of our Community data is connected to data of the UMLS.

These connections are not without problems. The UMLS does not have the same (lexical) outlook; it is quite happy to have a singular and a plural to be part of the same concept. In OmegaWiki we do not support the notion of plurals yet. For the UMLS it is not a problem to include Geologists as it is included as a subject heading. We have it connected to geologist.

Lyme disease has several synonyms that are problematic from a lexical point of view; only "Lyme borreliosis" is what I expect to find in a dictionary. This does not necessarily mean that "Borreliosis, Lyme" is not useful to have. The Community database knows some 15 translations and thereby adds value to the English only content for Lyme disease.

With four percent of the Community Database connected, in reality we haven't scratched the surface of the UMLS. The UMLS is a well explored resource and I am sure that there are many resources that have made connections already. I hope we will find the people, the organisations willing to share the work that they have already done.

Thanks,
GerardM