Tuesday, October 25, 2011

OmegaWiki, Wiktionary and the Wikimedia Foundation

Many people still ask me:
  • What is the point of OmegaWiki if we already have Wiktionary?
  • Why is OmegaWiki not a Wikimedia Foundation project?
  • When will OmegaWiki replace the Wiktionarys?
So I'll write my answers in this blog post, so that I can easily refer people to it when they ask ;-)

Why OmegaWiki was created

The idea of OmegaWiki came to several multilingual contributors to Wiktionary, somewhere around 2004.
In Wiktionary, when you want to add a French=>English translation, you would edit the French word at the French Wiktionary, and add the translation. If you want that the English Wiktionary also has your translation, you'll have to edit the French word at the English Wiktionary and add your translation.
You'll probably also want to add the same translation in the other direction, i.e. English=>French. This means that two more pages have to be edited. So, basically to indicate that a French word XX and an English word YY express the same concept, you'll have to edit 4 pages to be complete. Now, consider that there are 170 Wiktionaries as of today... There is much redundant work involved.
The same will happen if you want to add relations, such as hypernyms. Even though hypernyms are based on semantics, and are therefore (more or less) independent of languages, you'll have to edit all the Wiktionaries one by one to enter such a relation.
The idea of OmegaWiki is therefore to be based on concepts (what we call DefinedMeaning). You'll only need to edit that concept once to indicate that a French word XX and an English word YY are representing this concept.

So, OmegaWiki is created

Around 2007, the first prototype version of OmegaWiki came online (at the time, it was called WiktionaryZ, but renamed later on because it is not a WMF project).
Concerns were expressed such as:
  • I contributed so much data to Wiktionary, I want this data to be automatically imported before I come to OmegaWiki
  • it lacks a lot of features that I'd like to see before I come from Wiktionary (e.g. inflexions table which we still don't have)
  • discussions on such a multilingual project will be obviously mainly held in English, and my English is not good enough, so I prefer to stay on my language version of Wiktionary
Then, we had no programmer anymore to implement new features.

OmegaWiki is not a WMF project

OmegaWiki was created with the idea of becoming a project of the Wikimedia Foundation.
Its software is an extension of MediaWiki called Wikidata. The source code is stored on the WMF servers.
Its community is mostly made of people who came from Wiktionary and still contribute to other WMF projects (Wikipedia, Wikisource, etc.). So we are not so far away from the WMF...
However it has been said that OmegaWiki can become a WMF project only if it has already all the features of the Wiktionaries, and if it is decided whether OmegaWiki will replace the Wiktionaries or coexist with them.
This is problematic, because replacing the Wiktionaries means performing an automatic import, which is not possible. Coexisting with the Wiktionaries means that there are two projects with similar scopes supported by the WMF.
Having more features before becoming a WMF project is also a bit of a catch-22: becoming a WMF project would give us dedicated programmers who would add the needed features... In that respect, it would be nice to become a WMF project, even without the need to implement whatever missing features. We would also be happy to be hosted on the WMF servers, for a faster website.
The discussion on Meta is still open for comments and waiting for a decision. Cf. Requests for comment/Adopt OmegaWiki

What about an automatic import of Wiktionary?

So you want to import the Wiktionary(ies). Which Wiktionary?
  1. All!
    Then, you need to merge the data representing the same concepts from the different Wiktionaries. There is no easy way to do that with 170 Wiktionary each having its own structure.
  2. Only the English Wiktionary
    Because it is the biggest! Then, what do you do with the other Wiktionaries?
Also, importing the English Wiktionary alone, which would be a good start, is already not straightforward at all.
For example, one of the definitions of the word "medicine" at Wiktionary is A treatment or cure. Defining a word by using synonyms is not allowed at OmegaWiki, because all synonyms share the same definition, and you cannot define "cure" as a treatment or cure. So, the definition needs to be changed.
Furthermore, the page "medicine" at Wiktionary shows a list of translations in other languages for that concept. Other lists of other translations for the same concept will/could be found at the pages "treatment" and "cure". These lists need to be merged in order to be imported to OmegaWiki.

Doing it automatically, if feasible, is a lot of work. Doing it manually, i.e. checking whether the concept is already there and rewriting the definition where necessary, is totally possible: we have already achieved much with very few contributors.
Furthermore, some of the data such as IPA, gender, pinyin, declensions and conjugation tables, could be imported with a bot in the future.

My two cents,
Kipcool.

Monday, October 17, 2011

A better table sorting

Some languages, such as French, German, Spanish, and many others use diacritics: é, í, ü, etc.

The normal sort function available in Javascript considers that any letter with a diacritic should be sorted after z. It was a bit annoying when using the French interface because the language names were not sorted as expected: "gaélique écossais" after "gallois", "géorgien" after "grec", and "hébreu" after "hongrois".

As can be seen on the following screenshot, this has been corrected (red dots identify languages which are now correctly sorted).




Friday, October 14, 2011

The combobox does not ignore you

If you have edited OmegaWiki, you probably know about this annoying little bug where you want to select a language, type in some letters, but the list of languages that appears ignores the letters that you typed. This happened particularly when the server is busy. The "solution" was then to either add an extra letter, or remove one and put it back to make the system aware that you typed something.

Well, this bug has just been solved!

Now, when the combobox has finished displaying its list, it compares it with what is now typed in the field, and if something new has been typed, it refreshes the list consequently; and you get less annoyed.

As an experimental experiment, I have also changed the delay between when you type and when the list is generated to 100ms instead of 500ms.

Feedbacks welcome.