Wednesday, December 14, 2011

More than 1000 images

In February 2011, the possibility to add images to OmegaWiki was implemented. A few months later, we have already more than 1000 images. Images help to understand definitions and are an important feature of (online) dictionaries.

In OmegaWiki, when an image is added, it is attached to a particular concept, and is then displayed for all words representing that concept. Compared to Wiktionary where you would need to add an image of a concept for each page of each synonym, and for each page of each language, there is a lot of redundant work that is avoided.

As of today, there are about 47.000 concepts in OmegaWiki. It means that more than 2% of them have an image. It is not clear what percentage can be reached. Several concepts are abstract (feelings, reasonings), and other are verbs or names of languages and are hard to illustrate.

I find it particularly useful when a word with different meanings has different images, such as vampire, bluebottle, Mars, Venus, etc. Also interesting is to find an illustration for abstract concepts. This is sometimes possible: season, wake up or cheapskate.

P.S.: we have also almost 15.000 links to Wikipedia articles :-)

Sunday, November 27, 2011

Selecting the interface language

OmegaWiki tries to be as multilingual as possible in that its data is multilingual, and its interface can be displayed in any language supported by MediaWiki (more than 300 languages).

I've just added a combobox on the left to change the language of the interface. Previously, you had to create an account and change it in your preferences. Now, anonymous users can also change the language.

A nice feature is that when you set the interface language in your language, the definitions will be shown preferably in your language as well, when available.

I do not speak all languages, so I cannot test everything. Therefore I ask the question:
Is your language displayed correctly?


Sunday, November 06, 2011

The Glosbe dictionary uses our data is a multilingual dictionary where many bilingual translation dictionaries are available. It started earlier this year, and already have an impressive number of translations.

Their data come from several freely available online resources, including OmegaWiki. For example, the many words I have added in Bavarian are there:

It also shows definitions from OmegaWiki, and inflexion tables from Wiktionary.
They have managed somehow to merge all these sources, and present them in an organized manner. For an example of a page with many sources involved, see:

Furthermore, Glosbe is also a translation memory, and they claim to have gathered more than 1 billion sentences already. I think this is a great resource for translators. New features will come in the future, they told me.

It is always nice to learn that someone uses our data. It makes our work meaningful!

Tuesday, October 25, 2011

OmegaWiki, Wiktionary and the Wikimedia Foundation

Many people still ask me:
  • What is the point of OmegaWiki if we already have Wiktionary?
  • Why is OmegaWiki not a Wikimedia Foundation project?
  • When will OmegaWiki replace the Wiktionarys?
So I'll write my answers in this blog post, so that I can easily refer people to it when they ask ;-)

Why OmegaWiki was created

The idea of OmegaWiki came to several multilingual contributors to Wiktionary, somewhere around 2004.
In Wiktionary, when you want to add a French=>English translation, you would edit the French word at the French Wiktionary, and add the translation. If you want that the English Wiktionary also has your translation, you'll have to edit the French word at the English Wiktionary and add your translation.
You'll probably also want to add the same translation in the other direction, i.e. English=>French. This means that two more pages have to be edited. So, basically to indicate that a French word XX and an English word YY express the same concept, you'll have to edit 4 pages to be complete. Now, consider that there are 170 Wiktionaries as of today... There is much redundant work involved.
The same will happen if you want to add relations, such as hypernyms. Even though hypernyms are based on semantics, and are therefore (more or less) independent of languages, you'll have to edit all the Wiktionaries one by one to enter such a relation.
The idea of OmegaWiki is therefore to be based on concepts (what we call DefinedMeaning). You'll only need to edit that concept once to indicate that a French word XX and an English word YY are representing this concept.

So, OmegaWiki is created

Around 2007, the first prototype version of OmegaWiki came online (at the time, it was called WiktionaryZ, but renamed later on because it is not a WMF project).
Concerns were expressed such as:
  • I contributed so much data to Wiktionary, I want this data to be automatically imported before I come to OmegaWiki
  • it lacks a lot of features that I'd like to see before I come from Wiktionary (e.g. inflexions table which we still don't have)
  • discussions on such a multilingual project will be obviously mainly held in English, and my English is not good enough, so I prefer to stay on my language version of Wiktionary
Then, we had no programmer anymore to implement new features.

OmegaWiki is not a WMF project

OmegaWiki was created with the idea of becoming a project of the Wikimedia Foundation.
Its software is an extension of MediaWiki called Wikidata. The source code is stored on the WMF servers.
Its community is mostly made of people who came from Wiktionary and still contribute to other WMF projects (Wikipedia, Wikisource, etc.). So we are not so far away from the WMF...
However it has been said that OmegaWiki can become a WMF project only if it has already all the features of the Wiktionaries, and if it is decided whether OmegaWiki will replace the Wiktionaries or coexist with them.
This is problematic, because replacing the Wiktionaries means performing an automatic import, which is not possible. Coexisting with the Wiktionaries means that there are two projects with similar scopes supported by the WMF.
Having more features before becoming a WMF project is also a bit of a catch-22: becoming a WMF project would give us dedicated programmers who would add the needed features... In that respect, it would be nice to become a WMF project, even without the need to implement whatever missing features. We would also be happy to be hosted on the WMF servers, for a faster website.
The discussion on Meta is still open for comments and waiting for a decision. Cf. Requests for comment/Adopt OmegaWiki

What about an automatic import of Wiktionary?

So you want to import the Wiktionary(ies). Which Wiktionary?
  1. All!
    Then, you need to merge the data representing the same concepts from the different Wiktionaries. There is no easy way to do that with 170 Wiktionary each having its own structure.
  2. Only the English Wiktionary
    Because it is the biggest! Then, what do you do with the other Wiktionaries?
Also, importing the English Wiktionary alone, which would be a good start, is already not straightforward at all.
For example, one of the definitions of the word "medicine" at Wiktionary is A treatment or cure. Defining a word by using synonyms is not allowed at OmegaWiki, because all synonyms share the same definition, and you cannot define "cure" as a treatment or cure. So, the definition needs to be changed.
Furthermore, the page "medicine" at Wiktionary shows a list of translations in other languages for that concept. Other lists of other translations for the same concept will/could be found at the pages "treatment" and "cure". These lists need to be merged in order to be imported to OmegaWiki.

Doing it automatically, if feasible, is a lot of work. Doing it manually, i.e. checking whether the concept is already there and rewriting the definition where necessary, is totally possible: we have already achieved much with very few contributors.
Furthermore, some of the data such as IPA, gender, pinyin, declensions and conjugation tables, could be imported with a bot in the future.

My two cents,

Monday, October 17, 2011

A better table sorting

Some languages, such as French, German, Spanish, and many others use diacritics: é, í, ü, etc.

The normal sort function available in Javascript considers that any letter with a diacritic should be sorted after z. It was a bit annoying when using the French interface because the language names were not sorted as expected: "gaélique écossais" after "gallois", "géorgien" after "grec", and "hébreu" after "hongrois".

As can be seen on the following screenshot, this has been corrected (red dots identify languages which are now correctly sorted).

Friday, October 14, 2011

The combobox does not ignore you

If you have edited OmegaWiki, you probably know about this annoying little bug where you want to select a language, type in some letters, but the list of languages that appears ignores the letters that you typed. This happened particularly when the server is busy. The "solution" was then to either add an extra letter, or remove one and put it back to make the system aware that you typed something.

Well, this bug has just been solved!

Now, when the combobox has finished displaying its list, it compares it with what is now typed in the field, and if something new has been typed, it refreshes the list consequently; and you get less annoyed.

As an experimental experiment, I have also changed the delay between when you type and when the list is generated to 100ms instead of 500ms.

Feedbacks welcome.

Wednesday, August 03, 2011

The Check Spambots extension

We have recently opened the editing for anonymous users. As a consequence, we are spammed a lot more...

In the last days in particular, we have had in particular many spams from different IPs on a single page (but it is not so much of a problem because spambots are not capable of editing Wikidata pages, due to their different structure).

The website keeps an updated list of IPs that are spamming various websites, and all the IPs who spammed us on that day where already in their database. Therefore, it makes sense to use their data.

I knew the possibility to import Stopforumspam's IP list to the blacklist of a MediaWiki installation. However, it is stated that
[... you should] allocate a few more megabytes of APC cache... as, with this 1300000-byte file added to your configuration, you'll need it.
Since we already have problems with memory on the OmegaWiki server (only 1GB), it could have been problematic. An other problem is that the list is not refreshed automatically.

By clicking around, I then discovered the Check Spambots extension. This extension sends a query directly to some online IP-spam databases, among them Stopforumspam. An advantage is that you don't need more memory in your wiki, and the list stays up to date. An inconvenience is the additional time needed to query the external website (however, very quick from what I observed), but I made a small modification so that it does not affect logged-in users anyway.

Now we should have fewer spam, I hope :-)

Thursday, July 28, 2011

Random expression

The link to show a random expression is now working as expected.

You can try it on the website (left panel) or with the link below:

I got bugenwilla , it even has an image :-)

Wednesday, July 13, 2011

SQL dump with only lexical data

For people only interested in our lexical data, but not to create their own copy of the OmegaWiki website (including the wiki pages), there is now a dump called "lexical data dump".


Until now, the only option was to download a full SQL dump containing all the data (except for private user data). The full dump is large (more than 1GB uncompressed) and it takes a long time to process. The lexical data dump is much smaller (around 175MB uncompressed).

Sunday, July 10, 2011

OmegaWiki database explained

This week-end, I have written the documentation of the tables used by WikiData and OmegaWiki. As far as I know, we were missing this kind of documentation.

It can be consulted at:

For anybody interested in the technical aspect: have a look, correct the mistakes, ask for clarifications or sample MySQL queries, etc. :-)

I hope it helps.


Saturday, July 09, 2011

More than 300 languages

At OmegaWiki, it is now possible to add definitions and translations in more than 300 languages.

Today were added the following languages.
- Assyrian Neo-Aramaic
- Bikol Central
- Central Aymara
- Chamorro
- Filipino
- Jingpho
- Northern Sami

Among all our languages :
- 227 have more than 10 words
- 122 have > 100 words
- 55 have > 1.000 words
- 11 have > 10.000 words
 and English is getting close to 50.000.

We also have definitions in more than 100 languages (though not for all concepts... all help is welcome)

For more statistics, visit the statistics page.

Thursday, July 07, 2011

Subscribing to the Word of the Day

Since the beginning of OmegaWiki, in 2005, we have had a Word of the Day, almost every day.

The Word of the Day is displayed on the main page and is a nice invitation to contribute, either by adding translations, definitions, an image, or any other information.

From now on, it is possible to subscribe to the Word of the Day, either via Facebook, or with an RSS feed (generated by Facebook). The word of the day is also displayed on the right of this blog. Of course you are all welcome to subscribe and contribute ;-)

Today, the word of the day is mac 'n' cheese . It is something I ate recently. Very good when you don't care about your weight...

Tuesday, July 05, 2011

Main Page design changed

MediaWiki 1.17 came with the pretty Vector skin. This skin has the search field on top, and consequently I removed the pre-existing search area on top of the main page.

Then I centered the title, played a bit with the css border-radius, removed the boxes that nobody uses, and resized the looong list of language portals so that it takes more width (and less height).

Here is how it should look on modern browsers:

You can visit and have a look! If you see squares where it should be round, you need to upgrade your browser...

The main pages for the other languages need to be updated as well. You can help! For the few languages where it was already up to date with the last version, it will not be too much of an effort. For the other ones, it requires a bit more time, but it is not complicated.

We are also planning on adding a section "Did you know?", updated e.g. daily, with interesting facts on words of all languages and interdependencies between languages. More on that later!

Thursday, June 30, 2011

OmegaWiki upgraded to MediaWiki 1.17

OmegaWiki has been successfully upgraded to the brand new MediaWiki 1.17.

Most of the changes are internal things (the database structure has changed, and several things had to be rewritten here and there to be compatible with the new version).

As a user, you might notice that everything still works like before :)

You will also experience the new interface (Vector) which is now default, like in the Wikipedias. The old interface (Monobook) can be activated in your preferences.

The upgrade also permitted to get the latest translations for the interface and to upgrade the extensions. Some of them where 1-2 years old.

Feedback on any bugs that you may find are more than welcome.

Monday, June 20, 2011

Automatic search suggestions

The automatic search suggestion, that has been available at Wikipedia for some time now, has just been enabled at OmegaWiki.

Now, when you type a text in the search box, it suggests results according to the words that are present in our database.

At the moment, the "Expression:" will unfortunately appear, as well as data from UMLS and Swissprot that we would need to filter out. In fact, the functionality should be rewritten specifically for OmegaWiki, with also the possibility to specify a language (a nice mix of php, java and sql... anybody interested in programming it?)

However, I think it is already useful as it is now.

You are all invited to try it out (no need to be logged in).

Note that if you don't like, you can disable it in your user preferences under "Search Options" => "Disable AJAX suggestions".

Monday, May 09, 2011

Example sentences can be long

For some reason, the text annotations - contrary to definitions - were limited to 256 characters. This was too limiting for example sentences, in particular when these examples are citations where you need to extract a long sentence and add the title of the book, the author, etc.

This limit has just been extended. From now on, text annotations can be as long as definitions.

Other text annotations include IPA, hyphenation and other language specific annotations such as pinyin for Mandarin. At the moment, you can access the annotations by clicking on the link "Annotation »" in the last column of the translation table. We are still working on a way of making the annotations more visible for the word that is being consulted.

Thursday, March 10, 2011

E-Mail functionality now working again

I have just noticed today that OmegaWiki was not sending any E-Mail. For example when you forget your password, you couldn't have it e-mailed to you.

This is probably due to the server change in March last year.

I've just repaired the functionality.


Monday, February 21, 2011

Images to illustrate the definitions

There is now the possibility to add images from Commons in order to illustrate the definitions and make OmegaWiki look prettier.

The following is a screenshot of Expression:mouse.

The image annotation is given at the DefinedMeaning level, which means that when the image is added once to the concept, it becomes available for all the languages : souris , Maus , etc.

Tuesday, February 15, 2011

Anyone can edit!

As of today, anonymous users can also contribute to OmegaWiki, without the need to create an account and having to wait to be given edit rights by a bureaucrat.

However, since we do not have roll back functionalities at the moment, IP users can add new data, but not modify existing data.

Let's see what comes out of it (more spam or more contributors?).

Tuesday, February 08, 2011

Taboo subjects


there is an ongoing discussion at omegawiki about why OmegaWiki is struggling (the number of contributors stays low), and what to do about it, such as find money and hire a developer.

Please take part in the discussion and submit your ideas.