Thursday, June 20, 2013

SQLite compatibility

Thanks to Hiong3.eng5, the WikiLexicalData extension that runs OmegaWiki is now compatible with SQLite - and still works fine with MySQL :-)

SQLite is a small SQL engine that takes almost no space on the disk, and which is very easy to set up (there is nothing to set up actually, just install it and it works). It is perfect for a test server.

So now you have no excuse not to contribute some code!

Wednesday, June 19, 2013

Follow us on Google+

I don't know if modern people are still supposed to use RSS (I do...) or if blogs are now followed on social networks.

In case you want to follow use on Google+, we now have a page:

https://plus.google.com/111583165631926912668/posts.

Blog posts are automatically linked from Google+ when added.

The Wikidata proposal

Denny, one of the programmers behind Wikidata, has written a proposal of how Wikidata could be extended to support lexical data, with the ultimate goal to be integrated with the various Wiktionaries.

What do you think?

Some notes:
- it is only a proposal, as far as I know, there is not yet a budget or a person to implement those changes;
- it conflicts with the Adopt OmegaWiki proposal, which also proposed to integrate with the Wiktionaries, and therefore would probably be opposed to by several contributors of the biggest Wiktionaries for the same reasons;
- Wikidata being an official WMF project, any extension of it would be also a WMF project (I guess?);
- if ever Wikidata supports lexical data, a migration of data from OmegaWiki to Wikidata (or in the other direction) is very easy, since both would be relational databases and concept oriented - contrary to a migration of Wiktionary data to OW or Wikidata, which is more complex. But that will be discussed in due time if necessary.
- In the meantime, OmegaWiki is already there, and the Wikidata proposal is not ;-)

Tuesday, June 04, 2013

When part of speeches matter

For linguists, translators, et al. part of speech (adjective, noun, verb) is pretty much important. In dictionaries, it is the first information that is given about a word. The different meanings of words are usually sorted first by part of speeches.

From a relational database-ical point of view, however, a part of speech is just one annotation of a syntrans (the association of a word and a meaning) and looks equally important as any other syntrans annotation, such as "gender", "international phonetic alphabet", "area" (indicating if a word is spoken only in a specific area).

In Wiktionary, the interface came first, and it is clear from viewing a page that a word is either a noun, a verb, etc. For a computer it is less clear. In OmegaWiki, it is quite the contrary. The relational database came first, so that the data is very computer-friendly, but then we have to build an interface on top of it and tell it what is important for a human. If part of speeches matter, we have to display them more visibly and sort meanings by part of speeches.

This feature was often requested, and this is exactly what OmegaWiki does now :-)

Definitions and translations of the various meanings of the English  word "round" in French.

As usual in OmegaWiki, the part of speech information is translated in the user language. When the part of speech of a definition is not known, it will display "??".
Definitions and translations in Spanish of the English word "square".

When none of the meanings have a part of speech, it will not display "??" because having only "??" on top of the page would only be confusing.

Definition and translation in Breton of the Cantonese word for France.
Yes, we do Breton-Cantonese dictionary :)
Most of the missing data about part of speeches of words could be imported by bots. We already have the API to add annotations easily. We just need someone who would like to run such a bot. Any programming language will do.

Thanks,
Kipcool