Tuesday, November 12, 2013

Using OmegaWiki for professional translation services

I just received a short job from Italian to German, subject tourism. This is the moment to test how easy it can be to work with OW data and to update it. I will definitely let you know afterwards. For now I downloaded the General Dictionary and the OFWB (Online Technical Dictionary) data in TSV-format and import it in my CAT-Tool, which in this case is going to be one of my favourites: DéjàVu X2 (there it is a bit more complicated to get the glossary in than with OmegaT). ... Well, let's see ... I'll tell you about my experience.

Monday, October 28, 2013

Open Source Dictionaries for Smartphones, Mobile Phones and Computers

DictionaryForMIDs and OmegaWiki are now partnering to provide software and dictionaries for smartphones, mobile phones, tablets and PCs.  Download the software for your device and add one or more dictionaries to them. In the first release you find monolingual, bilingual and multilingual dictionaries. If your language or combination of language/s is missing, don't hesitate to contact us (s.cretella [at] localcontent.eu).

Besides OmegaWiki dictionaries there are also plenty of other dictionaries from other sources online. Over time we hope to also find a good co-operation with them.

Over the time more language combinations and additional educational projects will be added.

Thursday, October 24, 2013

And ... we are back online

So here we go again. OmegaWiki is back and working. Have fun editing :-)

Tuesday, October 22, 2013

Server is down ...

... just a very short note: the OmegaWiki server is down. Probably a harware failure. We are working to get it back online as soon as possible.

Tuesday, October 15, 2013

Starting with a weekly "English word of the day"

OmegaWiki has many specific topics, though most people are interested in general terminology. Many people learn English or are just generally interested to know how things are done. The first series of "Words of the Day" were chosen from the general terminology. We wish to achieve similar efforts also for other languages. If you wish to have your language have a similar service, don't hesitate to contact us.

The English Word of the Day can be followed on various social networks:

Friday, October 11, 2013

Work once use multiple times

For many of us who deal with languages where there is often not even a handful of people working this is kind of a "must reach" situation. Those languages need all the support they can get and often hours and hours of research is necessary to find out how things should go. OmegaWiki, like many other database applications can be the basis for such a situation. But many others do not make their contents freely available.
Besides dictionaries (for computer and smartphone and eventually even printed) we do need spell checkers and in almost all cases an automated translation that then only needs proof reading would be "the thing".
Years ago I already wanted to reach the possibility to automatically "feed" contents to Apertium and we are half way done to go the way OmegaWiki-Apertium. The image shows that we can annotate our words to show that a noun is male or femal in German (just to take an obvious example for me living in Germany). What is missing is the exact definition of the paradigm to be used. This should be possible via the same feature we have for annotation.
Then there comes the second cave eat: OmegaWiki people would like to build automatic inflection, based on rules. Apertium already does exactly this. This means we need the same "grammar rules" to be defined and used.
IMHO it is relevant not to invent the wheel again, but to create interfaces to already existing projects so that one can help the other.
Like I already said: it does not make sense to me to work on many different projects to protect my language, because like for all people: there is only so much time.

Tuesday, October 08, 2013

Back to OmegaWiki

It's quite some time now I did not edit on OmegaWiki, and during that time we, mainly Outi and me, organised translations for a basic dictionary for children. Many people around the world contributed to these translations. I hope, we still have all names, since they will be on the "Children's Dictionary" project page of OmegaWiki. Right now I am adapting IDs the entries of my table are associated with the OmegaWiki ID and once we have that the collected data can be uploaded. This means right now I do have quite some work and it probably will take some months, BUT once done: things will become much easier.
Of course there are also other data sources I have in tables here, but all need to be checked and compared to what is already online. One step at a time :-)
So if you feel like contributing: please do so online or get in touch with me and I can provide you with the data export to be edited for the Children's dictionary.

Tuesday, August 20, 2013

Some dynamic editing for annotations

There are words like "water" that have plenty of translations at OmegaWiki. Each translation might have annotations (part of speech, gender, pronunciation, etc.) that can be directly viewed from the page "water" without having to click the translation and open a new page. This is not new.

What is new is that if you click on edit on such a page with a huge list of translation, it will load in an acceptable time and will not crash your browser.

The reason is that the annotations for the translations are not loaded on page load anymore, but only when the user clicks on "show/hide". It only loads what you need.

You can also now edit the annotations directly from the annotation panel itself, without editing and saving the entire page. This is Ajax magic :)

A few screenshots to explain the situation:

The annotation panel is loaded only when you click on "show". There are no annotations yet... Click "edit" to add some.

Here adding IPA, gender and part of speech. There are buttons for "save" and "cancel" on top and bottom of the annotation panel.
After clicking save, the annotations appear.

Thursday, July 04, 2013

View and download a list of words in any language

I was asked how it is possible to see the list of words that we have for a given language.

In the Data-search special page , it is possible to browse through the list of words in any language, using the "next 100" and "previous 100" buttons, similarly to how category pages works at Wikipedia. You can also filter your search by spellings.

For example the list of words in Micmac - an indigenous language of North America - can be consulted here (thanks, Amqui, for contributing :) ) and will look like this with definitions in French - because I have my interface in French:

Furthermore, a list of words can now also be downloaded in csv format in the new Ow_downloads special page , thanks to the work of Hiong3.eng5.

This page shows a list of lists which can be downloaded, and the date when it was generated. It is possible to click on "regenerate" to obtain a new list. The generation of an uptodate list is then processed by the server when it has time - in order not to slow down the normal operations on the server - and is usually ready after a few minutes, as one can see by visiting the page again.

If you are interested in a language that is not in the list, please request it, thanks!

It is planned to make it possible to download translation lists for any pair of languages from that page in the future.

Thursday, June 20, 2013

SQLite compatibility

Thanks to Hiong3.eng5, the WikiLexicalData extension that runs OmegaWiki is now compatible with SQLite - and still works fine with MySQL :-)

SQLite is a small SQL engine that takes almost no space on the disk, and which is very easy to set up (there is nothing to set up actually, just install it and it works). It is perfect for a test server.

So now you have no excuse not to contribute some code!

Wednesday, June 19, 2013

Follow us on Google+

I don't know if modern people are still supposed to use RSS (I do...) or if blogs are now followed on social networks.

In case you want to follow use on Google+, we now have a page:


Blog posts are automatically linked from Google+ when added.

The Wikidata proposal

Denny, one of the programmers behind Wikidata, has written a proposal of how Wikidata could be extended to support lexical data, with the ultimate goal to be integrated with the various Wiktionaries.

What do you think?

Some notes:
- it is only a proposal, as far as I know, there is not yet a budget or a person to implement those changes;
- it conflicts with the Adopt OmegaWiki proposal, which also proposed to integrate with the Wiktionaries, and therefore would probably be opposed to by several contributors of the biggest Wiktionaries for the same reasons;
- Wikidata being an official WMF project, any extension of it would be also a WMF project (I guess?);
- if ever Wikidata supports lexical data, a migration of data from OmegaWiki to Wikidata (or in the other direction) is very easy, since both would be relational databases and concept oriented - contrary to a migration of Wiktionary data to OW or Wikidata, which is more complex. But that will be discussed in due time if necessary.
- In the meantime, OmegaWiki is already there, and the Wikidata proposal is not ;-)

Tuesday, June 04, 2013

When part of speeches matter

For linguists, translators, et al. part of speech (adjective, noun, verb) is pretty much important. In dictionaries, it is the first information that is given about a word. The different meanings of words are usually sorted first by part of speeches.

From a relational database-ical point of view, however, a part of speech is just one annotation of a syntrans (the association of a word and a meaning) and looks equally important as any other syntrans annotation, such as "gender", "international phonetic alphabet", "area" (indicating if a word is spoken only in a specific area).

In Wiktionary, the interface came first, and it is clear from viewing a page that a word is either a noun, a verb, etc. For a computer it is less clear. In OmegaWiki, it is quite the contrary. The relational database came first, so that the data is very computer-friendly, but then we have to build an interface on top of it and tell it what is important for a human. If part of speeches matter, we have to display them more visibly and sort meanings by part of speeches.

This feature was often requested, and this is exactly what OmegaWiki does now :-)

Definitions and translations of the various meanings of the English  word "round" in French.

As usual in OmegaWiki, the part of speech information is translated in the user language. When the part of speech of a definition is not known, it will display "??".
Definitions and translations in Spanish of the English word "square".

When none of the meanings have a part of speech, it will not display "??" because having only "??" on top of the page would only be confusing.

Definition and translation in Breton of the Cantonese word for France.
Yes, we do Breton-Cantonese dictionary :)
Most of the missing data about part of speeches of words could be imported by bots. We already have the API to add annotations easily. We just need someone who would like to run such a bot. Any programming language will do.


Monday, May 27, 2013

Lexical annotations displayed more prominently

In OmegaWiki, we make the distinction between lexical annotations and semantic annotations.
* semantic annotations are information about a concept. They do not depend on the language, but rather on the meaning. For example an image illustration is the same for all languages. Also, if "dictionary" is a hyponym of "book", it is also true that in French "dictionnaire" is a hyponym of "livre".
* lexical annotations, on the contrary, are information about a specific word of a specific language. This include grammatical data, pronunciations, etymology, etc.

An extensive List of annotations can be consulted on the website.

In a dictionary, it is usually expected that lexical annotations are displayed on top. In OmegaWiki, it used to be that these annotations only appeared when you click on the annotation column in the translation lists. They were so well hidden that several people concluded that we just did not include such information.

This has been improved a few months ago. Now, here is how the definition of "dictionary" looks with the interface in English: 

click for a better view

For a word in Mandarin, it would display its pinyin (when available). For example, 键盘 with the interface in French:

Thursday, May 09, 2013

We are back online!

http://www.omegawiki.org is back online.

I was able to repair whatever I did wrong, so that a fresh install was not necessary. No data was lost.

Thanks to Erik for helping :)

Sunday, May 05, 2013

OmegaWiki is down

The server update (new Debian version) didn't go as planned and the server is down.


PS: I did a database dump just before it went down, so nothing is lost.

Tuesday, February 19, 2013

Give us your support for merging with Wikimedia

So... regarding having OmegaWiki become one of the Wikimedia projects...

The following pages have been redrafted:

Only with a sufficient number of support will the idea actually be considered by the Wikimedia Foundation.

As you might know, OmegaWiki is currently privately hosted. There are two problems with this:
- the server is slow, because faster servers cost more money which we don't have
- and without an organization supporting us, we cannot receive donations.

The WMF would be a perfect choice to host us.

However, if you know about another non-profit organization which shares the noble ideal of free-knowledge-for-all and which might be interested in supporting OmegaWiki (which means basically hosting the website and providing a structure for receiving donations), please mention it.


Saturday, February 16, 2013

Using Wikidata to display links to Wikipedia

Wikidata is a new wiki with a structured database (a bit like OmegaWiki), which is used, among other things, as a central repository for maintaining the interwiki links between the various Wikipedias.

In Wikidata, concepts are identified with a unique ID, like "Q41055". This ID gives you the links to the various Wikipedia articles (see: http://www.wikidata.org/wiki/Q208440 ). In OmegaWiki, concepts are called "DefinedMeaning", and also have a unique ID, like "159079" (see: http://www.omegawiki.org/DefinedMeaning:Schachfigur_(159079) ). They can be linked one-to-one with Wikidata IDs. Using the Wikidata ID, a link to Wikipedia is automatically retrieved and displayed on the side.

As usual, the user language is used to redirect the user to the Wikipedia that is in his language. See for example how the link changes when you view the same page in English , in French , in German or in Persian .

Before that, we already had links to Wikipedia, however one link had to be added for each language. The advantage of the new system over the previous one are:

  • we don't have to add a link for each language, as we did before, but just put one Wikidata ID.
  • the links are automatically updated when they are changed on Wikidata.
  • it works even when we don't have a translation in that language. For example, if we have a species, like Spalerosophis diadema that has no translation in French ("Spalerosophis diadema" is language "international") we can still have a link to a Wikipedia article.
  • it is an annotation at the DefinedMeaning level. This is more consistent with the other annotations, because it is about a concept (DM), whereas it was previously classified with the lexical annotations (like part of speech, gender, phonetics, etc.)

With the old system, we already created more than 15.000 links to Wikipedia. They will be converted automatically to Wikidata IDs.

Note that the link can work in the other direction too. When we have one-to-one correspondances between concepts in Wikidata and OmegaWiki, it can be used in Wikidata to get for example multilingual definitions for their concepts.

Wednesday, February 06, 2013

See only the languages you want to see

OmegaWiki supports more than 400 languages. While we are happy that some words, such as "head" and "apple" have more than 100 translations, we can imagine that not everybody is interested in seeing all translations. Many people only want to see the few languages that they actually speak, and do not care about the rest.

And this is now possible :-)

There is an option in the user preferences (see image below) where the user can select the languages he is interested in, so that all the other languages are hidden.

This is done by:
- selecting the languages from the list of all languages available at omegaWiki
- checking the checkbox on top, in bold "Show only the selected languages".
This second checkbox acts as a switch, allowing to quickly enable/disable language filtering without each time going through the list of languages and selecting/unselecting each language.

Also, if you restrict the number of languages, the pages will load faster (in particular when editing).

To avoid blank pages, and since many words are at least translated in English, it is best to always have English selected. You could also try with Spanish instead of English, it is the second best candidate in terms of number of translations.

Friday, January 04, 2013

Wikidata extension renamed to WikiLexicalData

OmegaWiki used to be based on the "Wikidata" extension of MediaWiki.

However, recently, a new project called the Wikidata project, which is based not on the Wikidata extension, but the Wikibase extension, was started. This confused a lot of people.

At the same time, extensions are being moved from Subversion to Git. Therefore, during that moved I have asked that the Wikidata extension be renamed to WikiLexicalData extension.

Why lexical? The Wikidata extension was initially developed for OmegaWiki, storing words, definitions and translations. It was believed however that the extension could be easily adapted for something else, with little code changes, and that the lexical part of Wikidata would be only one of the possible uses. After several years, there was obviously no interest in using that extension for something else. After having played with the code myself, I now believe that the languages are too central to that extension and that a major rewrite would be needed to code for something else, and probably starting from scratch would be as fast as a major rewrite. This is also why the Wikidata project started from scratch instead of starting from the Wikidata extension.

It is even now considered that in the future, the OmegaWiki database will be kept as is, but the extension code will be rewritten (completely?) to depend on the Wikibase extension (Wikidata project). This, of course, depends on the outcome of the Wikidata project, but it seems to give promising results at the moment :-)

For the user of OmegaWiki, all of this changes nothing.
For the developers, update you Svn to Git, and change the extension name.