CiviCRM: Multi-language Support
Multi-language editing
CiviCRM: Multi-language Support
Written by Piotr Szotkowski   

The last part of my Summer of Code project was multi-langage editing of the internationalised fields. The aim of this task was to be able to edit all language versions of a given field (say, a given contact’s first name) in a centralised place.

After some initial head-scratching I decided the most useful way of implementing this would be to add a small icon next to the internationalised fields in their respective editing screens; once the user clicks this icon, a small dialog with this field’s values in all the enabled languages would pop-up and the user would be albe to adjust this field’s value in all the languages on one go.

After struggling a bit with various issues, this feature is implemented using the Dojo toolkit’s Dialog widget; as I didn’t want to submit the ‘underlying’ form (nor reload it after the user edits the multi-language field), the small dialog form is submitted with Ajax, this time using Dojo’s xhrPost.

All of these were my first times with JavaScript, Ajax and Firebug-powered script debugging, and I’m must say I’m quite impressed with the ease of use and straightforwardness of the Dojo library – a very nice finishing touch to all the great experiences accompanying Google Summer of Code.

 
51 Votes

0 Comments

CiviCRM 2.1.alpha1 released with initial multi-language support
CiviCRM: Multi-language Support
Written by Piotr Szotkowski   

Long time no blog – partly because I was traveling in the USA most of July, partly because we were busy finishing last CiviCRM 2.1 features. That said, I’m more than happy to announce that the core multi-language support developed during the Summer of Code project made it to the abovementioned CiviCRM version, and is part of the freshly-released CiviCRM 2.1.alpha1 (alpha1 has actually a stupid bug that makes it more of a proof-of-concept release with regards to multi-language support, but I fixed the bug yesterday and alpha2 should be fully working). :)

As sketched previously, I ended up introducing the internationalization support on a very low level of abstraction, thus making it almost inivisible to the rest of the core development team (and any third-party coders); this way, there’s much bigger chance everything will ‘just work’ and not break when using multiple languages, and that any new contributions won’t break on multi-language sites either.

Basically, the whole stack works as follows:

1. A new XML element was added to our database schema structure, <localizable>true</localizable> – if any database field has this property, it will be represented by multiple, per-language columns in the final database.

2. For every supported language and every internationalized table, a $tablename_$locale view is created that exposes the localized columns under their ‘original’ names.

3. Our DAO files (the auto-generated backbone of our ORM) ‘know’ whether to operate on a table directly or through a view, so if a given piece of code operates on the database through DAO it just works as before.

4. Any hand-crafted SQL query sooner or later passes through CRM_Core_DAO::query(), which uses regular expressions to rewrite the query and replace any internationalized table names with view names related to the currently-used language.

5. For every localized table, an ON INSERT trigger is created, which populates all localized columns with the contents of the column that is not-NULL; thus, if you create a new organization in English, its name will be populated in all languages – but you can then alter the Russian version to be spelt with cyrillic.

All in all, I’m pretty happy with how this part of the project ended up – everything seems to ‘just work’, and doesn’t introduce any additional maintenance burden on the others developers.

 
42 Votes

0 Comments

Adjusting the code to use the view-based approach
CiviCRM: Multi-language Support
Written by Piotr Szotkowski   

One of the issues to solve when implementing the view-based approach described in my previous post on the topic is how to make the current codebase be aware of when to use a view and when to keep operating on a table.

The problem is twofold. First, the set of tables which hold localisable data will change from release to release (we might want to localise more data in the future, or schema changes might move a localisable column to some other table), and we don’t want have to track what table is localisable by hand; at the same time, localising all tables wouldn’t be practical (as we really want to do that to just a couple of them). Second, it would be most useful if there was a way that would somehow automagically handle this without other CiviCRM coders having to remember to glue $dbLocale variable to the end of any hand-crafted SQL.

Most of CiviCRM’s database operations are done through an ORM mapper (DB::DataObject), and are handled by auto-generated DAO classes; it was enough to switch the template for generating these classes to make the getTableName() method return the proper view’s name.

Unfortunately, quite a bit of custom data functionality is done using hand-crafter SQL queries. There are often referenes in the same query to both localisable and non-localisable tables; the references are in all of SELECT, FROM and WHERE parts of the queries; also, quite often the resulting column names incorporate table names as prefixes, under which they are subsequently visible as object properties. Also, quite often the table names used in the queries are not literal strings, but keys in hashes which are also used elsewhere.

All of the above made me scratch my head quite a bit; even if I could track down all the references to table_x (and turn them to references to table_x_$dbLocale), I still would wonder whether I didn’t miss anything, it would be hard to grasp by any other developer (‘why does this table’s name have something appended to it, while that doesn’t?’), hard to maintain, and all the people who wrote third-party extensions would have to update their code as well.

All of the custom SQL queries pass through one central place, namely CRM_Core_DAO::executeQuery(); unfortunately, this function gets the query as a single string, not as a set of parameters that could be operated upon (the assumption being that it’s used for those of the queries that DataObject can’t handle, most often more sophisticated ones). Fortunately, after looking a bit at the queries passed through it, I managed to come up with a set of regular expressions that match the localisable table names and can replace them with view references, while at the same time skipping any of those references used in output labeling (and, thus, maintainting the PHP-side property references).

This approach ends up in very small changes (basically, a couple of lines in one method call), while at the same time being ‘invisible’ to other developers, who don’t have to remember (or even know) they’re now using localised views; unless they debug the actual SQL, it still looks (and feels) like direct table access to any code outside executeQuery() and the DAOs.

 
43 Votes

0 Comments

A view-based approach to multi-language CiviCRM
CiviCRM: Multi-language Support
Written by Piotr Szotkowski   

Long time no blog – mostly because my initial concept of bringing the multi-language features to CiviCRM was replaced with a brand new approach, which should be much more developer-friendly.

Having the contents of a CiviCRM site in multiple languages means that certain columns in the database (the user-visible ones) must be localisable – but how to implement this from the database point of view is far from obvious.

My initial approach was to create a single civicrm_l10n table, with columns of entity_table, entity_column, entity_id, locale and translation. This approach has the advantage of being space-efficient; if only a handful of the database’s contents is localised in a given language, this table would hold just a couple of rows. This is also the least disruptive approach from the database’s point of view: only one new table is introduced.

Unfortunately, this approach has the drawback to be much more code-disruptive – any query that retrieves database values for display must be changed to check whether there isn’t a localised version in civicrm_l10n; any save operation would have to save to civicrm_l10n if the language is not the default one. This kind of disruption affects all the other CiviCRM developers, including the core team; evern since the introduction of this code onto the main repository, the developers would have to cater for the multilingual stuff when maintaining the codebase, and should consider internationalisation issues when writing new code.

The will to ease the future development of CiviCRM led my train of thought onto new tracks. What if instead of having civicrm_l10n table entries like ('civicrm_option_value', 'label', 69, 'pl_PL', 'Tłumaczenie') – which would mean that if you’re using Polish and are trying to display the contents of the row 69 and column ‘label’ from the ‘civicrm_option_value’ table, then you should display ‘Tłumaczenie’ instead of the original – we could stick to the current queries of simply displaying column ‘label’ for row 69 of X (where X is currently civicrm_option_value)?

Then it hit me – what if I used MySQL views for this? It turns out this seems like a sane idea. Instead of a separate civicrm_l10n table, every column that needs to be localisable in tablename is multiplied as columnname_locale, and a new view, tablename_locale is created that makes this column appear as columnname inside of it. For the above example, instead of having civicrm_l10n table entries with entity_table of ‘civicrm_option_value’ and entity_column of ‘label’, the civicrm_option_value table would simply gain label_pl_PL column and a civicrm_option_value_pl_PL view would be created that would work just like the original civicrm_option_value table, but with the label_pl_PL column visible as ‘label’.

This way, any code that currently operates on the civicrm_option_value table (and reads or  writes to column label) would still work if it was only changed to operate on the civicrm_option_value_pl_PL view instead.

I believe this is a viable approach; when using our DAO classes, the code should refer to the _tableName property (which can be build dynamically depeding on the currently-used locale), and when creating SQL by hand, it should simply refer to civicrm_table_$locale instead of civicrm_table (where $locale holds the current locale).

The coming days should see the implementation of this on the gsoc-i18n branch. Stay tuned for further blog posts on how it turned out.

 
41 Votes

0 Comments

Localisation of the new menu structure
CiviCRM: Multi-language Support
Written by Piotr Szotkowski   

Due to the introduction of a new menu system in CiviCRM 2.1, my summer project got one more item on its list – the localisation of the menu entries.

Until now, all of the localisable strings in CiviCRM were enclosed either in the PHP’s ts(…) function calls or in Smarty’s {ts}…{/ts} blocks. This approach was most convinient: first – the function/block call was short to type (and did not introduce a lot of noise into the code); second – making sure CiviCRM is internationalised for translation (i.e., all of its strings are localisable) was as easy as looking through the code/templates and wrapping the visible strings in the above calls; third – once we’ve written a simple PHP/Smarty parser, creating the POT files for translators’ use was simple: just scan the code and the templates and pull out anything that’s inside those calls.

This approach worked quite ok until the new menu system appeared on the scene; from CiviCRM 2.1 on, the entries in the menu are not defined in the code, but instead in the database (in the civicrm_menu table), which itself is generated from a set of XML files.

Fortunately, after some digging around I found out that the two methods that create the final structure for menu and page title display – CRM_Core_Menu::get() and getNaviation() – happen to keep the translatable strings in array fields keyed with title, so a method localising them in place and a new POT file with the menu structure (in the future, generated automatically from the abovementioned XML files) was all that was needed to make the menu localised to the currently selected language.

 
39 Votes

0 Comments

Language switching and choice persistence
CiviCRM: Multi-language Support
Written by Piotr Szotkowski   

As the Summer of Code coding phase got rolling, I started looking at the tasks I listed in my original description of the project.

The first task on the list was to make language switching in CiviCRM available to the users. This was introduced a list of links on the user’s dashboard. As CiviCRM currently has (granted – more or less advanced…) 32 localisations, it was essential that this list can be managed by the administrator of the site; this is now a part of the Administer CiviCRM → Global Settings → Localization admin screen.

The second task was to make the language selection permanent for a given user. Initially, I chose to use session variable to hold this information; unfortunately, this didn’t work so well, as the language got reset to the installation’s default on every logout.

As every CiviCRM user is also a contact in the given installation’s databse, the ideal solution would be to have the language selection be a setting associated with the contact; unfortunately, per-contact settings are not yet implemented in CiviCRM. Once they’re implemented, the language chosen by the user will be stored there; for now, a simple hack was implemented to store the preference as a year-long-valid cookie in the user’s browser.

 
40 Votes

0 Comments

Setting up the technical infrastructure and planning ahead
CiviCRM: Multi-language Support
Written by Piotr Szotkowski   

Being one of the CiviCRM developers for the past three years, the community/boding period in my case went pretty nicely. :) As part of my regular CiviCRM activities, through most of the past month I’ve been working on the new dedupe engine, and I’m really happy with the results – but it’s high time now to concentrate on my Summer of Code activities.

As my set of CiviCRM hats includes being the project’s Subversion repository administrator, it was my pleasure to setup the repositories for both Jon’s and mine GSoC projects – the branches are gsoc-ui and gsoc-i18n, respectively (for those interested, browsing the activities in our repository is most efficient with our FishEye install).

Given CiviCRM’s fast release cycle (three-four releases a year), the code from the GSoC projects won’t be a part of the upcoming CiviCRM 2.1 release, and will either be included in CiviCRM 2.2 or CiviCRM 2.3. To that end, both projects were branched from the main developer ‘trunk’, and as our main development on trunk continues, the changes happening on trunk will be merged to both GSoC branches on a ~weekly basis.

The decision when to include both (or either) projects’ code into CiviCRM ‘proper’ will be made after we branch for CiviCRM 2.1, and will depend on the shape and the scope of the changes introduced in both projects; once we believe they’re ‘good enough’, we’ll start merging the changes happening on the gsoc-* branches to trunk.

For me personally, this week started with taking part in a very interesting conference, and was sweetened even more by winning an award for the best paper in my field there. :) Now that the conference is over and the technical infrastructure is all set up, I can finally concentrate on bringing first-class multi-language support to CiviCRM – a topic which is very close to my heart, and which makes me go back to the time when I was taking my baby steps in CiviCRM development: when I joined the team in May 2005, I joined it as a localisation/internationalisation expert. At that time, the goal was to make CiviCRM usable for people outside the English-speaking part of humanity; now it’s high time to make it usable for multi-lanugage communities and organisations.

 
41 Votes

2 Comments

Team: CivicCRM Multi-language Support
CiviCRM: Multi-language Support
Written by Amy Stephen   
Piotr Szotkowski

Piotr Szotkowski is a PhD student at Warsaw University of Technology where he researches symbolic functional decomposition method for implementation of finite state machines in FPGA devices. He has developed for various non-governmental organizations, including the Stefan Batory Foundation and the Open Source Culture Foundation. Piotr is well known in the CiviCRM community given his development experience, including work with internationalization and localization. He has advanced knowledge in PHP, MySQL, Ruby, PosgreSQL, xHTML and CSS.

Wes Morgan Mentor: Wes Morgan is an online organizer and software developer for Environment America. He works to get people involved in environmental advocacy in their backyard and across the US. He is also a user and contributor to open source software like WebGUI and CiviCRM. This summer, he is working with the Joomla! GSoC team to mentor projects relating to CiviCRM (CiviCRM integrates with Joomla!). When not working or coding, Wes enjoys spending time in the Colorado Rocky Mountains hiking or skiing, and/or sampling the many delicious microbrews of the Front Range. You'll also often find him running through the park (and only sometimes being chased).

 
43 Votes

0 Comments

Abstract: CiviCRM Multi-Language Support
CiviCRM: Multi-language Support
Written by Piotr Szotkowski   

CiviCRM is an open source constituent relationship management system used by NGOs and advocacy groups (like Amnesty International, Wikimedia Foundation or the Joomla! and Drupal projects) all over the world. Judging by the number of community-contributed and -maintained translations and civicrm.org statistics, CiviCRM installations exist in over twenty languages using various alphabets (Latin, Cyrillic, Arabic, Devanagari, Chinese). Multi-language support is essential in multilingual countries (like Canada or India), as well as in cross-border (e.g., Central and East European) and worldwide organizations.

Currently, the CiviCRM internationalization and localization features are limited to one language per installation. Extending CiviCRM with multi-language support will allow on-the-fly language switching for both static and custom (specific to a given installation) user interface elements, as well as entering and storing multiple language versions of the managed data. The implementation will utilize gettext-like translation mechanism with separate textual domains for every set of localized data (thus evading the issue of gettext not supporting translations of homonyms) and a separate table for storing . This approach ensures that the internationalization layer is mostly independent from the core CiviCRM schema, that its existence doesn’t hamper the (relatively fast) speed of CiviCRM development and that it’s easily adaptable to future CiviCRM features. Another benefit of such an approach is that the database/disk space for the translated strings doesn’t have to be pre-allocated (otherwise, a ten-language site has to support a database an order of magnitude larger than a one-language install, even when most of the content is not localized to all of the languages).

 
42 Votes

1 Comment