WordCamp Europe 2015 – Multilingual Discussion

Discussers:

Drupal Multilingual Overview

Christian López Espínola is a Drupal contributor to the multi-language features in that project, and gave us this overview of the multi-language features in Drupal 7 and upcoming in Drupal 8:

In Drupal we have different modules bundled with core. The language module was added in Drupal 7, which gives support for assigning languages to content. Drupal already had Interface translation support as WordPress does.

Drupal 8 adds content translation, with a UI.

At first the problem was that there are two different strategies for doing this. Drupal 7 gave support for translating a node by creating copies; having multiple posts for one piece of content. This made it hard to select the content required for display in case other modules were not aware of or not interested in multilingual support. Moved to translations to the field level, so stored in the equivalent of post meta. One content node now contains all the translations, so fields attached to the node have translations of the title, content, etc, into all languages if those fields are translatable, this way content is not duplicated (i.e. fields are not duplicated between nodes if they are not translatable).

Q: How does Drupal connect languages into groups of translations of one piece of content.
A: If one post is a translation of another, there is a field which links it. D7 two nodes. D8 has one node, and translated content is stored in fields (with language attribute).

Q: What issues did the adding of content language cause?
A: At some level we had two different posts, one for each language. So if your plugin doesn’t consider internationalisation, then this causes issues because you are considering translations different content, and mix languages in the UI. For example if we want to rate a post from a rating plugin, we may want the rating to be “shared” between all translations of that content.

Some Approaches to Attributing Language to Content

We looked at just a few multi-lingual plugins, to see how they addressed the issue of storing language content.

<caveat>This is by no means an exhaustive list, and only reflects the solutions that the people in the discussion have had experience with 🙂 Please feel free to add other examples or corrections in the comments.</caveat>

  • WPML table structures – to attempt to synthesise this link: each translated content object is stored as a post, and a WPML database table links the content into groups and specifies the language of each content object.
  • Babble puts each the translation for each content object into a custom post type or custom taxonomy, e.g. `post`, `post_fr_fr`, `post_pt_pt`, and uses a taxonomy to group translated content objects together, so you can say “this post is a translation of this other post”. A disadvantage of this approach is that translated content does not have an expected post type, e.g. post, but instead a Babble translation post type, e.g. `post_pt_pt`.
  • Multilingual Press stores content from each language in a separate site in a WordPress multisite. There is a database table which links translated content across sites.
  • Polylang does not create any additional database tables at all. It creates 4 taxonomies: `language`(to hold all the languages you configured), `term_language` (the terms in each language ex. EN: Uncategorized, DE: Allgemein), `term_translations`(connects the translated terms in each languages) `post_translations` (connects translated posts). The plugins seems to be fairly lightweight compared to others and works well with many additional plugins too.

Proposals

There are lots of varied issues which multi-language, translation, and/or localisation plugins and projects seek to solve. WordPress core should not provide a translation or localisation UI and/or workflow, we should continue to rely on the plugin space to address different user scenarios.

We do believe that there are some things which core could provide which would facilitate translation in the ecosystem for this type of plugin.

Proposal one: core could provide a minimal way to mark content (e.g. posts, terms) as a particular language.

  • In the simplest case, a single language site, all posts would be implicitly assumed to be in the selected front end language for that site.
  • When a translation/localisation plugin is added, the plugin has the duty to set the language for each piece of content (post, term, etc).
  • If this shipped, it would be, by design, “invisible functionality”, and an example plugin would be useful.
  • How would this affect the WordPress exporter and the importer? The translation/localisation plugin would have the duty to add any UI to the importer/exporter, and core would need to provide the necessary hooks, etc.
  • Should we consider special locales like “no language” and “unknown language” (Drupal does this)? Perhaps core specifies these “locales” as a standard, but doesn’t use it.
  • This might be implemented as an additional column on the `wp_posts`and `wp_terms`tables, with associated post and taxonomy API additions and enhancements, which is available for plugins to use.

Proposal two: core could provide a method or standard for translating strings stored in content objects like widgets

In some contexts it is hard for a translation or localisation plugin to know what requires translation, e.g. in widgets when the data is stored in a blob in the database. It would be useful if core provided a pattern for others to follow to mark particular strings of text as translatable.

Taking the example of a plugin providing a text widget, with user editable title and body fields, this plugin could follow the same standard to make these strings available to translation plugins. A possible implementation might be a filter or set of filters to pass the string for translation, and perhaps also the nature of the string to give a hint for the translation UI required, e.g. “rich text” (perhaps the translation plugin would provide a TinyMCE instance), “plain text” (perhaps a simple text area), etc.

Other things discussed:

  • Setting the admin area language differently to the front end language, including showing the admin bar in the admin area language – being addressed in #26511
  • Supporting variants on locale, e.g. Portuguese informal, as these cannot be defined within the ISO standard currently – being addressed in #28303

#l10n, #multilingual, #translation