Internationalization goals for 4.0

Every few releases we’ve made a major push for improved internationalization. In 3.7, that was laying the groundwork for pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party and theme language packs (more on that soon). In 3.4, it was reducing all of the customizations that many localization teams needed to make. (There’s a page on that here.) In 4.0, I’d like to close the loopLoop The Loop is PHP code used by WordPress to display posts. Using The Loop, WordPress processes each post to be displayed on the current page, and formats it according to how it matches specified criteria within The Loop tags. Any HTML or PHP code in the Loop will be processed on each post. https://codex.wordpress.org/The_Loop. on a lot of this. Here’s what I’d like to accomplish, and I’ll need a lot of help.

  1. The first step installing WordPress should be to choose a language. The rest of the install process would then be in that language.
  2. You should be able to choose/switch a language from the general settings screen, after which the language pack should be downloaded.
  3. You should be able to search from the dashboard for plugins and themes that are available in your language.
  4. All localized packages should be able to be automatically generated and made available immediately as part of the coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. release process.
  5. Localized packages should only be used for initial downloads from WordPress.orgWordPress.org The community site where WordPress code is created and shared by the users. This is where you can download the source code for WordPress core, plugins and themes as well as the central location for community conversations and organization. https://wordpress.org/. Instead, language packs should be transparently used for updates.

That’s the what. The how poses extensive challenges.

1) Choosing a language when installing WordPress. The rest of the install process would then be in that language.

osx-language

Want.

Split workflows: One issue that came up in #26879 (“friendlier welcome when installing WordPress”) is that setup-config.php is often not the initial entry point. A user installing WordPress through a hosting control panel is likely to be dumped onto the install screen (if not the dashboard directly).

Thus, we need to keep in mind that either screen might be the “intro” screen. This introduces technical challenges: If the user’s first step is setup-config.php, we don’t actually have WordPress fully loaded at this point, which makes actually installing a language pack more difficult. The install step has WordPress loaded in full, just without database tables. We should look into making setup-config.php load “all” of WordPress to make these environments easier to code.

Different approaches: There are two different ways to approach this: download language files immediately upon selection, or bundle barebones language files of the install screens for all supported languages. The latter is not the easiest thing to do (due to error messages and such, strings can come from all over) and also adds a delay to the adoption of new languages. The former would mean that we would want to pull a languages list from WordPress.org. (If we cannot reach WordPress.org, we would have no way of downloading the complete language pack, so this is not a big deal.) It’s still a bit of a challenge due to the split workflows problem.

Recommending a particular language: WordPress.org already recommends a language based on the browser’s settings (Accept-Language headerHeader The header of your site is typically the first thing people will experience. The masthead or header art located across the top of your page is part of the look and feel of your website. It can influence a visitor’s opinion about your content and you/ your organization’s brand. It may also look different on different screen sizes.) as well as using a more rough lookup based on IP address blocks. (HTML5 location could be used, but unless we’re also going to use that to set the user’s timezone, it seems excessive for a “choose your language” for the moment, especially since location is not as preferred anyway.)

Both would require the client to hit WordPress.org via JavaScriptJavaScript JavaScript or JS is an object-oriented computer programming language commonly used to create interactive effects within web browsers. WordPress makes extensive use of JS for a better user experience. While PHP is executed on the server, JS executes within a user’s browser. https://www.javascript.com/., versus a server HTTPHTTP HTTP is an acronym for Hyper Text Transfer Protocol. HTTP is the underlying protocol used by the World Wide Web and this protocol defines how messages are formatted and transmitted, and what actions Web servers and browsers should take in response to various commands. request. We can do a server-side HTTP request to generate the list, then a client-side request to float recommended ones to the top. It’s possible to do it in two steps: the Accept-Language mapping can be done locally, while the IP-to-location table on WordPress.org has 2.9 million entries and would require a round-trip. Of course, if a user has downloaded a localized package directly from a local WordPress.org site, that would be the top recommendation.

Note that by language or localeLocale A locale is a combination of language and regional dialect. Usually locales correspond to countries, as is the case with Portuguese (Portugal) and Portuguese (Brazil). Other examples of locales include Canadian English and U.S. English. we also must consider other translationtranslation The process (or result) of changing text, words, and display formatting to support another language. Also see localization, internationalization. variants, as there may be a language like Portuguese available in multiple locales (Brazil, Portugal) and further broken down by formal and informal variants. Each of these would be listed as their own “language.” c.f. #28303

2) You should be able to choose/switch a language from the general settings screen, after which the language pack should be downloaded.

WordPress MU had this feature and it still exists in multisitemultisite Used to describe a WordPress installation with a network of multiple blogs, grouped by sites. This installation type has shared users tables, and creates separate database tables for each blog (wp_posts becomes wp_0_posts). See also network, blog, site (though it’s pretty broken in terms of how it handles locales, #13069, #15677, #19760). This however only worked for already installed locales.

For single-site, the available languages should be fetched from WordPress.org and cached in a transient. (#13069 suggests using the GlotPress locale list, but as indicated above, we’d want to handle newly added or updated languages. This can then be displayed in a dropdown on the General Settings page. Upon selection and “Save Changes,” the language pack would be downloaded and enabled. We would likely use the oddly named “WPLANG” option since it’s what MU adopted and uses at both the site and networknetwork (versus site, blog) levels.

If we can’t reach WordPress.org, or if a filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. is applied, then only installed languages will be listed. (An untrusted multisite network will likely have a default filter in place, to match existing behavior.) We should try to cache failures to .org (generally — not just here) so things aren’t terribly slow when developing a site without internet. We could possibly lazy-fetch the list only when the user expands the dropdown (or clicks a “Change” link or something).

Note this does not address two situations in particular: user-specific languages, or using the dashboard in English. These will be easier to do thanks to improvements in 4.0, but there still exist a number of edge cases where persistent translations can “bleed” into other situations. Examples include a comment moderation email being emailed to an English speaker but sent in the language of the logged-in commenter; or a string stored in the database like image metadata. When using a plugin, these are “edge cases.” When core adopts them, these become “bugs.” We will need to introduce a locale-switching method not unlike switch_to_blog() and also identify and work around any “persistence” issues. Not for 4.0, though we can get started on the groundwork.

3) You should be able to search from the dashboard for plugins and themes that are available in your language.

With language packs (more on that soon), plugins and themes will be able to be translated into any language. The goal would then be to have “localized” plugin and theme directories on the translated WordPress.org sites, listing just the plugins or themes available in their language. Note that even a plugin or theme without strings have a name and description that can be translated.

The plugin and theme install screens in the dashboard should similarly gain this ability. In all cases, there would be an option to expand the search to plugins/themes available in any language, of course — along with the invitation to translate them.

We will need to figure out how to make readme files translatable, which get parsed for the plugin directory (and eventually themes will gain these as well, possibly as part of these initiatives).

4) All localized packages should be able to be automatically generated and made available immediately as part of the core release process.

A localized package should merely consist of core WordPress plus a few translated files (the readme, the license, and the sample config file) and the PO/MO files. No other alterations should occur, which means translation teams will only need to translate, and builds can be automatically generated and made available immediately as part of the core release process.

Local modifications. Before 3.4, every install needed to make modifications to WordPress as part of their localized package, whether by actually replacing core files or using a {$locale}.php file. By my audit, this now applies to only 14 locales.

At the very least, we can allow the current system to work as-is for these locales. Ideally, though, we address the specific concerns of each locale and bring as many as possible into the fold. They include:

  • Some CSSCSS Cascading Style Sheets. modifications we can incorporate into core.
  • Workarounds for declension issues in a few locales, specifically regarding the names of months. #11226, also #21139, #24899, #22225
  • Adding major locale-specific oEmbed providers. #19980
  • Transliteration such as Romanization for Serbian. We can handle this by having two variants of Serbian, the way we have formal and informal European Portuguese. (But we’d automate these language packs, rather than forcing translations to happen twice.)
  • Significant changes in Uyghur (#19950), Farsi, Chinese, and Japanese (including the multibyte patchpatch A special text file that describes changes to code, by identifying the files and lines which are added, removed, and altered. It may also be referred to as a diff. A patch can be applied to a codebase for testing. plugin).

Additional concerns:

License. Many locales also have an unofficial translation of the license. We should be able to collect any of these in a repository and include it automatically in a package. (Note that link is an explainer; no GPLGPL GNU General Public License. Also see copyright license. version 2 translations are available from that page.)

Readme. Some locales directly translate readme.htmlHTML HyperText Markup Language. The semantic scripting language primarily used for outputting content in web browsers.. Others add a second file and use one of two forms: the translated word “readme” (example: leame.html would be Spanish) or using a suffix (“readme-ja.html” for Japanese). We should standardize this somehow. Technically the readme changes with each version due to the version bump, but we can automate that. Bigger readme changes are fairly rare, but we’ll still need to figure out how to have these translated and tracked.

wp-config-sample.php. This one is especially interesting. If a user has chosen a language on install, we don’t need the readme or license but we do need this file, as setup-config.php will display its contents in a textarea if we’re unable to write the file. We can store this file as a single translated string in a PO/MO file (in some automated fashion) or as part of the APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. response. We will also need some kind of way to translate and track this file to be included in localized packages that are downloaded.

5) Localized packages should only be used for initial downloads from WordPress.org. Instead, language packs should be transparently used for updates.

The WordPress.org API is set up to return a series of update “offers” — one of each type (“latest” a.k.a. reinstall, “update” and “autoupdate”), in both the site’s language and in English. This results in awkward double-upgrades even when no strings have been changed (updating first to English, then to the locale when it is available for that version) and is a lame experience. While auto-generating localized packages immediately will help, there’s no reason to actually use localized packages for updates. Any locale without local modifications can be set up as a language pack now.

As of 3.7, WordPress core supports receiving instructions for language packs. We’d simply stop issuing localized package offers, start issuing language pack offers, and rip out some code on update-core.php that has been handling the multiple-offer dance. This is the easiest of the five 4.0 tasks but is dependent on the very complex fourth task.

Next steps.

Here’s an overall outline of the things we need to do. I’ll be creating relevant core and metaMeta Meta is a term that refers to the inside workings of a group. For us, this is the team that works on internal WordPress sites like WordCamp Central and Make WordPress. TracTrac An open source project by Edgewall Software that serves as a bug tracker and project management tool for WordPress. tickets in the coming days, and I’ve also referenced a few related tickets I was able to find.

Biggest problems to solve:

  • Figure out how to handle readmes, licenses, and wp-config-sample.php. This includes plugin (and theme) readmes as well. Remember we don’t want plugin developers to need to be involved in any way; and also that updates to these files will need to be handled elegantly (such that we have both ‘stable’ and ‘development’ tracks).
  • Eliminate all code modifications across all locales (as much as possible). Related, #20974.

WordPress.org/API work:

  • Create an API for core to use to fetch available locales for download.
  • Create an API for core to use to recommend a locale based on browser settings/location.
  • Auto-generate localized packages and core language packs.
  • Serve core language packs for updates, not localized packages. Related: #23113, #27164, #26914, #27752.
  • Mirror the plugin and theme directories to locale sites, with full translations (including readme data) and with selective searching.

Core work:

  • Add the ability to install a new language pack on demand.
  • Add a “switch to language” method, for languages that are installed. #26511
  • Add a screen that allows a locale to be chosen. May need to change how setup-config.php bootstraps WordPress.
  • Add a language chooser to the General Settings screen. #15677
  • Support searching for plugins/themes in a particular language, as well as leveraging translated data fields that come through from the API response. This mostly just means sending back the site’s language to WordPress.org. It also requires UIUI User interface to reveal results from any languages. We could possibly filter/hide/show on the client side, if results were not paginated.
  • Take a machete to update-core.php’s localized build handling once everything else is done. Related: #25712, #28199.

#4-0, #i18n