I18N Performance Analysis

A recent in-depth performance analysis of WordPress core showed that loading translations had a significant hit on a site’s server response time. Given that more than half of all WordPress sites use a language other than English (US), the performance team identified this as an area worth looking into more closely. The team spent the last couple of months exploring this in more detail and the results are now shared in this blogblog (versus network, site) post.

This is merely an analysis of the current i18ni18n Internationalization, or the act of writing and preparing code to be fully translatable into other languages. Also see localization. Often written with a lowercase i so it is not confused with a lowercase L or the numeral 1. Often an acquired skill. system in WordPress with some proposed under-the-hood performance improvements. No decisions have been made on any of these proposals.

Context

Initial benchmarks showed that the median loading time for a localized site can be up to 50% slower than for non-localized sites, depending on which themes and plugins are being used. This was measured using both the wpp-research CLI tool and also a dedicated benchmark environment (as elaborated in the Comparison section towards the end).

The WordPress i18n system is based on gettext, which uses source .po (Portable Object) files and binary .mo (Machine Object) files for storing and loading translations. It is not using the C gettext APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. itself but a custom userland implementation that works without any external dependencies.

In addition to coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. itself, each pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party and theme has its own translationtranslation The process (or result) of changing text, words, and display formatting to support another language. Also see localization, internationalization. file, which has to be loaded and parsed on every request. Loading and parsing all these translation files is an expensive task.

In the past, various solutions have been discussed and explored to improve the i18n performance of WordPress. A non-exhaustive list:

  • Use a more lightweight MO parser
  • Improve translation lookups by using the hash map in MO files (e.g. with DynaMo)
  • Caching translations in the object cache
  • Caching translations in APCu (an in-memory key-value store for PHPPHP The web scripting language in which WordPress is primarily architected. WordPress requires PHP 7.4 or higher)
  • Other more elaborated forms of caching (e.g. per request)
  • Using the native PHP gettext extension
  • Use a custom PHP extension to handle the MO file parsing)
  • Using lazily evaluated translation calls (see #41305 for details)
  • Using a different file format than .mo files, e.g. plain .php files

A more recent discussion touching on all of these solutions can be found over at the wordpress/performance repository. It’s a great way to get some context on this topic.

For this analysis, many of these solutions were looked at, focusing on their advantages and disadvantages. At the end of this post there is a comparison table with some much needed numbers as well, based on custom-built benchmarks.

Solutions

Solution A: Use different file format

Use a different file format for translations instead of .mo files to avoid the overhead of loading and parsing binary files.

Design considerations

With this solution, translations will be stored in plain .php files returning an associative array of translation strings. Whenever a .php file is available, it will be preferred over the .mo file, which is still used as a fallback. The rest of the architecture remains the same.

When a localized WordPress site downloads language packs from the translate.wordpress.org translation platform, it downloads .po and .mo files containing all the translations. This will be modified to include .php files. GlotPress, which the platform is built on, will be updated to support this new output format. Additionally, WordPress core itself could be modified to generate PHP files whenever they are missing.

In theory, nothing is faster in PHP than loading and executing another PHP file. .json, .ini, or .xml would all be much slower.

Proof of concepts using the PHP files can be found at swissspidy/wp-php-translation-files and swissspidy/ginger-mo.

Benefits

  • Initial benchmarks show consistent significant performance improvements
  • Relatively trivial to implement
  • Maintains backward compatibility thanks to graceful fallback
  • Makes it easier for users to inspect and change translations (no need to compile .po to .mo)
  • Avoids loading and parsing binary .mo files, which is the main bottleneck
  • Lets PHP store translations in OPcache for an additional performance benefit
  • Battle-tested approach in the PHP ecosystem (for example in Laravel)

Caveats and risks

  • Requires not only changes to WordPress core, but also tools like GlotPress and WP-CLIWP-CLI WP-CLI is the Command Line Interface for WordPress, used to do administrative and development tasks in a programmatic way. The project page is http://wp-cli.org/ https://make.wordpress.org/cli/
  • Adds maintenance overhead by introducing a new file format on top of the existing one
    • As shown by the proof of concept, the overhead is minimal
    • In the long term, .mo support could be deprecated
  • Security considerations due to essentially executing remotely fetched PHP files
    • Not really different from downloading plugins/themes from WordPress.org
    • WordPress considers translations to be trusted
    • Hosting providers could be blocking PHP execution in wp-content/languages
    • Could potentially use checksum verifications or static analysis at install time to detect anomalies

Effort and timeline

The proof of concept using PHP files is in a very solid state already. There are also examples for changes to WP-CLI (PR) and GlotPress (PR). This makes it suitable for a feature project to expand testing with very little effort required. Even a core merge would be very straightforward in a relatively short time, potentially already in Q4 2023. The security aspect when using PHP files could be a potential blockerblocker A bug which is so severe that it blocks a release., so it’s important to loopLoop The Loop is PHP code used by WordPress to display posts. Using The Loop, WordPress processes each post to be displayed on the current page, and formats it according to how it matches specified criteria within The Loop tags. Any HTML or PHP code in the Loop will be processed on each post. https://codex.wordpress.org/The_Loop. in the WordPress security team and hosting providers early on.

More time is required to test other file formats and compare results.

Solution B: Native gettext extension

Use the native gettext PHP extension written in C when available, instead of the custom built-in parser in WordPress.

Design considerations

WordPress has always used a custom MO file parser, because the native gettext extension is not necessarily available on the server. With this solution, the existing system is adapted to use the extension whenever available and falling back to the custom implementation if not.

This has been previously explored in #17268 and implemented in WP Performance Pack and Native Gettext. These implementations can serve as inspiration for the initial design. They all work similarly in that they symlink or copy the translation files to a new directory structure that is compatible with the gettext extension.

As of July 2023, around 66% of all localized WordPress sites have the gettext extension installed, according to information from the WordPress update requests.

Benefits

  • Significant performance improvements for eligible sites
    • Initial benchmarks show that loading time and memory usage basically do not differ from non-localized sites

Caveats and risks

  • The gettext extension is not commonly available
    • Smaller incentive to implement and lower impact overall
  • Requires locales to be installed on the server
    • Servers rarely have many installed locales
      • Locales often need to be compiled first and take up a lot of space
      • WordPress on the other hand supports over 200 locales
    • Potential clashes with the custom locales WordPress supports
      • For example, locales like pt_PT_ao90, de_DE_formal or roh might not even be supported
    • Outreach to hosting providers would be necessary
  • Adds maintenance overhead by essentially adding a second gettext implementation
  • Poor API
    • Requires setting environment variables (such as LC_MESSAGES and LANGUAGE), which might not be possible or cause conflicts on certain servers/sites
  • Requires symlinks or hard file copies
    • Symlinks might not be possible on the server; copying all translation files means doubling disk usage
  • Translation files are cached by PHP, thus any translation change requires restarting the web server
    • There are workarounds such as cache busting using random file names or fstat, however they might not work on all environments
  • Has not been tested on a wider scale, despite being discussed for years

Check out the code of WP Performance Pack and Native Gettext to get a better idea of the extension’s poor API.

Effort and timeline

While there are existing implementations that could be leveraged for this solution, further field testing is required to assess whether the extension actually works under all circumstances. Given the limitations around the poor API and requirements for installing locales, it does not seem like a viable solution at all.

Solution C: Cache translations

Cache translations somehow to avoid expensive .mo parsing.

Design considerations

Cache translations either on disk, in the database, or the object cache to avoid expensive .mo file parsing on subsequent requests. This can be done in a generalized manner or also on a per-request basis to only load translations required for the current URLURL A specific web address of a website or web page on the Internet, such as a website’s URL www.wordpress.org.

Many different caching strategies have been explored in various forms in the past, each with their own pros and cons. Some could even be combined. Defining the exact implementation requires further exploration and testing, which warrants its own exploration post.

Benefits

  • Caching translations after one time .mo parsing potentially improves performance for future requests

Caveats and risks

  • Caching using persistent object cache (e.g. Memcached, Redis) or APCu:
    • Not available on most sites, making this not an ideal solution
      • Availability according to data from WordPress update requests:
        • Memcached: ~25%
        • Redis: ~25%
        • APCu: ~6%
    • Can potentially significantly increase cache size or exceed cache key limits
  • Database caching:
    • Moves the problem from disk reads to database reads
    • Can potentially significantly increase database size
    • Alternatively, use sqlite as a cache backend
      • Untested approach
      • Available on around 90% of sites
  • Disk caching:
    • Not always possible, depending on server environment
    • Still causes file reads, only with fewer or other files
  • Multiple cache groups (e.g. per-request or frontend/adminadmin (and super admin) split)
    • Smarter cache logic to only load translations that are needed for the majority of requests
    • Can potentially significantly increase cache size
    • Unlikely that different requests use very different translations
  • Cache retrieval adds overhead
    • Exact performance gains depend on implementation method and need to be measured first
    • No performance gains with cold cache
    • Cache invalidation logic TBD

Effort and timeline

Given the existing solutions in the ecosystem, the engineering effort itself would not be too big, but the right caching implementation (e.g. disk cache or object cache) needs to be evaluated first.

However, the right caching strategy probably does not exist because of all the different hosting environments. Since it’s unrealistic for core to support multiple types of caching, this solution seems better suited for plugins rather than core.

Solution D: Lazily evaluated translation calls

Use lazily evaluated translation calls to reduce the number of function calls in certain cases, leading to improved performance.

Design considerations

The idea of lazily evaluated translation calls has been first discussed in #41305. It enables avoiding string-specific expensive translation lookups until the translations are actually needed, by passing around proxy objects.

In other words: beyond just-in-time loading of translation files (which WordPress already does), this would add just-in-time lookup of individual strings in the translations. Check out this proof of concept to get a better picture.

It can be integrated essentially in two ways, both of which are explained on the core ticketticket Created for both bug reports and feature development on the bug tracker.:

  1. Change all translation calls to be lazily evaluated by default
  2. Make this opt-in, either with new function arguments or new functions altogether

Benefits

  • Reduces the number of translation lookups, in some scenarios drastically
    • On a regular home page request there are ~60% less translation calls, saving around ~10ms (as measured by XHProf)
  • As a side effect, solves UXUX User experience issues such as #38643

Caveats and risks

  • Depending on implementation this either breaks backward compatibility or risks not gaining enough adoption
    • Documentation, tooling, and developer education can help mitigate this to a certain extent
    • Adoption could be done gradually, e.g. starting with an opt-in approach and eventually making it the default
  • Likely will not have a significant impact on typical frontend page loads, as it’s mostly useful for areas like the REST APIREST API The REST API is an acronym for the RESTful Application Program Interface (API) that uses HTTP requests to GET, PUT, POST and DELETE data. It is how the front end of an application (think “phone app” or “website”) can communicate with the data store (think “database” or “file system”) https://developer.wordpress.org/rest-api/. schema output, where a lot of translation calls are made without actually using the translations
    • Needs analysis in more scenarios to measure impact
    • The REST API schema already has a workaround by using a cache in a static variable
  • Does not improve situation for actually loading translation files
  • Initial testing shows that this actually hurts performance due to the additional thousands of proxy objects being created

Effort and timeline

Gradual adoption would mean a multi-year effort to establish lazily evaluated translation calls, while enabling this by default is a significant backward compatibility break that could affect thousands of plugins and themes in the ecosystem. And since it does actually slow down performance in some cases, this solution is not a great candidate for implementation.

Solution E: Optimize/Rewrite existing MO parser

Refactor the existing MO parser in WordPress to be more performant.

Design considerations

Completely overhaul the existing MO translation file parser in WordPress with performance in mind. For example by using Ginger MO, WP Performance Pack, or other existing solutions as a base.

While for instance Altis DXP (Human Made) have actually replaced the existing MO parser with a custom-made PHP extension written in Rust, such an approach is obviously not feasible for core. The new solution needs to be written in userland PHP.

Initial testings with an updated fork of Ginger MO show some noticeable speedups and lower memory usage. It also supports multiple translation files per text domain and multiple locales loaded at once, which could prove beneficial for improving the localeLocale A locale is a combination of language and regional dialect. Usually locales correspond to countries, as is the case with Portuguese (Portugal) and Portuguese (Brazil). Other examples of locales include Canadian English and U.S. English. switching functionality in WordPress core.

Besides that, plugins like WP Performance Pack and DynaMo have implemented partial lookups using the MO hash table or binary search, avoiding reading the whole file and storing it in memory. That slightly reduces memory usage and performance.

Benefits

  • Can be used without necessarily introducing another file format
  • Opens up potential performance enhancements in other areas, i.e. locale switching
  • Mostly maintains backward compatibility

Caveats and risks

  • Still a risk of breaking backward compatibility

Effort and timeline

There already is a working proof of concept for this solution, but more testing is required to further refine it and improve its backward compatibility layer. With such an effort being an ideal candidate for a feature pluginFeature Plugin A plugin that was created with the intention of eventually being proposed for inclusion in WordPress Core. See Features as Plugins., this could be achieved relatively quickly in a few months.

Solution F: Splitting up translation files

Split translation files from plugins and themes into smaller chunks to make loading them more efficient.

Design considerations

Depending on the project’s size, translation files can be quite big. That’s why WordPress itself uses separate translation files for the admin and everything else, so that not too many strings are unnecessarily loaded.

This strategy could be applied to plugins and themes as well. Either by allowing them to use multiple text domains (which would require developer education and changes to tooling), or by somehow doing this automatically (exact method TBD)

Benefits

  • Faster loading times due to loading smaller files

Caveats and risks

  • Risk of breaking backward compatibility
  • Opt-in approach requires tooling and distribution changes and risks slow adoption

Effort and timeline

Further research is required to evaluate this properly.

Comparison

At first glance, solution A (PHP translation files) is a relatively straightforward enhancementenhancement Enhancements are simple improvements to WordPress, such as the addition of a hook, a new feature, or an improvement to an existing feature. that maintains backward compatibility and shows promising improvements. However, it does not only require changes to core itself, but also to the translation platform. The security aspect remains a risk, although discussing it early on with stakeholders and gathering more testers would help mitigate it.

Leveraging the native gettext extension as in solution B shows stunning results, but the lack of availability and the non-ideal API are a concern. Still, it’s a progressive enhancement that cannot be ignored. Especially since it could pretty much eliminate the need for additional caching as in solution C.

Caching already loaded translations as in solution C does not eliminate the root cause of the i18n slowness, but can speed up subsequent requests. Unfortunately, persistent object caches or APCu are rather uncommon (though we do not have exact data on the former yet, see #58808), and implementing more complex types of caching (e.g. per-request caching) would require significant exploration effort before becoming a viable option.

Lazily evaluated translation calls (solution D) can shave time off translation calls in some situations, but overall actually decrease performance. While it could help solve some actual UX issues in core, the backward compatibility and adoption concerns make it even less of a suitable solution.

Existing plugins like Ginger MO and WP Performance Pack show that the existing MO parser in WordPress can be further improved (solution E).

Benchmarks

Now to the most interesting part: the hard numbers!

These benchmarks are powered by a custom-built performance testing environment using @wordpress/env and Playwright. The environment has been configured with some additional plugins and the PHP extensions required for some of the solutions. Tests have been performed against the 6.3 RCrelease candidate One of the final stages in the version release cycle, this version signals the potential to be a final release to the public. Also see alpha (beta). by visiting the home page and the dashboard 30 times each and then using the median values.

You can find the exact setup in this wp-i18n-benchmarks GitHub repository.

BlockBlock Block is the abstract term used to describe units of markup that, composed together, form the content or layout of a webpage using the WordPress editor. The idea combines concepts of what in the past may have achieved with shortcodes, custom HTML, and embed discovery into a single consistent API and user experience. Theme

LocaleScenarioObject CacheMemory UsageTotal Load Time
en_USDefault15.60 MB133.58 ms
de_DEDefault29.14 MB181.95 ms
de_DEGinger MO (MO)19.24 MB159.18 ms
de_DEGinger MO (PHP)16.98 MB138.14 ms
de_DEGinger MO (JSONJSON JSON, or JavaScript Object Notation, is a minimal, readable format for structuring data. It is used primarily to transmit data between a server and web application, as an alternative to XML.)19.24 MB153.39 ms
de_DENative Gettext15.99 MB142.12 ms
de_DEDynaMo19.62 MB157.93 ms
de_DECache in APCu50.37 MB181.51 ms
en_USDefault15.67 MB121.53 ms
de_DEDefault29.01 MB167.67 ms
de_DEGinger MO (MO)19.11 MB147.19 ms
de_DEGinger MO (PHP)16.85 MB127.97 ms
de_DEGinger MO (JSON)19.11 MB144.43 ms
de_DENative Gettext15.86 MB129.19 ms
de_DEDynaMo18.57 MB133.46 ms
de_DECache in APCu50.30 MB170.19 ms
de_DECache in object cache29.07 MB173.19 ms
Benchmarks using the Twenty Twenty-Three block theme

Classic Theme

LocaleScenarioObject CacheMemory UsageTotal Load Time
en_USDefault15.35 MB120.79 ms
de_DEDefault28.79 MB172.10 ms
de_DEGinger MO (MO)18.85 MB145.68 ms
de_DEGinger MO (PHP)16.56 MB124.73 ms
de_DEGinger MO (JSON)18.84 MB140.78 ms
de_DENative Gettext15.58 MB128.26 ms
de_DEDynaMo19.24 MB146.09 ms
de_DECache in APCu50.13 MB167.28 ms
en_USDefault15.19 MB107.26 ms
de_DEDefault28.59 MB154.30 ms
de_DEGinger MO (MO)18.64 MB133.21 ms
de_DEGinger MO (PHP)16.37 MB112.94 ms
de_DEGinger MO (JSON)18.64 MB128.94 ms
de_DENative Gettext15.38 MB115.11 ms
de_DEDynaMo18.10 MB120.72 ms
de_DECache in APCu49.99 MB151.82 ms
de_DECache in object cache28.65 MB156.36 ms
Benchmarks using the Twenty Twenty-One classic theme

Admin

LocaleScenarioObject CacheMemory UsageTotal Load Time
en_USDefault15.42 MB139.83 ms
de_DEDefault31.92 MB187.76 ms
de_DEGinger MO (MO)20.07 MB164.94 ms
de_DEGinger MO (PHP)17.09 MB139.66 ms
de_DEGinger MO (JSON)20.06 MB160.87 ms
de_DENative Gettext15.95 MB143.43 ms
de_DEDynaMo20.58 MB166.79 ms
de_DECache in APCu58.13 MB190.38 ms
en_USDefault15.66 MB112.69 ms
de_DEDefault31.84 MB164.26 ms
de_DEGinger MO (MO)19.99 MB140.70 ms
de_DEGinger MO (PHP)17.01 MB118.52 ms
de_DEGinger MO (JSON)19.98 MB138.49 ms
de_DENative Gettext15.87 MB120.01 ms
de_DEDynaMo19.73 MB120.26 ms
de_DECache in APCu58.07 MB162.41 ms
de_DECache in object cache31.86 MB164.28 ms
Benchmarks visiting the WordPress admin

Conclusion

Finding the right path forward means weighing all the pros and cons of each solution and looking at both horizontal and vertical impact, i.e. how much faster can i18n be made for how many sites.

When looking at all these factors, it appears that a revamped translations parser (solution E) could bring the most significant improvements to all localized WordPress sites. Especially when combined with a new PHP translation file format (solution A), which Ginger MO supports, the i18n overhead becomes negligible. Of course the same risks associated with introducing a new format apply.

On top of that, a revamped i18n library like Ginger MO could also be combined with other solutions such as caching or dynamic MO loading to potentially gain further improvements. However, those routes have yet to be explored.

Next steps

The WordPress performance team wants to further dive into this topic and test some of the above solutions (and combinations thereof) on a wider scale through efforts like the Performance Lab feature project. We are looking forward to hearing your feedback on this analysis and welcome any additional comments, insights, and tinkering.

Deadline August 6, 2023

After the deadline passes, the performance team will discuss the received feedback and determine next steps.


Thank you to @flixos90, @westonruter, @joemcgill, @spacedmonkey, and @adamsilverstein for reviewing and helping with this post. Thank you to @nbachiyski, @ocean90, @akirk, @rmccue, @dd32 for providing valuable insights and context.

#core, #i18n, #performance

I18N Improvements in 6.3

Various internationalization (i18n) improvements are in WordPress 6.3, and this developers note will focus on these.

Allow to short-circuit load_textdomain()

In #58035 / [55928], a new pre_load_textdomain filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. was introduced. This is useful for plugins to develop and test alternative loading/caching strategies for translations. This brings consistency with the existing pre_load_script_translations filter for JavaScriptJavaScript JavaScript or JS is an object-oriented computer programming language commonly used to create interactive effects within web browsers. WordPress makes extensive use of JS for a better user experience. While PHP is executed on the server, JS executes within a user’s browser. https://www.javascript.com/. translations.

Improvements to just-in-time translationtranslation The process (or result) of changing text, words, and display formatting to support another language. Also see localization, internationalization. loading

In #58321, it was reported that _load_textdomain_just_in_time() was firing too often if no translations were found for a given text domain, which typically is the case on site running English (US).

[55865] addresses this issue, which resulted in some minor performance improvements.

Props to @spacedmonkey for technical review, to @stevenlinx for proofreading.

#6-3, #dev-notes, #dev-notes6-3, #i18n

Preferred Languages: Help test the latest version

Since the last update on the Preferred Languages feature plugin, a lot of work has been accomplished both on the pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party side and in coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. to make the solution more robust in a variety of ways. Today, I want to provide a bit more details on these accomplishments, which resulted in the recent release of Preferred Languages 2.0, advancing the project a huge step closer towards a core merge proposal

But first, make sure to check out the previous update:

Improved Stability, Fully Rewritten

Over the last year, a lot of work has gone into making the plugin more stable by adding more tests and fixing bugs. This includes improving compatibility with other plugins and making translationtranslation The process (or result) of changing text, words, and display formatting to support another language. Also see localization, internationalization. merging and localeLocale A locale is a combination of language and regional dialect. Usually locales correspond to countries, as is the case with Portuguese (Portugal) and Portuguese (Brazil). Other examples of locales include Canadian English and U.S. English. switching more robust. As a result, pure unit testunit test Code written to test a small piece of code or functionality within a larger application. Everything from themes to WordPress core have a series of unit tests. Also see regression. code coverage is near 100%, with end-to-end tests adding another layer of confidence.

With WordPress adding several i18ni18n Internationalization, or the act of writing and preparing code to be fully translatable into other languages. Also see localization. Often written with a lowercase i so it is not confused with a lowercase L or the numeral 1. Often an acquired skill. improvements in WordPress 6.1 and 6.2, the Preferred Languages plugin is now fully compatible with WP_Textdomain_Registry and switch_to_user_locale(). The minimum required WordPress version has been bumped to 6.1 as a result.

Certainly the biggest change, however, was the full refactoring of the UIUI User interface itself. The whole JavaScriptJavaScript JavaScript or JS is an object-oriented computer programming language commonly used to create interactive effects within web browsers. WordPress makes extensive use of JS for a better user experience. While PHP is executed on the server, JS executes within a user’s browser. https://www.javascript.com/. portion of the code base was over 6 years old and using jQuery and jQuery UI. But not anymore! The UI has been completely refactored to use ReactReact React is a JavaScript library that makes it easy to reason about, construct, and maintain stateless and stateful user interfaces. https://reactjs.org/., with the same components that also power the blockBlock Block is the abstract term used to describe units of markup that, composed together, form the content or layout of a webpage using the WordPress editor. The idea combines concepts of what in the past may have achieved with shortcodes, custom HTML, and embed discovery into a single consistent API and user experience. editor. In the process, drag & drop sorting functionality was removed to simplify the UI, and accessibilityAccessibility Accessibility (commonly shortened to a11y) refers to the design of products, devices, services, or environments for people with disabilities. The concept of accessible design ensures both “direct access” (i.e. unassisted) and “indirect access” meaning compatibility with a person’s assistive technology (for example, computer screen readers). (https://en.wikipedia.org/wiki/Accessibility) has improved, but otherwise everything looks mostly the same.

How to help

So, what’s next? The latest version of the Preferred Languages feature pluginFeature Plugin A plugin that was created with the intention of eventually being proposed for inclusion in WordPress Core. See Features as Plugins. needs more eyes testing it and providing feedback.

One big remaining question mark is the concept of translation merging. By default, if there are only some missing strings in a selected locale, these would be displayed in English. But with translation merging, the missing strings will be taken from the locale next in line instead. While this works great, it could be a tad slow due to the way translations are loaded in WordPress. Any help addressing this potential performance concern would be greatly appreciated.

Note: The merging feature can be enabled with add_filter( 'preferred_languages_merge_translations', '__return_true' );.

Active development is taking place on GitHub. If you want to get involved, check out the open issues and join the #core-i18n channel on Slack.

Props to @ocean90 for reviewing this post.

#feature-plugins, #feature-projects, #i18n, #polyglots, #preferred-languages

I18N Improvements in 6.2

Various internationalization (i18n) improvements are in WordPress 6.2, and this developers note will focus on these.

Make it easier to switch to a user’s localeLocale A locale is a combination of language and regional dialect. Usually locales correspond to countries, as is the case with Portuguese (Portugal) and Portuguese (Brazil). Other examples of locales include Canadian English and U.S. English.

A while back, WordPress 4.7 introduced user admin languages and locale switching. With every user being able to set their preferred locale, it’s crucial to use locale switching to ensure things like emails are sent in that locale. That’s why you would see a lot of code like switch_to_locale( get_user_locale( $user ) ) in coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. and in plugins.

Not only is this very repetitive, it also causes limitations when used in combination with things like the Preferred Languages feature pluginFeature Plugin A plugin that was created with the intention of eventually being proposed for inclusion in WordPress Core. See Features as Plugins., where one would like to fall back to another locale if the desired one is not available.

To improve this, WordPress 6.2 provides a new switch_to_user_locale() function that takes a user ID, grabs the user’s locale and stores the ID in the stack, so that at each moment in time you know whose locale is supposed to be used.

Together with this enhancementenhancement Enhancements are simple improvements to WordPress, such as the addition of a hook, a new feature, or an improvement to an existing feature., the WP_Locale_Switcher class has been updated to filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. both locale and determine_locale with the switched locale. This way, anyone using the determine_locale() function will get the correct locale information when switching is in effect.

Core already makes use of this new function, and plugins and themes are of course encouraged to do so as well.

See #57123 for more information.

wp_get_word_count_type()

In #56698, the locale’s word count type (i.e. whether they count words or characters), has been made part of the WP_Locale class.

Previously, to get that information, plugins and themes had to do something similar as core and use code like _x( 'words', 'Word count type. Do not translate!' ). All such translationtranslation The process (or result) of changing text, words, and display formatting to support another language. Also see localization, internationalization. strings in core have already been replaced with the new wp_get_word_count_type() function (which is a wrapper around WP_Locale::get_word_count_type()). So if you have been using those translation strings in your code, you can now switch to this new function too!

Install new translations when editing your profile

Ever since the aforementioned user admin language feature was introduced, users have been able to change their preferred language in the user profile by choosing from the list of already installed languages. New languages could only be installed via the General Settings page.

Starting with WordPress 6.2, you don’t have to go to the settings page anymore if you quickly want to change your user language to a new one—if you have the necessary capabilities to install languages of course, which by default only admins have.

See #38664 for full context.

Screenshot of the profile edit screen, showing the language chooser dropdown where users can now also choose languages that have not yet been installed.
Users with the necessary capabilities can now install new languages via the profile edit screen.

Translator comments for screen reader strings

In r55276 / #29748, all translatable strings intended for screen readers have been marked as such via translator comments.

This aims to provide better context for translators and make it easier to determine that some strings contain hidden accessibilityAccessibility Accessibility (commonly shortened to a11y) refers to the design of products, devices, services, or environments for people with disabilities. The concept of accessible design ensures both “direct access” (i.e. unassisted) and “indirect access” meaning compatibility with a person’s assistive technology (for example, computer screen readers). (https://en.wikipedia.org/wiki/Accessibility) text and are not displayed in the UIUI User interface.

Props @ocean90 and @webcommsat for reviewing this post.

#6-2, #dev-notes, #dev-notes-6-2, #i18n

An Update on Preferred Languages

5 years after announcing the Preferred Languages feature project and implementing the first prototype, it’s time for a long overdue update on where things currently stand.

More than half of all WordPress sites in the world use a language other than US English. For these sites and users, the options to change the site and user language are great. But when there’s no translationtranslation The process (or result) of changing text, words, and display formatting to support another language. Also see localization, internationalization. for a given pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party or theme, WordPress falls back to US English. That’s a poor user experience for many non-English speakers.

The Preferred Languages plugin solves this issue by doing the same thing operating systems, browsers, and other types of software have been doing for years. It lets you select multiple preferred languages in your settings. WordPress then tries to load the translations for the first language that’s available, falling back to the next language in your list.

The Preferred Languages UIUI User interface

Recent New Features

After stabilizing the initial prototype, the feature pluginFeature Plugin A plugin that was created with the intention of eventually being proposed for inclusion in WordPress Core. See Features as Plugins. has lived a mostly dormant life, with only irregular updates here and there. Adding support for JavaScript i18n introduced in WordPress 5.0 was the biggest enhancementenhancement Enhancements are simple improvements to WordPress, such as the addition of a hook, a new feature, or an improvement to an existing feature.. With the plugin being stable and used on thousands of sites without issues, there was simply no need for any other change. Yet, the plugin was far from feature complete.

Over the past year, I blew the dust off and made significant improvements to the plugin:

  • Bringing UI and code up-to-date with latest WordPress version
  • Improved Multisitemultisite Used to describe a WordPress installation with a network of multiple blogs, grouped by sites. This installation type has shared users tables, and creates separate database tables for each blog (wp_posts becomes wp_0_posts). See also network, blog, site support, bringing Preferred Languages to Networknetwork (versus site, blog) settings
  • Site Health integration
  • Increased test coverage
  • Improved compatibility with other plugins, especially those accessing the locale user metaMeta Meta is a term that refers to the inside workings of a group. For us, this is the team that works on internal WordPress sites like WordCamp Central and Make WordPress.
  • Added an option to merge incomplete translations to avoid fallbacks to US English

The latter is actually a pretty cool enhancement and can be enabled using a filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output.. Here’s what it does:

Let’s say your preferred languages are es_CR, es_MX, es_ES, en_US (in this order). With this feature, if some of the translations are incomplete, your site will be displayed in es_CR, with missing strings taken from es_MX, es_ES etc. Without this feature, missing strings would simply be displayed in US English straight away. Merging translations this way makes for a much better user experience.

What’s Still Missing / Open Questions

Textdomain Registry

Since the Preferred Languages feature plugin also needs to work well when switching locales, #39210 has been a missing feature for a long time. While the plugin has its own implementation of a textdomain registry originally created (but then reverted) in that ticketticket Created for both bug reports and feature development on the bug tracker., it is required for this change to finally land in coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress..

My hope is that this can be implemented in WordPress 6.1+.

JavaScriptJavaScript JavaScript or JS is an object-oriented computer programming language commonly used to create interactive effects within web browsers. WordPress makes extensive use of JS for a better user experience. While PHP is executed on the server, JS executes within a user’s browser. https://www.javascript.com/. Code Base

The initial version of the Preferred Languages plugin was built in a pre-GutenbergGutenberg The Gutenberg project is the new Editor Interface for WordPress. The editor improves the process and experience of creating new content, making writing rich content much simpler. It uses ‘blocks’ to add richness rather than shortcodes, custom HTML etc. https://wordpress.org/gutenberg/ era, using jQuery and jQuery UI Sortable. It doesn’t make much sense to potentially merge such a new UI component into core that is built with those technologies.

Rather, some time should be spent to rebuild the client-side code. There are two possible options here:

  1. Rewrite in vanilla JavaScript, replacing jQuery with modern Web APIs. This is most feasible when removing the capability to sort languages using drag & drop, for which jQuery UI Sortable is currently used.

    If we’re okay with dropping drag & drop functionality, then this would be a straightforward change.
  2. Rewrite everything in ReactReact React is a JavaScript library that makes it easy to reason about, construct, and maintain stateless and stateful user interfaces. https://reactjs.org/.. A prototype of this actually exists, so it’s mostly a matter of finishing it up & perhaps submitting some upstream PRs to Gutenberg for any missing features.
    Using React would be more in line with current best practices and expansion of Gutenberg throughout WordPress adminadmin (and super admin). Such a rewrite would also facilitate potential use of the component directly within a Gutenberg context.
    On the other hand, it would significantly increase overall code size and thus maintenance burden, for little immediate benefit.

While I am currently heavily leaning towards the first option, inputs are always welcome!

Of course, if we are okay with keeping jQuery & jQuery UI Sortable, then no change is needed at all.

The Next Steps

The Preferred Languages feature plugin can always use help with development and testing. Right now resolving the two open questions is the main priority before proposing merging this functionality into core.

Active development is taking place on GitHub. If you want to get involved, check out open issues and join the #core-i18n channel on Slack.

#feature-plugins, #feature-projects, #i18n, #polyglots, #preferred-languages

New Capability Queries in WordPress 5.9

WordPress 5.9 adds support for capability queries in WP_User_Query. Similar to the existing role/role__in/role__not_in query arguments, this adds support for three new query arguments in WP_User_Query:

  • capability
  • capability__in
  • capability__not_in

These can be used to fetch users with (or without) a specific set of capabilities, for example to get all users with the capability to edit a certain post type.

A new capabilities parameter (mapping to capability__in in WP_User_Query) was added to the REST APIREST API The REST API is an acronym for the RESTful Application Program Interface (API) that uses HTTP requests to GET, PUT, POST and DELETE data. It is how the front end of an application (think “phone app” or “website”) can communicate with the data store (think “database” or “file system”) https://developer.wordpress.org/rest-api/. users controller so you can also perform these queries via the REST API.

Under the hood, this will check all existing roles on the site and perform a LIKE query against the capabilities user metaMeta Meta is a term that refers to the inside workings of a group. For us, this is the team that works on internal WordPress sites like WordCamp Central and Make WordPress. field to find:

  • all users with a role that has this capability
  • all users with the capability being assigned directly

Important note: In WordPress, not all capabilities are stored in the database. Capabilities can also be modified using filters like map_meta_cap. These new query arguments do not work for such capabilities.

The prime use case for capability queries is to get all “authors”, i.e. users with the capability to edit a certain post type. This is needed for the post author dropdown in the post editor, for instance.

Until now, 'who' => 'authors' was used for this, which relies on user levels. However, user levels were deprecated a long time ago and thus never added to custom roles. This led to constant frustration due to users with custom roles missing from author dropdowns.

Thanks to this new feature, any usage of 'who' => 'authors' in coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. was updated to use capability queries instead.

Subsequently, 'who' => 'authors' queries were deprecated in favor of these new query arguments. The same goes for ?who=authors queries for thewp/v2/users REST API endpoint.

In the same run, the twentyfourteen_list_authors() function in the Twenty Fourteen theme was updated to make use of this new functionality, adding a new twentyfourteen_list_authors_query_args filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. to make it easier to override this behavior.

Making use of capability queries while retaining compatibility for older WordPress versions, one could use code as follows to support both:

$args = array(
	'orderby'    => 'post_count',
	'order'      => 'DESC',
	'capability' => array( 'edit_posts' ),
);

// Capability queries were only introduced in WP 5.9.
if ( version_compare( $GLOBALS['wp_version'], '5.9-alpha', '<' ) ) {
	$args['who'] = 'authors';
	unset( $args['capability'] );
}

$authors = get_users( $args );

To learn more about this change and the 11-year-old bugbug A bug is an error or unexpected result. Performance improvements, code optimization, and are considered enhancements, not defects. After feature freeze, only bugs are dealt with, with regressions (adverse changes from the previous version) being the highest priority. it fixed, check out #16841, [51943], and [52290]

#5-9, #dev-notes

New XML Sitemaps Functionality in WordPress 5.5

In WordPress 5.5, a new feature is being introduced that adds basic, extensibleExtensible This is the ability to add additional functionality to the code. Plugins extend the WordPress core software. XML sitemaps functionality into WordPress coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress..

While web crawlers are able to discover pages from links within the site and from other sites, sitemaps supplement this approach by allowing crawlers to quickly and comprehensively identify all URLs included in the sitemap and learn other signals about those URLs using the associated metadata.

For more background information on this new feature, check out the merge announcement, or the corresponding TracTrac An open source project by Edgewall Software that serves as a bug tracker and project management tool for WordPress. ticketticket Created for both bug reports and feature development on the bug tracker. #50117.

This article explains in detail the various ways in which this new feature can be customized by developers. For example, if you are developing a pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party with some similar functionality, this post will show you how you can integrate it with the core’s new sitemaps feature.

Key Takeways

With version 5.5., WordPress will expose a sitemap index at /wp-sitemap.xml. This is the main XML file that contains the listing of all the sitemap pages exposed by a WordPress site.

The sitemap index can hold a maximum of 50000 sitemaps, and a single sitemap can hold a (filterable) maximum of 2000 entries.

By default, sitemaps are created for all public and publicly queryable post types and taxonomies, as well as for author archives and of course the homepage of the site.

The robots.txt file exposed by WordPress will reference the sitemap index so that i can be easily discovered by search engines.

Technical Requirements

Rendering sitemaps on the frontend requires the SimpleXML PHPPHP The web scripting language in which WordPress is primarily architected. WordPress requires PHP 7.4 or higher extension. If this extension is not available, an error message will be displayed instead of the sitemap. The HTTPHTTP HTTP is an acronym for Hyper Text Transfer Protocol. HTTP is the underlying protocol used by the World Wide Web and this protocol defines how messages are formatted and transmitted, and what actions Web servers and browsers should take in response to various commands. status code 501 (“Not implemented”) will be sent accordingly.

Configuring Sitemaps Behavior

Adding Custom Sitemaps

WordPress provides sitemaps for built-in content types like pages and author archives out of the box. If you are developing a plugin that adds custom features beyond those standard ones, or just want to include some custom URLs on your site, it might make sense to add a custom sitemap provider.

To do so, all you need to do is create a custom PHP class that extends the abstract WP_Sitemaps_Provider class in core. Then, you can use the wp_register_sitemap_provider() function to register it. Here’s an example:

add_filter(
	'init',
	function() {
		$provider = new Awesome_Plugin_Sitemaps_Provider();
		wp_register_sitemap_provider( 'awesome-plugin', $provider );
	}
);

The provider will be responsible for getting all sitemaps and sitemap entries, as well as determining pagination.

Removing Certain Sitemaps

There are three existing sitemaps providers for WordPress object types like posts, taxonomies, and users. If you want to remove one of them, let’s say the “users” provider, you can leverage the wp_sitemaps_add_provider filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. to do so. Here’s an example:

add_filter(
	'wp_sitemaps_add_provider',
	function( $provider, $name ) {
		if ( 'users' === $name ) {
			return false;
		}

		return $provider;
	},
	10,
	2
);

If instead you want to disable sitemap generation for a specific post type or taxonomyTaxonomy A taxonomy is a way to group things together. In WordPress, some common taxonomies are category, link, tag, or post format. https://codex.wordpress.org/Taxonomies#Default_Taxonomies., use the wp_sitemaps_post_types or wp_sitemaps_taxonomies filter, respectively.

Example: Disabling sitemaps for the page post type

add_filter(
	'wp_sitemaps_post_types',
	function( $post_types ) {
		unset( $post_types['page'] );
		return $post_types;
	}
);

Example: Disabling sitemaps for the post_tag taxonomy

add_filter(
	'wp_sitemaps_taxonomies',
	function( $taxonomies ) {
		unset( $taxonomies['post_tag'] );
		return $taxonomies;
	}
);

Adding Additional Tags to Sitemap Entries

The sitemaps protocol specifies a certain set of supported attributes for sitemap entries. Of those, only the URLURL A specific web address of a website or web page on the Internet, such as a website’s URL www.wordpress.org (loc) tagtag A directory in Subversion. WordPress uses tags to store a single snapshot of a version (3.6, 3.6.1, etc.), the common convention of tags in version control systems. (Not to be confused with post tags.) is required. All others (e.g. changefreq and priority) are optional tags in the sitemaps protocol and not typically consumed by search engines, which is why WordPress only lists the URL itself. Developers can still add those tags if they really want to.

You can use the wp_sitemaps_posts_entry / wp_sitemaps_users_entry / wp_sitemaps_taxonomies_entry filters to add additional tags like changefreq, priority, or lastmod to single items in the sitemap.

Example: Adding the last modified date for posts

add_filter(
    'wp_sitemaps_posts_entry',
    function( $entry, $post ) {
        $entry['lastmod'] = gmdate( DATE_W3C, strtotime( $post->post_modified_gmt );
        return $entry;
    },
    10,
    2
);

Similarly, you can use the wp_sitemaps_index_entry filter to add lastmod on the sitemap index. Note: the sitemaps protocal does not support on the sitemap index.

Trying to add any unsupported tags will result in a _doing_it_wrong notice.

Excluding a Single Post from the Sitemap

If you are developing a plugin that allows setting specific posts or pages to noindex, it’s a good idea to exclude those from the sitemap too.

The wp_sitemaps_posts_query_args filter can be used to exclude specific posts from the sitemap. Here’s an example:

add_filter(
	'wp_sitemaps_posts_query_args',
	function( $args, $post_type ) {
		if ( 'post' !== $post_type ) {
			return $args;
		}

		$args['post__not_in'] = isset( $args['post__not_in'] ) ? $args['post__not_in'] : array();
		$args['post__not_in'][] = 123; // 123 is the ID of the post to exclude.
		return $args;
	},
	10,
	2
);

Disabling Sitemaps Functionality Completely

If you update the Site Visibility settings in WordPress adminadmin (and super admin) to discourage search engines from indexing your site, sitemaps will be disabled. You can use the wp_sitemaps_enabled filter to override the default behavior.

Here’s an example of how to disable sitemaps completely, no matter what:

add_filter( 'wp_sitemaps_enabled', '__return_false' );

Note: Doing that will not remove the rewrite rules used for the sitemaps, as they are needed in order to send appropriate responses when sitemaps are disabled.

Want to know whether sitemaps are currently enabled or not? Use wp_sitemaps_get_server()->sitemaps_enabled().

Image/Video/News Sitemaps

WordPress currently implements and supports the core sitemaps format as defined on sitemaps.org. Sitemap extensions like image, video, and news sitemaps are not covered by this feature, as these are usually only useful for a small number of websites. In future versions of WordPress, filters and hooksHooks In WordPress theme and development, hooks are functions that can be applied to an action or a Filter in WordPress. Actions are functions performed when a certain event occurs in WordPress. Filters allow you to modify certain functions. Arguments used to hook both filters and actions look the same. may be added to enable adding such functionality. For now this will still be left to plugins to implement.

New Classes and Functions

As of this writing, this is the full list of new classes and functions introduced with this feature.

Functions:

  • wp_sitemaps_get_server – Retrieves the current Sitemaps server instance.
  • wp_get_sitemap_providers – Gets an array of sitemap providers.
  • wp_register_sitemap_provider – Registers a new sitemap provider.
  • wp_sitemaps_get_max_urls – Gets the maximum number of URLs for a sitemap.

Classes:

  • WP_Sitemaps – Main class responsible for setting up rewrites and registering all providers.
  • WP_Sitemaps_Index – Builds the sitemap index page that lists the links to all of the sitemaps.
  • WP_Sitemaps_Provider – Base class for other sitemap providers to extend and contains shared functionality.
  • WP_Sitemaps_Registry – Handles registering sitemap providers.
  • WP_Sitemaps_Renderer – Responsible for rendering Sitemaps data to XML in accordance with sitemap protocol.
  • WP_Sitemaps_Stylesheet – This class provides the XSL stylesheets to style all sitemaps.
  • WP_Sitemaps_Posts – Builds the sitemaps for the ‘post’ object type and its sub types (custom post types).
  • WP_Sitemaps_Taxonomies – Builds the sitemaps for the ‘taxonomy’ object type and its sub types (custom taxonomies).
  • WP_Sitemaps_Users – Builds the sitemaps for the ‘user’ object type.

Available Hooks and Filters

As of this writing, this is the full list of available hooks and filters.

General:

  • wp_sitemaps_enabled – Filters whether XML Sitemaps are enabled or not.
  • wp_sitemaps_max_urls – Filters the maximum number of URLs displayed on a sitemap.
  • wp_sitemaps_init – Fires when initializing sitemaps.
  • wp_sitemaps_index_entry – Filters the sitemap entry for the sitemap index.

Providers:

  • wp_sitemaps_add_provider – Filters the sitemap provider before it is added.
  • wp_sitemaps_post_types – Filters the list of post types to include in the sitemaps.
  • wp_sitemaps_posts_entry – Filters the sitemap entry for an individual post.
  • wp_sitemaps_posts_show_on_front_entry – Filters the sitemap entry for the home page when the ‘show_on_front’ option equals ‘posts’.
  • wp_sitemaps_posts_query_args – Filters the query arguments for post type sitemap queries.
  • wp_sitemaps_posts_pre_url_list – Filters the posts URL list before it is generated (short-circuit).
  • wp_sitemaps_posts_pre_max_num_pages – Filters the max number of pages before it is generated (short-circuit).
  • wp_sitemaps_taxonomies – Filters the list of taxonomies to include in the sitemaps.
  • wp_sitemaps_taxonomies_entry – Filters the sitemap entry for an individual term.
  • wp_sitemaps_taxonomies_query_args – Filters the query arguments for taxonomy terms sitemap queries.
  • wp_sitemaps_taxonomies_pre_url_list – Filters the taxonomies URL list before it is generated (short-circuit).
  • wp_sitemaps_taxonomies_pre_max_num_pages – Filters the max number of pages before it is generated (short-circuit).
  • wp_sitemaps_users_entry – Filters the sitemap entry for an individual user.
  • wp_sitemaps_users_query_args – Filters the query arguments for user sitemap queries.
  • wp_sitemaps_users_pre_url_list – Filters the users URL list before it is generated (short-circuit).
  • wp_sitemaps_users_pre_max_num_pages – Filters the max number of pages before it is generated (short-circuit).

Stylesheets:

  • wp_sitemaps_stylesheet_css – Filters the CSSCSS Cascading Style Sheets. for the sitemap stylesheet.
  • wp_sitemaps_stylesheet_url – Filters the URL for the sitemap stylesheet.
  • wp_sitemaps_stylesheet_content – Filters the content of the sitemap stylesheet.
  • wp_sitemaps_stylesheet_index_url – Filters the URL for the sitemap index stylesheet.
  • wp_sitemaps_stylesheet_index_content – Filters the content of the sitemap index stylesheet.

#5-5, #dev-notes, #sitemaps, #xml-sitemaps

New esc_xml() function in WordPress 5.5

As part of the development for the new XML Sitemaps feature in WordPress 5.5, a new esc_xml() function has been added to coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. that filters a string cleaned and escaped for output in XML. This joins the existing set of functions like esc_html() and esc_js().

While all contents in XML sitemaps are already escaped using this new function, existing code in WordPress core can be updated to leverage it in future releases.

wp_kses_normalize_entities() has been updated accordingly to support this, and now can distinguish between HTMLHTML HyperText Markup Language. The semantic scripting language primarily used for outputting content in web browsers. and XML context.

Note: l10nL10n Localization, or the act of translating code into one's own language. Also see internationalization. Often written with an uppercase L so it is not confused with the capital letter i or the numeral 1. WordPress has a capable and dynamic group of polyglots who take WordPress to more than 70 different locales. helpers like esc_xml__() and esc_xml_e() are being proposed separately in #50551, and are not part of this release.

#5-5, #dev-notes, #sitemaps, #xml-sitemaps

Merge Announcement: Extensible Core Sitemaps

This proposal seeks to integrate basic, extensibleExtensible This is the ability to add additional functionality to the code. Plugins extend the WordPress core software. XML sitemaps functionality into WordPress CoreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress..

While web crawlers are able to discover pages from links within the site and from other sites, sitemaps supplement this approach by allowing crawlers to quickly and comprehensively identify all URLs included in the sitemap and learn other signals about those URLs using the associated metadata.

Purpose & Goals

Sitemaps help WordPress sites become more discoverable by providing search engines with a map of content that should be indexed. The Sitemaps protocol is a URLURL A specific web address of a website or web page on the Internet, such as a website’s URL www.wordpress.org inclusion protocol and complements robots.txt, a URL exclusion protocol.

A Sitemap is an XML file that lists the URLs for a site. Sitemaps can optionally include information about each URL: when it was last updated, how often it changes, and how important it is in relation to other URLs of the site. This allows search engines to crawl the site more effectively and to discover every public URL the site has made available. 

This core sitemaps feature aims to provide the base required functionality for the Sitemaps protocol for core WordPress objects, then enables developers to extend this functionality with a robust and consistent set of filters. For example, developers can control which object types (posts, taxonomies, authors) or object subtypes (post types, taxonomies) are included, exclude specific entries, or extend sitemaps to add optional fields. See below for the full list.

Project Background

The idea of adding sitemaps to core was originally proposed in June 2019.  Since then, development has been ongoing in GitHubGitHub GitHub is a website that offers online implementation of git repositories that can easily be shared, copied and modified by other developers. Public repositories are free to host, private repositories require a paid subscription. GitHub introduced the concept of the ‘pull request’ where code changes done in branches by contributors can be reviewed and discussed before being merged be the repository owner. https://github.com/, and weekly meetings in the #core-sitemaps channel started this year to push development forward. Several versions of the feature plugin have been released on the WordPress.orgWordPress.org The community site where WordPress code is created and shared by the users. This is where you can download the source code for WordPress core, plugins and themes as well as the central location for community conversations and organization. https://wordpress.org/ pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party repository, with the latest 0.4.1 representing the state that is considered ready to merge into core. The team is currently working on preparing the final patch to include on the Trac ticket.

Implementation Details

XML Sitemaps will be enabled by default making the following object types indexable:

  • Homepage
  • Posts page
  • Core post types (i.e. pages and posts)
  • Custom post types
  • Core taxonomies (i.e. tags and categories)
  • Custom taxonomies
  • Author archives

Additionally, the robots.txt file exposed by WordPress will reference the sitemap index.

A crucial feature of the sitemap plugin is the sitemap index. This is the main XML file that contains the listing of all the sitemap pages exposed by a WordPress site. By default, the plugin creates a sitemap index at /wp-sitemap.xml which includes sitemaps for all supported content, separated into groups by type. Each sitemap file contains a maximum of 2,000 URLs per sitemap, when that threshold is reached a new sitemap file is added.

By default, sitemaps are created for all public post types and taxonomies, as well as for author archives. Several filters exist to tweak this behavior, for example to include or exclude certain entries. Also, there are plenty of available hooksHooks In WordPress theme and development, hooks are functions that can be applied to an action or a Filter in WordPress. Actions are functions performed when a certain event occurs in WordPress. Filters allow you to modify certain functions. Arguments used to hook both filters and actions look the same. for plugins to integrate with this feature if they want to, or to disable it completely if they wish to roll their own version.

Contributors and Feedback

The following people have contributed to this project in some form or another:

Adrian McShane, @afragen, @adamsilverstein, @casiepa, @flixos90, @garrett-eclipse, @joemcgill, @kburgoine, @kraftbj, @milana_cap, @pacifika, @pbiron, @pfefferle, Ruxandra Gradina, @swissspidy, @szepeviktor, @tangrufus, @tweetythierry

With special thanks to the docs, polyglots, and security teams for their thorough reviews.

Available Hooks and Filters

Check out the feature plugin page for a full list of filters and also a few usage examples.

Frequently Asked Questions

How can I disable sitemaps?

If you update the WordPress settings to discourage search engines from indexing your site, sitemaps will be disabled. Alternatively, use the wp_sitemaps_is_enabled filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output., or use remove_action( 'init', 'wp_sitemaps_get_server' ); to disable initialization of any sitemap functionality.

How can I disable sitemaps for a certain object type or exclude a certain item?

Using the filters referred to above – check out the feature plugin page for examples.

Does this support lastmod, changefreq, or priority attributes for sitemaps?

By default, no. Those are optional fields in the sitemaps protocol and not typically consumed by search engines. Developers can still add those fields if they want to using the filters referred to above.

lastmod in particular has not been implemented due to the added complexity of calculating the last modified dates for all object types and sitemaps with reasonable performance. For a common website with less frequent updates, lastmod does not offer additional benefits. For sites that are updated very frequently and want to use lastmod, it is recommended to use a plugin to add this functionality.

What about image/video/news sitemaps?

These sitemap extensions were declared a non-goal when the project was initially proposed, and as such are not covered by this feature. In future versions of WordPress, filters and hooks may be added to enable plugins to add such functionality.

Are there any UIUI User interface controls to exclude posts or pages from sitemaps?

No. User-facing changes were declared a non-goal when the project was initially proposed, since simply omitting a given post from a sitemap is not a guarantee that it won’t get crawled or indexed by search engines. In the spirit of “Decisions, not options”, any logic to exclude posts from sitemaps is better handled by dedicated plugins (i.e. SEO plugins). Plugins that implement a UI for relevant areas can use the new filters to enforce their settings, for example to only query content that has not been flagged with a “noindex” option.

Are there any privacy implications of listing users in sitemaps?

The sitemaps only surface the site’s author archives, and do not include any information that isn’t already publicly available on a site.

Are there any performance implications by adding this feature?

The addition of this feature does not impact regular website visitors, but only users who access the sitemap directly. Benchmarks during development of this feature showed that sitemap generation is generally very fast even for sites with thousands of posts. Thus, no additional caching for sitemaps was put in place.

If you want to optimize the sitemap generation, for example by optimizing queries or even short-circuiting any database queries, use the filters mentioned above.

What about sites with existing sitemap plugins?

Many sites already have a plugin active that implements sitemaps. For most of them, that will no longer be necessary, as the feature in WordPress core suffices. However, there is no harm in keeping them. The core sitemaps feature was built in a robust and easily extensible way. If for some reason two sitemaps are exposed on a website (one by core, one by a plugin), this does not result in any negative consequences for the site’s discoverability.

#5-5, #feature-plugins, #feature-projects, #merge-proposals, #sitemaps, #xml-sitemaps

XML Sitemaps Meeting: June 9th, 2020

Since the last blog post about the XML feature project we have seen many fruitful discussions and great progress towards WordPress coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. inclusion.

This post aims to give an overview of the things currently in progress, and the items that should be discussed in the upcoming meeting on SlackSlack Slack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/..

Updates

  • Version 0.4.0
    This release was published last week in an effort to add the last remaining features before
  • Merge proposal post
    Work continued on the draft, and contributors will be pinged for review before publishing.
  • Core patchpatch A special text file that describes changes to code, by identifying the files and lines which are added, removed, and altered. It may also be referred to as a diff. A patch can be applied to a codebase for testing.
    A pull request has been started on GitHubGitHub GitHub is a website that offers online implementation of git repositories that can easily be shared, copied and modified by other developers. Public repositories are free to host, private repositories require a paid subscription. GitHub introduced the concept of the ‘pull request’ where code changes done in branches by contributors can be reviewed and discussed before being merged be the repository owner. https://github.com/ that aims to serve as the basis for this.

Agenda: June 9th

The next meeting will be held on Tuesday, June 9h at 16.00 CEST.

Items on the agenda so far:

  • 0.4.1 release
  • Core patch
  • Open floor

Want to add anything to the above? Please leave a comment here or reach out on Slack.

This meeting is held in the #core-sitemaps channel. To join the meeting, you’ll need an account on the Making WordPress Slack.

#agenda, #feature-plugins, #feature-projects, #sitemaps, #xml-sitemaps