Rewrites Next Generation: Next Meeting

There was a bit of confusion with the last Rewrites meeting, so we never really got to kick off the conversation. Let’s give it another shot this week. The revised kick-off meeting for the Rewrites Next Generation project will hence be on May 18th 23:00 UTC.

The agenda for this meeting is to discuss problems you see with the rewrites system, any particular use-cases you think should be considered, as well as agreeing on a broad approach for how we want to tackle the problems.

(In case you missed it, you may want to read the introductory post for the project.)

Hope to see you all there!

#rewrite-rules

Proposal: Next Generation Rewrites

Hi everyone! Today I’d like to propose a new feature project for WordPress: Next Generation Rewrites. After proposing this at the last feature projects meeting, it’s time to flesh out the details of what this project would be.

The aim of the project is to modernise the rewriting and routing system for the modern CMS and platform era of WordPress.

(This project was previously proposed in a ticket on Trac, however the project is larger than a single TracTrac An open source project by Edgewall Software that serves as a bug tracker and project management tool for WordPress. ticketticket Created for both bug reports and feature development on the bug tracker. and needs a larger discussion.)

Overview

If you’ve worked with custom development on WordPress, you’ve probably registered at least one rewrite rule. Rewrites are the key to what WordPress calls “pretty permalinks”, which are URLs designed for humans rather than the server. They look like /2016/04/19/, or /about/requirements/, rather than index.php?p=42.

As a developer, you can register your own rewrite rules as well. For example, bbPressbbPress Free, open source software built on top of WordPress for easily creating forums on sites. https://bbpress.org. registers URLs like /forums/topic/jjj-is-awesome/ and WooCommerce has URLs like /product/gs3/. WordPress then “re-writes” these to the internal query variables (“query vars”), which are eventually passed into the main WP Query.

The rewrite system was initially designed to implement pretty permalinks, but has been used and abused for so much more since then. With modern plugins and web apps built on top of WordPress, it’s not uncommon to have pages that don’t need a main query, or need a specialised system of querying. As an example of this, the REST APIREST API The REST API is an acronym for the RESTful Application Program Interface (API) that uses HTTP requests to GET, PUT, POST and DELETE data. It is how the front end of an application (think “phone app” or “website”) can communicate with the data store (think “database” or “file system”) https://developer.wordpress.org/rest-api/. infrastructure added in WordPress 4.4 doesn’t use the main query, as many of the endpoints don’t require querying posts at all.

While the usage of rewrites has developed, the internals of how URLs are handled internally hasn’t changed. The system is fundamentally still designed for the blogblog (versus network, site) era of WordPress. It also predates a lot of the pieces of WordPress we take for granted, such as transients and object caching.

Project Proposal

It’s time to change the rewrite system. Rewrite rules should be changed from a layer on top of WP Query to a fully-fledged system all of their own. To do this, I’m proposing three initial steps:

  • Decouple rewrites from the query system
  • Refactor and rework the internals to make them testable
  • Rethink rewrite storage and remove flushing

These steps would take place under the umbrella of a new feature project, Next Generation Rewrites. This feature project would coordinate the project and people working on it across the various parts of coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. it touches (primarily the Rewrite Rules component, but also potentially involving UIUI User interface and UXUX User experience discussion and changes; fixing the “visit the Permalinks page and save to fix your site” issue, for example). This would also coordinate the team across multiple release cycles as we slowly but surely make progress. It’s important to make progress while keeping rewrites stable, as they’re a critical part of WordPress that we cannot break.

Decoupling Rewrites from Querying

The first step I’m proposing is to decouple rewrites from querying. Right now, rewrites for anything except querying posts involves jumping through a few hoops. This includes registering temporary query vars, or using global state to track route matching. Separating these two components will make it easier to add non-query-based rewrites, as well as allowing more powerful query rewrites.

Currently, rewrites are registered using code like the following:

add_rewrite_rule( '/regex/', 'index.php?var=value' );

These rewrite rules map a regular expression to a query string, which is later parsed into an array internally. You can achieve more complex routing via regular expressions by using a special syntax in the rewrite string:

add_rewrite_rule( '/topic/(.+)', 'index.php?topic=$matches[1]' );

Note that while this looks like a PHPPHP The web scripting language in which WordPress is primarily architected. WordPress requires PHP 5.6.20 or higher variable, it’s actually a static string that gets replaced after the regular expression. This can lead to confusion with developers who are new to the syntax, and it also can make doing more complex rewrites hard. These typically involve temporary query vars, as well as quite a bit of processing.

Instead, I want to introduce a new WP_Rewrite_Rule object to replace the current rewrite string. This object would contain a method called get_query_vars that receives the $matches variable and returns the query vars to set. This would look something like this:

class MyRule extends WP_Rewrite_Rule {
    public function get_query_vars( $matches ) {
        return array(
            'post_type' => 'topic',
            'meta_query' => array(
                array(
                    'key' => '_topic_name',
                    'value' => $matches[1],
                ),
            ),
        );
    }
}
add_rewrite_rule( '/topic/(.+)', new MyRule() );

(The exact format of the object is yet to be decided; we’ve also started discussing the possibility of passing a request object into this callback as well.)

Using an object for this allows us to have multiple callbacks for different stages of the routing process, and starts to simplify some of the internal code in WP_Rewrite. For example, the routing code around pages that allows URLs like /parent/child/sub-child/ for child pages can be simplified and moved out of the main routing code. This also helps make the code more testable, which dovetails nicely with the third goal.

Refactor rewrite internals

Step 2 of changing rewrites is to make the rewrite system fully testable. Currently, a bunch of global state is mixed into the internals of rewrite matching, and the rewrite methods tend to be monolithic. @ericlewis has previously begun work on this in #18877, and this can be continued and rolled into the Next Generation Rewrites project.

Ideally, this should happen at the same time or before the first step to allow easy checking of regressions. This is relatively boring work that won’t affect many developers, but it’s important we do it. The rewrite system is a critical part of WordPress, and it’s crucially important that it’s testable and verifiable.

Rethink rewrite storage

Once the first two steps are in place, we should begin reconsidering how rewrites are stored. Currently, rewrites are stored in the database in the rewrite_rules option. The option is less of an option (it’s not really configuration), and more of a caching technique. This is somewhat of a relic, as rewrites predate object caching and transients in WordPress, both of which are better techniques to handle caching in modern WordPress.

Removing this option and changing it to use a proper caching subsystem in WordPress should fix multiple problems, including the need to flush rewrites. This should improve the UX for end users, as we should be able to remove the “links on your site don’t work, so go to the Permalinks page and save” trick (which only really requires visiting the page, not saving).

The need to cache rewrites at all can also be reconsidered; many of the rewrite rules are simply entries in an array, and are not expensive to generate on-the-fly, while removing the caching so can make for an easier to use and more dynamic system. More expensive rules can be cached by the code generating the rule, rather than via the monolithic rewrite cache.

This has the potential to improve the user experience and fix a class of issues, but it also has the potential to ruin performance, depending on how other code is using it. It’s important to tread carefully as we attempt to improve this, and ensure we remain compatible and performant. Introducing a two-tiered system is one approach we can consider, with the ability for plugins to opt-in to a newer, dynamic system to avoid problems with flushing.

Why a Feature Project?

I’m proposing Rewrites Next Generation as a feature project for several reasons. The concept and implementation are both large and complex, moreso than simply operating in an informal matter on Trac tickets. The impact of this project is also large, affecting many pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party and theme developers, as well as potential improvements to UX. Gaining feedback from all interested stakeholders is important, as this is a crucial part of WordPress.

Getting Started

The steps I’ve listed here are only a selection of the issues with rewrites. The rewrite system hasn’t been massively changed since its inception, so there are plenty of parts that could be changed to better suit the modern era of WordPress development. The issues here are simply the three most important that I’ve found, but I want to hear your feedback too.

Let’s talk rewrites. I’m proposing a weekly meeting for Rewrites Next Generation, at Wednesday, 23:00 UTC with the first meeting at May 4th 23:00 UTC. If you’re interested in rewrites, or have had problems with them in the past, let’s talk and work out what we need to do to improve rewrites. (This is a non-permanent meeting to set the scope of the project; we’ll refine meeting frequency and timing at a later date.)

After this initial discussion, we can settle on concrete goals and a timeline for the initial stages of the project. A proof-of-concept patchpatch A special text file that describes changes to code, by identifying the files and lines which are added, removed, and altered. It may also be referred to as a diff. A patch can be applied to a codebase for testing. and ticket are available on Trac, however alternative approaches should be considered after this discussion. The short-term goal is to begin landing these improvements in trunktrunk A directory in Subversion containing the latest development code in preparation for the next major release cycle. If you are running "trunk", then you are on the latest revision., with the goal of having our first changes in WordPress 4.6, which is a quick but achievable timeline.

This project has the potential to make a large impact on developers and users, and it’s important that everyone has their say. I hope you’ll join me for the first meeting, and join in the fun of contributing to a key part of WordPress!

Thanks for reading. <3

#rewrite-rules