An update on the taxonomy roadmap

In 2013, @nacin laid out a potential roadmap for the WordPress taxonomy component. Progress on this roadmap has necessarily been deliberate and, as a result, somewhat slow. But some important steps have been taken in the last few months.

Improvements in 4.1

Recall the problem. Since 2.3 – when WP’s taxonomyTaxonomy A taxonomy is a way to group things together. In WordPress, some common taxonomies are category, link, tag, or post format. schema was first introduced – terms have been “shared” between taxonomies when they happen to share the same slug. This architecture was designed to enable serendipitous connections between bloggers writing about similar subjects, especially when combined with “global” terms, which are shared across a multisitemultisite Used to describe a WordPress installation with a network of multiple blogs, grouped by sites. This installation type has shared users tables, and creates separate database tables for each blog (wp_posts becomes wp_0_posts). See also network, blog, site networknetwork (versus site, blog). However, as WP has developed as a CMS and taxonomies have been put to much broader use than originally envisioned, this feature has ended up causing more problem than it solved. Shared slugs in particular have caused a good deal of frustration for users and developers. A notable example is the infamous #5809: changing a term in one taxonomy (say, ‘Chicago’ in the taxonomy ‘cool_cities’) would update all terms that happened to share the same slug (say, ‘Chicago’ in the taxonomy ‘cool_bands’).

Fixing this behavior has required a series of carefully staged maneuvers. Here’s what’s happened during the 4.1 cycle:

  • Unit testunit test Code written to test a small piece of code or functionality within a larger application. Everything from themes to WordPress core have a series of unit tests. Also see regression. coverage has been improved for wp_insert_term(), wp_update_term(), and term_exists() ([29798], [29830], [29875]).
  • We fixed a few edge-case bugs in term_exists() #29589 #29851. These fixes, plus the increased test coverage, have made us confident in our ability to prevent the creation of unwanted duplicate terms in most cases, a prerequisite for the next item.
  • We removed the UNIQUE index from the ‘slug’ column in wp_terms #22023, and added some failsafe logic to avoid duplicate term creation in certain edge cases [30238]. This step is small but critical. “Splitting” existing shared terms means creating a new row in the wp_terms table for each item in wp_term_taxonomy that currently shares its ‘term_id’ with another item. But when we split these terms, we cannot change their slugs, or we’ll break backward compatibility with things like taxonomy archive permalinks. That is: and must continue to work. So the ‘slug’ column of the terms table must not enforce uniqueness.
  • We stopped creating new shared terms #21950. This means that each new item inserted into wp_term_taxonomy will correspond to a new row in wp_terms, even if the new item shares its slug with an existing term.
  • Updating a shared term will cause it to be split into two separate terms #5809. This addresses the primary pain point of shared terms: you can now change ‘Chicago’ to ‘The Windy City’ without your readers thinking that you’re confused about the artists behind this hard-rockin’ hit. (Edit – term splitting was removed for WP 4.1, and is slated for 4.2.)

In short, while 4.1 won’t completely remove shared terms from WP – see below – it should alleviate the pains they inflict.

Looking ahead

The continued existence of shared terms in the database means that we still have to deal with the translationtranslation The process (or result) of changing text, words, and display formatting to support another language. Also see localization, internationalization. between ‘term_id’+’taxonomy’ and ‘term_taxonomy_id’, a requirement that limits the cool things we can do with terms. As such, cleaning up existing shared terms is a top priority. #30261 lays out some suggestions for how to expunge them from the database. This step isn’t practical for 4.1: we can only split existing terms when the 4.1 database schema change has been applied #22023, and schema changes are not immediate in multisite enviroments. But we can consider it for 4.2.

Once term_ids and term_taxonomy_ids are in one-to-one correspondence, we can start on some of the more exciting items that Nacin outlined. Termmeta #10142 becomes a viable possibility once we can rely on unique term_ids. And collapsing the wp_terms and wp_term_taxonomy tables #30262 will allow us to simplify and streamline the internals of our taxonomy functions. These are tasks that we can start exploring for 4.3 and beyond, at least one version after we’ve split existing terms on all installations. These tickets are a good starting place for people who are interested in following (and contributing to) the future of taxonomies in WordPress.

#4-1, #dev-notes, #roadmaps, #taxonomy