New XML Sitemaps Functionality in WordPress 5.5

In WordPress 5.5, a new feature is being introduced that adds basic, extensibleExtensible This is the ability to add additional functionality to the code. Plugins extend the WordPress core software. XML sitemaps functionality into WordPress coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress..

While web crawlers are able to discover pages from links within the site and from other sites, sitemaps supplement this approach by allowing crawlers to quickly and comprehensively identify all URLs included in the sitemap and learn other signals about those URLs using the associated metadata.

For more background information on this new feature, check out the merge announcement, or the corresponding TracTrac An open source project by Edgewall Software that serves as a bug tracker and project management tool for WordPress. ticketticket Created for both bug reports and feature development on the bug tracker. #50117.

This article explains in detail the various ways in which this new feature can be customized by developers. For example, if you are developing a pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party with some similar functionality, this post will show you how you can integrate it with the core’s new sitemaps feature.

Key Takeways

With version 5.5., WordPress will expose a sitemap index at /wp-sitemap.xml. This is the main XML file that contains the listing of all the sitemap pages exposed by a WordPress site.

The sitemap index can hold a maximum of 50000 sitemaps, and a single sitemap can hold a (filterable) maximum of 2000 entries.

By default, sitemaps are created for all public and publicly queryable post types and taxonomies, as well as for author archives and of course the homepage of the site.

The robots.txt file exposed by WordPress will reference the sitemap index so that i can be easily discovered by search engines.

Technical Requirements

Rendering sitemaps on the frontend requires the SimpleXML PHPPHP The web scripting language in which WordPress is primarily architected. WordPress requires PHP 7.4 or higher extension. If this extension is not available, an error message will be displayed instead of the sitemap. The HTTPHTTP HTTP is an acronym for Hyper Text Transfer Protocol. HTTP is the underlying protocol used by the World Wide Web and this protocol defines how messages are formatted and transmitted, and what actions Web servers and browsers should take in response to various commands. status code 501 (“Not implemented”) will be sent accordingly.

Configuring Sitemaps Behavior

Adding Custom Sitemaps

WordPress provides sitemaps for built-in content types like pages and author archives out of the box. If you are developing a plugin that adds custom features beyond those standard ones, or just want to include some custom URLs on your site, it might make sense to add a custom sitemap provider.

To do so, all you need to do is create a custom PHP class that extends the abstract WP_Sitemaps_Provider class in core. Then, you can use the wp_register_sitemap_provider() function to register it. Here’s an example:

add_filter(
	'init',
	function() {
		$provider = new Awesome_Plugin_Sitemaps_Provider();
		wp_register_sitemap_provider( 'awesome-plugin', $provider );
	}
);

The provider will be responsible for getting all sitemaps and sitemap entries, as well as determining pagination.

Removing Certain Sitemaps

There are three existing sitemaps providers for WordPress object types like posts, taxonomies, and users. If you want to remove one of them, let’s say the “users” provider, you can leverage the wp_sitemaps_add_provider filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. to do so. Here’s an example:

add_filter(
	'wp_sitemaps_add_provider',
	function( $provider, $name ) {
		if ( 'users' === $name ) {
			return false;
		}

		return $provider;
	},
	10,
	2
);

If instead you want to disable sitemap generation for a specific post type or taxonomyTaxonomy A taxonomy is a way to group things together. In WordPress, some common taxonomies are category, link, tag, or post format. https://codex.wordpress.org/Taxonomies#Default_Taxonomies., use the wp_sitemaps_post_types or wp_sitemaps_taxonomies filter, respectively.

Example: Disabling sitemaps for the page post type

add_filter(
	'wp_sitemaps_post_types',
	function( $post_types ) {
		unset( $post_types['page'] );
		return $post_types;
	}
);

Example: Disabling sitemaps for the post_tag taxonomy

add_filter(
	'wp_sitemaps_taxonomies',
	function( $taxonomies ) {
		unset( $taxonomies['post_tag'] );
		return $taxonomies;
	}
);

Adding Additional Tags to Sitemap Entries

The sitemaps protocol specifies a certain set of supported attributes for sitemap entries. Of those, only the URLURL A specific web address of a website or web page on the Internet, such as a website’s URL www.wordpress.org (loc) tagtag A directory in Subversion. WordPress uses tags to store a single snapshot of a version (3.6, 3.6.1, etc.), the common convention of tags in version control systems. (Not to be confused with post tags.) is required. All others (e.g. changefreq and priority) are optional tags in the sitemaps protocol and not typically consumed by search engines, which is why WordPress only lists the URL itself. Developers can still add those tags if they really want to.

You can use the wp_sitemaps_posts_entry / wp_sitemaps_users_entry / wp_sitemaps_taxonomies_entry filters to add additional tags like changefreq, priority, or lastmod to single items in the sitemap.

Example: Adding the last modified date for posts

add_filter(
    'wp_sitemaps_posts_entry',
    function( $entry, $post ) {
        $entry['lastmod'] = gmdate( DATE_W3C, strtotime( $post->post_modified_gmt );
        return $entry;
    },
    10,
    2
);

Similarly, you can use the wp_sitemaps_index_entry filter to add lastmod on the sitemap index. Note: the sitemaps protocal does not support on the sitemap index.

Trying to add any unsupported tags will result in a _doing_it_wrong notice.

Excluding a Single Post from the Sitemap

If you are developing a plugin that allows setting specific posts or pages to noindex, it’s a good idea to exclude those from the sitemap too.

The wp_sitemaps_posts_query_args filter can be used to exclude specific posts from the sitemap. Here’s an example:

add_filter(
	'wp_sitemaps_posts_query_args',
	function( $args, $post_type ) {
		if ( 'post' !== $post_type ) {
			return $args;
		}

		$args['post__not_in'] = isset( $args['post__not_in'] ) ? $args['post__not_in'] : array();
		$args['post__not_in'][] = 123; // 123 is the ID of the post to exclude.
		return $args;
	},
	10,
	2
);

Disabling Sitemaps Functionality Completely

If you update the Site Visibility settings in WordPress adminadmin (and super admin) to discourage search engines from indexing your site, sitemaps will be disabled. You can use the wp_sitemaps_enabled filter to override the default behavior.

Here’s an example of how to disable sitemaps completely, no matter what:

add_filter( 'wp_sitemaps_enabled', '__return_false' );

Note: Doing that will not remove the rewrite rules used for the sitemaps, as they are needed in order to send appropriate responses when sitemaps are disabled.

Want to know whether sitemaps are currently enabled or not? Use wp_sitemaps_get_server()->sitemaps_enabled().

Image/Video/News Sitemaps

WordPress currently implements and supports the core sitemaps format as defined on sitemaps.org. Sitemap extensions like image, video, and news sitemaps are not covered by this feature, as these are usually only useful for a small number of websites. In future versions of WordPress, filters and hooksHooks In WordPress theme and development, hooks are functions that can be applied to an action or a Filter in WordPress. Actions are functions performed when a certain event occurs in WordPress. Filters allow you to modify certain functions. Arguments used to hook both filters and actions look the same. may be added to enable adding such functionality. For now this will still be left to plugins to implement.

New Classes and Functions

As of this writing, this is the full list of new classes and functions introduced with this feature.

Functions:

  • wp_sitemaps_get_server – Retrieves the current Sitemaps server instance.
  • wp_get_sitemap_providers – Gets an array of sitemap providers.
  • wp_register_sitemap_provider – Registers a new sitemap provider.
  • wp_sitemaps_get_max_urls – Gets the maximum number of URLs for a sitemap.

Classes:

  • WP_Sitemaps – Main class responsible for setting up rewrites and registering all providers.
  • WP_Sitemaps_Index – Builds the sitemap index page that lists the links to all of the sitemaps.
  • WP_Sitemaps_Provider – Base class for other sitemap providers to extend and contains shared functionality.
  • WP_Sitemaps_Registry – Handles registering sitemap providers.
  • WP_Sitemaps_Renderer – Responsible for rendering Sitemaps data to XML in accordance with sitemap protocol.
  • WP_Sitemaps_Stylesheet – This class provides the XSL stylesheets to style all sitemaps.
  • WP_Sitemaps_Posts – Builds the sitemaps for the ‘post’ object type and its sub types (custom post types).
  • WP_Sitemaps_Taxonomies – Builds the sitemaps for the ‘taxonomy’ object type and its sub types (custom taxonomies).
  • WP_Sitemaps_Users – Builds the sitemaps for the ‘user’ object type.

Available Hooks and Filters

As of this writing, this is the full list of available hooks and filters.

General:

  • wp_sitemaps_enabled – Filters whether XML Sitemaps are enabled or not.
  • wp_sitemaps_max_urls – Filters the maximum number of URLs displayed on a sitemap.
  • wp_sitemaps_init – Fires when initializing sitemaps.
  • wp_sitemaps_index_entry – Filters the sitemap entry for the sitemap index.

Providers:

  • wp_sitemaps_add_provider – Filters the sitemap provider before it is added.
  • wp_sitemaps_post_types – Filters the list of post types to include in the sitemaps.
  • wp_sitemaps_posts_entry – Filters the sitemap entry for an individual post.
  • wp_sitemaps_posts_show_on_front_entry – Filters the sitemap entry for the home page when the ‘show_on_front’ option equals ‘posts’.
  • wp_sitemaps_posts_query_args – Filters the query arguments for post type sitemap queries.
  • wp_sitemaps_posts_pre_url_list – Filters the posts URL list before it is generated (short-circuit).
  • wp_sitemaps_posts_pre_max_num_pages – Filters the max number of pages before it is generated (short-circuit).
  • wp_sitemaps_taxonomies – Filters the list of taxonomies to include in the sitemaps.
  • wp_sitemaps_taxonomies_entry – Filters the sitemap entry for an individual term.
  • wp_sitemaps_taxonomies_query_args – Filters the query arguments for taxonomy terms sitemap queries.
  • wp_sitemaps_taxonomies_pre_url_list – Filters the taxonomies URL list before it is generated (short-circuit).
  • wp_sitemaps_taxonomies_pre_max_num_pages – Filters the max number of pages before it is generated (short-circuit).
  • wp_sitemaps_users_entry – Filters the sitemap entry for an individual user.
  • wp_sitemaps_users_query_args – Filters the query arguments for user sitemap queries.
  • wp_sitemaps_users_pre_url_list – Filters the users URL list before it is generated (short-circuit).
  • wp_sitemaps_users_pre_max_num_pages – Filters the max number of pages before it is generated (short-circuit).

Stylesheets:

  • wp_sitemaps_stylesheet_css – Filters the CSSCSS Cascading Style Sheets. for the sitemap stylesheet.
  • wp_sitemaps_stylesheet_url – Filters the URL for the sitemap stylesheet.
  • wp_sitemaps_stylesheet_content – Filters the content of the sitemap stylesheet.
  • wp_sitemaps_stylesheet_index_url – Filters the URL for the sitemap index stylesheet.
  • wp_sitemaps_stylesheet_index_content – Filters the content of the sitemap index stylesheet.

#5-5, #dev-notes, #sitemaps, #xml-sitemaps