Merge Announcement: Extensible Core Sitemaps

This proposal seeks to integrate basic, extensibleExtensible This is the ability to add additional functionality to the code. Plugins extend the WordPress core software. XML sitemaps functionality into WordPress CoreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress..

While web crawlers are able to discover pages from links within the site and from other sites, sitemaps supplement this approach by allowing crawlers to quickly and comprehensively identify all URLs included in the sitemap and learn other signals about those URLs using the associated metadata.

Purpose & Goals

Sitemaps help WordPress sites become more discoverable by providing search engines with a map of content that should be indexed. The Sitemaps protocol is a URLURL A specific web address of a website or web page on the Internet, such as a website’s URL www.wordpress.org inclusion protocol and complements robots.txt, a URL exclusion protocol.

A Sitemap is an XML file that lists the URLs for a site. Sitemaps can optionally include information about each URL: when it was last updated, how often it changes, and how important it is in relation to other URLs of the site. This allows search engines to crawl the site more effectively and to discover every public URL the site has made available. 

This core sitemaps feature aims to provide the base required functionality for the Sitemaps protocol for core WordPress objects, then enables developers to extend this functionality with a robust and consistent set of filters. For example, developers can control which object types (posts, taxonomies, authors) or object subtypes (post types, taxonomies) are included, exclude specific entries, or extend sitemaps to add optional fields. See below for the full list.

Project Background

The idea of adding sitemaps to core was originally proposed in June 2019.  Since then, development has been ongoing in GitHubGitHub GitHub is a website that offers online implementation of git repositories that can easily be shared, copied and modified by other developers. Public repositories are free to host, private repositories require a paid subscription. GitHub introduced the concept of the ‘pull request’ where code changes done in branches by contributors can be reviewed and discussed before being merged be the repository owner. https://github.com/, and weekly meetings in the #core-sitemaps channel started this year to push development forward. Several versions of the feature plugin have been released on the WordPress.orgWordPress.org The community site where WordPress code is created and shared by the users. This is where you can download the source code for WordPress core, plugins and themes as well as the central location for community conversations and organization. https://wordpress.org/ pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party repository, with the latest 0.4.1 representing the state that is considered ready to merge into core. The team is currently working on preparing the final patch to include on the Trac ticket.

Implementation Details

XML Sitemaps will be enabled by default making the following object types indexable:

  • Homepage
  • Posts page
  • Core post types (i.e. pages and posts)
  • Custom post types
  • Core taxonomies (i.e. tags and categories)
  • Custom taxonomies
  • Author archives

Additionally, the robots.txt file exposed by WordPress will reference the sitemap index.

A crucial feature of the sitemap plugin is the sitemap index. This is the main XML file that contains the listing of all the sitemap pages exposed by a WordPress site. By default, the plugin creates a sitemap index at /wp-sitemap.xml which includes sitemaps for all supported content, separated into groups by type. Each sitemap file contains a maximum of 2,000 URLs per sitemap, when that threshold is reached a new sitemap file is added.

By default, sitemaps are created for all public post types and taxonomies, as well as for author archives. Several filters exist to tweak this behavior, for example to include or exclude certain entries. Also, there are plenty of available hooksHooks In WordPress theme and development, hooks are functions that can be applied to an action or a Filter in WordPress. Actions are functions performed when a certain event occurs in WordPress. Filters allow you to modify certain functions. Arguments used to hook both filters and actions look the same. for plugins to integrate with this feature if they want to, or to disable it completely if they wish to roll their own version.

Contributors and Feedback

The following people have contributed to this project in some form or another:

Adrian McShane, @afragen, @adamsilverstein, @casiepa, @flixos90, @garrett-eclipse, @joemcgill, @kburgoine, @kraftbj, @milana_cap, @pacifika, @pbiron, @pfefferle, Ruxandra Gradina, @swissspidy, @szepeviktor, @tangrufus, @tweetythierry

With special thanks to the docs, polyglots, and security teams for their thorough reviews.

Available Hooks and Filters

Check out the feature plugin page for a full list of filters and also a few usage examples.

Frequently Asked Questions

How can I disable sitemaps?

If you update the WordPress settings to discourage search engines from indexing your site, sitemaps will be disabled. Alternatively, use the wp_sitemaps_is_enabled filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output., or use remove_action( 'init', 'wp_sitemaps_get_server' ); to disable initialization of any sitemap functionality.

How can I disable sitemaps for a certain object type or exclude a certain item?

Using the filters referred to above – check out the feature plugin page for examples.

Does this support lastmod, changefreq, or priority attributes for sitemaps?

By default, no. Those are optional fields in the sitemaps protocol and not typically consumed by search engines. Developers can still add those fields if they want to using the filters referred to above.

lastmod in particular has not been implemented due to the added complexity of calculating the last modified dates for all object types and sitemaps with reasonable performance. For a common website with less frequent updates, lastmod does not offer additional benefits. For sites that are updated very frequently and want to use lastmod, it is recommended to use a plugin to add this functionality.

What about image/video/news sitemaps?

These sitemap extensions were declared a non-goal when the project was initially proposed, and as such are not covered by this feature. In future versions of WordPress, filters and hooks may be added to enable plugins to add such functionality.

Are there any UIUI User interface controls to exclude posts or pages from sitemaps?

No. User-facing changes were declared a non-goal when the project was initially proposed, since simply omitting a given post from a sitemap is not a guarantee that it won’t get crawled or indexed by search engines. In the spirit of “Decisions, not options”, any logic to exclude posts from sitemaps is better handled by dedicated plugins (i.e. SEO plugins). Plugins that implement a UI for relevant areas can use the new filters to enforce their settings, for example to only query content that has not been flagged with a “noindex” option.

Are there any privacy implications of listing users in sitemaps?

The sitemaps only surface the site’s author archives, and do not include any information that isn’t already publicly available on a site.

Are there any performance implications by adding this feature?

The addition of this feature does not impact regular website visitors, but only users who access the sitemap directly. Benchmarks during development of this feature showed that sitemap generation is generally very fast even for sites with thousands of posts. Thus, no additional caching for sitemaps was put in place.

If you want to optimize the sitemap generation, for example by optimizing queries or even short-circuiting any database queries, use the filters mentioned above.

What about sites with existing sitemap plugins?

Many sites already have a plugin active that implements sitemaps. For most of them, that will no longer be necessary, as the feature in WordPress core suffices. However, there is no harm in keeping them. The core sitemaps feature was built in a robust and easily extensible way. If for some reason two sitemaps are exposed on a website (one by core, one by a plugin), this does not result in any negative consequences for the site’s discoverability.

#5-5, #feature-plugins, #feature-projects, #merge-proposals, #sitemaps, #xml-sitemaps

REST API Merge Proposal, Part 2: Content API

Hi everyone, it’s your friendly REST APIREST API The REST API is an acronym for the RESTful Application Program Interface (API) that uses HTTP requests to GET, PUT, POST and DELETE data. It is how the front end of an application (think “phone app” or “website”) can communicate with the data store (think “database” or “file system”) https://developer.wordpress.org/rest-api/. team here with our second merge proposal for WordPress coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress.. (WordPress 4.4 included the REST API Infrastructure, if you’d like to check out our previous merge proposal.) Even if you’re familiar with the REST API right now, we’ve made some changes to how the project is organised, so it’s worth reading everything here.

(If you haven’t done so already, now would be a great time to install the REST API and OAuth plugins from WordPress.orgWordPress.org The community site where WordPress code is created and shared by the users. This is where you can download the source code for WordPress core, plugins and themes as well as the central location for community conversations and organization. https://wordpress.org/.)

A brief history of the REST API

The REST API was created as a proof-of-concept by Ryan McCue (hey, that’s me!) at the WordPress Contributor Summit in 2012, but the project kicked off during the 2013 Google Summer of Code. The end result was Version 1.0, which grew into a community supported initiative that saw adoption and provided for a solid learning platform. The team used Version 1 to test out the fundamental ideas behind the APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways., and then iterated with Version 2, which made some major breaking changes, including explicit versioning, the introduction of namespacing for forwards compatibility, and a restructure of the internals. Version 2 also led to the infrastructure of the REST API being committed to WordPress core in 4.4.

This infrastructure is the core of the REST API, and provides the external interface to send and receive RESTful HTTPHTTP HTTP is an acronym for Hyper Text Transfer Protocol. HTTP is the underlying protocol used by the World Wide Web and this protocol defines how messages are formatted and transmitted, and what actions Web servers and browsers should take in response to various commands. requests. Since shipping in 4.4, the infrastructure is now used by WordPress Core for oEmbed responses, and by plugins like WooCommerce and Jetpack, enabling anyone to create their own REST API endpoints.

The team has also been hard at work on the API endpoints. This has included core changes to WordPress to support the API, including deeper changes to both settings and meta.

Today the REST API team is proposing the inclusion of a collection of endpoints that we term the “Content API” into WordPress Core.

Proposals for Merge

Content Endpoints

For WordPress 4.7 the API team proposes to merge API endpoints for WordPress content types. These endpoints provide machine-readable external access to your WordPress site with a clear, standards-driven interface, allowing new and innovative apps for interacting with your site. These endpoints support all of the following:

  • Content:
    • Posts: Read and write access to all post data, for all types of post-based data, including pages and media.
    • Comments: Read and write access to all comment data. This includes pingbacks and trackbacks.
    • Terms: Read and write access to all term data.
    • Users: Read and write access to all user data. This includes public access to some data for post authors.
    • MetaMeta Meta is a term that refers to the inside workings of a group. For us, this is the team that works on internal WordPress sites like WordCamp Central and Make WordPress.: Read and write access to metadata for posts, comments, terms, and users, on an opt-in basis from plugins.
  • Management:
    • Settings: Read and write access to settings, on an opt-in basis from plugins and core. This enables API management of key site content values that are technically stored in options, such as site title and byline.

This merge proposal represents a complete and functional Content API, providing the necessary endpoints for mobile apps and frontends, and lays the groundwork for future releases focused on providing a Management API interface for full site administration.

Content API endpoints support both public and authenticated access. Authenticated access allows both read and write access to anything your user has access to, including post meta and settings. Public access is available for any already-public data, such as posts, terms, and limited user data for published post authors. To avoid potential privacy issues we’ve taken pains to ensure that everything we’re exposing is already public, and the API uses WordPress’ capability system extensively to ensure that all data is properly secured.

Just like the rest of WordPress, the Content API is fully extensibleExtensible This is the ability to add additional functionality to the code. Plugins extend the WordPress core software., supporting custom post meta, as well as allowing more complex data to be added via register_rest_field. The API is built around standard parts of WordPress, including the capability system and filters, so extending the API in plugins should feel as familiar to developers as extending any other part of WordPress.

This Content API is targeted at a few primary use cases, including enhancing themes with interactivity, creating powerful pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party interfaces, building mobile and desktop applications, and providing alternative authoring experiences. We’ve been working on first-party examples of these, including a mobile app using React Native and a liveblogging web app, as well as getting feedback from others, including WIRED, the New York Times, and The Times of London. Based on experience building on the API, we’ve polished the endpoints and expanded to support settings endpoints, which are included as the first part of the Management API.

Authentication

The API Infrastructure already in WordPress core includes support for regular cookie-based authentication. This is useful for plugins and themes that want to use the API, but requires access to cookies and nonces, and is hence only useful for internal usage.

To complement the Content Endpoints, for WordPress 4.7 the API team also proposes merging the REST API OAuth 1 server plugin into WordPress Core. This plugin provides remote authentication via the OAuth 1 protocol, allowing remote servers and applications to interact securely with the WordPress API.

OAuth is a standardised system for delegated authorisation. With OAuth, rather than providing your password to a third-party app, you can authorise it to operate on your behalf. Apps are also required to be registered with the site beforehand, which gives site administrators control over third-party access. Access to these apps can be revoked by the user if they are no longer using the app, or by a site administrator. This also allows apps with known vulnerabilities to have compromised credentials revoked to protect users.

We’ve chosen OAuth 1 over the newer OAuth 2 protocol because OAuth 1 includes a complex system for request signing to ensure credentials remain secure even over unsecured HTTP, while OAuth 2 requires HTTPSHTTPS HTTPS is an acronym for Hyper Text Transfer Protocol Secure. HTTPS is the secure version of HTTP, the protocol over which data is sent between your browser and the website that you are connected to. The 'S' at the end of HTTPS stands for 'Secure'. It means all communications between your browser and the website are encrypted. This is especially helpful for protecting sensitive data like banking information. with a modern version of TLS. While it is strongly encouraged for sites to use HTTPS whenever possible (Let’s Encrypt makes it easier than ever to do so), WordPress itself does not require HTTPS and we do not believe WordPress should make HTTPS a requirement for using the API. The additional complexity that OAuth 1 adds can be easily supported by a library, and many such libraries already exist in most programming languages. OAuth 1 remains supported around the web, including for the Twitter API, and we also provide extensive documentation on using it.

Authentication Beyond 4.7

One issue with OAuth over direct username and password authentication is that it requires applications to be registered on the site. For centralized OAuth servers this wouldn’t be a problem, but the distributed nature of WordPress installations makes this tough to handle: your application must be independently registered with every WordPress site it connects to. If you’ve ever had to create a Twitter or Facebook app just to use an existing plugin on your site, you’ll know this can be a less-than-optimal experience for users.

To solve this distribution problem, we’ve created a solution called brokered authentication. This allows a centralised server (called the “broker”) to handle app registration and to vouch for these apps to individual sites. It simplifies app registration by allowing app developers to register once for all sites, and improves security by allowing the broker to vet applications and revoke them across the entire networknetwork (versus site, blog). The system is designed to allow multiple brokers; while the main broker is run at apps.wp-api.org, organisations can run their own broker for internal usage, and developers can run a broker locally for testing.

While the broker system has been running live at apps.wp-api.org for months, we want to stay conservative in our approach to the API, especially where security is concerned. We are therefore proposing brokered authentication for WordPress 4.8 to ensure we have further time to continue testing and refining the broker system. In addition, this will require an installation of the broker on a centralised server to act as the canonical broker for out-of-the-box WordPress. While apps.wp-api.org is currently acting in this role, this is currently hosted by a third-party (Human Made) on behalf of the API team. For long-term usage the broker should instead be hosted on WordPress.org, alongside the existing plugin and theme repositories. This migrationMigration Moving the code, database and media files for a website site from one server to another. Most typically done when changing hosting companies. will take time but we remain committed to continuing to develop and support the broker.

After Merge

After merging the REST API, the team plans to continue developing the API as before. We expect that integrating the REST API into WordPress core will bring additional feedback, and we plan on incorporating this feedback through the rest of the 4.7 cycle.

During the remaining parts of this release cycle and through into the 4.8 cycle, additional work will go into other parts of the API. This includes further work and refinement on the broker authentication system, including work on WordPress.org infrastructure. Additionally, we plan to continue working on the Management API endpoints, including theme and appearance endpoints to support the Customiser team. Both of these components will be maintained as separate feature projects on GitHubGitHub GitHub is a website that offers online implementation of git repositories that can easily be shared, copied and modified by other developers. Public repositories are free to host, private repositories require a paid subscription. GitHub introduced the concept of the ‘pull request’ where code changes done in branches by contributors can be reviewed and discussed before being merged be the repository owner. https://github.com/ until they’re ready for merge into core.

The team remains committed to supporting the API in core, and the Content API will switch from GitHub to TracTrac An open source project by Edgewall Software that serves as a bug tracker and project management tool for WordPress. for project management and contributions. This same process occurred for the API Infrastructure in WordPress 4.4.

Reviews and Feedback

With this merge proposal, we’re looking for feedback and review of the project. In particular, we’re focussing on feedback on the security of the API and OAuth projects, and are also reaching out to specific people for reviews. (We take the security of the API seriously, and bugbug A bug is an error or unexpected result. Performance improvements, code optimization, and are considered enhancements, not defects. After feature freeze, only bugs are dealt with, with regressions (adverse changes from the previous version) being the highest priority. reports are welcomed on HackerOne at any time.) Design and accessibilityAccessibility Accessibility (commonly shortened to a11y) refers to the design of products, devices, services, or environments for people with disabilities. The concept of accessible design ensures both “direct access” (i.e. unassisted) and “indirect access” meaning compatibility with a person’s assistive technology (for example, computer screen readers). (https://en.wikipedia.org/wiki/Accessibility) reviews for the OAuth authorisation UIUI User interface are also welcomed to ensure we maintain the high standards of WordPress core.

Both the REST API plugin and the OAuth plugin are available on WordPress.org, and issues can be reported to the GitHub tracker for the API and the OAuth plugin respectively. We have released a final betaBeta A pre-release of software that is given out to a large group of users to trial under real conditions. Beta versions have gone through alpha testing in-house and are generally fairly close in look, feel and function to the final product; however, design changes often occur as part of the process. (Beta 15 “International Drainage Commission”) which includes the meta and settings endpoints.

With Love from Us

As always, this is a merge proposal, and is not final until 4.7 is released. We’re eager to hear your thoughts and feedback; the comments below are a perfect place for that, or you can pop along to one of our regular meetings. We’re also always available in the #core-restapi room on SlackSlack Slack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/..

We’d like to thank every single one of our contributors, including 88 contributors to the main repository and 23 contributors to the OAuth repository. Particular thanks goes to my (@rmccue) wonderful co-lead Rachel Baker (@rachelbaker), our 2.0 release leads Daniel Bachhuber (@danielbachuber) and Joe Hoyle (@joehoyle), and our key contributors for the 4.7 cycle: Adam Silverstein (@adamsilverstein), Brian Krogsgard (@krogsgard), David Remer (@websupporter), Edwin Cromley (@chopinbach), and K. Adam White (@kadamwhite). Thanks also to the core committers helping us out through the 4.7 cycle, including Aaron D. Campbell (@aaroncampbell) and Aaron Jorbin (@aaronjorbin), and to the fantastic release leadRelease Lead The community member ultimately responsible for the Release., Helen Hou-Sandí (@helen).

Thanks also to everyone who has used the REST API, and to you for reading this. We built the REST API for you, and we hope you like it.

With love, The REST API Team

#feature-plugins, #json-api, #merge-proposals, #rest-api