Enhancing image preview: core proposal

For many years search engine results have shown various images size publicly made available by site owners. Last year (September 2019), some extra controls over the content preview was introduced to Google Search followed by Bing who announced similar capabilities for their Search Engine earlier this year (April 2020).

In practice, this means that many sites do not get the benefit of large image previews, and may be losing out on traffic. Today, WordPress sites do not opt-in to large image previews by default even when “Search Engine Visibility” setting is turned on.

Below is an example comparison of Discover content for small image preview vs large image preview:

Proposed Solution 

This proposal is to opt-in to large image previews by default when “Search Engine Visibility” setting is turned on allowing search engines to display large images resulting in an enhanced user experience and CTR (click-through rate).

Theoretically, this is as simple as conditionally injecting <meta name="robots" content="max-image-preview:large"> in the HTMLHTML HyperText Markup Language. The semantic scripting language primarily used for outputting content in web browsers. head of all pages.

WordPress may already inject a “robots” metaMeta Meta is a term that refers to the inside workings of a group. For us, this is the team that works on internal WordPress sites like WordCamp Central and Make WordPress. tagtag A directory in Subversion. WordPress uses tags to store a single snapshot of a version (3.6, 3.6.1, etc.), the common convention of tags in version control systems. (Not to be confused with post tags.) into a page, for example when a site is set to disallow search engines from indexing it. To facilitate large image previews as well as exposing a central management layer for the “robots” meta tag, a new function wp_robots() should be introduced. The function would include a filterable list of values to include in the “robots” meta tag and render the meta tag only if necessary. Having this centralized layer will streamline robots management and interoperability between plugins.

By default, the list would cover the following values:

  • noindex, to be included when search engines are disallowed from indexing the page or due to certain other circumstances
  • nofollow, to be included when noindex is provided when search engines are disallowed from indexing the page
  • follow, to be included when noindex is provided due to certain other circumstances
  • max-image-preview: large, to be included when search engines are allowed to index the page, and when large preview images may be used for the page

This would only be the default behavior and could be expanded or modified by plugins, for example to add additional robots tag directives.

The function would be hooked into wp_head and other relevant actions, and it would essentially supersede the existing noindex() and wp_no_robots() functions.

What’s next?

Your thoughts on this proposal would be greatly valued. Please share your feedback, questions or interest in collaboration by commenting on this post. After that we can create a tracTrac An open source project by Edgewall Software that serves as a bug tracker and project management tool for WordPress. ticketticket Created for both bug reports and feature development on the bug tracker. and kick start development.

#images, #proposal, #seo

XML Sitemaps Meeting: February 25th, 2020

Last week we held the first of many weekly meetings for the XML Sitemaps feature project on SlackSlack Slack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/..

Meeting Recap: February 18th

We had quite a few people attending, not all of whom were familiar with the project. Thus, we started off with a small recap of the project’s scope and goals. After that we discussed various different topics:

  • How to modify the sitemaps to include/exclude certain URLS
    A pull request has been opened to add a FAQ section to the readme that aims to answer these kind of questions.
    Also, a new way to filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. WP_Query instances used for sitemaps has been proposed.
  • Why are there no changefreq and priority fields?
    Those are optional fields in the sitemaps protocol and not typically consumed by search engines. The feature pluginFeature Plugin A plugin that was created with the intention of eventually being proposed for inclusion in WordPress Core. See Features as Plugins. follows other solutions like Yoast SEO who also don’t include those fields.
    Developers can still add those fields if they really want too.
  • Will there be UIUI User interface controls to include/exclude content from sitemaps?
    Adding UI controls is currently a non-goal for the project.
  • Calculating the last modified date for URLs
    This is rather difficult and computationally expensive in WordPress. Given that sitemaps are first and foremost a discovery mechanism for content, having this data is not necessarily required. We will explore omitting this functionality (GitHub issue).
  • The default limit of 2000 URLs per sitemap is considered high and might need to be re-evaluated.
  • Potential compatibility issues with other XML Sitemaps plugins have been discussed.
    If a site ends up having two sitemaps by accident that wouldn’t be bad. However, the current /sitemap.xml URLURL A specific web address of a website or web page on the Internet, such as a website’s URL www.wordpress.org might clash with other plugins. A GitHub issue has been opened to suggesting using /wp-sitemap.xml as the base. This would avoid conflicts in this regard.

Agenda: February 25th

The next meeting will be held on Tuesday, February 25 at 16.00 CET

For tomorrow’s meeting, the agenda is rather brief:

  • Updates since last week (merged changes, new issues)
  • Next steps for proposed lastmod changes
  • Next steps for URL naming change
  • Planning release of version 0.2.0

This meeting is held in the #core-sitemaps channel , to join the meeting, you’ll need an account on the Making WordPress Slack.

#agenda, #feature-plugins, #feature-projects, #seo, #sitemaps, #xml-sitemaps

XML Sitemaps Kickoff Meeting Announcement

A few weeks ago an update was posted for the XML Sitemaps feature project to give everyone an idea of where it is heading.

Now, we want to gather more contributors around the feature pluginFeature Plugin A plugin that was created with the intention of eventually being proposed for inclusion in WordPress Core. See Features as Plugins. and get your feedback on the project. For this, we’re kicking off regular meetings in the brand new #core-sitemaps SlackSlack Slack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/. channel.

The first meeting will be held on Tuesday, February 18 at 16.00 CET and will serve as an introduction to the project and an opportunity to discuss the next steps. As such, there is currently no formal agenda for this inaugural meeting.

However, if you have anything specific that you’d like to propose being discussed in this meeting, feel free to leave a comment below.

This meeting is held in the #core-sitemaps channel , to join the meeting, you’ll need an account on the Making WordPress Slack.

#feature-plugins, #feature-projects, #seo, #sitemaps, #xml-sitemaps

Feature Plugin: XML Sitemaps

Last year, a group of contributors posted a proposal to implement native XML Sitemaps in WordPress CoreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. which received lots of interest and feedback from the community. Since then, we have been working on a XML Sitemap feature plugin (MVPMinimum Viable Product "A minimum viable product (MVP) is a product with just enough features to satisfy early customers, and to provide feedback for future product development." - WikiPedia) which is now available for testing and feedback.

Props to the contributors working on this pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party and co-authoring the content of this post: Sander van Dragt, Kirsty Burgoine, Adrian McShane, Ruxandra Gradina, Joe McGill, Thierry Muller, Pascal Birchler 

Feature overview

As a quick reminder of what this project is trying to achieve, here are the main features as described in the initial project proposal, which we would encourage you to read in its entirety.

XML Sitemaps will be enabled by default making the following content types indexable

– Homepage
– Posts page
– Core Post Types (Pages and Posts)
– Custom Post Types
– Core Taxonomies (Tags and Categories)
– Custom Taxonomies
– Users (Authors)

Additionally, the robots.txt file exposed by WordPress will reference the sitemap index.

Additionally, an XML Sitemaps APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. ships with the plugin aiming for developers to build on top of it. 

The approach

In order to fulfil these initial requirements, we researched the way several existing popular plugins implement this functionality, and came up with an approach that we believe combines many of the best ideas from each.

The sitemap index

A crucial feature of the sitemap plugin is the sitemap index. This is the main XML file that contains the listing of all the sitemap pages exposed by your WordPress site and the time each was last modified. By default, the plugin creates a sitemap index at /sitemap.xml which includes sitemaps for all supported content, separated into groups by post types, taxonomies, and users.

Sitemap pages

Each sitemap page will be available at a URLURL A specific web address of a website or web page on the Internet, such as a website’s URL www.wordpress.org using the following structure, sitemap-{object-type}-{object-subtype}-{page}.xml. Some examples of this structure applied to real content include:

  • Post type – posts: sitemap-posts-post-1.xml 
  • Post type – pages: sitemap-posts-page-1.xml
  • TaxonomyTaxonomy A taxonomy is a way to group things together. In WordPress, some common taxonomies are category, link, tag, or post format. https://codex.wordpress.org/Taxonomies#Default_Taxonomies. – categories: sitemap-taxonomies-category-1.xml
  • Users – sitemap-users-1.xml (note that the WP_User object doesn’t support sub-types)

The official sitemaps protocol asserts that each sitemap can contain a maximum of 50,000 URLs and must be no larger than 50MB (52,428,800 bytes). However, in practice, we found that performance begins to degrade when trying to generate a query that returns more than a few thousand URLs, so for that reason, we’ve decided to limit the default implementation to a maximum of 2,000 URLs per sitemap, which can be modified by using a filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. on the core_sitemaps_max_urls hook.

Sitemap pages for each public post type (except attachments) will be generated, which include URLs to individual post pages. Likewise, sitemaps will be generated for all public taxonomies, which include URLs to taxonomy archive pages, and sitemaps will be generated for all users with published public posts, which includes the URL for each user’s author archive page. The list of supported sub-types for posts and taxonomies can be filtered using the core_sitemaps_post_types and core_sitemaps_taxonomies filters, respectively. Additionally, URLs for any object type can be added or removed using the following filters:

  • Post types: core_sitemaps_posts_url_list
  • Taxonomies: core_sitemaps_taxonomies_url_list
  • Users: core_sitemaps_users_url_list

Performance and scalability

Adding an XML Sitemaps caching mechanism was specifically listed as a non-goal of the project, so we have not included one. However, we did want to ensure that the initial version of the plugin took scalability into consideration, so we spent time researching the major scalability issues present in current popular implementations and ways of solving those problems.

By using best practices for making our main queries performant, we were able to eliminate most scalability problems from individual sitemap pages. However, the main performance problem is generating last modified times for each page in the sitemap index. It’s not scalable to calculate these values dynamically, so instead, we’ve started with an implementation that updates these values using a WP_Cron task that runs twice daily, and saves these values in the options table.

We’ve also begun researching and writing up an implementation for a more robust sitemap page caching mechanism, using custom post types to store and update sitemap data, which can be further explored if the initial implementation proves to be insufficient as an initial implementation for core. (See: #1 and #39 for more details).

Next steps

Announcing the first version of this feature plugin is a major milestone, but is only an early step in the process of having this functionality included in WordPress Core. Now is when we need your help to test, validate, and improve what we have built to ensure that we meet the needs of the broad WordPress community. We are also encouraging sitemaps plugins authors to integrate with the plugin, specifically leveraging the sitemaps API to extend its core functionalities.

We will kick start weekly meetings on WordPress SlackSlack Slack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/. in the very near future. In the meantime, we would encourage anyone interested to join now and begin discussion about this feature. Additionally, you can leave questions and feedback in the comments section of this post or as new issues on the GitHub repo.

Thanks for reading!

#feature-plugins, #feature-projects, #seo, #sitemaps, #xml-sitemaps

XML Sitemaps Feature Project Proposal

Note: a follow post was published with more recent information about this project.

While web crawlers usually discover pages from links within the site and from other sites, sitemaps supplement this approach by allowing crawlers to pick up all URLs included in the sitemap and learn about those URLs using the associated metadata.

Today, WordPress coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. does not generate XML Sitemaps by default, affecting a high number of WordPress websites search engine discoverability. 4 out of the top 15 plugins on WordPress pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party repository currently ship with their own implementation of XML sitemaps, pointing to a universal need for this feature and a great potential to join forces.

This post proposes integration of XML Sitemaps to WordPress Core as a feature project. The proposal was created as a collaboration between Yoast*, Google** and various contributors.

Proposed Solution

In a nutshell, the goal of the proposal is to integrate basic XML Sitemaps in WordPress Core and introduce an XML Sitemaps APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. to make it fully extendable. Below is a diagram of the proposed XML Sitemaps structure:

XML Sitemaps will be enabled by default making the following content types indexable

  • Homepage
  • Posts page
  • Core Post Types (Pages and Posts)
  • Custom Post Types
  • Core Taxonomies (Tags and Categories)
  • Custom Taxonomies
  • Users (Authors)

Additionally, the robots.txt file exposed by WordPress will reference the sitemap index.

Developers

An XML Sitemaps API will be introduced as part of the integration allowing extensibility. At a high level, below is a list of the ways the XML Sitemaps may be manipulated via the API:

  • Add extra sitemaps and sitemap entries
  • Add extra attributes to sitemap entries
  • Provide a custom XML Stylesheet
  • Exclude a specific post type from the sitemap
  • Exclude a specific post from the sitemap
  • Exclude a specific taxonomyTaxonomy A taxonomy is a way to group things together. In WordPress, some common taxonomies are category, link, tag, or post format. https://codex.wordpress.org/Taxonomies#Default_Taxonomies. from the sitemap
  • Exclude a specific term from the sitemap
  • Exclude a specific author from the sitemap
  • Exclude a specific authors with a specific role from the sitemap

Non Goals

While the initial XML Sitemaps integration will fulfill search engines minimum requirements and cover most WordPress content types, below is a list of features which will not be included in the initial integration:

  • Image sitemaps
  • Video sitemaps
  • News sitemaps
  • User-facing changes like UIUI User interface controls to exclude individual posts or pages from the sitemap
  • XML Sitemaps caching mechanisms

i18ni18n Internationalization, or the act of writing and preparing code to be fully translatable into other languages. Also see localization. Often written with a lowercase i so it is not confused with a lowercase L or the numeral 1. Often an acquired skill.

The XML Sitemaps will leverage standard internationalization functionality provided by WordPress core.

Since there are plans by WordPress leadership to officially support multilingual websites in WordPress, the XML Sitemaps will be flexible enough to list localized content in the future as per web development best practices.

What’s next?

Your thoughts on this proposal would be greatly valued. Please share your feedback, questions or interest in collaboration by commenting on this post. After that we can decide on how to best proceed with this proposed project and set up a meeting on SlackSlack Slack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/. to kick things off.

* @joostdevalk, @omarreiss, @jonoalderson, @herregroen

** @swissspidy @albertomedina @westonruter @flixos90 @tweetythierry

#feature-plugins, #feature-projects, #proposal, #seo, #sitemaps, #xml-sitemaps

wp_title() now consistently handles reve …

wp_title() now consistently handles reverse order title breadcrumbs. So in “right” mode, you get “July (sep) 2008 (sep) Blogname.” SEO pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party authors may want to take note.

#seo, #title, #wp_title