Felix Arntz 7:35 pm on April 5, 2024
Tags: analysis, core ( 737 ), core web vitals ( 2 ), performance ( 411 ), plugin ( 3 ), theme ( 20 )

Conducting WordPress performance research in the field

Over the past few years, several Make CoreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. posts, for example the 2023 performance retrospective, have referenced field data based on how real users experience millions of real WordPress sites. Such field data can help gather metrics of many different kinds, such as adoption of a feature or even its performance impact. As such, they can be instrumental in demonstrating the success of or potential concerns about a feature or enhancementenhancement Enhancements are simple improvements to WordPress, such as the addition of a hook, a new feature, or an improvement to an existing feature..

Gathering this data can be accomplished using public datasets like those from HTTP Archive and the Chrome User Experience Report (CrUX). However, as it requires writing BigQuery queries, getting the data may not be trivial as it is a separate technology not relevant for WordPress core development or pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party. and theme development.

To provide a better starting point for those new to BigQuery, HTTPHTTP HTTP is an acronym for Hyper Text Transfer Protocol. HTTP is the underlying protocol used by the World Wide Web and this protocol defines how messages are formatted and transmitted, and what actions Web servers and browsers should take in response to various commands. Archive, and CrUX, members of the WordPress Performance Team and HTTP Archive have collaborated on a tutorial and reference Colab.

Whether you are new to those technologies or whether you have already written a few BigQuery queries, the Colab provides an introduction and can help build more familiarity. It only assumes some familiarity with SQL in general, such as from writing custom database queries in WordPress. The Colab comes with several queries, alongside their results, which can be used as a reference, and covers use-cases relevant to WordPress core development as well as plugin and theme development. It can be considered a “living resource”, i.e. expect for it to be updated and expanded in the future.

Other than this post, you can also find the Colab linked from a new Make Performance Handbook article on gathering WordPress performance data in the field.

If you are interested in field research around WordPress sites, you may want to take a look and work through the Colab. As it contains a lot of content, please feel free to work through it in multiple sessions.

#analysis, #core, #core-web-vitals, #performance, #plugin, #theme

Felix Arntz 1:59 pm on December 19, 2023
Tags: analysis, core web vitals ( 2 ), core-performance ( 145 ), performance ( 411 )

WordPress performance impact on Core Web Vitals in 2023

This post summarizes and highlights the impact that WordPress has had on Core Web Vitals (CWV) in the field in 2023, providing a metric-based retrospective at the end of the year.

TL;DR: The WordPress performance team and all WordPress contributors can be very proud of the accomplishments: The overall CWV passing rate across all WordPress sites has improved from 28.31% to 36.44% (+8.13%) on mobile devices and from 32.55% to 40.80% (+8.25%) on desktop devices. 🎉

These improvements led to a visible increase of CWV passing rates even for the entire web. The performance team is currently discussing additional findings to define the focus for 2024 and is looking for further proposals and contributors for next year.

Note: This post is based on the slide deck used in a presentation for the WordPress performance year-end hallway hangout (also see recording). Feel free to review the deck as well as an alternative way to consume the numbers.

A few notes on CWV field metrics

Before looking at the 2023 metrics in more depth, a few things should be clarified.

The metrics shared in this post are exclusively field metrics. They are distinctively different from lab metrics, which are the metrics that have typically been shared in the WordPress release performance summary posts this year:

Lab metrics are benchmarks conducted on demand, typically as a synthetic A/B performance comparison. They provide an indication of whether/how a specific change is anticipated to impact performance. For example, for the lab metrics shared for the WordPress releases in the aforementioned posts, the load time performance of the respective new WordPress version was compared to that of the previous version, on the exact same setup, so that the only different variable is the WordPress version.

Field metrics on the other hand are analytics data collected from site usage and provide an indication of how performance is actually experienced by real users on real sites. For example, the Chrome User Experience Report (CrUX) provides a public dataset of performance field data from opted-in Chrome users, aggregated at the site level. The WordPress 6.3 field impact summary post shared metrics queried from that dataset, granularly broken down for just that release compared to the previous 6.2 release.

When reviewing field metrics, it is important to consider the following:

Field metrics are influenced by a myriad of factors, for example the active plugins and theme, the WordPress coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. version, the hosting stack, the browser(s) used by end users, their networknetwork (versus site, blog) connection, and more. So when comparing field data between two months, for example, it is impossible to limit the comparison to just one specific aspect of those.
Field metrics do not allow direct correlation between specific enhancements and their concrete metric impact. This is closely related to the previous point: Even if you compare the performance of a dataset of sites that enabled a specific feature before and after the change, for example, there may have been numerous other changes during that time that influence the data. Effectively, it is impossible to conduct clean A/B tests in the field.
Field metrics are more meaningful the larger the dataset is. Because of them being only indicators rather than “proof”, the larger the relevant dataset the more a specific metric observation from it can be trusted. It’s also worth considering that the web’s user base is incredibly diverse, so even the audience that a specific site has matters significantly for its field performance.

And yet, despite all these caveats, field metrics are the only way to validate how beneficial performance changes really are. If a performance “improvement” doesn’t impact real users, it effectively doesn’t matter even if it looked great in the lab.

CWV breakdown and assessment

For a brief recap on Core Web Vitals (CWV), they are a set of three specific metrics:

Largest Contentful Paint (LCP) measures how fast the page loads (load time performance).
First Input Delay (FID) measures how quickly interactions work (interactivity).
Cumulative Layout Shift (CLS) measures how stable page elements are (layout stability).

Each of these metrics has thresholds for whether a value is considered “good”, whether it “needs improvement” or whether it is “poor”.

The CWV metrics LCP ("good" threshold of 2.5s or lower), FID ("good" threshold of 100ms or lower), and CLS ("good" threshold of 0.1 or lower) — *This visualization shows the three CWV metrics and their thresholds.*

These metrics can be captured for every single navigation / page load. A page load is then only considered to have “good” CWV if all three metrics show a “good” value. In other words, for a single navigation the CWV assessment is simply a binary metric of “good” or “not good”.

When looking at actual CWV datasets in the field, metrics are always aggregated, typically at the site / origin level. In other words, such an aggregation encompasses all navigations that happened on that site in a given time frame. In that dimension, the CWV assessment is considered “good” if 75% or more of the navigations have a “good” CWV assessment.

Last but not least, CWV can also be assessed at a larger scale, for example across all sites using a specific technology, like WordPress. In that scenario, CWV are typically measured through a “passing rate” which describes the percentage of sites that have a “good” CWV assessment. For example, if in a dataset of 1 million sites 200,000 of them have a “good” CWV assessment, the resulting passing rate is 20%.

Note that not only the overarching CWV passing rate can be measured, but also the passing rates for the individual metrics (LCP, FID, CLS and others). The CWV passing rate is however the most meaningful single metric as it is a summary of all of them.

But now let’s take a look at the data. Note: All metrics highlighted in this post are CrUX passing rates from the field, based on year-over-year comparisons between October 2023 and October 2022 (unless otherwise indicated).

WordPress CWV in 2023

As already mentioned in the “TL;DR” at the beginning of this post, CWV for WordPress have improved significantly this year:

Mobile CWV passing rate improved 8.13% (from 28.31% to 36.44%).
Desktop CWV passing rate improved 8.25% (from 32.55% to 40.80%).

While at first glance ~8% may not sound like much, it is a quite substantial boost to the passing rates, particularly when considering their base values. Relatively speaking, the new passing rate is ~29% higher than the old one on mobile and ~25% higher on desktop.

For reference: The previous year’s improvement for CWV passing rate was 6.99% on mobile and 6.25% for desktop. So while WordPress already did a great job in the previous year, this year even exceeded those accomplishments.

Line chart of WordPress's mobile CWV passing rate gradually improving from below 30% to over 35%. There is a small drop between March and April 2023, annotated with "LCP algorithm slightly changed". — This chart shows WordPress’s mobile CWV passing rate throughout the year. Note that the decline between March and April is not a result of a WordPress-specific problem (or any performance issue at all), but rather due to a change in how the LCP metric is being calculated, which was rolled out in that month.

Let’s take a closer look at the individual metrics that make up CWV and how they changed this year. For simplicity, only the mobile numbers are shown below. They are slightly more important than desktop results since mobile traffic overall is higher than desktop traffic. More importantly, performance improvements carry more importance for mobile devices as they are typically less powerful and are subject to worse network conditions. It is also worth noting that the corresponding desktop numbers don’t show any notable differences in trend.

Mobile LCP passing rate improved 8.89% (from 34.48% to 43.37%).
Mobile CLS passing rate improved 4.22% (from 74.76% to 78.98%).
Mobile FID passing rate improved 0.87% (from 96.55% to 97.42%).

As you can see, LCP saw by far the largest boost. This is an excellent outcome, as improving LCP was the main focus for the WordPress performance team this year. The rationale behind this may also be obvious when looking at the base values: The LCP passing rate of WordPress sites is the lowest performing metric, so it deserves the most attention. On the flip side, even though FID only improved by less than 1%, that is perfectly fine given its passing rate is already so high.

Based on these metrics, it can furthermore be concluded that LCP was the primary driver behind the overall CWV passing rate improvements, which confirms the focus on this metric has made sense.

Last but not least, the Time to First Byte (TTFB) passing rate should be highlighted as well: While TTFB is not a Core Web Vitals metric, it is a direct part of LCP (specifically denoting its server-side load time performance portion), and it was another partial focus this year both because of its impact on LCP and because its passing rate is very low. Here is how it improved this year:

Mobile TTFB passing rate improved 3.10% (from 18.67% to 21.77%).
Desktop TTFB passing rate improved 3.53% (from 28.44% to 31.97%).

WordPress 2023 releases impact

This section focuses on the load time performance impact of the three new WordPress versions released this year, 6.2, 6.3, and 6.4. Since the focus for all of these releases was load time performance, it is sufficient to focus solely on LCP and TTFB.

While the overarching WordPress metrics from the previous section were based on a broad year-over-year comparison, the metrics to assess the release impact were queried with a more granular approach: For each WordPress version, a dataset was established between two months based on only the intersection of sites that were on the previous WordPress version in the first month and on the newer WordPress version in the second month. The months were then chosen in a way to maximize the size of the dataset (e.g. a WordPress version always sees the highest usage in the month before the subsequent version is released).

While that approach is still by no means an A/B comparison, it eliminates at least a good portion of noise e.g. from sites on other WordPress versions or sites that newly entered or dropped out of the dataset.

WordPress 6.2 LCP and TTFB

All metric comparisons are based on the intersection of WordPress 6.1 sites in March and WordPress 6.2 sites in July. See relevant WordPress 6.2 slide.

Mobile LCP passing rate improved 0.01% (from 34.98% to 34.99%).
Mobile TTFB passing rate improved 0.65% (from 18.47% to 19.12%).
Desktop LCP passing rate improved 2.13% (from 46.85% to 48.98%).
Desktop TTFB passing rate improved 3.89% (from 25.79% to 29.68%).

WordPress 6.3 LCP and TTFB

All metric comparisons are based on the intersection of WordPress 6.2 sites in July and WordPress 6.3 sites in October. See relevant WordPress 6.3 slide.

Mobile LCP passing rate improved 4.72% (from 34.46% to 39.18%).
Mobile TTFB passing rate improved 0.78% (from 18.78% to 19.56%).
Desktop LCP passing rate improved 1.96% (from 48.55% to 50.51%).
Desktop TTFB passing rate decreased 2.15% (from 29.28% to 27.13%).

WordPress 6.4 LCP and TTFB

All metric comparisons are based on the intersection of WordPress 6.3 sites in October and WordPress 6.4 sites in November (since newer data is not available yet). See relevant WordPress 6.4 slide.

Mobile LCP passing rate improved 0.30% (from 37.40% to 37.70%)
Mobile TTFB passing rate improved 0.11% (from 18.21% to 18.32%).
Desktop LCP passing rate improved 0.13% (from 49.46% to 49.59%).
Desktop TTFB passing rate decreased 0.31% (from 25.88% to 25.57%).

Remember that all of the above metrics are just indicators and an approximation of the field impact of those releases, influenced by several factors. It’s also great to keep in mind that the adoption of those versions will continue to grow. For example, as of November ~68% of all WordPress sites were using version 6.2 or newer based on the dataset. As the adoption increases further, the performance wins from those releases will continue to scale horizontally and benefit more sites.

WordPress 2023 impact on the web

WordPress, with its high usage, has a large footprint on the web, making a significant impact on the entire internet. When WordPress performance improves, the web’s performance improves. Therefore, last but not least, let’s look at the performance impact that WordPress has had on the web overall.

A good starting point for that is to compare the 2023 CWV passing rate improvement of all WordPress sites with that of all sites not using WordPress:

As mentioned before, WordPress’s mobile CWV passing rate improved 8.13%.
The non-WordPress mobile CWV passing rate improved by 3.68%.
Similarly as mentioned before, WordPress’s desktop CWV passing rate improved 8.25%.
The non-WordPress desktop CWV passing rate improved by 5.29%.

From those numbers alone, it is clear that WordPress has made notably more progress in performance than the rest of the web, which is an amazing achievement.

It is furthermore possible to draw some conclusions on how much WordPress’s impact on the overall web’s improvements are. For reference, the overall web’s CWV passing rate improved 5.35% on mobile and 6.26% on desktop. Based on that, a simple calculation can be used, subtracting the non-WordPress CWV passing rate improvement from that of the entire web. This leads to the following results:

1.67% of the overall web’s mobile CWV improvement of 5.35% comes directly from WordPress.
0.97% of the overall web’s desktop CWV improvement of 6.26% comes directly from WordPress.

While those numbers may seem small, this is literally WordPress’s impact on CWV for the entire web! It is an excellent reminder of how important WordPress’s role is for the web and how contributing to WordPress not only improves the WordPress ecosystem but also the entire web.

Considerations for 2024

With this recap of the 2023 CWV impact of WordPress, it is time to look ahead and plan for 2024. In particular there are two important considerations for the performance focus next year.

INP replaces FID as a Core Web Vitals metric

Earlier this year, a notable change to the CWV metrics was announced: The more recently introduced metric Interaction to Next Paint (INP) will replace First Input Delay (FID) in early 2024. Please refer to the linked articles for additional information on the metric and that change. The very short summary is that INP measures interactivity more accurately than FID, and due to its limitations FID ended up having very high passing rates throughout the web ecosystem – almost perfect as seen in the WordPress numbers before.

The INP metric and its "good" threshold of 200ms or lower — *This visualization shows the new INP metric and its thresholds.*

Since INP measures interactivity more comprehensively than FID, it implies that passing the INP assessment is more difficult than passing the FID assessment. In other words, the INP passing rate is expected to be lower than the FID passing rate. Once INP is taken into account for CWV instead of FID, this will therefore lead to a decline in the overall CWV passing rate as well. In reality, the difference is particularly relevant on mobile devices. Here is a comparison of the FID passing rate and INP passing rate of WordPress sites, based on October 2023:

Mobile FID passing rate is 97.42%, while mobile INP passing rate is 71.48%.
Desktop FID passing rate is 99.97%, while desktop INP passing rate is 98.47%.

While there is a small decline on desktop, it isn’t really worth focusing on as the value is still extremely good. But for mobile, the decline is significant. In other words, it will be important in 2024 to find ways in which WordPress can improve INP passing rates on mobile, in order to make up for the loss compared to FID.

Of course this is not only relevant for WordPress, but for the entire web. Here is how the mobile CWV passing rate would change if INP was already replacing FID today (data based on October 2023):

WordPress’s mobile CWV passing rate would be 3.64% lower (32.80% instead of 36.44%).
The non-WordPress mobile CWV passing rate would be 7.39% lower (40.58% instead of 48.02%).

Based on these numbers, it is at least a little comforting that WordPress will struggle with this change less than the rest of the web. Nevertheless, 3.64% CWV passing rate decline is a significant drop, so part of next year’s performance goals will be to make up for that loss, likely through a combination of INP focused improvements and a continuation of the LCP focused efforts.

TTFB focus was less impactful than (client-side) LCP focus

The second consideration for 2024 is not related to a metric change, but rather a potential takeaway from the 2023 metrics highlighted in this post: When looking at the WordPress release specific impact, it can be noted that the mobile LCP improvement was by far the highest in the 6.3 release (4.72%, compared to <1% for 6.2 and 6.4).

The WordPress 6.3 release included a number of client-side LCP enhancements, which likely led to a large chunk of those improvements, as more granularly indicated by the findings from the related field analysis from a few months ago. WordPress 6.4 and especially WordPress 6.2 were more focused on server-side LCP enhancements (i.e. TTFB enhancements), and based on the field results those releases did not lead to as large LCP wins. Even for TTFB, the picture isn’t entirely clear: As expected, WordPress 6.2 has a much better TTFB win than 6.3 on desktop, but on mobile devices the two releases show almost equal TTFB wins, with 6.3 even slightly higher.

While those numbers are by no means evidence, they raise the question whether the TTFB focus in WordPress core is the right way to move the needle for load time performance. Potentially the influence of other factors outside of WordPress core on TTFB, such as plugins, themes, hosting stack, or network connection, are just too large for the core-specific server-side improvements to make an impact in the field.

This does not imply the TTFB efforts should be dropped – but likely a shift is needed. Potentially, the TTFB issues need to be addressed through other means than directly enhancing the server response time of WordPress core itself. Further research should likely be conducted to get a better understanding of how much WordPress’s low TTFB passing rate stems from the pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party. ecosystem’s performance or other aspects not directly code-related such as hosting.

2024 planning is underway

The WordPress performance team is currently working on their roadmap for 2024. A GitHub issue for performance focused project proposals has been opened and awaiting input. You’re invited to contribute any WordPress performance related proposals or ideas you may have on that issue.

While the 2024 roadmap is currently being planned, let’s once again circle back to 2023: The amazing metrics shared here speak for themselves. Thank you to everyone who contributed to WordPress’s incredible performance impact in 2023!

Props to @annezazu @westonruter for review and proofreading.

#analysis, #core-web-vitals, #core-performance, #performance

Felix Arntz 7:56 pm on September 19, 2023
Tags: 6-3 ( 70 ), analysis, performance ( 411 ), summary ( 972 )

Analyzing the Core Web Vitals performance impact of WordPress 6.3 in the field

As highlighted in the WordPress 6.3 performance summary post, the 6.3 release included numerous performance enhancements. Based on the lab benchmarks cited in that post, the test sites used with WordPress coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. were loading 27% faster for blockBlock Block is the abstract term used to describe units of markup that, composed together, form the content or layout of a webpage using the WordPress editor. The idea combines concepts of what in the past may have achieved with shortcodes, custom HTML, and embed discovery into a single consistent API and user experience. themes and 18% faster for classic themes based on the Largest Contentful Paint (LCP) metric.

While lab benchmarks are great to estimate the projected performance impact of a release, the tests are not representative of the average WordPress site and real-world traffic. Therefore, it is crucial to further review and attempt to validate the impact in the field, i.e. on actual production sites using WordPress, at scale. Last week, three analyses were conducted to assess the performance impact of WordPress 6.3, using the public data sets from HTTP Archive and the Chrome User Experience Report.

Highlights of the WordPress 6.3 performance analysis findings

Before diving into the results, the term “passing rate” should be briefly explained here. It denotes the percentage of sites in a dataset for which a specific Web Vitals metric performs better than the threshold value that is considered “good”. For LCP, that encompasses all sites in the dataset that load faster than 2.5 seconds in total per the LCP metric. For example, if 600,000 out of 1,000,000 URLs have an LCP faster or equal to 2.5 seconds, the LCP passing rate is 60%.

The results from the analyses indicate that WordPress 6.3 is indeed a great success from a performance perspective, as indicated by the lab benchmarks. Some notable findings to highlight include:

Looking at all applicable sites in the dataset, the Largest Contentful Paint (LCP) passing rate has improved by 5.6% for classic theme sites and by 2.7% for block theme sites for mobile viewports. In terms of the absolute LCP passing rate, for classic theme sites this means a bump from 31.3% to 33%, while for block theme sites it means a bump from 42.8% to 44%. For desktop viewports, the improvements are not as pronounced, yet they are still positive. See the source for overall LCP passing rate changes.
When segmenting between sites that use the emoji loader script and the sites that have disabled it, the impact of the improvements to the emoji loader script are clearly visible. The Largest Contentful Paint (LCP) boost for classic theme sites using the emoji loader script is 3.4% to 7% higher than for those that don’t use it, and for block themes it’s 0.7% to 4.5% better as well. To outline the numbers behind that more clearly, classic theme sites using the emoji loader script see a relative LCP boost of 8.4% on phone and 2.4% on desktop, compared to only 1.4% and -0.8% for those that don’t use the emoji loader script. Similarly, for block theme sites using the emoji loader script the relative LCP boost amounts to 4.2% on phone and 0.8% on desktop, compared to only -0.3% and 0.1% for those that don’t use the emoji loader script. See the source for LCP passing rate differences between sites using vs not using the emoji loader script.
When looking at the impact of more accurate lazy-loading heuristics and support for fetchpriority="high", segmentation is especially important, since the enhancements themselves have a varying degree of accuracy. As a reminder, the LCP image of a URLURL A specific web address of a website or web page on the Internet, such as a website’s URL www.wordpress.org should not be lazy-loaded, but it should have fetchpriority="high". When looking at only the sites where that is the case and which were still lazy-loading the LCP image with WordPress 6.2, the LCP performance impact amounts to a massive 16% to 21% improvement for mobile viewports and 6% to 9% on desktop. Even in absolute LCP passing rate numbers, this is a jump of 4.3% for classic theme sites and 8% for block theme sites, which is nothing short of amazing. See the source for LCP passing rate changes for sites that no longer lazy-load LCP image and use fetchpriority correctly.
Of course this only applies to a subset of sites, however the accuracy of the lazy-loading heuristics has notably improved as well: In WordPress 6.3, only 9–10% of sites still lazy-load their LCP image for classic theme sites (down from 27–28% in 6.2) while for block theme sites it’s 5–8% (down from 17–29% in 6.2), so this multiplies the above LCP improvements horizontally. See the source for the accuracy comparison of how many sites (correctly) no longer lazy-load their LCP image.

Explaining the metrics

Tooling used

HTTP Archive is an open-source project that runs a pipeline across millions of URLs every month to monitor the state of the web, recording aspects like which technologies are used, how specific web features are being leveraged, how many HTMLHTML HyperText Markup Language. The semantic scripting language primarily used for outputting content in web browsers. tags or attributes of a specific kind are present on pages, and much more. The Core Performance Team has been heavily relying on this tool to measure success of specific features or enhancements in WordPress core releases. In fact, HTTP Archive even monitors a few specific metrics that are specific to WordPress.

The Chrome User Experience Report (short “CrUX”) exposes Core Web Vitals (CWV) performance data for millions of URLs, based on how real-world Chrome users experience visiting those URLs. While the tool can be used for individual sites to monitor their Web Vitals (e.g. via PageSpeed Insights), the data can also be aggregated at a larger lens. While CrUX does not contain much data other than the actual Web Vitals metrics, intersecting its dataset with that of HTTPHTTP HTTP is an acronym for Hyper Text Transfer Protocol. HTTP is the underlying protocol used by the World Wide Web and this protocol defines how messages are formatted and transmitted, and what actions Web servers and browsers should take in response to various commands. Archive allows gathering valuable insights. For example, it becomes possible to group sites into specific segments (such as all sites that use WordPress) and measure their CWV passing rates.

Both HTTP Archive and CrUX expose data aggregated on a monthly basis.

Joining data from HTTP Archive with data from CrUX is the foundation for tools like the Core Web Vitals Technology Report, which displays CWV passing rates for numerous technologies over time. The dashboard also includes WordPress-specific passing rates, which can be helpful to look at for a quick overview of how WordPress sites are performing on the web at a glance. However, it should be noted that those numbers are quite broad, since the passing rates are based on all WordPress sites in the dataset, regardless of the version used or any other factors. Therefore, in order to assess the impact of a specific WordPress release such as 6.3, a more granular approach is needed.

Methodology

The WordPress 6.3 performance summary post highlighted two client-side performance enhancements as the main sources for the improved LCP performance, which are the optimizations of the emoji loader script (see #58472) and the lazy-loading fixes plus the newly added support for the fetchpriority attribute, which are closely related (see the WordPress 6.3 image performance enhancements post). To assess whether those enhancements resulted in the anticipated LCP improvement, two analyses were conducted specific to those efforts.

Additionally, a broader analysis was conducted to compare the LCP performance of WordPress 6.3 and WordPress 6.2 sites overall, as well as their Time to First Byte (TTFB) performance, which directly impacts LCP as well. While with broader analyses like this one it is impossible to directly connect it to specific enhancements or fixes that launched as part of that release, it is crucial to look at the performance impact as a whole as well to get an idea how successful the release is at scale, regardless of how a specific feature is being used.

The analyses were conducted by running various BigQuery queries against the intersection of HTTP Archive and CrUX datasets, specifically zooming in on only the sites that were using WordPress 6.2 in July 2023 and WordPress 6.3 in August 2023. To present the approach, queries, and results transparently, the research tool Colab was used.

The links below point to the three Colabs with the analyses. They are quite detailed, so for a quick summary you may want to continue reading this post first. Please feel free to dive into the individual Colabs and their details, which you can also use to validate the summary below. Potentially you will find other notable metrics to highlight, or additional conclusions to draw.

It should be noted that any field metrics need to be interpreted carefully as they always contain some degree of noise. Websites change over time in many ways, and it is impossible to eliminate external factors from the data. For example, a WordPress site may be slower with WordPress 6.3 than it was in 6.2 simply because it activated a new pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party. in the meantime that impacts performance. Such scenarios cannot be reliably detected and are therefore part of the metrics as well. Fortunately, the number of WordPress sites in the dataset is quite large: Looking at only the WordPress sites in the dataset that match the aforementioned criteria, we are looking at more than 500,000 WordPress home page URLs. This means that such specific side effects of individual sites usually have only negligible impact when looking at the overall data. Still, this is something to keep in mind: While field data is the closest there is available to assess the actual performance impact of a change, field data cannot be used to confidently claim that something is true or false — it has to be interpreted.

Conclusion

The large positive LCP impact confirms that the 6.3 release is an important milestone for WordPress performance. The numbers are particularly impressive on the sites for which the lazy-loading behavior was fixed and where fetchpriority support was correctly added. This shows the potential vertical impact that a few specific changes like that can have. Of course the overall LCP improvements are not as high, but it confirms this is a large opportunity: By further improving the heuristics so that they apply correctly to more WordPress sites, the horizontal impact of the change can be increased so that in the future the large LCP benefits may scale to even more sites.

Another metaMeta Meta is a term that refers to the inside workings of a group. For us, this is the team that works on internal WordPress sites like WordCamp Central and Make WordPress. observation worth noting is that the LCP passing rate improvements in WordPress 6.3 compared to 6.2 for the correct behavior above (16-21% higher LCP passing rate) is actually not too far off from the lab benchmarks measured for 6.3 a few months ago (18-27% faster LCP). This makes sense, given that for lab benchmarks the test site was a simulated scenario where lazy-loading and fetchpriority were behaving correctly. It is great to know that the lab benchmarks carry some weight even when compared to the field impact.

Last but not least, there are also two points to be highlighted which show that there is still room for improvement:

The accuracy with which fetchpriority="high" is applied to the LCP image is only around 50% across all scenarios. While this is okay for the newly added support of the attribute, it is clearly something to follow up on. Getting the heuristics for applying fetchpriority right is even more challenging than not lazy-loading the LCP image especially since the LCP image may differ between different viewports, but it’s safe to say there should be more that WordPress core can do in that area. At least, it is relieving to see that the negative LCP impact of adding fetchpriority="high" to the wrong image is fairly low, compared to the negative LCP impact of lazy-loading the LCP image. See the source for fetchpriority accuracy against the LCP image and the source for LCP passing rate changes for sites that no longer lazy-load LCP image but use fetchpriority incorrectly.
At a higher level, the Time to First Byte (TTFB) passing rate is not seeing much of an improvement and in parts is even regressing: For mobile viewports, the TTFB passing rate is improving between 1.6-1.7%, while for desktop viewports it is regressing by ~4.9% for classic theme sites and ~9% for block theme sites. It’s impossible to connect that to specific changes that landed in WordPress 6.3, and as mentioned before it could be affected by external factors, but it clarifies that server-side performance needs to continue to be a point of focus. See the source for overall TTFB passing rate changes.

Please feel free to take a closer look at the analyses and leave your feedback as comments on this post. Additional thoughts, observations and questions are much appreciated.

Props @joemcgill @westonruter for proofreading.

#6-3, #analysis, #performance, #summary

Welcome!

Communication

Tag Archives: analysis

Conducting WordPress performance research in the field

WordPress performance impact on Core Web Vitals in 2023