Why are we working to improve the automation of the theme review?

During the last few years, the WordPress Theme Review Team has had a hard time to keep up with the volume of submitted new themes.

A high number of submissions is a good problem to have. But it means that developers have to wait a long time before their theme is published.

A theme review includes a lot of routine tasks. Examples are checking for the use of deprecated functions, or lack of support for coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. WordPress features. A lot of these tasks do not need the intervention of a human reviewer, and could be made by software.

The benefits of automation

The Theme Check plugin was created in 2011, and is used to scan every theme uploaded to WordPress.orgWordPress.org The community site where WordPress code is created and shared by the users. This is where you can download the source code for WordPress core, plugins and themes as well as the central location for community conversations and organization. https://wordpress.org/. It runs checks on the submitted theme, and blocks themes that fail checks for required items.

The number of checks, and the issues detected by the checks have been improved over the years. But not all the requirements in the review guidelines are verified by the pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party. This leaves the task of verifying these requirements to the reviewers.

In June 2016, the admins started a project to improve the automation of theme review tasks. The goal of this project is to automate as many routine tasks as possible.

This automation has the following benefits:

  • By detecting more required items during the upload process, themes that fail elementary guidelines will not make it into the review queue. This will reduce the number of theme review tickets that get closed after a quick review due to fundamental flaws in the theme code.
  • By using better analysis tools, the number of issues missed by a reviewer will diminish. Some themes have large code bases, making it difficult to catch all issues. This is important in the context of security, as one flaw is often all that is needed to put an entire install at risk.
  • By using the new tool during the scans of automated theme updates, the quality of the themes will be kept at a high level.
  • By having software handle the menial work, reviewers can focus on areas where their expertise is needed, and provide a real benefit to theme authors.

The flaws of the Theme Check plugin

The Theme Check plugin has a fundamental flaw, and that is the reliance on text parsing for detecting errors.

A text parser only understands text. It has no notion of what valid PHPPHP PHP (recursive acronym for PHP: Hypertext Preprocessor) is a widely-used open source general-purpose scripting language that is especially suited for web development and can be embedded into HTML. http://php.net/manual/en/intro-whatis.php. code is, and what isn’t. This can mean that a check will pass, although the theme does not respect the guideline. A tool on which the reviewer cannot rely is a lot less useful, since double checking is needed.

Additionally text parsing relies on regular expressions, which are difficult to write. This can lead to bugs, which in the worst case can blockBlock Block is the abstract term used to describe units of markup that, composed together, form the content or layout of a webpage using the WordPress editor. The idea combines concepts of what in the past may have achieved with shortcodes, custom HTML, and embed discovery into a single consistent API and user experience. a valid theme from being uploaded. This leads to lack of trust in the tool by developers, who see it as a nuisance, rather than a useful tool.

The unreliability, coupled with the absence of unit tests, make the Theme Check plugin difficult to maintain. The risk of unintentionally introducing regressions is too high.

There have been attempts to use the PHP tokeniser APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. in checks to avoid text parsing. This API is a set of functions that provide an interface into the PHP tokeniser in the Zend Engine, which is the standard PHP interpreter.

A tokeniser breaks text up into elements called tokens, which get passed to a lexer, that attaches meaning to the tokens. This means that after this operation, you can determine what a token represents, based on its internal type as used by the PHP interpreter.

The problem with the current use of the PHP tokeniser in checks is that the API is too low level to be useful. Additionally transforming source code to tokens is an expensive, and therefore a slow operation.

The current architecture of the Theme Check plugin does not offer a high level API to use the tokeniser in checks in a performant way. It needs to be rewritten from scratch, using better tools. After a discussion at WordCamp Europe between the theme review admins and other developers, using PHP_CodeSniffer seemed to be the best solution.

A better approach, the PHP CodeSniffer

PHP_CodeSniffer (PHPCSPHP Code Sniffer PHP Code Sniffer, a popular tool for analyzing code quality. The WordPress Coding Standards rely on PHPCS.) is a static code analysisStatic code analysis "...the analysis of computer software that is performed without actually executing programs, in contrast with dynamic analysis, which is analysis performed on programs while they are executing." - Wikipedia tool, meaning that it can analyse code without running it. PHPCS tokenises code, and runs sniffssniff A module for PHP Code Sniffer that analyzes code for a specific problem. Multiple stiffs are combined to create a PHPCS standard. The term is named because it detects code smells, similar to how a dog would "sniff" out food. on them. These sniffs serve to detect violations of a defined coding standard.

PHPCS has coding standards for all major PHP projects, and WordPress is one of them, with a standard called WPCS.

Using WPCSWPCS The collection of PHP_CodeSniffer rules (sniffs) used to format and validate PHP code developed for WordPress according to the WordPress Coding Standards. May also be an acronym referring to the Accessibility, PHP, JavaScript, CSS, HTML, etc. coding standards as published in the WordPress Coding Standards Handbook. has four major advantages:

  1. The existing sniffs for the different WordPress coding standards give us a head start on detecting essential issues.
  2. PHPCS has offers a higher level API for interacting with the PHP tokeniser, making sniffs easier to write.
  3. With the WPTRT participating in the development of WPCS, there will be more contributors to the project. This tool is a crucial tool for the WordPress ecosystem. More developers means a bigger positive impact on WordPress as a whole.
  4. WPCS can be integrated with most editors, and integrated development environments (IDEs). PHPStorm is an example of an IDE with great support for PHPCS checks. This allows the tool to provide feedback while the developer writes code.

The idea is to add a extra coding standard, WordPress-Theme, to the WPCS project. A list of sniffs that would need to be implemented as part of this project can be found on GithubGitHub GitHub is a website that offers online implementation of git repositories that can easily be shared, copied and modified by other developers. Public repositories are free to host, private repositories require a paid subscription. GitHub introduced the concept of the ‘pull request’ where code changes done in branches by contributors can be reviewed and discussed before being merged be the repository owner. https://github.com/.

As part of this project, @jrf has done a great job working on the base WPCS project. The long list of improvements in version 0.10.0 speaks for itself.

Limits of PHP_Codesniffer

The theme review guidelines can broadly be divided into two categories:

  1. Guidelines that cover technical aspects of theme development. An example would be lack of using the `eval()` function. PHPCS is great for detecting issues like this.
  2. Policy guidelines that are specific to a theme distributed on WordPress.org. An example would be a theme tagged with rtl that nonetheless lacks support for RTL languages. PHPCS is unfortunately not the right tool to detect these issues.

This is due to the way that PHPCS works. The sniffsniff A module for PHP Code Sniffer that analyzes code for a specific problem. Multiple stiffs are combined to create a PHPCS standard. The term is named because it detects code smells, similar to how a dog would "sniff" out food. process goes through the files one at a time, and runs all the sniffs on the current file. Once all files are processed, the sniff is considered complete. As such the tool has no knowledge of what the object of the sniff is. It just deals with files.

Additionally, PHPCS sniffs detect errors by looking for certain combinations of tokens. So it’s up to the person writing the sniff to know which token pattern represents a function call for example.

To effectively check the policy guidelines, we would need a tool specifically designed for the task. A theme is a collection of PHP, JavaScriptJavaScript JavaScript or JS is an object-oriented computer programming language commonly used to create interactive effects within web browsers. WordPress makes extensive use of JS for a better user experience. While PHP is executed on the server, JS executes within a user’s browser. https://www.javascript.com/., and CSSCSS CSS is an acronym for cascading style sheets. This is what controls the design or look and feel of a site. files, but we would need a tool that goes beyond this basic level.

There is an existing PR on the WordPress-Theme standards repository, that extracts a set of data points from a theme. While the implementation itself is not a final solution, the approach has merit. Rather than dealing with individual files, the relevant information is extracted, and serves as an abstract representation of the theme.

We are currently working on a test project that uses a PHP Parser to extract this information. A parser is one level above a lexer, as it turns the tokens into an abstract syntax tree. This is an advantage, because a parser knows how the tokens fit together.

The library used is in this project is phpDocumentor/Reflection. This library was recommended by @rmccue, since the PHPDoc parser powering the WordPress Code Reference is based on it.

The project is still in an early phase. It will be made available for contributions and testing as soon as a first stable version exists.

How can you help?

If you are a theme developer, start using the WordPress-Theme coding standard as part of your development process.

The WPCS project in general, and the WordPress-Theme coding standard in particular, could benefit from the help of proficient developers.

If you want to follow the advancement of the project, you can attend the automation meetings.

As the guidelines are reviewed and adjusted regularly, make sure to attend the WPTRT meetings.

#review-automation