Why are we working to improve the automation of the theme review?

During the last few years, the WordPress Theme Review Team has had a hard time to keep up with the volume of submitted new themes.

A high number of submissions is a good problem to have. But it means that developers have to wait a long time before their theme is published.

A theme review includes a lot of routine tasks. Examples are checking for the use of deprecated functions, or lack of support for coreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. WordPress features. A lot of these tasks do not need the intervention of a human reviewer, and could be made by software.

The benefits of automation

The Theme Check plugin was created in 2011, and is used to scan every theme uploaded to WordPress.orgWordPress.org The community site where WordPress code is created and shared by the users. This is where you can download the source code for WordPress core, plugins and themes as well as the central location for community conversations and organization. https://wordpress.org/. It runs checks on the submitted theme, and blocks themes that fail checks for required items.

The number of checks, and the issues detected by the checks have been improved over the years. But not all the requirements in the review guidelines are verified by the pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party. This leaves the task of verifying these requirements to the reviewers.

In June 2016, the admins started a project to improve the automation of theme review tasks. The goal of this project is to automate as many routine tasks as possible.

This automation has the following benefits:

  • By detecting more required items during the upload process, themes that fail elementary guidelines will not make it into the review queue. This will reduce the number of theme review tickets that get closed after a quick review due to fundamental flaws in the theme code.
  • By using better analysis tools, the number of issues missed by a reviewer will diminish. Some themes have large code bases, making it difficult to catch all issues. This is important in the context of security, as one flaw is often all that is needed to put an entire install at risk.
  • By using the new tool during the scans of automated theme updates, the quality of the themes will be kept at a high level.
  • By having software handle the menial work, reviewers can focus on areas where their expertise is needed, and provide a real benefit to theme authors.

The flaws of the Theme Check plugin

The Theme Check plugin has a fundamental flaw, and that is the reliance on text parsing for detecting errors.

A text parser only understands text. It has no notion of what valid PHPPHP PHP (recursive acronym for PHP: Hypertext Preprocessor) is a widely-used open source general-purpose scripting language that is especially suited for web development and can be embedded into HTML. http://php.net/manual/en/intro-whatis.php. code is, and what isn’t. This can mean that a check will pass, although the theme does not respect the guideline. A tool on which the reviewer cannot rely is a lot less useful, since double checking is needed.

Additionally text parsing relies on regular expressions, which are difficult to write. This can lead to bugs, which in the worst case can blockBlock Block is the abstract term used to describe units of markup that, composed together, form the content or layout of a webpage using the WordPress editor. The idea combines concepts of what in the past may have achieved with shortcodes, custom HTML, and embed discovery into a single consistent API and user experience. a valid theme from being uploaded. This leads to lack of trust in the tool by developers, who see it as a nuisance, rather than a useful tool.

The unreliability, coupled with the absence of unit tests, make the Theme Check plugin difficult to maintain. The risk of unintentionally introducing regressions is too high.

There have been attempts to use the PHP tokeniser APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. in checks to avoid text parsing. This API is a set of functions that provide an interface into the PHP tokeniser in the Zend Engine, which is the standard PHP interpreter.

A tokeniser breaks text up into elements called tokens, which get passed to a lexer, that attaches meaning to the tokens. This means that after this operation, you can determine what a token represents, based on its internal type as used by the PHP interpreter.

The problem with the current use of the PHP tokeniser in checks is that the API is too low level to be useful. Additionally transforming source code to tokens is an expensive, and therefore a slow operation.

The current architecture of the Theme Check plugin does not offer a high level API to use the tokeniser in checks in a performant way. It needs to be rewritten from scratch, using better tools. After a discussion at WordCamp Europe between the theme review admins and other developers, using PHP_CodeSniffer seemed to be the best solution.

A better approach, the PHP CodeSniffer

PHP_CodeSniffer (PHPCSPHP Code Sniffer PHP Code Sniffer, a popular tool for analyzing code quality. The WordPress Coding Standards rely on PHPCS.) is a static code analysisStatic code analysis "...the analysis of computer software that is performed without actually executing programs, in contrast with dynamic analysis, which is analysis performed on programs while they are executing." - Wikipedia tool, meaning that it can analyse code without running it. PHPCS tokenises code, and runs sniffssniff A module for PHP Code Sniffer that analyzes code for a specific problem. Multiple stiffs are combined to create a PHPCS standard. The term is named because it detects code smells, similar to how a dog would "sniff" out food. on them. These sniffs serve to detect violations of a defined coding standard.

PHPCS has coding standards for all major PHP projects, and WordPress is one of them, with a standard called WPCS.

Using WPCSWPCS The collection of PHP_CodeSniffer rules (sniffs) used to format and validate PHP code developed for WordPress according to the WordPress Coding Standards. May also be an acronym referring to the Accessibility, PHP, JavaScript, CSS, HTML, etc. coding standards as published in the WordPress Coding Standards Handbook. has four major advantages:

  1. The existing sniffs for the different WordPress coding standards give us a head start on detecting essential issues.
  2. PHPCS has offers a higher level API for interacting with the PHP tokeniser, making sniffs easier to write.
  3. With the WPTRT participating in the development of WPCS, there will be more contributors to the project. This tool is a crucial tool for the WordPress ecosystem. More developers means a bigger positive impact on WordPress as a whole.
  4. WPCS can be integrated with most editors, and integrated development environments (IDEs). PHPStorm is an example of an IDE with great support for PHPCS checks. This allows the tool to provide feedback while the developer writes code.

The idea is to add a extra coding standard, WordPress-Theme, to the WPCS project. A list of sniffs that would need to be implemented as part of this project can be found on GithubGitHub GitHub is a website that offers online implementation of git repositories that can easily be shared, copied and modified by other developers. Public repositories are free to host, private repositories require a paid subscription. GitHub introduced the concept of the ‘pull request’ where code changes done in branches by contributors can be reviewed and discussed before being merged be the repository owner. https://github.com/.

As part of this project, @jrf has done a great job working on the base WPCS project. The long list of improvements in version 0.10.0 speaks for itself.

Limits of PHP_Codesniffer

The theme review guidelines can broadly be divided into two categories:

  1. Guidelines that cover technical aspects of theme development. An example would be lack of using the `eval()` function. PHPCS is great for detecting issues like this.
  2. Policy guidelines that are specific to a theme distributed on WordPress.org. An example would be a theme tagged with rtl that nonetheless lacks support for RTL languages. PHPCS is unfortunately not the right tool to detect these issues.

This is due to the way that PHPCS works. The sniffsniff A module for PHP Code Sniffer that analyzes code for a specific problem. Multiple stiffs are combined to create a PHPCS standard. The term is named because it detects code smells, similar to how a dog would "sniff" out food. process goes through the files one at a time, and runs all the sniffs on the current file. Once all files are processed, the sniff is considered complete. As such the tool has no knowledge of what the object of the sniff is. It just deals with files.

Additionally, PHPCS sniffs detect errors by looking for certain combinations of tokens. So it’s up to the person writing the sniff to know which token pattern represents a function call for example.

To effectively check the policy guidelines, we would need a tool specifically designed for the task. A theme is a collection of PHP, JavaScriptJavaScript JavaScript or JS is an object-oriented computer programming language commonly used to create interactive effects within web browsers. WordPress makes extensive use of JS for a better user experience. While PHP is executed on the server, JS executes within a user’s browser. https://www.javascript.com/., and CSSCSS CSS is an acronym for cascading style sheets. This is what controls the design or look and feel of a site. files, but we would need a tool that goes beyond this basic level.

There is an existing PR on the WordPress-Theme standards repository, that extracts a set of data points from a theme. While the implementation itself is not a final solution, the approach has merit. Rather than dealing with individual files, the relevant information is extracted, and serves as an abstract representation of the theme.

We are currently working on a test project that uses a PHP Parser to extract this information. A parser is one level above a lexer, as it turns the tokens into an abstract syntax tree. This is an advantage, because a parser knows how the tokens fit together.

The library used is in this project is phpDocumentor/Reflection. This library was recommended by @rmccue, since the PHPDoc parser powering the WordPress Code Reference is based on it.

The project is still in an early phase. It will be made available for contributions and testing as soon as a first stable version exists.

How can you help?

If you are a theme developer, start using the WordPress-Theme coding standard as part of your development process.

The WPCS project in general, and the WordPress-Theme coding standard in particular, could benefit from the help of proficient developers.

If you want to follow the advancement of the project, you can attend the automation meetings.

As the guidelines are reviewed and adjusted regularly, make sure to attend the WPTRT meetings.

#review-automation

A Guide to Writing Secure Themes – Part 4: Securing Post Meta

The previous two parts, we have gotten a high level overview of how to validate and sanitize data.

In the next few parts this series, we’ll look at how we can apply these concepts in specific contexts, starting with post metaMeta Meta is a term that refers to the inside workings of a group. For us, this is the team that works on internal WordPress sites like WordCamp Central and Make WordPress..

Post meta database structure

When dealing with any kind of data, we need to first understand how it is stored in the database, and what the schema is.

Post meta is stored in the wp_postmeta table. It acts as a key/value storage for additional information about particular posts.

I encourage you to look at the structure of the table, because it provides us with a number of important information:

  • The post_id is the fastest way to look up data.
  • Queries against the meta_key are possible, but slower.
  • The meta keys need to be unique for the theme, which means we will need to use a prefix.
  • Meta values that are not scalars will be stored in serialized form.
  • Queries against meta values should be avoided. You cannot query against serialized data, and queries against LONGTEXT columns are super slow.

So post meta is meant for storing little additional bits of information related to individual posts. It shouldn’t be used to store custom CSSCSS CSS is an acronym for cascading style sheets. This is what controls the design or look and feel of a site. snippets, as a taxonomyTaxonomy A taxonomy is a way to group things together. In WordPress, some common taxonomies are category, link, tag, or post format. https://codex.wordpress.org/Taxonomies#Default_Taxonomies. replacement, or as a bucket for storing content from (rich) text editors.

So remember: just because the flexibility provided by WordPress allows you to do something, that doesn’t mean that it is a good idea to do so.

Now that we have seen how the underlying data storage works, let’s look at the user interface for entering post meta.

Post meta user interface

Post meta is entered via elements in the interface called meta boxes. You are probably familiar with the code for adding a meta box to the post editing interface:

<?php
function wptrt_add_meta_box() {
    add_meta_box( 'wptrt-sample-meta-box',  esc_html__( 'WPTRT Sample Meta Box', 'wptrt' ), 'wptrt_print_meta_box', 'post' );
}
add_action( 'add_meta_boxes', 'wptrt_add_meta_box' );
?>

When you look at the code of the add_meta_box() function, you’ll see that it stores the passed arguments in the global $wp_meta_boxes array.

So our custom meta box gets added to the same array in which the WordPress CoreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. meta boxes are stored. This array is then used by the do_meta_boxes() function. It calls our custom callback function, wptrt_print_meta_box(), to print the meta box onto the screen.

To save the data entered via the meta box, you have to write a custom saving function, and hook it to save_post.

<?php
function wptrt_save_meta_box_data( $post_id ) {
    // Handle saving here
}
add_action( 'save_post', 'wptrt_save_meta_box_data' );
?>

The save_post action gets fired after the post has been saved. At this point WordPress has done all the work it needs to save the post data that Core handles.

The hook passes in three variables: the ID of the saved post, the post object (an instance of WP_Post), and a boolean, indicating whether it’s an update or a new post.

By looking at the underlying architecture, we can dedicate a number of things:

  1. In order to avoid conflicts with other meta boxes or meta data, we need to prefix everything.
  2. WordPress has no specific APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. for handling data submitted through custom meta boxes. We have to write our own data handling function.
  3. The save_post action is a generic action, that passes none of the data that was entered into the post edit form. This means that we’ll have to grab our data directly from the $_POST superglobal.
  4. No origin or capability checks have been performed for the HTTPHTTP HTTP is an acronym for Hyper Text Transfer Protocol. HTTP is the underlying protocol used by the World Wide Web and this protocol defines how messages are formatted and transmitted, and what actions Web servers and browsers should take in response to various commands. request from which we are retrieving the data.

We’ll now see how we reflect these points in our code.

Dealing with autosaves

Auto saves are automatically triggered by WordPress at set intervals. These autosaves are done via an asynchronous request (AJAX request). For the duration of the processing of the request, the interface is “frozen”, to keep the user from entering more data.

The more work the database has to do, the longer this saving process is going to take. In order to reduce the time that users are stuck waiting on the autosave to finish, we’re going to put a check in place that avoids saving the post meta during this event.

<?php
if ( defined( 'DOING_AUTOSAVE' ) && DOING_AUTOSAVE ) {
    return;
}

The DOING_AUTOSAVE constant is defined by WordPress in the wp_autosave() function. When it is defined, and when its value evaluates to true, we return.

Protecting against unwanted requests

As we have seen before, we retrieve the data to save directly from a $_POST request. In the current stage of our code, we cannot distinguish between valid requests, made intentionally by a user via the admin interface, and a forged request by a malicious third party.

To protect against this, we use a nonce, a number used once. We add this information to the form, so that it gets send along with all the other data. We can then verify in the received $_POST data that the nonce is the same as the one we added to the form, and as such validate the request.

We will look at nonces in detail in a later part of this series. For now let’s add a nonce to the form with the wp_nonce_field() function.

<?php wp_nonce_field( 'wptrt-post-meta-box-save', 'wptrt-post-meta-box-nonce' ); ?>

The wp_nonce_field() function accepts four arguments, but we’re only using the first two: $action and $name.

$action refers to the context in which the data is generated. We’re going to use wptrt-post-meta-box-save here, to show that this data originates from the action of saving a meta box on the post screen.

The $name is the name of the hidden input field in the form that contains the nonce. We’re going to use this to retrieve the nonce later from the $_POST data.

As you can see, both the action and the name are prefixed, and unique to the situation. We’re going to see later why this is important, but please take note of this as a best practice.

Now let’s add a nonce check into our post meta saving function.

<?php
if ( ! isset( $_POST['wptrt-post-meta-box-nonce'] ) && ! wp_verify_nonce( $_POST['wptrt-post-meta-box-nonce'] ) ) {
    return;
}
?>

First we check whether the nonce data has been transmitted as part of the $_POST request data. Then we use wp_verify_nonce() to verify it.

If the nonce isn’t set, or if the wp_verify_nonce() function returns false, we return.

Verifying access rights

So far, we have completed two tasks: avoid saving data on autosaves, and ensuring that the data we work with was issued via the admin of the site.

But we haven’t verified whether the user that has entered the data has the right to modify post meta data. To check this, we’re going to use the built-in roles and capabilities provided by WordPress.

We’re going to look at this in more detail later, but for now, let’s try and find out what capability we would need to check.

Our post meta box is located on the Edit Post screen of the admin. So users should only be able to see our meta box when they can edit posts on the site.

That’s a good start, but we want to make sure that the user can edit the specific post for which we are saving the data. This is because there are many instances in which users have different access rights to different posts. Therefore we need to check for the rights on this particular post.

<?php
if ( ! current_user_can( 'edit_post', $post_id ) ) {
        return;
}
?>

For checking the user’s capabilities, we use the current_user_can() function, and the edit_post capability. Since the capability refers to a single post, we pass in the $post_id variable, which we passed into our custom saving function, and which is provided by the save_post hook.

With this code in place, we can start the saving process. But before we get to that, let’s first look at how to work efficiently with post meta.

Working efficiently with post meta

There are a couple of things to keep in mind when working with post meta:

  1. Don’t save default values. Default values should be handled in the code, not via the database.
  2. Combine related data, separate individual data. When you repeatedly retrieve an array from post meta, only to use a single element, you would better use an individual key. On the other hand, when your code is littered with calls to get_post_meta() for little bits, you’d better consolidate those calls by using a single key.
  3. Separate related data by type. When storing related data, make sure not to mix different data types in the same key. Doing so would make things unnecessarily difficult to manage.

After lots of preparatory work, let’s get down to business: saving meta data.

Individual checkboxes

Let’s start off by looking at the simplest case: individual checkboxes. These are checkboxes used for handling features that are not related. Each checkbox is therefore saved with an individual key, to make the data easier to retrieve and use in the theme.

Here is the code for adding a checkbox. We will place this code inside our wptrt_print_meta_box() function.

<p>
    <input type="checkbox" id="wptrt-individual-checkbox" name="wptrt-individual-checkbox" value="1" <?php checked( get_post_meta( get_the_ID(), 'wptrt-individual-checkbox', true ) ); ?> />
    <label for="wptrt-individual-checkbox"><?php echo esc_html__( 'Individual Checkbox', 'wptrt' ); ?></label>
</p>

Note the use of the checked() function, which is part of a series of functions for working with inputs.

The code for saving the checkbox is straightforward:

<?php
if ( ! isset( $_POST['wptrt-individual-checkbox'] ) && get_post_meta( $post_id, 'wptrt-individual-checkbox', true ) ) {
        delete_post_meta( $post_id, 'wptrt-individual-checkbox' );
} else {
        update_post_meta( $post_id, 'wptrt-individual-checkbox', 1 );
}
?>

Here are the steps we take:

  • We verify first whether the wptrt-individual-checkbox key exists in the $_POST data. This is because checkboxes don’t submit any data if they are not checked.
  • If the checkbox is not checked, and we have a saved post meta value that evaluates to true, we delete the post meta data.
  • If the key exists, we save a 1 to the database with add_post_meta(). The 1 is hardcoded, so no data from the request will be saved.
  • The 1 is used because it gets evaluated to true due to PHPPHP PHP (recursive acronym for PHP: Hypertext Preprocessor) is a widely-used open source general-purpose scripting language that is especially suited for web development and can be embedded into HTML. http://php.net/manual/en/intro-whatis.php.’s type comparison. We use an integer, because the data gets stored as a string, which is not a good fit for booleans.

If we want to know in our theme whether the user has checked the checkbox or not, we verify whether the retrieved data evaluates to true.

<?php
if ( get_post_meta( get_the_ID(), 'wptrt-individual-checkbox', true ) ) {
    // Checkbox was checked
} else {
    // Checkbox was not checked
}
?>

So the saved 1 evaluates to true. If no post meta was saved, get_post_meta() will return an empty string, which evaluates to false.

Individual text fields

Other individual fields, like for example a single text field, can be dealt with in a similar manner.

Let’s add a text input to our post meta box:

<p>
        <label for="wptrt-individual-text-field"><?php echo esc_html__( 'Individual Text Field', 'wptrt' ); ?></label>
        <input type="text" id="wptrt-individual-text-field" name="wptrt-individual-text-field" value="<?php echo esc_attr( get_post_meta( get_the_ID(), 'wptrt-individual-text-field', true ) ); ?>" />
</p>

This is how we would save this data:

<?php
if ( empty( $_POST['wptrt-individual-text-field'] ) ) {
        if ( get_post_meta( $post_id, 'wptrt-individual-text-field', true ) ) {
            delete_post_meta( $post_id, 'wptrt-individual-text-field' );
        }
} else {
        update_post_meta( $post_id, 'wptrt-individual-text-field', sanitize_text_field( $_POST['wptrt-individual-text-field'] ) );
}
?>

As you can see, the code is similar to the checkbox example, with a few key differences:

  • empty() is used instead of isset(). This is because text fields return an empty string when they don’t contain any data.
  • We separated the checks for the existence of submitted data and saved data into two different conditionals. This is because the text field might be empty and no data might be saved.
  • Instead of add_post_meta(), we use update_post_meta(). The update_post_meta() function either adds post meta or updates existing post meta depending on the case.
  • Since we are using data from the request, we are using sanitization to make sure that the data is secure before saving.

Checkbox groups

Checkbox groups are checkboxes that are related together. A common example for such a group would be options that allow to hide certain elements related to individual posts, such as the date, author, and the categories.

First, let’s print the markup for these three checkboxes:

<?php $hide_elements = (array) get_post_meta( get_the_ID(), 'wptrt-hide-post-element', true ); ?>
<p>
        <input type="checkbox" id="wptrt-hide-post-date" name="wptrt-hide-post-element[]" value="date" <?php checked( in_array( 'date', $hide_elements, true ) ); ?> />
        <label for="wptrt-hide-post-date"><?php echo esc_html__( 'Hide Date', 'wptrt' ); ?></label>
</p>
<p>
        <input type="checkbox" id="wptrt-hide-post-author" name="wptrt-hide-post-element[]" value="author" <?php checked( in_array( 'author', $hide_elements, true ) ); ?> />
        <label for="wptrt-hide-post-author"><?php echo esc_html__( 'Hide Author', 'wptrt' ); ?></label>
</p>
<p>
        <input type="checkbox" id="wptrt-hide-post-categories" name="wptrt-hide-post-element[]" value="categories" <?php checked( in_array( 'categories', $hide_elements, true ) ); ?> />
        <label for="wptrt-hide-post-categories"><?php echo esc_html__( 'Hide Categories', 'wptrt' ); ?></label>
</p>

We will save the data from these checkboxes under a single key, in form of an array.

As we have seen, get_post_meta() returns an empty string when no post meta data exists. We therefore use type casting to ensure that we are always dealing with an array.

As we are dealing with an array, we use in_array() to determine whether a checkbox needs to be checked or not. It will return true or false, which the checked() function then will use to print the correct markup.

Speaking of markup, all the input elements have the same name: wptrt-hide-post-element[]. This ensures that all the submitted data for these checkboxes will be provided as an array stored under the wptrt-hide-post-element key in the $_POST data.

Since the user only can select from the options we provide in the interface, we will use validation to secure the data.

<?php
if ( ! isset( $_POST['wptrt-hide-post-element'] ) ) {
        if ( get_post_meta( $post_id, 'wptrt-hide-post-element', true ) ) {
            delete_post_meta( $post_id, 'wptrt-hide-post-element' );
        }
} else {
        $safe_hide_post_element = array();

        foreach ( $_POST['wptrt-hide-post-element'] as $element ) {
            if ( in_array( $element, array( 'date', 'author', 'categories' ), true ) ) {
                $safe_hide_post_element[] = $element;
            }
        }

        if ( ! empty( $safe_hide_post_element ) ) {
            update_post_meta( $post_id, 'wptrt-hide-post-element', $safe_hide_post_element );
        }
}
?>

We will not talk about the first conditional in the code, since it is the same as we used in the previous example.

Instead let’s look at what happens when $_POST['wptrt-hide-post-element'] is set, meaning the user has checked at least one checkbox:

  • First we created a temporary array called $safe_hide_post_element. This is where we will save all valid data. The naming is very explicit, to clarify that this variable only holds secure data.
  • Next we loopLoop The Loop is PHP code used by WordPress to display posts. Using The Loop, WordPress processes each post to be displayed on the current page, and formats it according to how it matches specified criteria within The Loop tags. Any HTML or PHP code in the Loop will be processed on each post. https://codex.wordpress.org/The_Loop. over the array contained in $_POST['wptrt-hide-post-element'] and compare each entry against a list of possible values. Valid entries are then stored inside the $safe_hide_post_element array.
  • Finally we check whether the $safe_hide_post_element array contains any entries. This is to avoid saving an empty array in case the $_POST data did not contain any valid options. This array is then saved.

In the theme, you can then retrieve the data, and use in_array() to determine whether to display an element of the post or not:

<?php
$hide_elements = (array) get_post_meta( get_the_ID(), 'wptrt-hide-post-element', true );

if ( ! in_array( 'date', $hide_elements ) ) {
        echo '<p>' . esc_html__( 'Date:', 'wptrt' ) . esc_html( get_the_date() ) . '</p>';
}

if ( ! in_array( 'author', $hide_elements ) ) {
        echo '<p>' . esc_html__( 'Author:', 'wptrt' ) . esc_html( get_the_author() ) . '</p>';
}

if ( ! in_array( 'categories', $hide_elements ) ) {
        echo '<p>' . esc_html__( 'Categories:', 'wptrt' ) . '</p>';
        the_category();
}
?>

Do not repeat yourself checkbox groups

The code we have so far works well, but it is a bit repetitive in some places. If you have more complex requirements, a little bit more abstraction would be helpful to reduce repetitive code.

Let’s implement a feature that allows users to choose their favorite colors. We will need a key and a text with the color name for each option. Let’s implement a function that returns the options:

<?php
function wptrt_get_favorite_color_options() {
    return array(
        'blue'   => __( 'Blue', 'wptrt' ),
        'red'    => __( 'Red', 'wptrt' ),
        'yellow' => __( 'Yellow', 'wptrt' ),
    );
}
?>

We can then use this function inside a loop to print the checkboxes.

<?php
$favorite_colors = (array) get_post_meta( get_the_ID(), 'wptrt-favorite-color', true );
foreach ( wptrt_get_favorite_color_options() as $option => $text ) :
?>
        <p>
            <input type="checkbox" id="wptrt-favorite-color-<?php echo esc_attr( $option ); ?>" name="wptrt-favorite-color[]" value="<?php echo esc_attr( $option ); ?>" <?php checked( in_array( $option, $favorite_colors, true ) ); ?> />
            <label for="wptrt-favorite-color-<?php echo esc_attr( $option ); ?>"><?php echo esc_html( $text ); ?></label>
        </p>
<?php endforeach; ?>

Next, we need to modify our saving code:

<?php
if ( ! isset( $_POST['wptrt-favorite-color'] ) ) {
        if ( get_post_meta( $post_id, 'wptrt-favorite-color', true ) ) {
            delete_post_meta( $post_id, 'wptrt-favorite-color' );
        }
} else {
        $safe_favorite_color = array();

        foreach ( $_POST['wptrt-favorite-color'] as $color ) {
            if ( array_key_exists( $color, wptrt_get_favorite_color_options() ) ) {
                $safe_favorite_color[] = $color;
            }
        }

        if ( ! empty( $safe_favorite_color ) ) {
            update_post_meta( $post_id, 'wptrt-favorite-color', $safe_favorite_color );
        }
}
?>

With this setup, every time you add or remove an entry in the array returned by wptrt_get_favorite_color_options(), the display code and saving code will take this change into account.

Conclusion

By now, you should have a pretty good idea on how to safely store post meta. Feel free to copy the code samples and play around with them to gain more experience. You can find the entire code in this tutorial on Github.

In the next part, we’ll look at how to deal with custom widgetWidget A WordPress Widget is a small block that performs a specific function. You can add these widgets in sidebars also known as widget-ready areas on your web page. WordPress widgets were originally created to provide a simple and easy-to-use way of giving design and structure control of the WordPress theme to the user. settings.

#writing-secure-themes

A Guide to Writing Secure Themes – Part 3: Sanitization

In this part, we’re going to look at another technique to ensure that input is secure before using it in your code.

The difference between validation and sanitization

In the second part of this series, we talked about validation. When validating data, you are looking for certain criteria in the data. Or simply put, you’re saying “I want the data to have this, this, and this”.

Sanitization is different, because it is about removing all the harmful elements from the data. In essence you’re saying “I don’t want the data to have this, this, and this”.

But the difference is more than just conceptual. With validation, we store the data once we have verified it’s valid. If not, we discard it.

With sanitization, we take the data, and remove everything we don’t want. This means that we might change the data during the sanitization process. So in the case of user input, it is not guaranteed that all the input is kept.

So it’s important that you choose the right sanitization functions, to keep the data intact.

We’re going to look at seven often used sanitization functions provided by WordPress. For each function we’ll look at what it removes, as well as its use cases.

WordPress sanitization functions

sanitize_text_field()

The main usage of sanitize_text_field() function is to sanitize the data provided by text input fields in forms. But it’s useful for sanitizating any kind of data that you want to be plain text.

sanitize_text_field() applies the following modifications to the data:

  • Removes all tags.
  • Removes whitespace from the start and end of the string.
  • Removes extra whitespace (more than a single space) between words.
  • Removes tabs and line breaks.
  • Converts single < characters into an HTMLHTML HTML is an acronym for Hyper Text Markup Language. It is a markup language that is used in the development of web pages and websites. entity.
  • Removes any invalid UTF–8 characters.
  • Removes % encoded octets.

Data passed through sanitize_text_field() is safe for storage in the database. You can use it with any of the high level functions in WordPress for saving data to the database, like for example update_post_meta().

<?php
if ( ! empty( $_POST['wptrt-meta-box-data'] ) ) {
    update_post_meta( $post_id, 'wptrt-meta-box-data', sanitize_text_field( $_POST['wptrt-meta-box-data'] ) );
}
?>

sanitize_text_field() can also clean arguments passed to WordPress or custom functions that expect plain text input. In this context, other validation steps might be needed, but making sure the data is valid plain text is a good first step.

absint()

absint() is a useful function for sanitizing IDs.

WordPress uses IDs to identify posts, terms, comments, users, etc. An ID needs to be an absolute integer, meaning a whole number that’s positive.

absint() is a wrapper function for two PHPPHP PHP (recursive acronym for PHP: Hypertext Preprocessor) is a widely-used open source general-purpose scripting language that is especially suited for web development and can be embedded into HTML. http://php.net/manual/en/intro-whatis.php. functions: intval() turns the data into an integer, and abs() makes sure that it is an absolute value.

<?php
$post_id = abs( intval( $_POST['id'] ) ); // PHP functions.
$post_id = absint( $_POST['id'] );        // WordPress function that acts as a shortcut.
?>

Integers are safe to use in any context. When you pass invalid data–like a text string–to absint(), the return is most likely a 0. As the function internally converts the data into an integer, the rules of integer casting apply.

In MySQLMySQL MySQL is a relational database management system. A database is a structured collection of data where content, configuration and other options are stored. https://www.mysql.com/., IDs start at 1. If your code relies on the sanitized ID, you can check that the sanitized data does not equal 0 before proceeding.

<?php
$post_id = absint( $_POST['id'] );

if ( 0 === $post_id ) {
    return;
}

// Use $post_id to retrieve a post, or do something else.
[…]
?>

If you need to sanitize an integer that can be negative or positive, use the PHP function intval(). It will only cast the data to an integer.

esc_url_raw()

esc_url_raw() sanitizes URLs for safe storage in a database by stripping undesired characters and verifying the URLURL A specific web address of a website or web page on the Internet, such as a website’s URL www.wordpress.org protocol.

The function accepts two arguments: the URL to clean, as well as an optional array of allowed protocols. URLs that don’t use the whitelisted protocol(s) will be discarded.

So if you only want to save URLs that start with https://, you can call the function like this:

<?php $clean_url = esc_url_raw( $url, array( 'https' ) ); ?>

Keep in mind that relative URLs starting with a /, #, or ?, as well as file names ending with .php will not be discarded by esc_url_raw(). So if you need an absolute URL to a website, you need to put additional checks into place.

sanitize_email()

The sanitize_email() function performs a number of checks to detect invalid email address formats, and strips undesired characters.

It returns an empty string when the basic validity checks fail. If the email address has the right format, the sanitized address is returned.

sanitize_file_name()

The sanitize_file_name() function applies the following modifications to the data:

  • Removes special characters that are illegal in filenames on various operating systems.
  • Removes special characters that would require escaping when interacting with the file through the command line.
  • Replaces spaces and consecutive dashes with a single dash.
  • Removes periods, dashes, and underscores from the beginning and the end of the file name.
  • Adds an underscore to intermediate extensions that are not whitelisted.

sanitize_file_name() only handles sanitizing the name of the file.

It doesn’t make sure that the name is unique, you would need to use wp_unique_filename() for that.

While it handles intermediate extensions, it is not concerned with the main extension of the file. As an example, file.exe.exe will be transformed into file.exe_.exe, because .exe is not an allowed extension. file.exe will not be modified though.

You would need to use wp_check_filetype() to verify that the extension of the file is allowed on the system. The function returns an array with two keys: ext and type. Both will be set to false if the filetype is not part of the allowed MIME types.

sanitize_key()

The sanitize_key() function is useful to deal with data that needs to be in slug form.

Slugs can only be composed of lowercase alphanumeric characters (characters from a to z and numbers from 0 to 9), dashes (-) and underscores (_). Slugs are safe to use in any context.

Imagine that you have a theme option that allows users to enter a tag used to display featured posts in a slider.

<?php
$tag = sanitize_key( $_POST['featured-tag'] );
?>

Users will enter the tag into a text field, so using sanitize_text_field() would also be correct. In this case, using sanitize_key() is preferred, because it removes more unwanted data.

In addition, you will most likely query for the posts displayed in the slider using the tag slug. With the right sanitization function, you ensure that the data is a valid argument to pass to WP_Query.

sanitize_title()

The sanitize_title() function turns post titles into their slug form.

To do this, sanitize_title():

  • Removes PHP and HTML tags.
  • Removes accents.
  • Replaces spaces and periods with dashes.

This function is useful when you need to query for a post by name. You can safely pass the sanitized data to the name argument in WP_Query.

This concludes our look at some of WordPress’ sanitization functions. In the next section, we’re briefly going to touch on the sanitization functions provided by the PHP language itself.

PHP sanitization functions

For sanitization, you can use the same functions that we have discussed in the second part on validation. They are:

  • filter_input(): Retrieves an external variable (from $_GET, $_POST, $_SERVER,…) and applies the specified filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output..
  • filter_input_array(): Works the same as filter_input(), but allows multiple values to be retrieved with one call.
  • filter_var(): Filters the variable passed as an argument.

When using these functions, you need to indicate a filter to use. The sanitization filters can be combined with flags to achieve a specific behavior.

As always, make sure to read the documentation carefully. Because only the combination of the right filter, with the right flags, and the right options makes sure that all invalid data is removed.

Conclusion

In this part of our series, we have seen what sanitization is, and what WordPress and PHP functions you can use.

In the next part, we are going to see how to put all the things we have seen so far into practice when dealing with post metaMeta Meta is a term that refers to the inside workings of a group. For us, this is the team that works on internal WordPress sites like WordCamp Central and Make WordPress. data, custom widgetWidget A WordPress Widget is a small block that performs a specific function. You can add these widgets in sidebars also known as widget-ready areas on your web page. WordPress widgets were originally created to provide a simple and easy-to-use way of giving design and structure control of the WordPress theme to the user. settings, as well as CustomizerCustomizer Tool built into WordPress core that hooks into most modern themes. You can use it to preview and modify many of your site’s appearance settings. and the Settings APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways..

If you have any specific use cases you’re wondering about, please let me know in the comments, and we may look at this during the next part.

#writing-secure-themes

A Guide to Writing Secure Themes – Part 2: Validation

Validation is a technique to ensure that input is secure before using it in your code.

When validating data, you are verifying that it corresponds to what the program needs. This only works if you have a list of criteria that you can check to determine that the data is valid.

Whitelisting

The simplest validation method is whitelisting. This only works when there is a precise set of possible values that the data can have.

Let’s look at how whitelisting can be used for validating a theme option controlling the position of the sidebarSidebar A sidebar in WordPress is referred to a widget-ready area used by WordPress themes to display information that is not a part of the main content. It is not always a vertical column on the side. It can be a horizontal rectangle below or above the content area, footer, header, or any where in the theme..

image

Here is the code that we used to create the setting and the control:

$wp_customize->add_setting( 'sidebar-position', array(
    'default'           => 'left',
    'sanitize_callback' => 'wptrt_validate_sidebar_position',
) );

$wp_customize->add_control( 'sidebar-position-control', array(
    'label'    => esc_html__( 'Sidebar Position', 'wptrt' ),
    'section'  => 'theme',
    'settings' => 'sidebar-position',
    'type'     => 'radio',
    'choices'  => array(
        'left'  => esc_html__( 'Left', 'wptrt' ),
        'right' => esc_html__( 'Right', 'wptrt' ),
) ) );

The user only has two choices: left or right. This means that in the wptrt_validate_sidebar_position(), we can determine whether the submitted option is one of the two possible values.

function wptrt_validate_sidebar_position( $sidebar_position ) {
    if ( in_array( $sidebar_position, array( 'left', 'right' ), true ) ) {
        return $sidebar_position;
    }
}

To do this, we use the in_array() PHPPHP PHP (recursive acronym for PHP: Hypertext Preprocessor) is a widely-used open source general-purpose scripting language that is especially suited for web development and can be embedded into HTML. http://php.net/manual/en/intro-whatis.php. function. This function returns true when the needle, the submitted value for the position of the sidebar, is in the haystack, the list of possible positions.

The third parameter of the in_array() function is to enable strict type comparison. We pass true as an argument, to enable the strict checking. This is important, because in PHP loose type comparison can lead to unexpected results.

So whitelisting simply means that we compare the submitted data against a list of acceptable values. This works well for controls such as checkboxes, radio buttons, selects, and dropdowns.

But how can we validate data for which we don’t know the possible values? Let’s have a look at validating data according to a set of qualifications.

Qualifying data

When qualifying data, we try to find out whether it meets a precise set of criteria. Let’s look at an example of validating data.

Imagine that you have a metaMeta Meta is a term that refers to the inside workings of a group. For us, this is the team that works on internal WordPress sites like WordCamp Central and Make WordPress. box that allows users to enter a value for the width (in pixels) of the content area of a particular post. While not being a super useful feature in a theme, this example allows us to demonstrate the use of filter_input().

The filter_input() PHP function gets a variable and validates it. The function accepts four arguments: the type of input, the name of the variable to get, the filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. (validation) to apply, and an optional array of options.

<?php
$content_area_width = filter_input(
                INPUT_POST,
                'content_area_width',
                FILTER_VALIDATE_INT,
                array( 'options' => array(
                    'default'   => 500,
                    'min_range' => 100,
                    'max_range' => 1000,
                ) )
            );
?>

Although the code for this function might seem verbose, it’s much shorter and clearer than writing it all out:

<?php
// Warning: This code does not work correctly.
if ( isset( $_GET['content_area_width'] ) && is_int( $_GET['content_area_width'] ) && $_GET['content_area_width'] >= 100  && $_GET['content_area_width'] <= 1000 ) {
    $content_area_width = $_GET['content_area_width'];
} else {
    $content_area_width = 500;
}
?>

You might wonder why there is a warning about this code not working. Seems to look good, right? The problem is that is_int( $_GET['content_area_width'] ) will always return false, so this code will always return 500.

This is because data retrieved from the $_GET and $_POST super globals is always of the type string. Using the filter_input() function allows us to get around this limitation of the PHP language.

Choosing the right qualifications

When validating data, it’s crucial that you choose the right set of qualifications, and express this correctly in the code.

Imagine that you have a CustomizerCustomizer Tool built into WordPress core that hooks into most modern themes. You can use it to preview and modify many of your site’s appearance settings. setting in your theme for entering a link to a Twitter profile. You want to have a valid URLURL A specific web address of a website or web page on the Internet, such as a website’s URL www.wordpress.org for this setting, so you use the filter_var() PHP function with the FILTER_VALIDATE_URL filter.

<?php
// Warning: Insecure code!
function wptrt_validate_twitter_profile_url( $url ) {
    return filter_var( $url, FILTER_VALIDATE_URL ) );
}
?>

The next thing you do is output the validated URL in your theme:

<?php
// Warning: Insecure code!
echo '<a href="' . $twitter_url . '">' . esc_html__( 'Twitter', 'wptrtp' ) . '</a>';
?>

In the four lines of codeLines of Code Lines of code. This is sometimes used as a poor metric for developer productivity, but can also have other uses. that we have seen so far, we have made two crucial mistakes:

  1. We trusted the filter_var() function to validate the URL to the Twitter profile.
  2. We didn’t escape the URL on output.

We are going to look at escaping in a later part of this series. For now let’s look at why the validation was too weak to be secure.

The problem is that if you enter javascript://test%0Aalert(321), this is a valid URL. As soon as a user would click on the Twitter link on the front end of the site, a JavascriptJavaScript JavaScript or JS is an object-oriented computer programming language commonly used to create interactive effects within web browsers. WordPress makes extensive use of JS for a better user experience. While PHP is executed on the server, JS executes within a user’s browser. https://www.javascript.com/. dialog would appear.

We need to add additional checks to our function:

function wptrt_validate_twitter_profile_url( $url ) {
    if ( 0 !== strpos( $url, 'https://twitter.com/' ) ) {
        return;
    }

    return filter_var( $url, FILTER_VALIDATE_URL, FILTER_FLAG_PATH_REQUIRED ) );
}

This function now verifies that the data meets three qualifications:

  1. The URL starts with https://twitter.com/.
  2. The URL is valid according to the RFC 2396 standard.
  3. The URL has a path component (as in http://example.org/path).

Validation functions

WordPress validation functions

WordPress only has a couple of validation functions.

  • is_email(): Checks whether the data is a valid email address. The validation done by the function does not comply with the RFC 822 standard, and does not work with internationalized domain names.
  • wp_validate_boolean(): Despite the name, this function not only validates, but also sanitizes the data passed to it. So the return value will always be a boolean. You can use filter_var( $var, FILTER_VALIDATE_BOOLEAN, FILTER_NULL_ON_FAILURE ) as an alternative, as it returns NULL when the passed data is not valid.
  • sanitize_hex_color(): This actually a validation function, as it returns null if the color code isn’t valid. It is only available in the Customizer context, but it’s a small function so you can copy the code to your own validation function if needed.
  • sanitize_hex_color_no_hash(): The same as sanitize_hex_color() but for values without a leading #.

PHP validation functions

PHP offers a number of validation functions. As we have seen previously, using them can be a bit tricky. So make sure to read the documentation carefully, including the notes.

  • is_bool(): Returns true if the passed variable is of the type boolean.
  • is_float(): Returns true if the passed variable is of the type float.
  • is_int(): Returns true if the passed variable is of the type integer.
  • is_numeric(): Returns true if the passed variable contains a numeric value. Keep in mind that this encompasses all numeric values, so signs, hexadecimal, binary, and octal values are all valid.
  • strtotime(): Not a validation function strictly speaking, but can be used as such to validate dates. The function returns false if the passed data cannot be converted into a timestamp.

Next we have a family of functions that have been specifically designed to validate data.

  • filter_input(): Retrieves an external variable (from $_GET, $_POST, $_SERVER,…) and applies the specified filter.
  • filter_input_array(): Works the same as filter_input(), but allows multiple values to be retrieved with one call.
  • filter_var(): Filters the variable passed as an argument.

When using these functions, you need to indicate a filter to use. The validation filters can be combined with flags to achieve a specific behavior. Some filters also accept additional options.

It’s the combination of the right filter, with the right flags, and the right options that makes these functions do their work correctly.

Conclusion

Now that we have a solid grasp on how validating data works, we’ll look at sanitization in the next part of this series.

#writing-secure-themes

A Guide to Writing Secure Themes – Part 1: Introduction

As a developer, keeping your users secure should be your most important priority.

Having a theme available on WordPress.orgWordPress.org The community site where WordPress code is created and shared by the users. This is where you can download the source code for WordPress core, plugins and themes as well as the central location for community conversations and organization. https://wordpress.org/ is a huge responsibility, because security issues make every site running the theme potentially vulnerable.

This guide will give you an introduction to the techniques you can apply to write secure code.

The guide is broken up into parts to make it easier to read and apply. It contains everything I learned over the past three years while reviewing themes for WordPress.org, premium themes for WordPress.comWordPress.com An online implementation of WordPress code that lets you immediately access a new WordPress environment to publish your content. WordPress.com is a private company owned by Automattic that hosts the largest multisite in the world. This is arguably the best place to start blogging if you have never touched WordPress before. https://wordpress.com/, as well as themes and plugins for WordPress.com VIP.

Before we get to the techniques, let’s have a look at the principles of secure code.

Principles of secure code

Writing secure code is not about using a particular function, tool, or workflow. Those things change over time, with new development techniques emerging and new security issues arising.

The common element that connects all these things together is the state of mind of the developer. This mindset is based on three principles:

  1. Don’t assume anything. Only act on what you know for sure.
  2. Don’t trust any data. Consider data invalid and insecure until proven valid and secure.
  3. Don’t become complacent. Web technologies evolve, and so do best practices.

With these principles on our mind, let’s clarify the meaning of a few terms we’re going to use in this series.

Commonly used terms

Input and output

When we talk about input, this designates all the data that is given to our code.

The most prevalent use case is information entered by the user, for example into a form field or the browser address bar. But it also encompasses data retrieved from stored cookies or from external services, like the Twitter APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways..

Themes deal with this data in various ways. They might store it into the database, use it to retrieve data from the database, or display it to the user.

When we talk about displaying information, we use the word output. But output is not just what we see on the screen, it’s all the data provided by our code.

Imagine a PHPPHP PHP (recursive acronym for PHP: Hypertext Preprocessor) is a widely-used open source general-purpose scripting language that is especially suited for web development and can be embedded into HTML. http://php.net/manual/en/intro-whatis.php. script that passes data to a JavascriptJavaScript JavaScript or JS is an object-oriented computer programming language commonly used to create interactive effects within web browsers. WordPress makes extensive use of JS for a better user experience. While PHP is executed on the server, JS executes within a user’s browser. https://www.javascript.com/. script, such as data used to initialize a slider for example. In this case, the PHP outputs the data that is then used as input by the Javascript.

If your code connects to a REST APIREST API The REST API is an acronym for the RESTful Application Program Interface (API) that uses HTTP requests to GET, PUT, POST and DELETE data. It is how the front end of an application (think “phone app” or “website”) can communicate with the data store (think “database” or “file system”) https://developer.wordpress.org/rest-api/., the JSONJSON JSON, or JavaScript Object Notation, is a minimal, readable format for structuring data. It is used primarily to transmit data between a server and web application, as an alternative to XML. data returned by the API is the output, that your code then uses as input.

Dynamic and static data

When we talk about dynamic or static data, this is not to be confused with the static keyword in PHP.

When we talk about static data, we designate data that cannot be changed except by changing the code. Here is an example:

<?php echo 'Hello World'; ?>

So when you read static, think of static HTMLHTML HTML is an acronym for Hyper Text Markup Language. It is a markup language that is used in the development of web pages and websites. pages. These documents cannot return information that is not present in their source code.

Dynamic data on the other hand can be modified through different ways. For example:

<?php echo __( 'Hello World', 'wptrt' ); ?>

In this code sample, the __() translation function returns data. This data can be filtered, or modified by loading a translation.

What we are outputting is the return value of the function. Let’s look at this in more detail.

Return values

Return values is data provided by a function. In PHP you often see these return statements in functions:

<?php
function wptrt_add_numbers( $a, $b ) {
    return $a + $b;
}
?>

Functions can return all kinds of data. Currently in PHP there is no way to force a function to return a certain type of data.

This is important to keep in mind, because a lot of WordPress functions contain filters. So you can never be sure about the data that a certain function returns.

Now that we have seen the vocabulary, we’ll look at common attacks.

Common attacks

In order for you to secure your code, you need to understand how attacks work.

A good starting point is to read through the list of the Top 10 attacks in 2013, published by the Open Web Application Security Project (OWASP).

Google Application security also has a very good introduction to Cross-Site scription (XSS) attacks. You actually can test out these attacks in the browser.

If you are interested in specific attacks for WordPress, I recommend reading the Sucuri Blog.

Conclusion

This part should have provided you with a good overview of what security is, the related terminology, and the type of attacks encountered.

In the next part, we’re going to look at how you can protect against some of these attacks by validating data before use.

#writing-secure-themes