OMG EMOJI 😎

One of the fun bits of the utf8mb4 upgrade is that we can now store emoji! Once your site is upgraded to utf8mb4, it can natively store any emoji character. There were a couple of problems with that, though:

  • Some browsers don’t know how to render emoji 👎, or they have bugs in their implementation 😢. Notably, Chrome either doesn’t work or has bugs, older versions of IE don’t work, and Firefox has bugs.
  • Not all sites will be able to upgrade to utf8mb4, which means they’ll be unable to save emoji characters that they enter.

Not being able to use emoji makes everyone a sad panda (😭🐼), so we need to fix this. There are a few moving parts of our emoji support, so lets go through them.

utf8 backwards compatibility

If a site can’t be upgraded to utf8mb4, we convert emoji to their HTML-encoded equivalent, and store that, instead. From a UI perspective, post editing works as expected 🎉. Because fields need to be white listed to support this, we don’t allow it everywhere – just post_title, post_content and post_excerpt. We also allow it in the site title and the site description.

Browser support

There’s a small compatibility check included on every page, both on the front end, and in the Dashboard. For those interested, this adds 1-4ms (⚡️-fast!) to the page render time – the aim was to keep this as low as possible, to avoid affecting your UX. If you notice a little chunk of compressed JS at the top your HTML, that’s probably it. If you’d like to check out how it works in a more readable format, have a look through wp-emoji-loader.js.

Email ✉️ and RSS 📯 (There’s no RSS emoji, give me a break.)

Of course, the JS shim won’t work in email and RSS. So, we replace all emoji with a static PNG version in those cases.

TinyMCE 📝

In addition to the browser support JS, there’s a TinyMCE plugin that handles keeping emoji looking good, while you type. It’s pretty magical.

Taxonomies and URL slugs

You can totally make taxonomies and URL slugs with emoji in them. Because we love you, and want you to be happy. 😀

So, that’s about it. If you have any questions about the implementation, drop them in the comments below.

💩

#4-2, #dev-notes, #%f0%9f%91%bb, #emoji

Emoji Chat Agenda, February 25, 2015

Here’s the agenda for Wednesday’s Emoji Chat in the #core channel on Slack.

Time/Date: Immediately after the Dev Chat.

  1. Emoji Helper – On platforms that don’t have an emoji keyboard, should we provide one?
  2. Performance – When we’re falling back to Twemoji on large pages, we have to consider how it will perform. Are there faster ways of replacing the emoji characters with images?
  3. Open Floor – There’s still time to rant about the evils of emoji! Let us know how you really feel.

#agenda, #emoji, #x1f4a9

Emoji Chat Meeting Notes, February 12, 2015

The full meeting archive is available here.

1. Why we’re doing this

So, here’s a bit of back story.

As of r31349, WordPress partially supports emoji. ~60% of WordPress sites are running MySQL 5.5 or later (so can be upgraded to store emoji), and ~40% of browsers natively support emoji. Emoji are a wildly popular method of communication, so we can expect them to be heavily used as soon an they’re available. The problem is, 60%/40% means a really bad experience for a huge number of our users, who’ll try to use emoji, and fail.

This is where the emoji feature plugin comes in to play. In order to help the 40% of WordPress sites that can’t be upgraded to store emoji natively, the wp_encode_emoji() function will turn them into HTML entities. Due to the unimaginable joy that character sets brings me, this will only be applied to sites using the utf8 character set, which accounts for the vast majority of WordPress sites – utf8 has been the default character set since r4860.

To help the 60% of browsers that don’t display emoji natively, we’re using the Twemoji image set as a fallback. This lets us show emoji everywhere, without causing extra load where emoji are already supported, mobile browsers being the important example here.

Now, there have been some concerns brought up previously that I’d like to address.

“Is this really appropriate for core?”

Yes. (Obviously that’s my answer, or we wouldn’t be here.) WordPress is is the business of making communication simple and accessible for all. Tech users everywhere have clearly chosen emoji as a means of communication, so it’s up to us to make sure they can do that within WordPress as easily as possible, or risk being left behind.

“Should we be concerned about changing the images in the future? Wouldn’t we be altering users’ content?”

No. By using Twemoji only when we can’t provide native support for emoji, it’s a pretty clear message that while the general appearance of emoji stays the same, the actual sprite used can differ significantly between platforms. (For example, every emoji set except Android uses a left hand for :thumbsup:.) As more browsers add native support for emoji, Twemoji usage will drop, reducing even further any impact we can have on users.

And so, that brings us to today.

2. The current state of the plugin

The plugin is very close to done. The editor plugin needs some attention, which @azaozz will be providing soon. There are a few bugs to discuss, which are mostly around fallback behaviour in browsers that don’t support emoji natively. Apart from that, the basic functionality is pretty much how I would expect it to appear in core. It’s had a brief review from the accessibility team, with only some minor alterations needed. The Twemoji images won’t be included in wordpress.zip, as it’s a total of 3.4MB of images. They’re currently hosted on WP.com’s CDN, but we’re investigating other options for where to host them, probably the W.org CDN. Given that the wp-admin Dashboard also loads things from Google, I have no problem with hosting them on an external CDN. There will naturally be a filter on the URL, to allow local hosting for sites that don’t want to use the CDN.

One of the major concerns at the moment is that we’re going to be splitting data formats, depending on if the site uses the utf8mb4 or the utf8 character set. utf8mb4 stores emoji natively, while utf8 requires us to HTML encode the emoji characters. In the futures, we’ll look at upgrading sites to utf8mb4 if they’ve upgraded their MySQL since WordPress 4.2, but that leaves the potential for mixed encoding – old posts having HTML encoding, new posts having the native characters. A post will be automatically updated to native upon saving, but do we need to consider upgrade routines, to go through all old posts and convert them?

Export/import also needs thorough testing, particularly when importing and exporting between sites having different character sets.

3. Unicode 8.0: the future of emoji

To talk about the future of emoji, you need to know a little bit of history. At the basic level, emoji are all single characters defined within the Unicode standard. However, they also support modifiers. Modifiers are a second character following the first, which usually causes the two characters to be merged into a single character when rendered. A good example of this is flag emoji.

The character G is U+1F1EC. The character B is U+1F1E7 (these characters are different to their ASCII equivalent). When used individually, they’ll display as that letter. When combined next to each other, they’ll display as the British flag.

So, Unicode 8.0 will two interesting things: a set of 37 new emoji, and skin tone modifiers. When a skin tone modifier character is placed after any face or person emoji, the emoji will show with that skin tone. Unicode 8.0 is due to be finalised in August 2015, so we and (Twemoji) will be looking at adding support for these then.

From a technical perspective, it just means we need to be aware that emoji are not always one character, and the methods for detecting multi-character emoji are about to get more complex.

We’ll also be able to detect if a browser is able to render the new emoji and skin tones, and fall back to Twemoji if they can’t. I don’t have a timeline for when browsers will support the new emoji, so I think it’d be good for us to get ahead of the curve then.

utf8mb4 stores anything in the Unicode address space, including unallocated characters, so I don’t expect any problems with storage of new emoji.

#emoji, #x1f4a9

Emoji Chat Agenda, February 12, 2015

Here’s the agenda for Thursday’s Emoji Chat in the #core channel on Slack.

Time/Date: February 12 2015 23:00 UTC

  1. Why we’re doing this – @pento
  2. The current state of the plugin – @pento
  3. Unicode 8.0, and future plans – @pento
  4. Open Floor/ranting about the evils of emoji – everyone

See you tomorrow!

#emoji, #x1f4a9

Emoji Feature Plugin for 4.2

It’s time for a weekend fun feature! Now that #21212 is complete, WordPress kind of supports Emoji (for the 60% of WordPress sites using MySQL 5.5+, and the 30-40% (by usage) of browsers that natively display Emoji – including when Chrome for OS X adds support in the next month or so).

In order to complete this support, I’ve created a feature plugin called x1f4a9, which makes use of Twitter’s Open Source twemoji icon set, the same as WordPress.com recently added.

I’ve added a few tickets to the Github project, feel free to add any others you think of, and pull requests are always welcome! If you’d like to test the plugin, daily builds are available from the plugin repo.

(And if you’re using MySQL older than 5.5, please pay special attention to this ticket.)

#emoji, #feature-plugins, #kickoff, #updates, #x1f4a9