Make WordPress Core

Tagged: stats Toggle Comment Threads | Keyboard Shortcuts

  • Jen 11:45 am on December 28, 2012 Permalink
    Tags: stats   

    Contributor Stats 

    Going to be working on a project around stats for the contributor community, something at which we currently suck (even creating the list of people with props each release is still a fairly manual process). For the sake of this exercise, ignore the voice in your head that thinks, “There’s no way to gather that information,” or “We’d need a new API for that,” and just brainstorm. What stats would it be cool for us to have about the activity of core contributors? Leave your ideas in the comments, and they’ll be cobbled together into a big list that I take to Otto to see what’s possible (at which point Nacin can start daydreaming about APIs, but not until then).

    • Jane Wells 11:52 am on December 28, 2012 Permalink | Log in to Reply

      Some of my ideas:

      • Automated props list
      • Number of props per person per release (plus high/low/average for most of these)
      • Number of contributors with props in prior releases (43 1st time props, 26 2-releases, 66 3-releases, 12 4-releases, etc)
      • Lines of code contributed, also per patch, also per person, also per person per patch
      • Number of screens affected by UI changes
      • Number of strings being translated (this one will be more of a polyglots thing)
      • Number of hours spent on creating patches/reviewing and testing patches/running unit tests/etc by contributors
      • Average time from ticket open to ticket close
      • Average number of people commenting on a ticket
      • Average number of comments per person on a ticket
      • How many tickets each person created, how many commented on, how many uploaded a patch to
      • Helen Hou-Sandi 2:24 pm on December 28, 2012 Permalink | Log in to Reply

        I’m not a fan of the lines of code metric – removal is just as valuable, and sometimes more so. I probably come out close to zero πŸ™‚

        • Jane Wells 2:28 pm on December 28, 2012 Permalink | Log in to Reply

          Sorry, imprecise language on my part. Because I’ve been indoctrinated so well, my “lines of code” is shorthand for “lines of code changed,” not “lines of code added,” so removal/cleanup patches would still count based on lines affected.

    • Benjamin J. Balter 1:58 pm on December 28, 2012 Permalink | Log in to Reply

      • Number and Percentage of first time contributors
      • Number / Percentage of personal time versus contributing for work
      • Number / Percentage of Automattician contributions
      • Number / Percentage of core team vs. community contributions
      • Number of people who opened their first ticket ever
      • Ratio of tickets opening to tickets closing
      • Velocity
      • Number of plugins created
      • Number of commits to plugins
      • Number of contributors added to plugins (immagine this is super-low)
      • First time plugin creators
      • Number of plugins that haven’t had a commit in X months
      • Number of plugin support forum tickets opened
      • Number of forum comments
      • Number of tickets closed
      • Open / closed support forum ticket ratio
      • Number of posts on dev blogs
      • Number of comments in dev blogs
      • Lines in IRC
      • Number of first timers in dev chat
      • Number of flame wars on twitter
      • Number of posts on nacin.org
      • Running average of all of the above to baseline (e.g., this release had N fewer contributors than average)
      • Jane Wells 2:29 pm on December 28, 2012 Permalink | Log in to Reply

        Hey Ben. This one is just for core. Stats on plugins etc are being brainstormed on those make/teamname sites. πŸ™‚

    • Scott Taylor 4:20 pm on December 28, 2012 Permalink | Log in to Reply

      I like the idea of a historical leaderboard. Number of patches committed historically plus last commit date would separate “was active” and “is active.” I ran some perl / grep / sed earlier in the year to see how many patches Sergey had committed and it blew my mind. Also, I like the idea of distinguishing who works for Automattic-like companies and who doesn’t. If your job is *core*, it’s a lot easier to crush it every release than the people who are using 100% of their spare time to do it.

    • Matt Mullenweg 5:18 pm on December 28, 2012 Permalink | Log in to Reply

      In terms of what can be extracted from SVN, Ohloh has some interesting stats for WP:


      They also have some interesting historical data, like in Dec 2007 our codebase was 34.5% Javascript, Dec 2012 we’re about 19.6% (lines of PHP has increased way faster than lines of JS).

    • Aaron Jorbin 8:41 pm on December 28, 2012 Permalink | Log in to Reply

      Number of people reporting bugs
      Mean, median and mode of number of bugs reported by a person
      Companies that contributed employee time towards WordPress core (and the number of hours per time period (not sure if per month or per release is a better metric here)
      Number of patches submitted per ticket
      Days between posts on nacin.com
      Number of core contributors who don’t run there own blogs on WordPress

    • Andrea Rennick 10:28 pm on December 28, 2012 Permalink | Log in to Reply

      Contributor status over releases. Like, if someone was around for 3.3, but not for 3.4 and back again for 3.5

      And attrition rate – do people drop off or ramp up? Why or why not?

      How many tickets or patches does one file, on average, before they get props that go in?

  • Joseph Scott 10:32 pm on August 31, 2010 Permalink
    Tags: , stats   

    Previously we’d talked about putting up a stats page on WordPress.org (WPORG) so that more people could see what was happening. While working on some of the new stats processing code on WPORG I realized that people would likely end up scraping this data for their own uses. That seemed like a waste, so instead as a first run the stats numbers are available in JSON format via:


    A few notes about these numbers. First, they are summary percentages for the previous day (where day is based on GMT). You’ll also notice that these numbers don’t really line up with each other, this is because the system normalizes the version numbers and throws out odd/invalid versions (I was surprised by how many odd version strings there are out there). As a result each category is best compared to itself, instead of trying to compare PHP with MySQL numbers.

    The content type returned for this data is ‘application/json’, your browser may or may not display them correctly.

    This is a start, there are more things to be added to this in the future. One obvious item is support for getting numbers for previous days and date ranges. Another would be to add some pretty graphs to WPORG to display this data.

    • Andrew Nacin 7:48 am on September 1, 2010 Permalink | Log in to Reply

      Very cool! I’ll try to build a nice page for dotorg that uses this snapshot data.

      Along with change over time, I think minor versions would potentially be useful. I know it was helpful when we needed to choose a MySQL version to move to, and also identifying the number of installs actually affected by bugs like #14160. And it’s more data for people to play with.

    • Ben Forchhammer 11:56 am on September 1, 2010 Permalink | Log in to Reply

      Wow, this is great πŸ™‚ Should be very useful when trying to decide which versions to support.

    • Denis 10:28 pm on September 2, 2010 Permalink | Log in to Reply

      Could it be possible to have php and mysql by WP version too? As well as WP and MySQL by PHP, and WP and PHP by MySQL? It seems the latter three would be more interesting.

      • Joseph Scott 4:41 pm on September 3, 2010 Permalink | Log in to Reply

        Certainly the data is there for that to be possible, we’d need to look at the queries involved to make sure it could be done in a reasonable way.

        • Denis 3:54 pm on September 4, 2010 Permalink | Log in to Reply

          I’ve no idea of your schema’s specifics, or whether you keep duplicate records related to each site in your stats, but I was thinking something like this:

          SELECT wp_version, mysql_version, php_version, COUNT(DISTINCT site_key)
          FROM stats
          WHERE stat_date > NOW() – interval ‘1 day’
          AND wp_version IN ( $valid_versions )
          GROUP BY wp_version, mysql_version, php_version;

          The raw output of the above as /stats/raw/1.0/ would, I think, be the most interesting for plugin devs. It doesn’t necessarily need to be normalized as percentages, either: having the actual number of sites is useful to get an idea of how many users one is potentially targeting exactly.

    • Ryan McCue 11:39 am on September 3, 2010 Permalink | Log in to Reply

      Whipped up a quick Google Visualisation of these: http://ryanmccue.info/wp/stats/

    • filosofo 6:05 pm on September 3, 2010 Permalink | Log in to Reply

      Thanks for doing this, Joseph, Otto, and Ryan! A great combination of data and visualization.

    • Sergey Biryukov 6:33 pm on September 12, 2010 Permalink | Log in to Reply

      Is there any chance of making stats for localized versions available too?

    • Mike Schinkel 5:59 pm on July 8, 2011 Permalink | Log in to Reply

      Just found this thanks to Ryan C. Duff (thanks Ryan!) Curious, how is this data collected? From API requests to list plugins and themes?

  • Peter Westwood 9:25 pm on May 12, 2010 Permalink
    Tags: stats   

    Just ran some stats on where WordPress 3.0 development is compared to some of the previous larger releases:

    • Closed 896 tickets so far compared to the previous peak of 786 for 2.8 and only 448 in 2.9
    • Been coding for ~5months so far compared to 7 months for 2.5 and 9 months for 1.5 (Those were the days!)
  • Matt Mullenweg 5:05 pm on May 6, 2010 Permalink
    Tags: stats   

    Michael Adams ran some Trac stats last week for my keynote at WCSF, I only mentioned one of them but here are the rest:

    We had 1,440 unique people particpate in tickets, attachments, comments, commit, or get props in the past year.

    There were 3,506 new tickets from 1,093 people, and they had 3,979 attachments from 491 people.

    TWENTY SEVEN THOUSAND comments. (And change.) From 1,032 people, or about 27 per person.

    13 people committed 3,176 revisions, about a third (1,387 of those) with “props.”

    It was a very good year.

compose new post
next post/next comment
previous post/previous comment
show/hide comments
go to top
go to login
show/hide help
shift + esc
Skip to toolbar