Subscribe to this blog and receive notifications of new posts by email.
Join 2,294 other subscribers
Suggest Agenda Items for Dec 17th Dev Chat
Eric Marden, Denis de Bernardy, Pulse, and 13 others are discussing. Toggle Comments
Disclosure of api.wordpress.org data transfer/retention policy: consider adding in some conspicuous location (e.g. options-privacy.php) the following disclosures:
1) State that core update notification requires sending WordPress version number to api.wordpress
2) State that plugin update notification requires sending list of plugins with version numbers to api.wordpress
3) State that theme update notification requires sending list of themes with version numbers to api.wordpress
4) State how often api.wordpress is queried, what additional data are sent to api.wordpress, what data are retained by api.wordpress, how long those data are retained, and what is done with those retained data.
What I’d like to see instead, personally, is a page on wp.org that actually lists part or all of the information (minus the urls, for obvious security reasons).
Agreed, the intention was always to expose as much aggregate data as possible. It could also be useful for plugin authors.
Well, in case you haven’t written such a page yet, you’re welcome to reveal the schema used, as well as the existing scripts that take advantage of it. Surely someone will pick it up and write part or all whatever is missing to distribute it in the wild.
The data is retained forever. In addition to every theme and plugin, we’ve developed an algorithm to figure out what blood type you are by the PHP and MySQL version. Plugin data has proved surprisingly useful when paired with spidering your blog posts because by the TV shows you talk about we guess your age and then whether you say “soda” or “pop” tells us your location and bing we have your social security number.
That makes the data super easy and profitable to sell, which we do. Or rather, did. The main customers buying the plugin/SSN information were some insurance company AIG and Lehman Brothers (sp?) who used it to tweak credit tranches but… apparently that didn’t work or something? Anyway congrats to all the Akismet and All-in-one-SEO users who bought a house last year!
The not-so-secret plan (I talked about it at WordCamp SF!) is to use the data to predict which plugins are going to become popular before they become popular so we can replace the core team including myself with a script that just auto-merges plugins into core (and removes whitespace from the end of lines).
Oh, and don’t worry about the organ donor checkbox in 3.0, my liver is probably going to last at least until I’m 28.
I don’t think it’s a good idea to opt-out of any updates. Rather, I would hope that opting out just provides a way to send less data such as blog url and such but still be able to do upgrades.
I hope the other WordPress enthusiast sites don’t pick up on this or there is going to be another negativity ****storm.
I’m beginning to wonder when WordPress.org will hire a PR firm lol.
I’m afraid I have to agree with Jeffro. Although personally I have no problem with either the data sent or data retention (I really don’t care either way) Chip and others have brought up a valid point and responses such as this leave them feeling as if their concerns/opinions are stupid and worthless.
Lol. Amusing response.
It might pay to add a disclaimer saying you still take their suggestion seriously on the end of stuff like that in future as it could come back and bite you in the rear at some point otherwise. Those speaking up about this topic seem to be quite aggressive about it and they may not see the funny side.
Yes, it was meant as a satire to lighten up the discussion a bit, people seem to be getting a bit serious.
While I don’t really care what info you farm, store, and use, responding in this way to people who have a valid concern may provoke them to fork WordPress into something Automattic doesn’t use as a data-farming operation. And perhaps this is the solution: a branch of WordPress without any connection to Automattic, by api call, ownership, direction or input. Would the project survive that? And all this over an issue that has such an easy solution.
This type of spiteful rebuttal lends credibility to those who have a concern for how this data is used. Who does own that data, and the rights to that data? Is WordPress.org equal to Automattic? Does Automattic have the right to access, without permission or reservation, non-published database information from WordPress run blogs too? How about credit-card and social security numbers, and form emails processed by plugins? Are you worried that how you are really using the data is in fact harmful if publicly known? Is making a disclosure a liability for Automattic? Does some other 3rd party pay a lot for that information?
Why provoke people just for emotional gratification? The solution isn’t heinous. Disclose what information, how its gathered, how its used, and why its necessary. It’d be nice to have an option for not participating in non-essential data, but not absolutely required, as long as its disclosed what this info is, why its collected, and how it would be used, etc. Besides, next to no one would opt-out I believe.
Actually Ken, that project wouldn’t survive. Threats of forking isn’t going to change anything and neuters any momentum behind the privacy argument you’re trying to make here.
For those of you who haven’t read it already and don’t agree with Chip’s stance, I suggest reading through the following trawl of posts:
That topic is overly long, but there are some good points raised in there which have solid arguments for and against Chip’s suggestion.
This is paranoia by those worried about it for their own site. But I’m guessing there is probably a legitimate legal concern too since WordPress.org is storing potentially private data without permission (and yes, some of it is private – read through the forum topic for a better explanation).
In my case, there’s no paranoia in regards to my own site. There is concern about people disabling update checks and also concern about possible legal implications if any clients try to make consultants responsible for not disclosing the risk to private information. We can’t even give reassurances because nobody is giving them to us.
Please see my response at http://www.wptavern.com/forum/general-wordpress/1085-wordpress-phone-home-19.html#post9628 and see whether some of that “paranoia” isn’t justified.
Disclosure somewhere is enough to address the issue. Opting out can be achieved by not using WordPress, or finding/making a plugin. An option in the core is a bad/unnecessary thing. As long as the info is provided, and the user makes an informed decision, there is no problem, no controversy.
First, not disclosure “somewhere”, but rather, disclosure *within WordPress* (thus, my suggestion of options-privacy.php).
Second, how can a user make a decision, when that decision (opt-in) is made for the user, with no ability to *change* (opt-out) that decision?
One can “opt-out” of data transmission by “not using WordPress”? Seriously? How asinine.
Or, finding/making a plugin? No. The data transmission is built into WordPress, therefore so must be the mechanism to opt-out (again, I am referring to data transmitted for the purpose of *retention*, not for the purpose of update-checking).
With respect to data *retention*, to say that “an option in the core is a bad/unnecessary thing” demonstrates capricious disregard for the users’ right to control their own data.
Again, I direct you to stopbadware.org guidelines for transmission of user data to a third party.
Why should WordPress be any different?
Oh, I agree, it’s pretty convoluted. I would prefer a more conspicuous location, but I think the implementation is fine. Something along the lines of:
p.s. I’m working on a draft. What should I do with it? Put it up on Trac?
What I’m saying is enabling a feature in the core to disable updates is “asinine”. Whatever data is necessary for updates should NOT be optional, to be enable/disabled as an option in the core. It should be explain what and why this is. My comment was in support of your numbered list (the above one about disclosure, not the below one about opt-out).
Non-essential data should be optional. What I was attempting to say, is that disclosing what info is involved here solves the problem. The problem is gathering info without knowledge, and therefor consent. There is no requirement to be able to turn off data transmission because that is technically optional by use or not use, as long as it is made known that use includes such communication. This observation is not “asinine”, its a reality, not matter how distasteful it seems. Your tone is unappreciated. This is the area that seems related to “PR”; that Automattic, a private company, uses data collected from an open-source community project. I believe it is a valid concern. And yet the primary issue can be solved separately with disclosure.
“First” “Somewhere” does not preclude options-privacy.php, and if you heart is so set on that, then I wish you the best.
“Second”, if informed of what’s involved, users have a choice in whether or not to participate, and that is the concern here. Not using WordPress is a valid option, not “asinine”. Stating otherwise severely weakens what position your argument has. As far as your numbered list, the only optional information is in number 4 (retention). My statement is that it’s merely unnecessary to solve the problem. I’ve said else where I support a choice on participating in optional data, and agree about data retention, but upgrading should not be optional. It’s still irrelevant to the main concern here.
I was primarily responding to the thread this comment is in. The one with “disclosures” as the qualifier for the 1-4 list. Your list below is the opt-out list 1-4 of which I am vehemently opposed to 1, 2 and 3. I can only agree with the 4th position, while I am for 1, 2, 3, and 4 above (again, I’m for disclosure 1-4, and against opt-out for points 1-3, while 4 is fine with me).
So, in summary, opting out of “api.wordpress data retention” would be a desirable option, disclosure in my opinion is all that is absolutely required here.
“Somewhere” could be taken as http://www.wordpress.org, or essentially, anywhere. My point was that the disclosure needs to be made *to the user*, and therefore, *within WordPress*. Whether that’s during installation, or on the Privacy Options page, or wherever is fine, provided that it’s *within WordPress*.
And, yes, I do find the assertion that “if you want to use WordPress, your data *will* be transmitted to and retained by api.wordpress; if you don’t like it, don’t use WordPress” to be asinine. It’s along the same user-unfriendly lines of, “if you don’t like it, fork.”
However, it seems that we’re in agreement on data retention being optional, and I’m not wanting to engage the battle on opting-out of core/theme/plugin updates, having seen the comments here. I already conceded those points. So, I think we’re in agreement on the 4 opt-out points below.
(And for the record, even if an option to opt-out of update checks were put in, I wouldn’t set it. I like update checks.)
I pretty much agree that, for the most part, disclosure will solve the problem.
“So, in summary, opting out of “api.wordpress data retention” would be a desirable option, disclosure in my opinion is all that is absolutely required here.”
If we had those two: full disclosure and a mechanism to opt-out of data retention, I think the entire matter would go away.
Opt-out option for api.wordpress data transfer/retention. On the Privacy Options page (options-privacy):
1) Allow users to opt-out of core updates
2) Allow users to opt-out of plugin updates
3) Allow users to opt-out of theme updates
4) Allow users to opt-out of api.wordpress data retention
I think you’re beating a dead horse. Users should not be allowed to opt out of updates without a plugin. Period. Those that do can and usually do get hacked, just like they deserve.
Re the data that is send, I think you’re misunderstanding how the data is used in the first place. Some plugins and themes have identical names. At least the file, author, and url are needed. Working around this issue would require all themes and plugins stored on wp.org to add a unique identifier. That would be a carnage.
Seriously, the only point that makes any sense in all of this is removing the site url. But even for this, I can see compelling arguments to keep it — in particular if it’s used to weed out spammy plugins used by spammy blogs that then send ping notifications to pingomatic.
Hey, I’m just throwing it out there for consideration.
As for allowing or not allowing opting-out of core update checks in core, I’m open to that argument. That’s why I split the four out into separate points. I don’t buy it quite so much for plugins and themes, however.
On the other hand, an opt-out option for data *retention* (as opposed to sending data needed for update checks) simply *must* be included in core. I find no compelling reason presented thus far, against such an option.
And, if opting-out of data retention were entirely separated from sending data necessary for update checks, I imagine that very few people would have reason not to let WordPress send data for update checks – thus eliminating the security risk associated with disabling update checks.
I see no reason why it “must” be included. The data is stored and used to determine what minimum version of PHP we should support and the URL is used as the unique identifier.
It’s not like it’s your SSN being sent. It’s just a URL and some server info.
It *must* be included because it is the right thing to do. WordPress – just like every other application (be it from Microsoft, Google, Apple, Ubuntu, Sony, Nintendo, or anyone else) must respect users’ right to control of and privacy with respect to their data.
The data being sent by WordPress does not belong to api.wordpress; it belongs to the user. Even short of a legal requirement (which, in some cases, there may be), it is the ethical, respectful thing to do.
“Users should not be allowed to opt out of updates without a plugin. Period.”
@Chip: Plugins and themes are just as capable of introducing security risks that need to be closed with updates.
Again, I am most concerned about users’ right to opt-out of data *retention*.
I am fundamentally opposed to not having opt-out options for all of the above (because, as the owner of the application, the user should have this control), but I’m not wanting to engage in that fight. I suggested them for discussion, but can see where that will go.
So, conceding the update-check opt-out option in core, what of the data-retention opt-out option in core?
@Jane – +1 BUT private, (ie. custom code that is not released to the public or not hosted by wordpress.org) plugin and theme data is being collected. This is nobody’s business except the site owners and should not be reported back to WordPress.
At the moment, if anyone wants to keep private plugin and theme data private, or if they don’t want their URL associated with the data being collected the only option is to disable update checks completely. If privacy concerns are resulting in deactivating a core feature and WordPress wants users to keep update checks then perhaps it is wise to remove the obstacle.
Giving users a reason, regardless of whether anyone else feels its a valid reason or not, to disable update checks defeats the purpose of providing them.
I agree that users should not be able to opt out of update checks without a plugin. But they should be able to control what information is sent and restrict the sending of personal information if they wish or if laws in their own countries require this.
If WordPress is all about freedom then it should also be about users having freedom to control their own personal information.
That’s UI bloat for sure.
I think a constant in wp-config.php would be much better. Can’t be changed by a hacker who compromises the login and allows blocking before the site is even installed. Then again I’ve never understood these paranoid people who want this stuff blocked.
Is it “UI bloat” for every other application in existence, that must disclose to users when data are sent to a third-party server, and that must provide an opt-out mechanism?
Why is WordPress exempt?
And every time I hear someone call people with legitimate concerns “paranoid”, I am going to assume that the speaker is acting arrogant and condescending.
There is no need to start throwing around insults. Please remember that your opinion is just that — an opinion. Someone is not wrong just because they disagree with you. I for example don’t consider it a legitimate concern.
Anyway, I’m stepping out of this conversation.
And incessantly referring to people who express concerns as “paranoid” is no less insulting than calling those who dismiss those concerns “arrogant” or “condescending”. It’s not directed specifically at your comment. I’ve seen that term thrown around here, and on wp-hackers, and at WPTavern.
Mutual respect gos a lot farther.
To clarify, Chip, I think any and all privacy concerns are completely valid to the person making them. I’m passionate about privacy myself, for example I Truecrypt encrypt all my drives.
Fortunately WordPress is completely open source and you can read every line of code that sends every bit of information, everywhere. (Not possible with most applications!) You can modify the code directly, or use or create a plugin to modify or prevent anything from being sent.
If you consider your blog URL to be sensitive information, which is exceedingly rare but certainly conceivable, but still want updates I would recommend hashing the URL before sending it. (And also disabling Ping-O-Matic updates.)
If you consider server IP to be sensitive information, which is exceedingly rare but certainly conceivable, I would recommend either disabling updates with a plugin or by modifying core per above or putting all external requests through some sort of proxy which can be done at the server level or perhaps through extending WP’s HTTP class.
For the truly cautious, many of these precautions are best handled at the server level, for example using an iptables firewall to block all outgoing connections from a server or anything on it, including WordPress.
We’re lucky that WordPress is software you can download and configure yourself because it gives you 100% control over every aspect of its operation, and you can run it under any environment you can imagine.
Matt, I greatly appreciate that WordPress is open source, and that everything “under the hood” is available.
That said, my own PHP skills are marginal at best – and, I would imagine, far above that of the average user. I do not believe it is sufficient to expect an average user to figure out what data are sent to api.wordpress only by looking through the PHP files.
Further, knowing what is *sent* to api.wordpress in no way provides insight into what is *retained* by api.wordpress.
And, again, the nature or sensitivity of the data being sent is irrelevant. What is relevant is that the *user* owns the data, and therefore WordPress must still respect that ownership. I have linked (several times) the StopBadware.org guidelines for applications that transmit data to third parties – guidelines that require up-front disclosure, and an option to opt-out. This really isn’t about paranoia; it is about adopting industry standards for regard of user data.
(Thinking farther ahead, it is also about building trust and goodwill from users, so that you might in the future be able to have users opt-in to having even more data collected, that would prove to be even more useful.)
I honestly believe that almost everyone would be placated with up-front disclosure of the data transmission (both what is sent, and what is retained), and the ability to opt-out of data *retention*.
As for the other examples you have given (e.g. Ping-O-Matic): a) the user has complete control over sending data to POM, and b) POM doesn’t *retain* those data (AFAIK).
Besides, it is not the URL *alone* that would normally be considered sensitive or especially valuable, but rather the cross-reference of other, far more sensitive/valuable data that can be tied together by the URL.
I don’t particularly understand the push-back on the issue of disclosure. It would be incredibly simple. I’ll even offer to write something up (though I’m sure it would need to be quality-checked, as I’m sure I wouldn’t get everything right wrt the data retention, and it would probably need legal review). Regardless: we write something up, drop it on options-privacy.php (or wherever is most appropriate), and it’s done. Maybe point the user to the location at the end of installation. But that’s it. That’s all that’s needed.
I do understand, though, that it would require some work to separate update check data from historical/aggregated data, since much of the data are the same, and would have to be sent differently if the user is allowed to opt out of data retention.
I still think the goodwill/PR gains will far outweigh whatever developer time is required to make such a change.
Please don’t get me wrong. I LOVE WordPress. I want to be able to praise WordPress as being a leader on the issue of respecting user data, while also being able to use those data for the betterment of both WordPress and the WordPress community.
The argument that WordPress users concerned about privacy should seek out a plug-in is flawed in my opinion. That makes the assumption WordPress users are aware that private information is being sent in the first place which is unlikely the case. Second, in order to ensure privacy on future versions of WordPress, the plug-in would likely have to be maintained to ensure compatibility, would it not? If a compatible plug-in was not available say after a significant upgrade then privacy is once again not assured.
A notification screen viewed at time of WP install and a privacy document with the download package would be a good start.
People concerned about privacy are just fine today — there is nothing we send or do that compromises anyone’s privacy. The information we send is public and regardless we don’t publish anything potentially personally identifiable, like URL, we only publish and use the numbers in aggregate for purposes of improving the software and user experience. This is like how browsers send your IP address with every request, it’s appropriate for 99.99% of the audience and if you’re a political dissident or government employee where that is considered sensitive then you should use a proxy or anonymizer plugin.
[...] to see where it goes, I added two suggestions for the upcoming Dev Chat: [...]
[...] the upcoming dev chat, requesting disclosure of the api.wordpress data collection/retention policy, this was Matt's response: [...]
Thank you Ryan.
Please discuss contacting the Software Freedom Law Center for advice (which is free) on what data is legal to collect, how, whether opt-in is necessary given the cross-jurisdictional nature of WordPress distribution, the nature and length of time for retention, whether data matching is permitted, and the type of disclosure that needs to be made.
Discussion about these issues is now impossible. It was impossible in 2007 and every privacy concern that has been raised in the past two years has met with similar responses. The only practical way to put this issue to bed is for the team to obtain legal advice and act on it. The SFLC is free to projects like WordPress.
Agreed with Elpie on this.
This entire argument comes down to what you feel is private information and what you feel is not. As an IT guy, I don’t see any of the information that they send as “private”. IP addresses are public domain, so no point in hiding those.
Plus, as far as users opting out, why should the option even be allowed? I can’t opt out of the Federal government accessing my records of telephone calls, e-mail conversations, or anything to that regard? I can’t opt-out of Verizon Wireless knowing my location at all times? I can’t opt out of publix tracking my spending habits to know when to stock things?
These are realities one must realize in this digital age, and in all honesty, if you don’t like it, Don’t use WordPress. That’s not asinine to suggest, considering that if you don’t like it, find another cms that doesn’t.
Other ideas that might be worth discussing while on the topic of call home:
Yet another one on the same topic:
The update that sped up the opacity fade in/out on alerts has made it into wordpress.com. Wow, is it annoying. It’s so fast it’s like a flicker and I’m almost worried it will induce a seizure. Okay, that was hyperbolic, but it *is* annoying and eye-catching, feeling very frenetic in comparison to the overall feel of the WordPress admin otherwise. I’d vote for getting rid of the fade in/out. The placement and color differentiation (which is visible even with color blindness) act as ‘this is an alert’ visual cue. I would like us to add an icon or something, maybe, to bring it home, but would suggest waiting until 3.0 for that so we have time to work out proper iconography (an error message should get a different graphic than a simple confirmation message, for example).
This one seems closed, so no point in discussing it.
← Summary of Dec 10th Dev Chat
2.9 Release Video Feedback →