The Code Review Flaw in the Workflow

After intensive patch testing and reviewing 200+ tickets in the past few months from the Test Team perspective, I’ve found there is a gap between Testing and Committer delivery, and I’ve also found a possible solution. Here is the story:

Context

Let’s put a bug as an example for the current simplified flowFlow Flow is the path of screens and interactions taken to accomplish a task. It’s an experience vector. Flow is also a feeling. It’s being unselfconscious and in the zone. Flow is what happens when difficulties are removed and you are freed to pursue an activity without forming intentions. You just do it.-chart:

Current Working Workflow

Currently, there is a keyword dev-feedback that could be assigned after “Patch Testing Report”. Generally, dev-feedback is a great keyword to “pingPing The act of sending a very small amount of data to an end point. Ping is used in computer science to illicit a response from a target server to test it’s connection. Ping is also a term used by Slack users to @ someone or send them a direct message (DM). Users might say something along the lines of “Ping me when the meeting starts.”” any committer that the report is ready for review and potentially “Ready to commit”.

But with this strategy, there is the following serious problem, that renders most Patch Testing Reports almost useless:

  • Most Patch Testers are not code reviewers. They only get some patch reproduction instructions, and they simply test the patch; they might not review the patch.
  • Most patches are not always the suitable solution, and they often need plenty of changes (if not a complete rework). For this reason, most committers simply ignore testing unless they know that the report comes from “trusted sources”. Trusted sources are members who have a reputation for good code reviewing.

For this reason, ultimately this establishes a full trust dynamic, rendering most Patch Testing Reports useless.

Moreover, there is another secondary problem: Reputed users cannot simply “Test Patches”: They should test and review the code to maintain a reputation.

Why simple Patch Testing (without Code Reviewing) is very useful?

Patch testing is just a negative-conditional process. This means that the purpose of Patch Testing is simply finding faulty patches and asking for changes (or refreshes). It helps triage and helps to filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. down the queue of patches. Simple patch testing can be done efficiently and even by non-technical users.

Contrarily, Code Reviewing is very complex, and some degree of seniority is required.

After a negative Testing Report, new Action WF Keywords can be introduced like: changes-requested, or needs-refresh.

After a positive Testing Report as it’s currently designed, no Action WF Keyword could be introduced except for dev-feedback.

The problem

The concern is that currently there is no clear way for pinging “CoreCore Core is the set of software required to run WordPress. The Core Development Team builds WordPress. Developers” after a code-review apart from the dev-feedback keyword. In the handbook currently, it says:

A response is wanted from a core developer or trusted members of the development community. For example, use this keyword when double sign-off is required to backport changes during RCRelease Candidate A beta version of software with the potential to be a final product, which is ready to release unless significant bugs emerge.

Once a patch has been code-reviewed, the only way to proceed here is to use dev-feedback as a way to ping Committers like: “I attest that this report is ready to move forward”. But dev-feedback was not even meant for that. In fact, there is no specific keyword meant for this.

The current practice after code reviewing is not pinging at all or directly pinging your trusted committer. 

But if you don’t have a trusted committer, and you would like that your code-review work doesn’t fade into oblivion, using dev-feedback this way (as it currently is), can potentially generate a “Trust-based dynamic” as commented in the beginning.

This will lead into 3 scenarios:

  1. Non-technical Patch tester: Adds dev-feedback. The committer comes, sees that the patch tester is non-technical, and enters in the mood of “Time to review everything from scratch, I cannot trust anything of what I see”.
  2. Technical Patch Tester: Add dev-feedback after a full Code Review. The committer comes with confidence and gives a second peer review to the patch. This patch can be committed with ease. Perfect scenario.
  3. Technical Patch Tester: Add dev-feedback without a full Code Reviewer. The committer comes with some expectations (after all, it was written by a trusted reviewer), to discover that the patch is utterly useless. This will lower the reputation of the Technical Patch Tester because the Committer was expecting that a review should have been done.

This often leads committers to approach tickets with modest expectations, as they’re responsible for triaging these 3 scenarios, and this task can be quite demanding. As a result, many choose to start working from scratch to ensure accuracy and clarity. This behaviour is well understood, as it’s not uncommon to see committers echoing observations already noted in earlier reports. For example, comments like: “I have observed that this issue is happening because…” when there is a comment by another user saying exactly the same. This is an obvious symptom that some committers can ignore Test Reports because they know they have been historically untrustworthy.

The solution

There are two solutions:

  1. The feeble: Commenting within the report that no review has been done. This can easily be overlooked if there is a lot of information within a report (lowering trust) plus this cannot be filtered in report listings.
  2. The ideal: Creating a new Workflow Keyword called: needs-code-review.
Ideal Scenario for a Workflow

Following the 3 previous scenarios:

  1. Non-technical Patch Tester: After testing, the needs-code-review will be set. Any Code reviewer will be able to filter this keyword and improve the quality of the report for the Committer.
  2. Technical Patch Tester who does a full Code Review: They can directly jump into dev-feedback. Nothing changes here.
  3. Technical Patch Tester who doesn’t want to do a full Code Review: Similar to the first scenario, needs-code-review will be set. Code review can be done afterwards, or leave the code review for another reviewer.

Advantages

  1. Committers can prioritize, focusing on dev-feedback reports. During a bug-scrub, for example, it will be wiser to pick those first, and then continue with the rest. This will accelerate patch processing and reduce the burden of Committers of having to fully commit to reports from scratch.
  2. Code Reviewers can prioritize needs-code-review tickets, to help committers.
  3. This ensures double sign-offs on many tickets because non-committers can assist in the code review process, with this keyword serving as a useful entry point for reports.
  4. Code Reviewing could be promoted as a very interesting activity during Contribution days for Core Team.
  5. Committing and Patch Testing can be a completely detached process, as it should be (for the reasons stated above).
  6. This approach can make it easier to promote and provide training on patch testing for non-technical users, allowing them to participate without needing to worry about code quality or best practices. Eventually, this could help facilitate batch patch testing more effectively.
  7. Last but not least, Technical Patch Testers will not always be expected to perform a full code review. While this might seem like a minor benefit, it can significantly help the Core Test team by streamlining the triaging process.

Conclusion

The conclusion is that dev-feedback workflow keyword will be consistently used for what it was meant for:

  1. Real Trust-based keyword for helping push commits (main use)
  2. Backporting purposes like the commit double sign-offs during RC
  3. Sometimes after a fixed-* backport (when backporter is the same as the committer)
  4. Return to pinging the dev for administrative purposes

End note: This is part of the series “Improving the Keyword Workflow” which will be published soon. If you are willing to join the review, please contact me in #core-test or #core at SlackSlack Slack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/. @ SirLouen

Props to @mosescursor, @jbaudras, @oglekler and @krupajnanda for helping review this article and offering feedback.

#core-meeting, #core-test

Test Team meetings schedule – poll

Test Team is meeting twice a week:

  • Tuesdays 13:00 UTC bi-weekly for Triage Sessions
  • Tuesdays 13:00 UTC bi-weekly for Team meetings
  • Fridays 13:15 UTC for Test Scrubs

This schedule was discussed while ago, when the team wasn’t crystalized as it’s right now. This is why it’s a good time to ask members of this Team whether the time of meetings is okay, or shall we reschedule it.

Please post in the comments your opinion about that, we’ll be collecting votes until 30th of August and then make a decision.

Thank you for reading!

#build-test-tools, #core-meeting

Gutenberg Usability Testing Plan – Feedback Needed

We are getting ready to roll out a new round of usability tests for GutenbergGutenberg The Gutenberg project is the new Editor Interface for WordPress. The editor improves the process and experience of creating new content, making writing rich content much simpler. It uses ‘blocks’ to add richness rather than shortcodes, custom HTML etc. https://wordpress.org/gutenberg/ mid-November. In this next round, we will focus on testing writing flowFlow Flow is the path of screens and interactions taken to accomplish a task. It’s an experience vector. Flow is also a feeling. It’s being unselfconscious and in the zone. Flow is what happens when difficulties are removed and you are freed to pursue an activity without forming intentions. You just do it.. We are also very keen to widen the net for participant feedback, including testing with participants who are not current WordPress users.

In order to make this happen we have drafted a usability testing plan – and we need your feedback and suggestions (in the comments below) before the next meeting in #core-flow on Thursday 9 Nov at 18:00 UTC. The next steps will be to test the user test on a small sample set (week of Nov 10th), refine at the #core-meeting on  Tuesday 14 Nov 17:00 UTC, and roll out to a wider audience starting from 15th Nov.

We will also have a usability testing section at WCUS, so if you are attending please drop by the and join in!

Proposed Usability Tests

To test the flow of writing, we propose to construct a three part usability test:

  1. General demographics:  including prior WordPress experience, age and device used. This information will help us to segment findings
  2. The main task: participants will be asked to re-create the post shown in an image. There will be three images to select from, mapping roughly to a beginner, intermediate and advanced level
  3. Follow up questions: a few questions about the experience of re-creating the post

Participants will be optionally invited to upload their screen recording, and answer a few questions about their video footage.

Testing Script

Question 1: Do you currently use WordPress?

  • Yes
  • No

Question 2: Would you describe yourself mostly as a…

  • Developer
  • Designer
  • Blogger
  • Business Owner
  • Other: ________

Question 3 (optional): How old are you?

  • Under 18
  • 18 – 30
  • 31 – 40
  • 41 – 50
  • 50 – 60
  • Over 60

Question 4: What device are you using?

  • Mobile phone
  • Tablet
  • Laptop
  • Desktop

Question 5: Let’s get set up!

Check that you have the following items ready as you will need them to complete the task

  • Open Gutenberg editor in a new browser window
  • Ensure that you have the Twenty Seventeen theme selected
  • Open the task image in a new window [ beginner | intermediate | advanced ]
  • Start a screen recording
  • Remember to talk out loud as you complete the task

Your task is to re-create the page that you see in the image using the Gutenberg editor. Remember to start your screen recorder, and talk out loud as you complete the task! When you are finished, continue on to answer a few questions about your editing experience.

Question 6:  Did the task take long or shorter than you expected?

  • It took longer than I expected
  • It took about the amount of time that I expected
  • It took less time than I expected

Question 7: Can you tell us why?

Question 8: Was the task easier or harder than you expected?

  • It was harder than I expected
  • It was about what I expected
  • It was easier than I expected

Question 9: Can you tell us why?

Question 10: Are you more or less likely to use the Gutenberg editor in the future?

  • I am not likely to use Gutenberg in the future
  • I am unsure
  • I am likely to use Gutenberg in the future

Optional section: screen recording analysis

It would help us out a lot if you could upload your screen recording and answer a few questions about your recording

Question 11:  Save your screen recording, and upload your file here…

Question 12: How long did it take to complete the task?

Question 13: Was the title added correctly?

  • Yes
  • No

Question 14: Was the quote added correctly?

  • Yes
  • No

Question 15: Was the image added correctly?

  • Yes
  • No

Question 16: Where were the main sources of friction?

Thank you so much for your help!

If you would like to be kept in the loopLoop The Loop is PHP code used by WordPress to display posts. Using The Loop, WordPress processes each post to be displayed on the current page, and formats it according to how it matches specified criteria within The Loop tags. Any HTML or PHP code in the Loop will be processed on each post. https://codex.wordpress.org/The_Loop. with the progress of Gutenberg testing, please leave us your email below and we will add you to the Make.WordPress user testing mailing list

Test Setup

The test can be completed by a participant, or used as a run sheet for an observational research session.

In order to complete the test, participants will need to:

  1. Get their hands on a device (laptop, tablet, desktop or mobile device)
  2. Ensure that they know how to do a screen recording on the device
  3. Load up the user test
  4. Follow the instructions in the test
  5. Upload the screen recording to the cloud
  6. Optionally, code the results from the screen recording
  7. Optionally, write up a blog post and tell us what you found

Reporting usability test results

There are three ways in which you can report back your user test results:

  1. You can simply answer all the questions in the test instructions. Remember to upload your screen recording at the end
  2. You can optionally analyse your screen recording footage by answering the optional questions in the final section of the test instructions
  3. In addition, you are welcome to write up your test results in a blog post

Get Involved

Have an idea on how to improve the usability testing plan? Have your say in the comments below before the next meeting in #core-flow on Thursday 9 Nov at 18:00 UTC. Once we have collected all feedback we will post a link to the test script and open the call for user testing!

 

#gutenberg, #usability-testing