Skip to content

[Engineering Workflow] Remove in-effective process that is not supported by data

Michael Becker requested to merge wandering_person-master-patch-e288 into master

Why is this change being made?

Several groups at GitLab are introducing prescriptive workflowverification processes into their engineering workflows. Examples include groupthreat insights groupsecurity policies (handbook page). groupcompliance has a similar process and the process is starting to be rolled out to more engineering groups.

There are two issues I would like to address with this MR. One is with this process in particular, the other is with the meta-process on how prescriptive processes are introduced into the GitLab engineering workflow.

This specific workflowverification process

  1. This process was introduced without any data justifying it
  2. This process was introduced without any track-able success metric to determine if it is having the desired effect
    • When this happens, it is a strong signal a process is "process for process-sake"
  3. I independently requested our in-house engineering analytics team if they could retrospectively (as, again, no success metric was included as part of this process) determine if this process has had any of its purported effects. The results show "insufficient evidence from the data to confirm the verification requirement has had the expected impact."
    • Another strong signal this is "process for process-sake"

The meta-issue of introducing prescriptive process

gitlab-com/content-sites/handbook!1478 (merged)

This process is an example of "Results Oriented Thinking". To quote this talk from Rubyconf:

It means you are measuring the soundness of a decision based on its outcomes, instead of based on the information available when the decision was made. It stems from an illusion of control. It's the idea that the choices we make have a one-to-one mapping to consequences, instead of a probabilistic mapping. It tries to extrapolate lessons from a sample of too few trials. [...] Results Oriented Thinking will lead to a mis-allocation of resources.

Shoving more and more process into the workflowverification stage is a bad response to incidents. It makes us feel like we are being pro-active and reducing future risk, while it is actually making us overly risk-averse and precluding better solutions to incidents.

CREDIT values for this process

  • Collaboration:
    • This is a mixed result. Involving more people in workflowverification seems inherently more collaborative.
    • However, how the process was introduced violates this value. In the absence of data justifying the process or a metric to gauge effectiveness, the driving force behind this process is "rank pulling"
  • Results:
    • It does not agree in writing on measurable goals
    • Effects cannot be measured and are actively not being measured
    • Removes agency from engineers. Not only is it removing agency from engineers to do engineering, it is management-driven "engineering"
    • Removes ownership from the DRI
    • It discourages a bias for action
    • It does not accept uncertainty and seems to be of the philosophy that uncertainty can be removed with enough process
  • Efficiency:
    • Again, this process is being churned on and enforced with no metric measuring efficiency
    • The retrospective data indicates this is "process for process-sake" and a waste of time
    • "Most companies regress to the mean and slow down over time." This process encourages that
    • The Boring solution here would be to leave the workflowverification process up to the DRI. If they think they need another set of eyes on a change, they can request a set of eyes. If not, they can mark the issue as ~verified by author and accept the responsibility that entails
    • This process is not self-service and discourages self-learning
    • it is not respectful of anyone's time
    • It obviously violates the value of freedom and responsibility over rigidity
    • It encourages a culture of shifting blame rather than accepting mistakes
    • "Not every problem should lead to a new process to prevent them. Additional processes make all actions more inefficient"
    • this process encourages moving slow and bundling changes, rather than moving fast and shipping the minimal viable change
  • Diversity, Inclusion & Belonging:
    • This process has a bias towards synchronous communication as the second engineer often needs context from the DRI
    • This process does not "Seek diverse perspectives" as it is up to the DRI to write the verification steps, and a second engineer to simply follow them.
    • Requiring everyone to work in the same workflow, especially when that workflow is not supported by any data or metrtics, is catering to one personal preference over others.
  • Iteration:
    • "People that join GitLab all say they already practice iteration. But this is the value that is the hardest to understand and adopt."
    • As every issue requires verification steps, this encourages larger "complete" MRs rather than broken down work
    • Involves a lot of waiting
    • encourages sign-off over cleanup. The opposite of this value
    • increases cycle time
    • encourages bundling
    • We should be making changes that are "two-way door decisions". This process treats all changes as a risky one-way door
    • The process itself is frequently changing. I would argue this is an example of "churn" vs "iteration", however that is a larger topic. Perhaps to be discussed in a future handbook MR
  • Transparency:
    • The process was added without an MR or Issue discussing it
    • Changes to the process, which governs every engineer in these groups, continue to be made in a reactive manner amongst a small group (recent example)
    • The imbalance of effort, data, and discussion needed to remove this process vs add it is an indicator of non-transparent decision making

related MRs/commits

  1. process added (cannot find related MR/issue):
  2. Add PTO guidance to Verification section
  3. Add verification triage policy link & new example
  4. Consider using Assignee in addition to Author for Staging verification report issue
  5. Add link to verification steps in implementation issue template
  6. Remove note about MR authors caveat for verification
  7. Threat Insights: Update verification steps for clarity
  8. Add workflow complete to Govern > Verification steps
  9. Clarify verification workflow for Security Policies and Threat Insights planning

Author and Reviewer Checklist

Please verify the check list and ensure to tick them off before the MR is merged.

  • Provided a concise title for this Merge Request (MR)
  • Added a description to this MR explaining the reasons for the proposed change, per say why, not just what
  • Assign reviewers for this MR to the correct Directly Responsible Individual/s (DRI)
    • If the DRI for the page/s being updated isn’t immediately clear, then assign it to one of the people listed in the Maintained by section on the page being edited
    • If your manager does not have merge rights, please ask someone to merge it AFTER it has been approved by your manager in #mr-buddies
    • The when to get approval handbook section explains the workflow in more detail
  • For transparency, share this MR with the audience that will be impacted.
    • Team: For changes that affect your direct team, share in your group Slack channel
    • Department: If the update affects your department, share the MR in your department Slack channel
    • Company: If the update affects all (or the majority of) GitLab team members, post an update in #whats-happening-at-gitlab linking to this MR

Edited by Michael Becker

Merge request reports