KR: Make the SUS more targeted and less of a lagging indicator through improved methodology => 100%, 7/7

As the next step in evolving our SUS measurement process, some changes need to occur to target the right respondents, have a quicker read on the impact of the usability fixes in place, and to sustainably measure the SUS OKR. This KR outlines the details of those changes.

The new SUS handbook page

As a result of this OKR, a handbook page specific to SUS was created, which outlines out the details on the approach along with the justification. That handbook page can be found here: 👉 https://about.gitlab.com/handbook/engineering/ux/performance-indicators/system-usability-scale/

Targeting the right respondents

Historically, when we surveyed for SUS, we surveyed the First Look panel. While we value our First Look panelists, we know very little about them ahead of time and cannot screen by certain criteria (ex: plan type, usage traits, etc). Instead, we asked respondents to self-report, which is an inefficient use of time and has accuracy concerns. Going forward, we're going to utilize the Data Warehouse for the majority of the sampling, while still leveraging the First Look panel for self-managed users only.

A quicker read on the impact of the usability fixes

By sampling more from our SaaS users, we believe we're able to see the impact of our changes quicker than our self-managed users. Reasoning behind that thinking: SaaS users see the usability fixes immediately. We cannot say the same for self-managed users, and have limited understanding of what self-managed users are experiencing, in terms of version and if they're experiencing the usability fixes.

Continue OKR quarterly SUS measuring

We have ~5 mil Self-managed users vs. ~1 mil SaaS users. Because of that, we still think it's important to continue to sample from the SaaS (quarterly) and self-managed population (every other quarter), in the context of our SUS OKR measure. Eventually, if we continue to see little differentiation between the two populations, we may decide to abandon this approach and stick just with SaaS.

Screening criteria & details for GUS:

Regular participant criteria

SaaS user: SaaS users start experiencing product improvements as soon as we ship them. By focusing on them, we can gather sentiment as quickly as possible. Also, we currently do not have a method for contacting Self-Managed end users for research purposes beyond those that have volunteered.
Recently active: We use a minimum threshold of 10 product events across at least 2 stages in the previous 30 days. An ‘event’ is an indicator that users are doing something in a certain area of GitLab. This approach has two goals: we’re targeting people who have used multiple stages, and eliminating people with limited exposure to our features and the usability of our experience. It also ensures respondents have recently used GitLab and have a higher likelihood of experiencing recent improvements.
N = 200 for each cohort All regular cohorts will have a minimum sample size of 200 users. This allows us to calculate a score with a high degree of confidence.

Regular cohorts

We have defined the following cohorts that we will track over time:

Paid users: Users that are associated with a paid subscription (whether that subscription was purchased or gifted by GitLab)
Free users: Users that are not currently associated with any paid subscriptions and are using the Free plan.
Mature users: Users that have a tenure of 180 days or more.
New users: Users with a tenure of less than 180 days.

Each of these cohorts will have a quota of 200 responses, and we will calculate individual SUS scores for each of them. Note that these cohorts will overlap, so we won't necesssarily be gathering 200 responses for each one. For example, a mature free user would be considered part of both the Free user cohort and the Mature user cohort, and would be counted for both of those quotas.

Paid users that are targeted as part of that cohort will be included in their respective Mature or New user cohort, but we only explicitly target Free users to fulfill our quotas for the Mature and New cohorts. This is to ensure our collection method is sustainable. Explicitly targeting paid users for these cohorts runs the risk of oversampling them, which could lead to negative sentiment. Since we have exponentially more Free users, we can contact more of them without having to worry about oversampling.

Self-Managed cohort

In order to understand how the experiences of our Self-Managed users compare to those of our SaaS users, we conduct a limited SUS measurement of Self-Managed users every other quarter. Due to sample size concerns, this cohort will be smaller than our regular cohorts. The majority of these users are recruited via our First Look user panel, but we can recruit using other means if necessary to fulfill our sample size requirement.

The Self-Managed cohort has the following criteria:

Self-Managed user: Users self-report that they are a user of a self-managed instance of GitLab.
Recently active: Users self-report that they have been active on a self-managed instance in the last 30 days.
N = 100: Given that the majority of these users will be recruited from First Look, we want to lower our sample size as to not exhaust Self-Managed users in the panel. This cohort will have a higher margin of error compared to our regular cohorts.

Survey cadence

To try and achieve a regular cadence of responses throughout a quarter, we aim to start sending email distributions every two weeks, starting at the beginning of the quarter, until we achieve our desired sample size.

Calculation process

A transparent and formalized calculation process will be developed in the form of a Google Sheet LINK HERE. The purpose of this is to be more transparent with the data and to maintain a high degree of accuracy when calculating the scores.

Checklist

To get to the above, we need to land on the following:

Define the key aspects we want to target (ex: usage type, % plan types, etc)
Determine the exact cadences for when the surveys are launched
Create a formalized calculation process that's more transparent
Determine, and justify, the % free vs. paid users to include in the surveys
Obtain sign-off on the above
Clearly document and communicate the shifts
- Rename Perception of system usability KPI to "System usability scale (SUS) score"
- Create a new (temporary) KPI for "GitLab usability scale (GUS) score" that we will retire when we are confident that this new methodology is sound
Pilot the GUS

Edited Jan 20, 2021 by Adam Smolinski