Frictionless Runners
Problem Statement
Some prospective and current customers have reported that GitLab Runners are too difficult to use, and in at least one case created too much friction in their infrastructure, so much so that they chose not to use GitLab for CI/CD.
Customer quote: “Runners are currently our biggest problem with GitLab and almost certainly the one thing that will drive us to a different CI/CD platform within the next 12 months. The shared runners are not suitable for the requirements of a medium-large business and setting up private runners is a huge ball ache that we didn't expect and simply don't need.”
Goals:
- Determine the core issues with installing, managing, and using GitLab Runners that may prevent medium to large customers from adopting GitLab for CI/CD.
- Determine the magnitude of the problem.
Customer
User Persona
- Devon (DevOps Engineer) For this opportunity canvas, we believe that the primary persona is best represented by Devon. So those individuals in an organization that are responsible for, or has taken the lead on providing management and support of the organization’s GitLab installation.
Pain
-
Customers who have to set up and configure specific Runners have to perform a number of manual steps that are prone to errors.
-
Another major pain point is dealing with the Runner tokens that are required to register a Runner with the GitLab instance. This process is very fragile because any changes in the configuration can break the registration of those runners!
-
It is very difficult to test or debug the Runner configuration during the setup process. Especially when using multi runners.
-
Testing more advanced functions like AWS EC2 caching is next to impossible.
-
When working with multi runners, there is almost no information why a Runner failed to start. The only option is to connect via SSH into the Multi Runner Dispatcher.
-
There does not seem to be any option or workaround for the customer to run a specific job on a specific runner if multiple runners are available; neither on a multi runner setup nor on a setup with multiple single runners.
Workflow examples
Use case example 1: Installing GitLab on local development system
Use case example 2: This example illustrates MLReef's per-branch-environment cloud deployment workflow
Business Case
Additional details are captured in the Opportunity canvas in Google Docs.
Reach
This solution will take several weeks of planning, a significant amount of design time, and at least three months of at least one backend engineer's time.
Solutions and Solution Validation
-
In parallel to continued interviews, proposed solutions are starting to be developed as a result of the focus on efficiency improvements. The related solution discussions are captured in the Make CI easy for Self-Managed GitLab users epic
-
The Make CI easy for Self-Managed GitLab users epic can be considered a meta epic that will encapsulate multiple incremental solutions that will be delivered over time. As such we will be performing solution validation as needed for individual features.
-
Solution validation iteration 1: This validated the concept of providing an automated installation option for Runners. This concept was covered in interviews conducted during the problem validation phase. The vast majority of respondents reacted favorably to this proposal. In future validation cycles we will validate individual features as needed. For additional context, review the Solutions Roadmap table in the Make CI easy for Self-Managed epic.