Skip to content

Invalid Service Name Results in Runner System Failure

Summary

We are seeing some jobs with invalid service name which is considered a system failure (our fault) where it should be considered a user error (user mistake).

Job failed (system failure): invalid service name:

Screenshot_2024-10-09_at_14.32.23

source

Job failed (system failure): invalid service name, is coming from the code block below:

func (e *executor) createService(
	serviceIndex int,
	service, version, image string,
	definition common.Image,
	linkNames []string,
) (*types.Container, error) {
	if service == "" {
		return nil, fmt.Errorf("invalid service name: %s", definition.Name)
	}

source

Impact

This causes false positives and pages for the SRE on-call where they get paged for a high error ratio:

Screenshot_2024-10-09_at_14.31.38

source

This has already paged twice for the SRE, with no action for us to take.

Recommendation

Invalid Service name shouldn't be a system failure but it should be a normal job failure so we don't attribute it to our error ratio.

Verification

Having the following .GitLab-ci.yml it should not be considered a system failure

services:
  - name: $INVALID_NAME
    alias: gitlab

job:
  script:
    - echo "hello world"

Screenshot_2024-10-09_at_14.35.35

source