Monday 2025-02-03 13:12 UTC - nodejs (for http router) missing in build-gdk-image

gitlab-org/gitlab pipeline #1653161156 failed

Pipeline ID Branch Commit Merge request Source Duration Triggered by
1653161156 master Merge branch '514745-applicationsettings-analysis-script-does-not-support-activerecord-encrypted' into 'master' Support encrypts :field calls in scripts/cells/application-settings-analysis.rb push 59.91 minutes Matthias Käppler

Failed jobs (1):

  • build-gdk-image Job ID: 9024679174 (retry with @gitlab-bot retry_job 9024679174)

Attribution:

This incident is unattributed and posted in #master-broken.

Incident review

  1. Impact:
    • At least 70 pipelines on merge requests failed unnecessarily.
    • Delays in the development process due to failing pipelines.
    • Potential confusion and time loss for developers working on affected merge requests.
  2. Root Cause: The issue was caused by how Make evaluates wildcards when it boots. This resulted in the gitlab-http-router-asdf-install target in GDK not executing properly on the first run, leading to the missing nodejs installation.
  3. Resolution: We fixed the underlying bug in GDK via gitlab-org/gitlab-development-kit!4461 (merged) and updated the version in the monolith repository via gitlab-org/gitlab!180043 (merged), which resolved the incident.
  4. Timeline:
    • 13:12 UTC: Incident raised
    • 13:59 UTC: Acknowledged
    • 14:19 UTC: MR to debug pipelines: gitlab-org/gitlab-development-kit!4459 (merged)
    • 15:06 UTC: MR for potential revert (later closed): gitlab-org/cells/http-router!488 (closed)
    • 15:33 UTC: Merged MR to fix root cause in GDK: gitlab-org/gitlab-development-kit!4461 (merged)
    • 15:50 UTC: Merged GDK version upgrade in gitlab-org/gitlab>: gitlab-org/gitlab!180043 (merged)
    • 16:17 UTC: Verified that incident is resolved through passing CI job: https://gitlab.com/gitlab-org/gitlab/-/jobs/9026962720
  5. Corrective action: Pin service dependencies (gitlab-org/gitlab-development-kit#2398 - closed)

How to close this incident

  • Follow the steps in the Broken master handbook guide to
    • escalate
    • triage, and
    • resolve
  • Reminder: apply the appropriate ~master-broken:* label to document root cause before closing the incident.

Quick Tips:

  • you can retry all failing jobs with @gitlab-bot retry_pipeline 1653161156.
  • a message can be posted in #backend_maintainers or #frontend_maintainers to get a maintainer take a look at the fix ASAP.
  • add the pipelineexpedited label, and master:broken or master:foss-broken label, to speed up the master-fixing pipelines.
Edited Feb 05, 2025 by Kev Kloss
Assignee Loading
Time tracking Loading