Outdated macOS service setup instructions

As I'm working on building a macOS AMI for use as an internal runner, I had some issues with setting up gitlab-runner as a service, specifically on Apple Silicon (I have not tested with Intel).

I believe those issues are a combination of the runner documentation being out of date, and my use case being new (running headless on macOS AWS instances, rather than on a server where I can easily login manually through the GUI). Hopefully this troubleshooting log can help folks that run into the same issue until we are able to update our documentation

After running gitlab-runner install, gitlab-runner start didn't work and I had the following issues:

  1. It seems that load and unload are deprecated commands of launchctl, and we should be using enable and bootstrap instead.

  2. The /usr/local/var/log directory does not exist and this location is not writeable by default. I updated the configuration to log to the home directory with sed -i.bak 's|/usr/local/var/log/|/Users/ec2-user/|' /Users/ec2-user/Library/LaunchAgents/gitlab-runner.plist

  3. Got an error trying to launch the service:

launchctl load /Users/ec2-user/Library/LaunchAgents/gitlab-runner.plist
Load failed: 5: Input/output error
Try running `launchctl bootstrap` as root for richer errors.
  1. Running launchctl boostrap I tried to launch the service into my user's domain. However, macOS instances in AWS do not auto-login to a graphical session on boot. As such, gui/501 is not working. I also had trouble getting user/501 to work, for which the manual says: "A user domain may exist independently of a logged-in user."
sudo launchctl bootstrap user/501 /Users/ec2-user/Library/LaunchAgents/gitlab-runner.plist
/Users/ec2-user/Library/LaunchAgents/gitlab-runner.plist: Service cannot load in requested session
Bootstrap failed: 134: Unknown error: 134
  1. I found this document that outlined that the "background session", and things like the home directory, are not available before the user has logged in once. https://launchd-dev.macosforge.narkive.com/7s3ELd8z/cause-of-service-cannot-load-in-requested-session. I'm not sure if this is the cause but it seemed close enough, given that it mentions the same error code.

  2. So I thought I would move this to the system domain. This meant moving the service file to /Library/LaunchAgents and making it owned by root:wheel. Also, I had to add the key for the user so the process is not executed as root:

// Add the following to the configuration:
//   <key>UserName</key>
//   <string>ec2-user</string>
//
vim /Users/ec2-user/Library/LaunchAgents/gitlab-runner.plist

// move plist to system directory
sudo mv /Users/ec2-user/Library/LaunchAgents/gitlab-runner.plist /Library/LaunchAgents
sudo chown root:wheel /Library/LaunchAgents/gitlab-runner.plist

// enable the service at boot
sudo launchctl enable system/gitlab-runner

// launch the service
sudo launchctl bootstrap system /Library/LaunchAgents/gitlab-runner.plist

// it works!
$ ps aux | grep gitlab
ec2-user          1634   0.0  0.0 408628368   1648 s000  S+    1:56pm   0:00.00 grep gitlab
ec2-user          1632   0.0  0.2 409260992  39312   ??  Ss    1:56pm   0:00.06 /usr/local/bin/gitlab-runner run --working-directory /Users/ec2-user --config /Users/ec2-user/.gitlab-runner/config.toml --service gitlab-runner --syslog
  1. Unfortunately, while this worked to launch the service manually, it didn't start at boot. I have found that it also needed to be only readable by root. sudo chmod 0600 /Library/LaunchAgents/gitlab-runner.plist

  2. Finally, the following error Warning: Expecting a LaunchDaemons path since the command was ran as root. Got LaunchAgents instead. prompted me to move the file to /Library/LaunchDaemons instead of /Library/LaunchAgents. Indeed, this matches the use case better. From https://apple.stackexchange.com/questions/290945/what-are-the-differences-between-launchagents-and-launchdaemons

LaunchAgents are only invoked when the user logs into a graphical session.

LaunchDaemons are typically launched when the system boots and are run outside of a specific user session.

The lauchctl manual page lists these folders with short descriptions:

Files

    ~/Library/LaunchAgents Per-user agents provided by the user.
    /Library/LaunchAgents Per-user agents provided by the administrator.
    /Library/LaunchDaemons System-wide daemons provided by the administrator.
    /System/Library/LaunchAgents Per-user agents provided by Mac OS X.
    /System/Library/LaunchDaemons System-wide daemons provided by Mac OS X.
  1. I also had to add the following key, in order for Keychain access to work:
    <key>SessionCreate</key>
    <true />

Setting up the keychain within the job is easy, you can use a custom keychain since the login one does not exist

security create-keychain -p gitlab gitlab || true
security default-keychain -s gitlab
security unlock-keychain -p gitlab gitlab
  1. In order for graceful shutdown to work, we need to use dumb-init to rewrite SIGTERM to SIGQUIT before it reaches the runner. Indeed, launchd will terminate processes with SIGTERM, but gitlab-runner is expecting SIGQUIT.
  • Install dumb-init with pip install dumb-init (this builds the binary for darwin arm64, which is not provided upstream)
  • Prefix the launch of gitlab-runner with
    ....
    <key>ProgramArguments</key>
    <array>
      <string>/opt/homebrew/bin/dumb-init</string>
      <string>--rewrite</string>
      <string>15:3</string>
      <string>/usr/local/bin/gitlab-runner</string>
    .....

Note that by default, AWS only allows around 10 minutes for an instance to shutdown. So if you need graceful termination on shutdown and you have jobs longer than 10 minutes, you might need to either use AutoScaling Lifecycle Hooks (if in an ASG), or first kill the gitlab-runner process with sudo launchctl bootout system/gitlab-runner and waiting for it to quit before initiating the termination

Edited by Adrien Kohlbecker