Skip to content

Fix perms on /var/opt/gitlab/gitaly.pid during startup

Justin Farmiloe requested to merge jfarmiloe-fix-gitaly-perms into master

What does this MR do?

When the persistent files belonging to a GitLab docker container are transferred from one docker server to another the user and group ids of the transferred files can change or not match up to the expected values on the new server.

Several customers have reported issues with the gitaly service not starting after performing such a transfer, due to the /var/opt/gitlab/gitaly/gitaly.pid file having the wrong permissions.

In the container startup wrapper script and in the troubleshooting docs we output a recommendation to run the update-permissions script if startup issues are encountered:

If this container fails to start due to permission problems try to fix it by executing:

  docker exec -it gitlab update-permissions
  docker restart gitlab

The update-permissions script resets the ownership of several files and folders but does not modify anything under /var/opt/gitlab/gitaly , and so does not help fix this issue.

There is also a clean_stale_pids function in the wrapper script run on container startup which is specifically intended to remove any old *.pid files left on the system when the container starts, but it does not include cleaning up /var/opt/gitlab/gitaly/gitaly.pid.

There are other pid files created under /var/opt/gitlab , namely (as identified by running find /var/opt/gitlab -name "*.pid" in a running container):

  • ./postgresql/data/postmaster.pid
  • ./nginx/nginx.pid
  • ./gitaly/gitaly.pid

but these are removed when the related services are stopped (as they most likely are before the container files are copied to a new server) and so typically don't exist when files are transferred, whereas the /var/opt/gitlab/gitaly/gitaly.pid file is not removed when the Gitaly service is stopped (which is something that should probably be changed in Gitaly itself).

For completeness though, these pid files should be included in the clean_stale_pids function.

So this MR updates the clean_stale_pids function to include the /var/opt/gitlab folder to ensure Gitaly can start after a container file migration.

Related issues

#6926

Checklist

See Definition of done.

For anything in this list which will not be completed, please provide a reason in the MR discussion.

Required

  • MR title and description are up to date, accurate, and descriptive.
  • MR targeting the appropriate branch.
  • Latest Merge Result pipeline is green.
  • When ready for review, MR is labeled "~workflow::ready for review" per the Distribution MR workflow.

For GitLab team members

If you don't have access to this, the reviewer should trigger these jobs for you during the review process.

  • The manual Trigger:ee-package jobs have a green pipeline running against latest commit.
  • If config/software or config/patches directories are changed, make sure the build-package-on-all-os job within the Trigger:ee-package downstream pipeline succeeded.
  • If you are changing anything SSL related, then the Trigger:package:fips manual job within the Trigger:ee-package downstream pipeline must succeed.
  • If CI configuration is changed, the branch must be pushed to dev.gitlab.org to confirm regular branch builds aren't broken.

Expected (please provide an explanation if not completing)

  • Test plan indicating conditions for success has been posted and passes.
  • Documentation created/updated.
  • Tests added.
  • Integration tests added to GitLab QA.
  • Equivalent MR/issue for the GitLab Chart opened.
  • Validate potential values for new configuration settings. Formats such as integer 10, duration 10s, URI scheme://user:passwd@host:port may require quotation or other special handling when rendered in a template and written to a configuration file.
Edited by Justin Farmiloe

Merge request reports