Add more NFS Mount health checks
Add more health checks for NFS mounts.
Various front end nodes have been experiencing silent outages of NFS mounts that need to become observable quickly via means other than end-user complaints.
This change could be a modification of https://gitlab.com/gitlab-org/gitlab-ce/blob/master/lib/gitlab/health_checks/fs_shards_check.rb which is currently repository specific, or a new health check file responsible for checking all NFS mounts.
The basic check to add for each mount involves stat
of a file within the mounted tree, and a stat
of the directory above the mount point -- they must have separate device id
if the mount is currently working.
example:
stat --format="%d" /var/opt/gitlab/gitlab-rails/shared/lfs-objects/foo
stat --format="%d" /var/opt/gitlab/gitlab/rails/shared
for additional context, see:
Edited by Paul Charlton