Check NFS mounts in a separate process (!15426) · Merge requests · GitLab.org / GitLab FOSS

What does this MR do?

Moving the check out of the general requests, makes sure we don't have any slowdown in the regular requests.

To keep the process performing this checks small, the check is still performed inside a unicorn. But that is called from a process running on the same server.

Because the checks are now done outside normal request, we can have a simpler failure strategy:

The check is now performed in the background every circuitbreaker_check_interval. Failures are logged in redis. The failures are reset when the check succeeds. Per check we will try circuitbreaker_access_retries times within circuitbreaker_storage_timeout seconds.

When the number of failures exceeds circuitbreaker_failure_count_threshold, we will block access to the storage.

After failure_reset_time of no checks, we will clear the stored failures. This could happen when the process that performs the checks is not running.

The background process can be started using bin/storage_check it takes a socket path or a host. It will periodically make requests the provided unicorn. The HealtController#storage_check will handle the request and report the status of the shards of that particular host.

Why was this MR needed?

This simplifies a lot of the circuitbreaker implementations, and moves the check out of requests. That way the request doesn't get bogged down.

Does this MR meet the acceptance criteria?

Changelog entry added, if necessary
Documentation created/updated
API support added
Tests added for this feature/bug
Review
- Has been reviewed by Backend
Internationalization required/considered

TODO:

Add support for the separate process in GDK: gitlab-development-kit!404 (merged)
Add support for the separate process in omnibus-gitlab

What are the relevant issue numbers?

Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/39847

Edited Dec 07, 2017 by Bob Van Landuyt

Check NFS mounts in a separate process

What does this MR do?

Why was this MR needed?

Does this MR meet the acceptance criteria?

TODO:

What are the relevant issue numbers?

Merge request reports