Check NFS mounts in a separate process
What does this MR do?
Moving the check out of the general requests, makes sure we don't have any slowdown in the regular requests.
To keep the process performing this checks small, the check is still performed inside a unicorn. But that is called from a process running on the same server.
Because the checks are now done outside normal request, we can have a simpler failure strategy:
The check is now performed in the background every
circuitbreaker_check_interval
. Failures are logged in redis. The
failures are reset when the check succeeds. Per check we will try
circuitbreaker_access_retries
times within
circuitbreaker_storage_timeout
seconds.
When the number of failures exceeds
circuitbreaker_failure_count_threshold
, we will block access to the
storage.
After failure_reset_time
of no checks, we will clear the stored
failures. This could happen when the process that performs the checks
is not running.
The background process can be started using bin/storage_check
it takes
a socket path or a host. It will periodically make requests the provided
unicorn. The HealtController#storage_check
will handle the request and
report the status of the shards of that particular host.
Why was this MR needed?
This simplifies a lot of the circuitbreaker implementations, and moves the check out of requests. That way the request doesn't get bogged down.
Does this MR meet the acceptance criteria?
-
Changelog entry added, if necessary -
Documentation created/updated -
API support added -
Tests added for this feature/bug - Review
-
Has been reviewed by Backend
-
-
Internationalization required/considered
TODO:
-
Add support for the separate process in GDK: gitlab-development-kit!404 (merged) -
Add support for the separate process in omnibus-gitlab