ctdb/ceph: add option to not register mutex rados helper as a service
Refiled version of !3998 (closed)
Add a new -R
option (no-register) that will skip the step of
registering the lock helper as a ceph service. Ceph will treat the lock
helper more like a typical rados client. The ceph -s
output will not
have ctdb listed under the services section (previous output):
cluster:
id: 5b81295a-fdec-11ef-a18f-525400220000
health: HEALTH_WARN
1 stray daemon(s) not managed by cephadm
services:
mon: 3 daemons, quorum ceph0,ceph1,ceph2 (age 6m)
mgr: ceph0.mkodry(active, since 85s)
mds: 1/1 daemons up
osd: 6 osds: 6 up (since 52m), 6 in (since 52m)
ctdb: 1 daemon active (1 hosts)
Most importantly, this will avoid triggering health warnings from ceph when cephadm discovers services that it did not create (or directly manage) listed in the cluster. Something we looked into hiding on the cephadm side but proved quite tricky so it's better off not to try this registration on cephadm managed clusters in the first place.
In addition, the 1 daemon active
bit is somewhat confusing when you
have a N (N>1) node ctdb cluster managed by cephadm. The fact that the
mutex helper only runs on one of those nodes at once is a low level
implementation detail that most users do not need and I assume could
confuse.
The -R was chosen to fit the practice of naming "negative" options upper-case. If we were to ever change the default behavior we'd add -r
(register) to enable registration, and -R
to disable registration. Since we only have to disable registration we use -R
.
Checklist
-
Commits have Signed-off-by:
with name/author being identical to the commit author -
(optional) This MR is just one part towards a larger feature. -
(optional, if backport required) Bugzilla bug filed and BUG:
tag added -
Test suite updated with functionality tests -
Test suite updated with negative tests -
Documentation updated -
CI timeout is 3h or higher (see Settings/CICD/General pipelines/ Timeout)
Reviewer's checklist:
-
There is a test suite reasonably covering new functionality or modifications -
Function naming, parameters, return values, types, etc., are consistent and according to README.Coding.md
-
This feature/change has adequate documentation added -
No obvious mistakes in the code