Develop pod-local StackGres Controller
Currently, StackGres works as an operator, that runs as a separate pod (from the Postgres cluster pods). From there, it interacts with the Kubernetes API to perform StackGres actions, including creating and destroying pods and many other actions.
However, it doesn't have a clear way to interact with the clusters once created. In particular, it has no way to explicitly run commands / take some actions inside of some of the containers of the StackGres pods. And some maintenance operations (like for example reloading configuration files) require to run some commands on those containers.
Rather than creating custom scripts, or executing commands with the Kubernetes API, the team has decided to continue leveraging the sidecar pattern and implement this via a sidecar container that will act as a pod-local StackGres controller.
This controller will expose an HTTP API that the "main/central" controller will be called. This would also help to abstract the implementation of how those actions are executed within the pod containers.
Similarly to other HTTP APIs used by StackGres, this HTTP API will follow RFC-7807 for communicating back errors. This pod-local controller will be a separate container, where a Java-developed software compiled to native code via GraalVM will be running.
For a first implementation, this pod-local controller must implement the following operations. Also suggested implementation details are proposed as part of this issue:
-
Reload Postgres configuration. Postgres configuration (bothpostgresql.confandpg_hba.conf) is currently handled via Patroni. Patroni stores this configuration as a JSON within and Endpoint annotation --which is backed by K8s etcd--. Whenever it is changed, Patroni propagates these changes to all nodes of the cluster automatically. However, they are not refreshed. Implementation: ideally, Patroni's own HTTP API should be called. It exposes aPOST /reloadmethod that handles precisely this use case. It may also be an option to execute the commandpatronictl reload, but is less desirable. -
Reload Pgbouncer configuration. PGbouncer provides a command to reload the configuration (see RELOAD) that will perform this task. This requires access to the administrative database of pgbouncer (typically called pgbouncer, but this depends on the configuration). Ideally, this may be over localhost using the Postgres protocol, but may happen otherwise. Another possible, although less desirable implementation, is tokill -HUPpgbouncer's process, getting the pid from thepgbouncer.pidor similar file.
One common issue to these methods is to ensure that the configuration change has been adequately propagated to the pod before the reload method is called. There may be a race condition here since propagation of changes via the downward API or annotations on objects is asynchronous with respect to the change operation. One way to avoid this problem is to introduce a version field as part of every configuration, that is monotonically increased whenever any change is performed. Then, the reload operations must also specify which version they are expecting the configuration to be at. The reload mechanism (executed by the pod-local controller) must check that the version of the configuration materialized on the filesystem matches the one requested via the web service; or wait with some retries until it gets updated. Return an error if after several retries that version got never updated.