Skip to content

Integrate Opni as an optional observability unit in Sylva

Ivo Petrov requested to merge ipetrov117/sylva-core:opni-sylva-integration into main

What does this MR do and why?

This MR aims at integrating the Opni application into the Sylva project. Opni is an open-source software designed for multi-cluster and multi-tenant observability.

Description of what this MR covers and why has it been implemented in such a way can be found in the TL;DR section below.

Related reference(s)

Information on Opni and how to use it in the Sylva project can be found in the documentation that comes with this MR.

Test coverage

The suggested changes have been tested using Sylva's kubeadm-capd environment.
There the following manual tests were performed:

  1. Opni Monitoring backend functionality validation
  2. Opni Logging backend functionality validation
  3. Capability to add additional opni-agents
  4. Alerting tests
  5. AIOps Log anomaly detection tests using Opni's pretrained models

TL;DR

This MR introduces the following Sylva units:

  1. opni-ns - sets up the namespace where Opni will be deployed
  2. opni-crd - deploys CRDs that are a prerequisite for the Opni application
  3. opni - the Opni application itself
  4. opni-keycloak-eso - external secrets that are used for Opni's Keycloak authentication automation
  5. opni-keycloak-resources - automates Keycloak client and user creation for Opni's Keycloak authentication setup
  6. opni-ingress - exposes relevant Opni urls, such as for: monitoring, logging, dashboard

Once the opni unit is enabled all relevant units to the specific Opni configuration will also be enabled. Once it is disabled, all resources related to Opni will be removed, leaving the cluster clean of any leftover Opni resources.

Changes that are out of the ordinary Sylva setup

opni-ns has not been added to the namespace-defs unit, due to:

  1. Opni creates additional resources while it works. If the namespace is not deleted once the opni unit is deleted any further enablement of the opni unit will result in a failure, due to resources from previous setups being present. Having this as a unit bound to the "opni unit chain" ensures that the namespace will always be fresh on every opni unit enable/disable.
  2. Did not want for the opni unit to have leftover resources if disabled.

opni-keycloak-eso and opni-keycloak-resources resources were not added to the keycloak-oidc-external-secrets and keycloak-resources respectively, because:

  1. Did not want for the opni unit to have leftover resources if disabled, or to have resources that will be deployed, but not be used in certain Opni deployment configurations.
  2. opni being marked as an optional component made little sense for me to put them there.

Resources related to Opni -> Keycloak authentication automation:

Opni uses an OpenID provider in order to authenticate Grafana users, this MR comes with an automation (triggered by a configuration flag) that prepares Keycloak resources, so that Opni can use Sylva's Keycloak as this authentication mechanism.

  1. keycloak-resources - covered by explanation above
  2. eso - covered by explanation above
  3. configtemplate-opni-keycloak.tpl - configurations that are specific only to the Opni -> Keycloak automation and are added to the opni unit only if the .Values.cluster.opni.gateway.auth.useInternalKeycloak flag has been set to true. This is done in the following way due to the limitation of not being able to write if clauses in the root values.yaml file of the Sylva chart.
  4. kubernetes-keycloak-ns-secretstore.yaml - added the default namespace to the secret store conditions in order to ensure that the opni-oidc-auth secret could be created in the default namespace and picked up by the opni unit's HelmRelease resource. Since doing a lookup from one unit to another (while both units are being deployed) is not possible, this was the only other way that I could get values from a secret and use it in my unit.
  5. kubernetes-cert-manager-ns-secretstore.yaml - added the opni namespace here in order to ensure that the eso-sylva-ca resource could deploy the sylva-ca.crt secret in the opni namespace.

Other changes that require an explanation:

  1. workload-cluster.values.yaml - added the cert-manager unit as an optional component in the workload-cluster deployment. cert-manager is a prerequisite for a successful deployment of an opni-agent.

Merge request reports