Skip to content

Collect node CPU and memory utilization in usage ping

Qingyu Zhao requested to merge topology_collect_node_cpu_memory_utilization into master

What does this MR do?

In order to track the groupmemory NSM more accurately, we need to not just track the total number of cores and total available memory per node, but also to what extent they're used. This will help us understand if self-managed customers are over- or underprovisioned in terms of hardware and w.r.t. our reference architectures.

MR omnibus-gitlab!4455 (merged) created Prometheus rules to provide the metrics gitlab_usage_ping:node_cpu_utilization:avg and gitlab_usage_ping:node_memory_utilization:avg. This client side MR query these metrics to collect node cpu/memory utilization from topology usage ping.

See also #230898 (closed)

Conformity

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Merge request reports