Skip to content

Add periodic worker for collecting network policy usage

Arthur Evstifeev requested to merge network-policy-metrics into master

What does this MR do?

Container network policy statistics were introduced in %12.10: #32365 (closed) . They work by collecting Prometheus stored metrics from Cilium installed to a user cluster. This MR follows similar logic to introduce North Star metrics for these stats across all defined environments.

This MR adds background job that collects network policy related metrics into the redis based counter. This job will run once a week on Sunday. Related usage data counter was also added to the usage ping data.

Prometheus query was introduced in: !30006 (merged)

Usage data counter was added in: !30005 (merged)

related to #214029 (closed)

Worker queries:

Clusters::Applications::Prometheus.preload_cluster_platform.with_clusters_with_cilium:

explain SELECT "clusters_applications_prometheus".* FROM "clusters_applications_prometheus" INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_prometheus"."cluster_id" INNER JOIN "clusters_applications_cilium" ON "clusters_applications_cilium"."cluster_id" = "clusters"."id" WHERE "clusters_applications_cilium"."status" IN (3, 5);

Nested Loop  (cost=0.72..49.96 rows=2 width=213) (actual time=0.010..0.010 rows=0 loops=1)
   Buffers: shared hit=1
   ->  Nested Loop  (cost=0.44..47.49 rows=8 width=12) (actual time=0.010..0.010 rows=0 loops=1)
         Buffers: shared hit=1
         ->  Index Scan using index_clusters_applications_cilium_on_cluster_id on public.clusters_applications_cilium  (cost=0.15..31.52 rows=8 width=8) (actual time=0.010..0.010 rows=0 loops=1)
               Filter: (clusters_applications_cilium.status = ANY ('{3,5}'::integer[]))
               Rows Removed by Filter: 0
               Buffers: shared hit=1
         ->  Index Only Scan using clusters_pkey on public.clusters  (cost=0.29..2.00 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=0)
               Index Cond: (clusters.id = clusters_applications_cilium.cluster_id)
               Heap Fetches: 0
   ->  Index Scan using index_clusters_applications_prometheus_on_cluster_id on public.clusters_applications_prometheus  (cost=0.29..0.31 rows=1 width=213) (actual time=0.000..0.000 rows=0 loops=0)
         Index Cond: (clusters_applications_prometheus.cluster_id = clusters.id)
PrometheusService.preload_project.with_clusters_with_cilium

explain SELECT "services".* FROM "services" INNER JOIN "projects" ON "projects"."id" = "services"."project_id" INNER JOIN "cluster_projects" ON "cluster_projects"."project_id" = "projects"."id" INNER JOIN "clusters" ON "clusters"."id" = "cluster_projects"."cluster_id" INNER JOIN "clusters_applications_cilium" ON "clusters_applications_cilium"."cluster_id" = "clusters"."id" WHERE "services"."type" = 'PrometheusService' AND "clusters_applications_cilium"."status" IN (3, 5);

Nested Loop  (cost=1.72..62.18 rows=1 width=219) (actual time=0.010..0.010 rows=0 loops=1)
   Buffers: shared hit=1
   ->  Nested Loop  (cost=1.16..59.13 rows=5 width=8) (actual time=0.009..0.009 rows=0 loops=1)
         Buffers: shared hit=1
         ->  Nested Loop  (cost=0.73..50.10 rows=5 width=4) (actual time=0.009..0.009 rows=0 loops=1)
               Buffers: shared hit=1
               ->  Nested Loop  (cost=0.44..47.49 rows=8 width=12) (actual time=0.009..0.009 rows=0 loops=1)
                     Buffers: shared hit=1
                     ->  Index Scan using index_clusters_applications_cilium_on_cluster_id on public.clusters_applications_cilium  (cost=0.15..31.52 rows=8 width=8) (actual time=0.009..0.009 rows=0 loops=1)
                           Filter: (clusters_applications_cilium.status = ANY ('{3,5}'::integer[]))
                           Rows Removed by Filter: 0
                           Buffers: shared hit=1
                     ->  Index Only Scan using clusters_pkey on public.clusters  (cost=0.29..2.00 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=0)
                           Index Cond: (clusters.id = clusters_applications_cilium.cluster_id)
                           Heap Fetches: 0
               ->  Index Scan using index_cluster_projects_on_cluster_id on public.cluster_projects  (cost=0.29..0.32 rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=0)
                     Index Cond: (cluster_projects.cluster_id = clusters.id)
         ->  Index Only Scan using projects_pkey on public.projects  (cost=0.43..1.80 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=0)
               Index Cond: (projects.id = cluster_projects.project_id)
               Heap Fetches: 0
   ->  Index Scan using index_services_on_project_id_and_type on public.services  (cost=0.56..0.60 rows=1 width=219) (actual time=0.000..0.000 rows=0 loops=0)
         Index Cond: ((services.project_id = projects.id) AND ((services.type)::text = 'PrometheusService'::text))

Screenshots

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Arthur Evstifeev

Merge request reports