Make Zoekt storage buffer factor dynamic
What does this MR do and why?
Replaces the hardcoded buffer_factor: 3 in PlanningService with an instance-wide dynamic ratio computed from aggregate observed Zoekt disk usage vs source repository size across all ready indices.
On GitLab.com the observed ratio is ~0.5x the source repository size, meaning nodes are significantly over-reserved with the current 3x hardcoded value. This change makes the planner converge toward actual usage patterns over time.
Implementation
-
Search::Zoekt::Index.global_buffer_factoraggregatesSUM(used_storage_bytes) / SUM(repository_size)across all ready indices joined to root storage statistics, applies a 1.2x safety margin, and falls back to 3.0x when no data is available. - Result is cached via
Rails.cachefor 1 hour to avoid expensive repeated aggregation. - Gated behind
zoekt_dynamic_buffer_factorfeature flag (gitlab_com_derisk, default off). - Factor is now visible in
gitlab:zoekt:inforake task output.
Database query
https://console.postgres.aiundefined/shared/714b111e-62b0-44e9-87e0-5e56b1c23cec
SELECT
SUM(per_namespace_stats.total_used)::float / NULLIF (SUM(per_namespace_stats.repository_size), 0) AS ratio
FROM (
SELECT
zoekt_enabled_namespaces.root_namespace_id,
SUM(zoekt_indices.used_storage_bytes) AS total_used,
MAX(namespace_root_storage_statistics.repository_size) AS repository_size
FROM
zoekt_indices
INNER JOIN zoekt_enabled_namespaces ON zoekt_enabled_namespaces.id = zoekt_indices.zoekt_enabled_namespace_id
INNER JOIN namespace_root_storage_statistics ON namespace_root_storage_statistics.namespace_id = zoekt_enabled_namespaces.root_namespace_id
WHERE
zoekt_indices.state = 10
AND zoekt_indices.used_storage_bytes > 1024
AND namespace_root_storage_statistics.repository_size > 0
GROUP BY
zoekt_enabled_namespaces.root_namespace_id) per_namespace_stats
MR acceptance checklist
- Tests added
- Feature flag added
- Documentation updated
Closes #592180 (closed)
Edited by Ravi Kumar