Change Request :: Autovacuum tuning
C1
Production Change - Criticality 1Change Objective | Autovacuum tuning |
---|---|
Change Type | ConfigurationChange, Operation |
Services Impacted | ~"Service:Postgres" |
Change Team Members | @gerardo.herzig |
Change Severity | C1 |
Change Reviewer | @gerardo.herzig |
Tested in staging | |
Dry-run output | |
Due Date | |
Time tracking | 30 mins |
Downtime Component | 2-3 minutes |
From https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/5024 it was concluded that autovacuum is not working properly at peak times. An study regarding autovacuum effectivity can be seen here and the document with the formal proposal can be seen here
The following changes can be applied via Chef
This is a three-step CR. Step 1 is about change autovacuum related settings, second is to reset individual table-level autovacuum settings, and the third step is about implement some scheduled mechanism (like cron
) to execute regular vacuums, to balance
Phase 1: PostgreSQL configuration settings
The following changes should be applied in the postgresql.conf
file:
autovacuum_vacuum_scale_factor = 0.010 ##(previously 0.005)
autovacuum_vacuum_cost_limit = 3000 ##(previously 6000)
autovacuum_max_workers = 10 (previously 6)
Those changes needs a PostgreSQL restart to be applied. In order to avoid a failover, we can pause Patroni before restart.
Phase 2: Revert specific table-level autovacuum settings
To revert the specific table settings (and take the value from the system config), the following changes must be applied (in the form of SQL sentences):
ALTER TABLE project_mirror_data reset (autovacuum_vacuum_cost_limit);
ALTER TABLE projects reset (autovacuum_vacuum_cost_limit);
Phase 3: Regular vacuum runs
vacuum_cost_limit = 1500 (previously 200)
vacuum_cost_delay = 5ms (previously 0)
The following action (ejemplified in cron language ) should be implemented (via Chef):
crontab -l
0 1 * * * /usr/bin/vacuumdb -z -t ci_stages gitlabhq_production
10 1 * * * /usr/bin/vacuumdb -z -t ci_builds_metadata gitlabhq_production
20 1 * * * /usr/bin/vacuumdb -z -t notes gitlabhq_production
30 1 * * * /usr/bin/vacuumdb -z -t ci_stages gitlabhq_production
40 1 * * * /usr/bin/vacuumdb -z -t merge_request_diffs gitlabhq_production
50 1 * * * /usr/bin/vacuumdb -z -t merge_request_diff_commits gitlabhq_production
0 2 * * * /usr/bin/vacuumdb -z -t merge_request_diff_files gitlabhq_production
10 2 * * * /usr/bin/vacuumdb -z -t project_mirror_data gitlabhq_production
Important metrics to track: The following graphs should be tracked to detect unexpected behaviour after this implementation:
Rollback steps
- Go back to the previous configuration.
- Eliminate cron (or similar) entry