upgrade nonprod logging cluster to 7.8
Production Change - Criticality 3 C3
| Change Component | Description |
|---|---|
| Change Objective | We will performa a rolling upgrade of the nonprod logging cluster from Elasticsearch v7.7.1 to 7.8.0, see https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/10628. |
| Change Type | Operation |
| Services Impacted | Elasticsearch |
| Change Team Members | @igorwwwwwwwwwwwwwwwwwwww @mwasilewski-gitlab |
| Change Criticality | C3 |
| Change Reviewer or tested in staging | @hphilipps |
| Dry-run output | n/a |
| Due Date | 13:30 UTC (engineer @ 15:30) |
| Time tracking | 4h |
Detailed steps for the change
-
Kick off the upgrade on the gitlab-logs-nonprod deployment in Elastic Cloud UI for Elasticsearch. -
Upgrade APM node. -
Provision additional Kibana node (from 1 => 2). -
Upgrade Kibana node (no downtime is expected here). -
De-provision extra Kibana node (from 2 => 1).
Rollback steps
Elastic will handle rollbacks during the upgrade itself.
Unfortunately, once the upgrade is complete, rollback is not trivially possible, we will likely need to roll forward.
Monitoring
Key metrics to observe
- Metric: Indexing rate
- ES cluster logs
- Recovery progress
- Location: https://nonprod-log.gitlab.net/app/kibana#/dev_tools/console
- Endpoints:
GET _cat/recovery?v&active_only GET _cat/tasks?v GET _cat/pending_tasks?v GET _cat/thread_pool
Summary of infrastruture changes
-
Does this change introduce new compute instances? -
Does this change re-size any existing compute instances? -
Does this change introduce any additional usage of tooling like Elastic Search, CDNs, Cloudflare, etc?
Changes checklist
-
Detailed steps and rollback steps have been filled prior to commencing work -
SRE on-call has been informed prior to change being rolled out -
There are currently no active incidents
Edited by Igor