upgrade prod logging cluster to 7.8
Production Change - Criticality 2 C2
| Change Component | Description |
|---|---|
| Change Objective | We will perform a a rolling upgrade of the prod logging cluster from Elasticsearch v7.5.1 to 7.8.0, see https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/10227. |
| Change Type | Operation |
| Services Impacted | Elasticsearch |
| Change Team Members | @igorwwwwwwwwwwwwwwwwwwww @mwasilewski-gitlab |
| Change Criticality | C2 |
| Change Reviewer | @T4cC0re |
| Tested in staging | #2311 (closed) |
| Dry-run output | n/a |
| Due Date | 2020-06-30 13:00 (engineer @ 15:00) |
| Time tracking | 8h |
Detailed steps for the change
-
Kick off the upgrade on the gitlab-logs-prod deployment in Elastic Cloud UI for Elasticsearch. -
Update version number in APM index templates. -
Upgrade APM node. -
Upgrade Kibana node (very short downtime is expected here).
Rollback steps
Elastic will handle rollbacks during the upgrade itself.
Unfortunately, once the upgrade is complete, rollback is not trivially possible, we will likely need to roll forward.
Monitoring
Key metrics to observe
- Metric: Indexing rate
- ES cluster logs
- Recovery progress
- Location: https://log.gprd.gitlab.net/app/kibana#/dev_tools/console
- Endpoints:
GET _cat/recovery?v&active_only GET _cat/tasks?v GET _cat/pending_tasks?v GET _cat/thread_pool
Summary of infrastruture changes
-
Does this change introduce new compute instances? -
Does this change re-size any existing compute instances? -
Does this change introduce any additional usage of tooling like Elastic Search, CDNs, Cloudflare, etc?
^ It will keep the number of ES nodes the same, but will cycle them as part of the upgrade.
Changes checklist
-
Detailed steps and rollback steps have been filled prior to commencing work -
SRE on-call has been informed prior to change being rolled out -
There are currently no active incidents
Edited by Igor