[Meta] Infrastructure stability + scalability

We can’t fix what we can’t see.

continue to improve monitoring and logging.

monitoring not only per host metrics, but an overall service response time (git/api/web) metric.
- Fix git timings in GitLab git status dashboard (Black box monitoring) (#1154) · Issues · GitLab.com / infrastructure · GitLab
- Availability monitoring for the DB
Logging. Centralized logging for all services and hosts. Including chef/rails consoles in order to remove the need for logging into production/staging environments.
- META Logging at GitLab.com (#2225)

Staging https://gitlab.com/gitlab-com/infrastructure/issues/2751
- Staging environment that mirrors production except with anonymize database data.
- quick spin-up/tear-down of staging environments. Allow multiple staging environments to run at once
Canary deployments:
- Meta - Getting to canary deployment,(https://gitlab.com/gitlab-com/infrastructure/issues/1504)
Feature Flags:
- https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/11747

Move CI artifacts to object storage -https://gitlab.com/gitlab-com/infrastructure/issues/2387
- Move the rest of the non-repo storage to object storage.
services separated . -https://gitlab.com/gitlab-com/infrastructure/issues/2458
Geo - multiple region .

Edited Sep 12, 2017 by Ghost User