Slackline V2
Slackline is a tool that the infrastructure team use for generating useful alertmanager notifications in Slack. The project can be found here: https://gitlab.com/gitlab-com/gl-infra/slackline.
History
The initial version of slackline was developed when:
- GitLab did not run a production GKE cluster
- Google Cloud Functions was beta and only supported the nodejs runtime
These two constraints pushed the implementation towards a nodejs serverless cloud function running in Google Cloud Functions, written in Javascript/nodejs.
Problems with this approach
Javascript is not a widely used language in the infrastructure team. SREs are not familiar with the stack.
Proposal
For the next iteration of this problem, I propose the following changes:
- Replace the nodejs/GCF implementation with a standard Go service
- Replace GCF as the runtime with the existing kubernetes cluster we run in production (ops)
- Switch the Elasticsearch persistence (chosen for ease of access over anything else) with Consul KV Store for atomic updates (for deduplication)
This approach will bring several benefits:
- Improve the ability of SREs within the team to contribute towards the project
- Bring the project in line with other projects within the infrastructure team
- Improve reliability, particularly of duplicates
Edited by AnthonySandoval