GitLab CI/CD service daemon
Description
GitLab CI/CD is currently implemented mostly on a Rails side, except GitLab Runner, which is a separate service, because of obvious reasons.
The more features we add to rails app, the more difficult it becomes to foresee future performance and availability implications. The implementation is currently quite complex (long polling, fair usage etc).
One of the directions for improving future performance / availability / scalability is to extract most of complex and domain-specific CI/CD implementation to a separate microservice - GitLab CI/CD daemon.
The first iteration would move all CI/CD builds queuing algorithms to the CI/CD Service.
Later, GitLab CI/CD Service could be also responsible for other things, that are not a priority right now:
- Implementing more efficient Runner -> GitLab communication protocol
- Implementing different than long polling GitLab <- Runner communication methods, websockets, other pub/sub mechanism
- Building a bulk build API to offload Rails
- Evenly distributing jobs across Runners fleet
- Receiving and storing traces using different protocols, binary protocols, GRPC
- Implementing various performance improvement techniques
- Offloading PostgreSQL Database, moving computations to concurrent goroutines.
There are numerous benefits that could stem from this work:
- It could improve performance of builds queuing a lot, currently we execute a big hairy SQL query, that is A4 page large after copying to libreoffice.
- It could vastly reduce the amount of builds pending on gitlab.com. We currently do have around 1000 of them almost in every moment.
- We are not able to build new features because of performance and complexity constraints. Moving this complexity and offloading rails, moving to Go-based service could allow us to build features like priority runners and custom queuing rules easier.
- We currently do have some fair usage rules that are very rigid and difficult to customize / configure. CI/CD queue service could help with that too.
- We have more and more features that we would like to build and feature requests related to assigning builds to runners in a different way than just using tags. Per-runner scheduling rules, sticky runners etc.