Workhorse Route Regex Labels in Prometheus are Awful

At present, Workhorse emits regular expressions as route identifiers in it's observability outputs -- as OpenMetrics text from it's Prometheus scrape endpoint, and as the route field in it's JSON logs.

This is awful to work with, since these labels often need to be matched with a regular expression themselves. Matching regular expressions with regular expressions requires that the matched expression is expanded, leading to some nightmare scenarios.

For example, this monstrosity is an actual PromQL expression used in GitLab's own monitoring:

rate(gitlab_workhorse_http_requests_total{route!="^/-/health$",route!="^/-/(readiness|liveness)$",route!~"\\^/\\.\\+\\\\\\.git/git-receive-pack\\\\z|\\^/\\.\\+\\\\\\.git/git-upload-pack\\\\z|\\^/\\.\\+\\\\\\.git/gitlab-lfs/objects/\\(\\[0-9a-f\\]\\{64\\}\\)/\\(\\[0-9\\]\\+\\)\\\\z|\\^/\\.\\+\\\\\\.git/info/refs\\\\z",route!~"^\\^/api/.*"}[5m])

see gitlab-com/runbooks!7664 (merged) for more details.

It goes without saying: this expression is completely unreadable.

What's more, the current approach leaks the internal abstractions used to match routes (ie, regular expressions). Slightly modifying a regular expression means that the metrics will also change, forcing downstream monitoring systems to be adapted. This is a leaky abstraction, which would be better avoided.

Alternatives

For backwards compatibility, we will need to continue to emit the regexp route identifiers for sometime, but along side these, we could add an additional route_id and backend_id fields. The backend_id would indicate the primary backend service for each route and would further simplify monitoring.

route (regexp)                                           route_id          backend_id
---------------------------------------------------------------------------------------
^/-/health$                                              health            self
^/-/(readiness|liveness)$                                liveness          self
^/([^/]+/){1,}[^/]+/uploads\z                            project_uploads   rails
^/-/cable\z                                              cable             rails
^/.+\.git/git-receive-pack\z                             git_receive_pack  gitaly
^/.+\.git/git-upload-pack\z                              git_upload_pack   gitaly
^/.+\.git/gitlab-lfs/objects/([0-9a-f]{64})/([0-9]+)\z   git_lfs_objects   gitaly
^/.+\.git/info/refs\z                                    git_info_regs     gitaly

In Prometheus, cardinality of labels is always a concern, but in this case, the new labels would be 1:1 with existing labels, so cardinality of the new labels would match the existing cardinality.

With the new route_id and backend_id, monitoring of Workhorse could be greatly simplified.

cc @cmiskell @reprazent

Edited by Andrew Newdigate