New client_id context metadata
Context metadata helps us understand what's happening in a request.
One blindspot that we still experience though is around anonymous users.
For authenticated users, we have meta.user
. For anonymous users, this field is missing.
Having more insight into these anonymous requests would be extremely helpful. Particularly for user-induced incidents, such as https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/9047 and abuse situations.
With this in mind, I propose an additional context metadata field, named something like: client_id
.
The value of this field should be kept arbitrary (ie, not something that should be parsed, but rather treated as a unique, identifier for a client). This identifier would be relatively transient, not long lived, giving us the flexibility to change it in future without migration paths.
The (initial) value of the field would be:
-
user/userid
for authenticated users. -
ip/remote_ip
for anonymous users.
Some examples:
meta.client_id: "user/1"
-
meta.client_id: "ip/196.7.0.138"
- This is an anonymous user, coming in from196.7.0.138
Why have this?
- Abuse becomes much easier to diagnose
- Anonymous usage becomes easier to understand and track
- We can introduce better rate limiting for all classes of users, without having to add special cases for anonymous and non-anonymous users. In both cases, we can use
meta.client_id
as our rate limiting key. For an example of where this would be useful, see https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/9047#note_278412462
Solution
- Either add
client_id
into labkit-ruby or lift the barrier to make Labkit context generic. - Remove all access to labkit context directly.
- Modify
Gitlab::ApplicationContext
to add client_id there.
(cc @cdewit, @rostrander )