Remove "token" attribute in Runners API

Isn't the token field in the API result the only way to truly correlate with the data available locally on a Runner? The runner config.toml does not contain the integer id, the token is the only truly unique identifier present that can be used to guarantee identification, which is my main complaint that the local config.toml file does NOT include the registered Runner Id

@brett_jacobson Thank you for your comment!

@tmaczukin What do you think about this?

@brett_jacobson On the Runner side you have the gitlab-runner verify command, that allows you to check if the runner entity is still present where it was registered. It uses the token (which is stored in config.toml) and works by authenticating the API request that the verify command is making. config.toml doesn't have the ID from GitLab's database, because it's not needed at all - in all Runner -> GitLab communication Runner is properly and uniquely identified by the API.

Could you elaborate more about the reason why you need the Runner ID on the Runner side? What use cases it helps you resolve?

Maybe there is a valid need that I don't see. But still, in that case we should work on allowing to request such ID through the API using the token authentication (maybe by extending the verify command output to print such ID additionally to the check status) and maybe even storing it in config.toml - as totally unused value, useful only for the user.

Because of how the authentication works right now, not showing the token in the API, after it was created and passed to the Runner during registration, is the only proper and secure way to go.

BTW we've started a discussion about changing how the Runner's API authentication is handled, and one of the ideas also proposes the need for using the Runner ID. But in this case the value would be fully usable and required to be stored in the config.toml

Read the runner api auth discussion; I'd agree the runner token being stored in config.toml is like storing a password in a text file.

As for the "use case", right now its the only way to correlate data; For example, we can query the gitlab api to find out things about runners that the server knows about, and id is their key. However, Prometheus metrics for the runner the only correlatable identifier IS the runner token: gitlab_runner_jobs_total{instance="10.164.57.28:9252",job="windows-gitlab-runners",runner="hyufGzbg"}

Ideally, there would be an id label in the prometheus metrics (but, the runner does not KNOW its id). For now, the "instance" label is uncorrellatable as no gitlab runner API data has that matching value.

I totally agree runner token should be removed from config.toml, for security reasons. But unfortunately, there is a deficiency in prometheus metrics, that will become all but useless if there is no correlateable identifier label. I assume this should spawn an issue.

@brett_jacobson To make it clear - the credentials for Runner's authentication in the GitLab API are and for now will be stored in config.toml. Whether it will be the token, a private key or any other credentials type. Runner must store it somehow to be able to use them. And if someone have access to the Runner host, then this is way bigger security issue (or trust for such person of course!) than only knowing the API.

The problem that we have here is that this token is intended to be used by the Runner only. But it's stored not only in the Runner's configuration file. It's stored also in the DB and with current API version, you can even read it from the API response. The current token is also not easily rotateable - if it was revealed the only way to secure the credentials is to re-register the Runner. And whoever will obtain this token is able to act "as the runner". In many cases it's OK-ish, because access to the API is already limited to people that have access to the project/group/GitLab instance. But what about a simple case when someone notes the value of such token obtained from the API when he is the member of the team and then, for example, leaves the company? Having such token can do anything "impersonating" the Runner, while neither is allowed to anymore nor may have good intentions.

This is why I propose to use an asymmetric keys pair for this - with this only the person who have access to Runner's host (which usually is way more limited than "who is the project maintainer") would know this value. And reading the public key value from GitLab's database (or API response like we have now with the token) would be not a problem - you can't authenticate with a public key. You must have the private one, which is stored only in the config.toml. But, this is the future and it will not happen probably within next few releases. While we need to secure the communication before then.

So, to summary, I propose to go forward with such steps:

We continue to merge this MR. And I don't think we will change our minds about removing the token field from the API.
We should update Runner to store an - unused for now - ID of the Runner in the configuration file during registration. We should also extend the verify command to allow already registered Runners to obtain and save this ID.
Proceed with working on https://gitlab.com/gitlab-org/gitlab-runner/-/issues/25351 in the 13.x series of the Runner, so that we will get both more secured API and better identification of the Runner on the Runner side.

Thanks @tmaczukin. I created gitlab-runner#25441 to request this id be added to the Prometheus metrics, since they'll become uncorrelatable without the runner token label.

mentioned in issue gitlab-runner#25441

@tmaczukin I have another question.

This was suggested from previous MR:

Should we remove it with a feature flag before deleting the code entirely?

It makes sense to have a FF just in case, we may rollback instantly if we need it. However, here we are introducing a breaking change , so I am still not sure about this.

WDYT?

edit: another info about breaking changes: gitlab-com/Product#1127 (comment 333994512)

@tmaczukin This is already stated in the blog post. So I think we are ready to go for this :)

I have a mixed feelings. The goal is to remove this field entirely and as a breaking change in the API I'd say it should be done in the major release. Which means either now (without a FF) or in 14.0 (which is another year of this field exist in some form with all the consequences that it have).

We've already mentioned this incoming breaking change in the previous release blog post, and the 13.0 release is just a finalization. I'd say that with the major release we expect some things to break (that is why we accept this only on major releases, which is done only once each few months). I'm not sure what purpose FF would have in this case. We want people to stop using this field starting with 13.0 and further.

Reading the discussion at gitlab-com/Product#1127 (comment 333994512) I'd say that FF is the way to handle a breaking change that changes the behavior. Not the one that removes something completely

@marin @joshlambert any advises how to proceed with "code removal" breaking change on a major release?

If we can put removals behind a FF that is ideal, as we can communicate as common date to our users on when things will happen. Right now it's hard, as it's based on when the change lands. This means users start to see breaking changes streaming in at various points in the 13.0 dev cycle, which is not ideal.

So if we can go with something like: gitlab-com/Product#1127 (closed) that would be best. Understand that it may be difficult to do now within the timeframe of 13.0, or at all in some cases depending on the change.

For next year though, I think we need to have a much clearer breaking change messaging to our users. We will have far higher paying customers on GitLab.com.

Thank you @tmaczukin and @joshlambert

As I understand, our first option is to put this change behind a FF enabled by default.

So I'll go with this way.

Also, does this issue need to be in the breaking change list of that issue?

mentioned in merge request !31448 (merged)

Remove "token" attribute in Runners API

Designs

Child items ...

Activity

Remove "token" attribute in Runners API

Blocks

Is blocked by

Relates to

Activity