Inconsistency in '*_get_sources' hooks usage between 'none' and 'empty' Git Strategies.
With Add GIT_STRATEGY of "empty" to support clean bu... (!4889 - merged) • Nathan Cain • 17.3 we've added a new Git strategy that Runner now supports - empty
.
The difference between the new empty
and already existing none
is that while none
skips Git sources handling, it keeps the local working directory untouched. That means if someone uses runner that re-uses the job environment, there may still be files and Git repository present from a previous job that was executed with a clone
or fetch
strategy. The empty
strategy, however, removes the local working directory entirely and recreates it without any Git repository nor any other files present at all.
There is, however, another unexpected difference that causes an inconsistency in the execution flow. We have two hooks that can be executed in context of the get_sources
step: pre_get_sources
and post_get_sources
.
For fetch
and clone
strategies the expectation is that if these hooks are specified, the code is injected into the get_sources
script before and/or after the "internal" code that handles git update.
For none
strategy we've didn't include them, however. That's a decision made years ago, and I don't remember if that was a conscious decision (which kind of make sense - with none
strategy there is no sources update so also should be no pre/post execution) or whether it just came like that. But the fact is that for the none
strategy these two hooks are not executed.
That's different with the empty
strategy. As we found through GIT_STRATEGY: empty fails with "fatal: not a gi... (gitlab-org/ci-cd/shared-runners/infrastructure#281 - closed), empty
strategy executes at least the post_get_sources
script. And I'm pretty sure it also executes the pre_get_sources
one.
And that's an inconsistency that may be confusing users. The nature of none
and empty
strategies is very similar. Both assume that no interaction with remote Git repository is done. empty
just enforces a working copy cleanup before the job is started, while none
uses the working copy as-is. But the outcome is the same - we have no Git updates and if the job lands on the totally fresh environment (like in many autoscaling setups) - there are no local files at all.
And at least post_get_sources
script may do something that expects the repository to exists, which was the cause of failures reported at GIT_STRATEGY: empty fails with "fatal: not a gi... (gitlab-org/ci-cd/shared-runners/infrastructure#281 - closed).
I think we should remove that inconsistency. Hooks should be either executed always or should be not executed at all for both none
and empty
. And if we will choose the first approach we should write a clear warning that writing a script for the post_get_sources
hook one should check whether the repository exists (if they want to interact with it), mentioning that the repository may not be present when none
or empty
strategy is used.
I'm marking this as typebug bugfunctional because the inconsistency in this behavior is and will be causing confusion that may lead to problems affecting users. Just like it did in gitlab-org/ci-cd/shared-runners/infrastructure#281 (closed). It was done post factum as the scripting on our Hosted Runners was introduced way before empty
was created... however it was designed with expectation that when no repository exists - because of none
- the hook will be not executed.