Native Step Runner Integration for Docker Executor
What does this MR do?
This MR adds native step-runner integration to the docker executor. "Native integration" here means the docker executor will run steps via the step-runner gRPC service.
The bulk of this code is the Dialer
implementation that tunnels gRPC request through a text-based docker exec
call. This code is absolutely heinous, and mostly involves reconciling the type required by grpc.WithContextDialer()
and the type returned by ContainerExecAttach()
. This code will eventually be moved to a new step-runner-client
project, but it lived here for now.
The rest of this code is plumbing; detecting when native steps integration is enables and adjusting the execution path accordingly to use the new execution code.
Enabling this feature is done via a single feature flag, FF_USE_NATIVE_STEPS
. This flag enables/disables the feature for all executors. We could also do it on a per-executor basis via executor config. I'm amenable to changing the flag name if other's don't like it.
There's also an integration test to run steps jobs. There are a handful of cases there.
Notes:
- I've deliberately kept the changes to the execution path to a minimum to avoid breaking other executors.
- This MR assumes the step-runner binary is in the execution environment
- This MR is best reviewed commit-at-a-time; each commits has lots of explanations.
Why was this MR needed?
To implement
- Create RunRequest payload from CI job (#37413 - closed)
- Execute CI job via Step Runner service - Docker... (#37414 - closed)
What's the best way to test this MR?
There's an integration test in this MR. ALternatively one could run a pipeline like so (see https://gitlab.com/avonbertoldi/test-project/-/tree/step-runner-docker-integration).
Click to expand
stages:
- test
variables:
FF_USE_NATIVE_STEPS: true
script steps:
stage: test
image:
name: registry.gitlab.com/gitlab-org/step-runner:v0
run:
- name: script1
script: pwd
- name: script2
script: env
- name: script3
script: ls -Rlah --ignore .git ../
local steps:
stage: test
image:
name: registry.gitlab.com/gitlab-org/step-runner:v0
run:
- name: ls
step: ./steps/ls
inputs:
opts: -lah
- name: local_echo
step: ./steps/echo
inputs:
message: Oh hello jimmy!?
# This jobs requires `privileged = true` on the executor
# I haven't got this job to pass yet
action step:
stage: test
services:
- docker:24.0.5-dind
variables:
DOCKER_HOST: tcp://docker:2376
image:
name: registry.gitlab.com/gitlab-org/step-runner:v0
run:
- name: action
step: ./steps/action
remote step:
stage: test
image:
name: registry.gitlab.com/gitlab-org/step-runner:v0
run:
- name: remote_hello_world
step: "https://gitlab.com/gitlab-org/ci-cd/runner-tools/echo-step@91141a6e"
inputs:
echo: hello world
What are the relevant issue numbers?
Merge request reports
Activity
changed milestone to %17.5
requested review from @ajwalker, @cam_swords, and @josephburnett
assigned to @avonbertoldi
added sectionci label and removed sectionops label
2 Warnings This merge request is quite big (1016 lines changed), please consider splitting it into multiple merge requests. This merge request includes more than 10 commits. Each commit should meet the following criteria: - Have a well-written commit message.
- Has all tests passing when used on its own (e.g. when using git checkout SHA).
- Can be reverted on its own without also requiring the revert of commit that came before it.
- Is small enough that it can be reviewed in isolation in under 30 minutes or so.
If this merge request contains commits that do not meet this criteria and/or contains intermediate work, please rebase these commits into a smaller number of commits or split this merge request into multiple smaller merge requests.
Reviewer roulette
Changes that require review have been detected! A merge request is normally reviewed by both a reviewer and a maintainer in its primary category and by a maintainer in all other categories.
To spread load more evenly across eligible reviewers, Danger has picked a candidate for each review slot. Feel free to override these selections if you think someone else would be better-suited or use the GitLab Review Workload Dashboard to find other available reviewers.
To read more on how to use the reviewer roulette, please take a look at the Engineering workflow and code review guidelines. Please consider assigning a reviewer or maintainer who is a domain expert in the area of the merge request.
Once you've decided who will review this merge request, mention them as you normally would! Danger does not automatically notify them for you.
Reviewer Maintainer @ajwalker
(UTC+0, 7 hours ahead of author)
@josephburnett
(UTC-8, 1 hour behind author)
If needed, you can retry the
danger-review
job that generated this comment.Generated by
DangerEdited by ****added 50 commits
-
513f3ef4...5b06439e - 37 commits from branch
main
- 5b06439e...08f93821 - 3 earlier commits
- 2c568e12 - Adjust Build.Steps when steps integration enabled
- bdf5ede9 - Dialer implementation that tunnels though docker exec API
- 8712ba2c - Steps.Docker implementation that uses docker tunneling dialer
- 3281c6e5 - Function to create steps RunRequest from Build
- 6e0e2268 - Use alternate exec.Docker implementation on the build container
- eea0403c - Change build container entrypoint/cmd when steps-integration is enabled
- 8d32796f - Test utils to create JobResponse object that uses run keyword
- 47be6c8a - Docker executor integration tests
- 119aedf6 - Go module additions
- 40fc133c - Add and set step-integration feature info
Toggle commit list-
513f3ef4...5b06439e - 37 commits from branch
added 15 commits
-
40fc133c...746cac55 - 2 commits from branch
main
- 746cac55...f4d5ad25 - 3 earlier commits
- 7ea239a0 - Adjust Build.Steps when steps integration enabled
- 8f46f944 - Dialer implementation that tunnels though docker exec API
- 6d5d8d56 - Steps.Docker implementation that uses docker tunneling dialer
- afb9985d - Function to create steps RunRequest from Build
- dde90005 - Use alternate exec.Docker implementation on the build container
- 5eac57c8 - Change build container entrypoint/cmd when steps-integration is enabled
- 6cecffbd - Test utils to create JobResponse object that uses run keyword
- 2bb7e851 - Docker executor integration tests
- 29e14650 - Go module additions
- 22dd2f8a - Add and set step-integration feature info
Toggle commit list-
40fc133c...746cac55 - 2 commits from branch
added 19 commits
-
08793dae...550e383d - 6 commits from branch
main
- 550e383d...d7384f6b - 3 earlier commits
- 5827c60c - Adjust Build.Steps when steps integration enabled
- daf9d547 - Dialer implementation that tunnels though docker exec API
- 1f9fe042 - Steps.Docker implementation that uses docker tunneling dialer
- dd9ed985 - Function to create steps RunRequest from Build
- 271afbc3 - Use alternate exec.Docker implementation on the build container
- 650d7774 - Change build container entrypoint/cmd when steps-integration is enabled
- a168c3c9 - Test utils to create JobResponse object that uses run keyword
- bde517c8 - Docker executor integration tests
- 2130493a - Go module additions
- e92fec06 - Add and set step-integration feature info
Toggle commit list-
08793dae...550e383d - 6 commits from branch
- Resolved by Cameron Swords
- Resolved by Axel von Bertoldi
- Resolved by Axel von Bertoldi
- Resolved by Cameron Swords
@avonbertoldi, during testing, I noticed job variables are merged into environment variables. I haven't looked into whether this is the step-runner or this MR, but I think we should fix this before merging this MR. Thoughts?
The following job:
job: variables: FF_USE_NATIVE_STEPS: true run: - name: echo_env step: ./steps/env
And
./steps/env/step.yml
step:# don't do this on a public machine else secrets will be leaked spec: --- exec: command: [env]
- Resolved by Cameron Swords
- Resolved by Cameron Swords
This is what an error looks like when the step fails to execute:
ERROR: Job failed (system failure): container exec on "d71da9e129c31dc37d3451a6eb3711177170326501b89a74e67887dfa6440720" finished with: executing step request: following job "8081479833": following logs: rpc error: code = Unknown desc = failed to run lazily-evaluated step "run_echo": failed to load: loading file "/builds/cam_swords/test-steps2/step.yml": validating step: jsonschema: '' does not validate with https://gitlab.com/gitlab-org/step-runner/schema/v1/step#/$ref/oneOf/0/required: missing properties: 'name', 'step'
Is there a way we could error without saying
system failure
?
- Resolved by Axel von Bertoldi
- Resolved by Axel von Bertoldi
- Resolved by Cameron Swords
This is a start @avonbertoldi, I'll look at it some more over the coming days.
Edited by Cameron Swords- Resolved by Axel von Bertoldi
Enabling this feature is done via a single feature flag,
FF_USE_NATIVE_STEPS
. This flag enables/disables the feature for all executors. We could also do it on a per-executor basis via executor config. I'm amenable to changing the flag name if other's don't like it.The alternative is feature negotiation with GitLab, which is a mechanism we already have and is per executor. I think the behaviour is that GitLab won't schedule a job to a runner if the necessary feature is not negotiated. We've probably complicated matters by supporting
run
no matter the executor elsewhere though.In general, I don't really like
build
knowing about the executor, because it's tightly coupling. We can probably let it slide because it's not like we're decoupled, but, we are adding to the problem. If we are going to do that though, we might want to adddocker
,docker-windows
anddocker-autoscaler
: they all use the docker executor.
- Resolved by Cameron Swords
- Resolved by Axel von Bertoldi
- Resolved by Cameron Swords
- Resolved by Axel von Bertoldi
- Resolved by Axel von Bertoldi
Exciting stuff @avonbertoldi! I've done some testing, looking pretty good. The biggest concern is the job vars being available in env, other than that, a better error message would be appreciated. We can create an issue for the error message.
added 21 commits
-
e92fec06...cc9ee87b - 8 commits from branch
main
- cc9ee87b...e8596b5b - 3 earlier commits
- fc2dd0ef - Adjust Build.Steps when steps integration enabled
- 87009432 - Dialer implementation that tunnels though docker exec API
- 04b4c85e - Steps.Docker implementation that uses docker tunneling dialer
- e66eb32e - Function to create steps RunRequest from Build
- 40c38883 - Use alternate exec.Docker implementation on the build container
- 7b341448 - Change build container entrypoint/cmd when steps-integration is enabled
- fd1a0193 - Test utils to create JobResponse object that uses run keyword
- 9c6c427a - Docker executor integration tests
- 10b0bae7 - Go module additions
- b07f061c - Add and set step-integration feature info
Toggle commit list-
e92fec06...cc9ee87b - 8 commits from branch
changed milestone to %17.6
mentioned in merge request step-runner!123 (merged)
added 32 commits
-
b07f061c...3c49d66c - 18 commits from branch
main
- 3c49d66c...84175544 - 4 earlier commits
- 79269f2b - Dialer implementation that tunnels though docker exec API
- df6ae976 - Steps.Docker implementation that uses docker tunneling dialer
- 3fab814b - Function to create steps RunRequest from Build
- 10ca3b02 - Use alternate exec.Docker implementation on the build container
- 3a6d2b77 - Change build container entrypoint/cmd when steps-integration is enabled
- d620bb93 - Test utils to create JobResponse object that uses run keyword
- ee306187 - Docker executor integration tests
- 3816ed14 - Go module additions
- dc624d00 - Add and set step-integration feature info
- fc25f168 - Do no inject job variables into the build container environment
Toggle commit list-
b07f061c...3c49d66c - 18 commits from branch
added 21 commits
-
fc25f168...feebc783 - 4 commits from branch
main
- feebc783...81f29616 - 7 earlier commits
- 54e99df5 - Use alternate exec.Docker implementation on the build container
- 524efd86 - Change build container entrypoint/cmd when steps-integration is enabled
- 7e1173cc - Test utils to create JobResponse object that uses run keyword
- 4cb4e64a - Docker executor integration tests
- e6fc0b09 - Go module additions
- 498f845e - Add and set step-integration feature info
- 2a3cc813 - Do no inject job variables into the build container environment
- 802b6f13 - Add stderrOmitWriter type
- bf8a85bb - Rename stdErrOmitWriter to omitWriter
- 75f7e9f7 - Use omitWriter to capture container exec stderr
Toggle commit list-
fc25f168...feebc783 - 4 commits from branch
added 16 commits
- 75f7e9f7...80cde740 - 6 earlier commits
- 44cc5c4c - Use alternate exec.Docker implementation on the build container
- 9b0a5f88 - Change build container entrypoint/cmd when steps-integration is enabled
- 8330be17 - Test utils to create JobResponse object that uses run keyword
- c4a0a091 - Docker executor integration tests
- 9d810248 - Go module additions
- 040edabd - Add and set step-integration feature info
- 0433ab9f - Do no inject job variables into the build container environment
- 684a0ef8 - Add stderrOmitWriter type
- ff15ecb0 - Rename stdErrOmitWriter to omitWriter
- d0390376 - Use omitWriter to capture container exec stderr
Toggle commit listadded 5 commits
- b1e0063c - Methods to determine if an executor supports native steps execution
- c2803682 - Method to determine if native steps execution was requested
- c2df507d - Method to validate steps request, and determine how to execute it
- edcdcf00 - Use ValidateStepsJobRequest method instead of StepsShim
- fe300f39 - Remove now obsolete code
Toggle commit listadded 8 commits
- 8f79d47d - Add stderrOmitWriter type
- 12f681b6 - Rename stdErrOmitWriter to omitWriter
- 06749dac - Use omitWriter to capture container exec stderr
- 77c64197 - Methods to determine if an executor supports native steps execution
- 8c655920 - Method to determine if native steps execution was requested
- e7faf929 - Method to validate steps request, and determine how to execute it
- f79bb3f8 - Use ValidateStepsJobRequest method instead of StepsShim
- 131d61f5 - Remove now obsolete code
Toggle commit listadded 35 commits
-
131d61f5...f9e2d14d - 13 commits from branch
main
- f9e2d14d...22d9c3ad - 12 earlier commits
- 4ec60c0b - Add and set step-integration feature info
- 223bd929 - Do no inject job variables into the build container environment
- d69afcd2 - Add stderrOmitWriter type
- c29cac8b - Rename stdErrOmitWriter to omitWriter
- 608bce32 - Use omitWriter to capture container exec stderr
- adaec922 - Methods to determine if an executor supports native steps execution
- 34fa8ba1 - Method to determine if native steps execution was requested
- 6b9ea3a2 - Method to validate steps request, and determine how to execute it
- 53663f64 - Use ValidateStepsJobRequest method instead of StepsShim
- 7950f951 - Remove now obsolete code
Toggle commit list-
131d61f5...f9e2d14d - 13 commits from branch
added 12 commits
- fa3703cf...3c86edbc - 2 earlier commits
- 32347808 - Add and set step-integration feature info
- 06a5867f - Do no inject job variables into the build container environment
- d36082c1 - Add stderrOmitWriter type
- a16db829 - Rename stdErrOmitWriter to omitWriter
- 4d8e2c8f - Use omitWriter to capture container exec stderr
- ae9e8f8c - Methods to determine if an executor supports native steps execution
- 4e0e0744 - Method to determine if native steps execution was requested
- 89f1ea7b - Method to validate steps request, and determine how to execute it
- bbb9fbf9 - Use ValidateStepsJobRequest method instead of StepsShim
- fcf24754 - Remove now obsolete code
Toggle commit list- Resolved by Cameron Swords
- Resolved by Cameron Swords
- Resolved by Axel von Bertoldi
- Resolved by Cameron Swords
- Resolved by Axel von Bertoldi
added 27 commits
-
fcf24754...8c8f546e - 10 commits from branch
main
- 8c8f546e...81414597 - 7 earlier commits
- f4426ecb - Steps.Docker implementation that uses docker tunneling dialer
- c1ac3d26 - Function to create steps RunRequest from Build
- 284163eb - Method to validate steps request, and determine how to execute it
- b7fad0fb - Wire it all up
- c45239cf - Remove now obsolete StepShim
- fab1e8e4 - Test utils to create JobResponse object that uses run keyword
- 83583402 - Docker executor integration tests
- 1cbbdc47 - Go module additions
- 6d12e21d - Fix test-compile jobs
- 12a9d8aa - Add job to run steps integration tests
Toggle commit list-
fcf24754...8c8f546e - 10 commits from branch
added 14 commits
- 94b933aa...e514a2fa - 4 earlier commits
- d18d5c3c - Steps.Docker implementation that uses docker tunneling dialer
- 51bc9425 - Function to create steps RunRequest from Build
- a2236137 - Method to validate steps request, and determine how to execute it
- dfc989b0 - Wire it all up
- 4de9767e - Remove now obsolete StepShim
- 3c45e34e - Test utils to create JobResponse object that uses run keyword
- 86a247e0 - Function to skip tests if job variable set
- a59a595b - Docker executor integration tests
- 990e94bc - Go module additions
- d8cf6d1b - Fix test-compile jobs
Toggle commit listadded 29 commits
-
d8cf6d1b...34a4d6ed - 12 commits from branch
main
- 34a4d6ed...2adac980 - 7 earlier commits
- 88845f03 - Steps.Docker implementation that uses docker tunneling dialer
- bac5f83b - Function to create steps RunRequest from Build
- f9c5e6af - Method to validate steps request, and determine how to execute it
- 40510baa - Wire it all up
- a1bfb4c4 - Remove now obsolete StepShim
- c0de782a - Test utils to create JobResponse object that uses run keyword
- 2cc97d4f - Function to skip tests if job variable set
- d7a01220 - Docker executor integration tests
- 5b43b917 - Go module additions
- 2ad95b59 - Fix test-compile jobs
Toggle commit list-
d8cf6d1b...34a4d6ed - 12 commits from branch
aborted automatic add to merge train because the source branch was updated. Learn more.
added 35 commits
-
2ad95b59...118db751 - 17 commits from branch
main
- 118db751...70c790b9 - 8 earlier commits
- 68891153 - Function to create steps RunRequest from Build
- 803d3fd5 - Method to validate steps request, and determine how to execute it
- 6b187ad8 - Wire it all up
- 5da15811 - Remove now obsolete StepShim
- 9b531c9f - Test utils to create JobResponse object that uses run keyword
- a2e82459 - Function to skip tests if job variable set
- 4dc0a26a - Docker executor integration tests
- 6876e5f1 - Go module additions
- ee4df918 - Fix test-compile jobs
- 4b69d66e - Add documentation for the docker executor
Toggle commit list-
2ad95b59...118db751 - 17 commits from branch
@cam_swords I have added one more commit with executor-level documentation.
@rsarangadharan could you review the documentation changes too please
Two suggestions, other than that LGTM @avonbertoldi
Thanks, @avonbertoldi!
- Resolved by Axel von Bertoldi
- Resolved by Axel von Bertoldi
- Resolved by Axel von Bertoldi
mentioned in commit step-runner@4b11b136
mentioned in merge request step-runner!138 (merged)
mentioned in merge request !4555 (closed)
added 12 commits
- d99b3cec...8874b7e2 - 2 earlier commits
- fc43f1fa - Function to create steps RunRequest from Build
- 12073c19 - Method to validate steps request, and determine how to execute it
- 386af07f - Wire it all up
- 47dc6a98 - Remove now obsolete StepShim
- 2f472403 - Test utils to create JobResponse object that uses run keyword
- ab820e7c - Function to skip tests if job variable set
- e3319d17 - Docker executor integration tests
- a2a86543 - Go module additions
- 794c76bf - Fix test-compile jobs
- 244d75fb - Add documentation for the docker executor
Toggle commit list- Resolved by Axel von Bertoldi
- Resolved by Axel von Bertoldi
- Resolved by Axel von Bertoldi
- Resolved by Axel von Bertoldi
- Resolved by Axel von Bertoldi
- Resolved by Axel von Bertoldi
- Resolved by Axel von Bertoldi
@avonbertoldi I've added some suggestions. Thanks!
added 21 commits
-
244d75fb...13b970f2 - 3 commits from branch
main
- 13b970f2...70e18ce8 - 8 earlier commits
- 4192eee6 - Function to create steps RunRequest from Build
- 3d2b4d28 - Method to validate steps request, and determine how to execute it
- 45f3bf18 - Wire it all up
- 429459cc - Remove now obsolete StepShim
- 6abf718f - Test utils to create JobResponse object that uses run keyword
- 37eb581e - Function to skip tests if job variable set
- 30b64e96 - Docker executor integration tests
- 12dfd00e - Go module additions
- 9bc6c36f - Fix test-compile jobs
- e03b3770 - Add documentation for the docker executor
Toggle commit list-
244d75fb...13b970f2 - 3 commits from branch
The integration tests are failing because we have (elsewhere) broken a step used by the integration test (http://gitlab.com/gitlab-org/ci-cd/runner-tools/echo-step)
added 20 commits
-
dbff81c0...f86f5af4 - 2 commits from branch
main
- f86f5af4...f79ee7a3 - 8 earlier commits
- 368d5924 - Function to create steps RunRequest from Build
- f90c363a - Method to validate steps request, and determine how to execute it
- 50bb95a9 - Wire it all up
- 8991ab23 - Remove now obsolete StepShim
- 9ee5a3cc - Test utils to create JobResponse object that uses run keyword
- 805f37eb - Function to skip tests if job variable set
- a0143a56 - Docker executor integration tests
- ce23e5e2 - Go module additions
- c83f0980 - Fix test-compile jobs
- 50ad7704 - Add documentation for the docker executor
Toggle commit list-
dbff81c0...f86f5af4 - 2 commits from branch
added 17 commits
- 50ad7704...34fdd328 - 7 earlier commits
- 1c79dd83 - Function to create steps RunRequest from Build
- a6319b6e - Method to validate steps request, and determine how to execute it
- 7f87cb42 - Wire it all up
- 054243c1 - Remove now obsolete StepShim
- 4f2ee8c5 - Test utils to create JobResponse object that uses run keyword
- 5432f630 - Function to skip tests if job variable set
- 022da934 - Docker executor integration tests
- 88f98baf - Go module additions
- 50ba2f0b - Fix test-compile jobs
- 4a4245bf - Add documentation for the docker executor
Toggle commit listmentioned in commit 6a770dd2
mentioned in merge request !5140 (merged)
mentioned in commit 1307cda7
- Resolved by Axel von Bertoldi
- Resolved by Axel von Bertoldi
@avonbertoldi I left two more suggestions to remove the term currently. I've also suggested adding a blank line at the end of the topic to avoid lint errors. Please fix them when you get some time.
Thanks @rsarangadharan , suggestions incorporated.
added Technical Writing documentation labels
added 39 commits
-
4a4245bf...bfcec06a - 21 commits from branch
main
- bfcec06a...96573820 - 8 earlier commits
- a7f1d29d - Function to create steps RunRequest from Build
- 653e40bf - Method to validate steps request, and determine how to execute it
- 0efe0978 - Wire it all up
- cb6fab9b - Remove now obsolete StepShim
- 4f0d872c - Test utils to create JobResponse object that uses run keyword
- f94b055b - Function to skip tests if job variable set
- 147bfdc5 - Docker executor integration tests
- b02824e4 - Go module additions
- cf19ad61 - Fix test-compile jobs
- 3a9288e6 - Add documentation for the docker executor
Toggle commit list-
4a4245bf...bfcec06a - 21 commits from branch
requested review from @rsarangadharan
aborted automatic add to merge train because the source branch was updated. Learn more.
added 32 commits
-
b6b248fe...c931b6c4 - 14 commits from branch
main
- c931b6c4...2480bdb5 - 8 earlier commits
- 356ffad2 - Function to create steps RunRequest from Build
- 210afbaf - Method to validate steps request, and determine how to execute it
- b17cbfa5 - Wire it all up
- 0b4af8c8 - Remove now obsolete StepShim
- da15326a - Test utils to create JobResponse object that uses run keyword
- 1ecbc35b - Function to skip tests if job variable set
- 261ebe4f - Docker executor integration tests
- 294a589b - Go module additions
- 6101931a - Fix test-compile jobs
- 4f75f6f3 - Add documentation for the docker executor
Toggle commit list-
b6b248fe...c931b6c4 - 14 commits from branch
added this merge request to the merge train at position 2
mentioned in commit bb5fd4f1
mentioned in commit ab47be56
mentioned in commit 25518873
mentioned in commit 974d26c3
mentioned in merge request jointeg/infrastructure/compose-services!113 (merged)