Skip to content
Snippets Groups Projects

Native Step Runner Integration for Docker Executor

Merged Axel von Bertoldi requested to merge avonbertoldi/47414/steps-integration-docker into main
2 unresolved threads

What does this MR do?

This MR adds native step-runner integration to the docker executor. "Native integration" here means the docker executor will run steps via the step-runner gRPC service.

The bulk of this code is the Dialer implementation that tunnels gRPC request through a text-based docker exec call. This code is absolutely heinous, and mostly involves reconciling the type required by grpc.WithContextDialer() and the type returned by ContainerExecAttach(). This code will eventually be moved to a new step-runner-client project, but it lived here for now.

The rest of this code is plumbing; detecting when native steps integration is enables and adjusting the execution path accordingly to use the new execution code.

Enabling this feature is done via a single feature flag, FF_USE_NATIVE_STEPS. This flag enables/disables the feature for all executors. We could also do it on a per-executor basis via executor config. I'm amenable to changing the flag name if other's don't like it.

There's also an integration test to run steps jobs. There are a handful of cases there.

Notes:

  • I've deliberately kept the changes to the execution path to a minimum to avoid breaking other executors.
  • This MR assumes the step-runner binary is in the execution environment
  • This MR is best reviewed commit-at-a-time; each commits has lots of explanations.

Why was this MR needed?

To implement

What's the best way to test this MR?

There's an integration test in this MR. ALternatively one could run a pipeline like so (see https://gitlab.com/avonbertoldi/test-project/-/tree/step-runner-docker-integration).

Click to expand
stages:
  - test

variables:
  FF_USE_NATIVE_STEPS: true

script steps:
  stage: test
  image:
    name: registry.gitlab.com/gitlab-org/step-runner:v0
  run:
    - name: script1
      script: pwd
    - name: script2
      script: env
    - name: script3
      script: ls -Rlah --ignore .git ../

local steps:
  stage: test
  image:
    name: registry.gitlab.com/gitlab-org/step-runner:v0
  run:
    - name: ls
      step: ./steps/ls
      inputs:
        opts: -lah
    - name: local_echo
      step: ./steps/echo
      inputs:
        message: Oh hello jimmy!?

# This jobs requires `privileged = true` on the executor
# I haven't got this job to pass yet
action step:
  stage: test
  services:
    - docker:24.0.5-dind
  variables:
    DOCKER_HOST: tcp://docker:2376
  image:
    name: registry.gitlab.com/gitlab-org/step-runner:v0
  run:
    - name: action
      step: ./steps/action

remote step:
  stage: test
  image:
    name: registry.gitlab.com/gitlab-org/step-runner:v0
  run:
    - name: remote_hello_world
      step: "https://gitlab.com/gitlab-org/ci-cd/runner-tools/echo-step@91141a6e"
      inputs:
        echo: hello world

What are the relevant issue numbers?

Edited by Axel von Bertoldi

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Cameron Swords
    • Resolved by Cameron Swords

      @avonbertoldi, during testing, I noticed job variables are merged into environment variables. I haven't looked into whether this is the step-runner or this MR, but I think we should fix this before merging this MR. Thoughts?

      The following job:

      job:
        variables:
          FF_USE_NATIVE_STEPS: true
        run:
          - name: echo_env
            step: ./steps/env

      And ./steps/env/step.yml step:

      # don't do this on a public machine else secrets will be leaked
      spec:
      ---
      exec:
        command: [env]
    • Resolved by Cameron Swords

      This is what an error looks like when the step fails to execute:

      ERROR: Job failed (system failure): container exec on "d71da9e129c31dc37d3451a6eb3711177170326501b89a74e67887dfa6440720" 
      finished with: executing step request: following job "8081479833": 
      following logs: rpc error: code = Unknown desc = failed to run lazily-evaluated 
      step "run_echo": failed to load: loading file "/builds/cam_swords/test-steps2/step.yml": 
      validating step: jsonschema: '' does not validate with 
      https://gitlab.com/gitlab-org/step-runner/schema/v1/step#/$ref/oneOf/0/required: 
      missing properties: 'name', 'step'

      Is there a way we could error without saying system failure?

  • Cameron Swords
  • This is a start @avonbertoldi, I'll look at it some more over the coming days.

    Edited by Cameron Swords
  • Cameron Swords requested changes

    requested changes

    • Resolved by Axel von Bertoldi

      Enabling this feature is done via a single feature flag, FF_USE_NATIVE_STEPS. This flag enables/disables the feature for all executors. We could also do it on a per-executor basis via executor config. I'm amenable to changing the flag name if other's don't like it.

      The alternative is feature negotiation with GitLab, which is a mechanism we already have and is per executor. I think the behaviour is that GitLab won't schedule a job to a runner if the necessary feature is not negotiated. We've probably complicated matters by supporting run no matter the executor elsewhere though.

      In general, I don't really like build knowing about the executor, because it's tightly coupling. We can probably let it slide because it's not like we're decoupled, but, we are adding to the problem. If we are going to do that though, we might want to add docker, docker-windows and docker-autoscaler: they all use the docker executor.

  • Cameron Swords
  • Cameron Swords
  • added 21 commits

    • e92fec06...cc9ee87b - 8 commits from branch main
    • cc9ee87b...e8596b5b - 3 earlier commits
    • fc2dd0ef - Adjust Build.Steps when steps integration enabled
    • 87009432 - Dialer implementation that tunnels though docker exec API
    • 04b4c85e - Steps.Docker implementation that uses docker tunneling dialer
    • e66eb32e - Function to create steps RunRequest from Build
    • 40c38883 - Use alternate exec.Docker implementation on the build container
    • 7b341448 - Change build container entrypoint/cmd when steps-integration is enabled
    • fd1a0193 - Test utils to create JobResponse object that uses run keyword
    • 9c6c427a - Docker executor integration tests
    • 10b0bae7 - Go module additions
    • b07f061c - Add and set step-integration feature info

    Compare with previous version

  • Axel von Bertoldi changed milestone to %17.6

    changed milestone to %17.6

  • mentioned in merge request step-runner!123 (merged)

  • Cameron Swords requested changes

    requested changes

  • added 32 commits

    • b07f061c...3c49d66c - 18 commits from branch main
    • 3c49d66c...84175544 - 4 earlier commits
    • 79269f2b - Dialer implementation that tunnels though docker exec API
    • df6ae976 - Steps.Docker implementation that uses docker tunneling dialer
    • 3fab814b - Function to create steps RunRequest from Build
    • 10ca3b02 - Use alternate exec.Docker implementation on the build container
    • 3a6d2b77 - Change build container entrypoint/cmd when steps-integration is enabled
    • d620bb93 - Test utils to create JobResponse object that uses run keyword
    • ee306187 - Docker executor integration tests
    • 3816ed14 - Go module additions
    • dc624d00 - Add and set step-integration feature info
    • fc25f168 - Do no inject job variables into the build container environment

    Compare with previous version

  • added 21 commits

    • fc25f168...feebc783 - 4 commits from branch main
    • feebc783...81f29616 - 7 earlier commits
    • 54e99df5 - Use alternate exec.Docker implementation on the build container
    • 524efd86 - Change build container entrypoint/cmd when steps-integration is enabled
    • 7e1173cc - Test utils to create JobResponse object that uses run keyword
    • 4cb4e64a - Docker executor integration tests
    • e6fc0b09 - Go module additions
    • 498f845e - Add and set step-integration feature info
    • 2a3cc813 - Do no inject job variables into the build container environment
    • 802b6f13 - Add stderrOmitWriter type
    • bf8a85bb - Rename stdErrOmitWriter to omitWriter
    • 75f7e9f7 - Use omitWriter to capture container exec stderr

    Compare with previous version

  • added 16 commits

    • 75f7e9f7...80cde740 - 6 earlier commits
    • 44cc5c4c - Use alternate exec.Docker implementation on the build container
    • 9b0a5f88 - Change build container entrypoint/cmd when steps-integration is enabled
    • 8330be17 - Test utils to create JobResponse object that uses run keyword
    • c4a0a091 - Docker executor integration tests
    • 9d810248 - Go module additions
    • 040edabd - Add and set step-integration feature info
    • 0433ab9f - Do no inject job variables into the build container environment
    • 684a0ef8 - Add stderrOmitWriter type
    • ff15ecb0 - Rename stdErrOmitWriter to omitWriter
    • d0390376 - Use omitWriter to capture container exec stderr

    Compare with previous version

  • added 5 commits

    • b1e0063c - Methods to determine if an executor supports native steps execution
    • c2803682 - Method to determine if native steps execution was requested
    • c2df507d - Method to validate steps request, and determine how to execute it
    • edcdcf00 - Use ValidateStepsJobRequest method instead of StepsShim
    • fe300f39 - Remove now obsolete code

    Compare with previous version

  • added 8 commits

    • 8f79d47d - Add stderrOmitWriter type
    • 12f681b6 - Rename stdErrOmitWriter to omitWriter
    • 06749dac - Use omitWriter to capture container exec stderr
    • 77c64197 - Methods to determine if an executor supports native steps execution
    • 8c655920 - Method to determine if native steps execution was requested
    • e7faf929 - Method to validate steps request, and determine how to execute it
    • f79bb3f8 - Use ValidateStepsJobRequest method instead of StepsShim
    • 131d61f5 - Remove now obsolete code

    Compare with previous version

  • added 35 commits

    • 131d61f5...f9e2d14d - 13 commits from branch main
    • f9e2d14d...22d9c3ad - 12 earlier commits
    • 4ec60c0b - Add and set step-integration feature info
    • 223bd929 - Do no inject job variables into the build container environment
    • d69afcd2 - Add stderrOmitWriter type
    • c29cac8b - Rename stdErrOmitWriter to omitWriter
    • 608bce32 - Use omitWriter to capture container exec stderr
    • adaec922 - Methods to determine if an executor supports native steps execution
    • 34fa8ba1 - Method to determine if native steps execution was requested
    • 6b9ea3a2 - Method to validate steps request, and determine how to execute it
    • 53663f64 - Use ValidateStepsJobRequest method instead of StepsShim
    • 7950f951 - Remove now obsolete code

    Compare with previous version

  • added 1 commit

    Compare with previous version

  • added 1 commit

    Compare with previous version

  • added 12 commits

    • fa3703cf...3c86edbc - 2 earlier commits
    • 32347808 - Add and set step-integration feature info
    • 06a5867f - Do no inject job variables into the build container environment
    • d36082c1 - Add stderrOmitWriter type
    • a16db829 - Rename stdErrOmitWriter to omitWriter
    • 4d8e2c8f - Use omitWriter to capture container exec stderr
    • ae9e8f8c - Methods to determine if an executor supports native steps execution
    • 4e0e0744 - Method to determine if native steps execution was requested
    • 89f1ea7b - Method to validate steps request, and determine how to execute it
    • bbb9fbf9 - Use ValidateStepsJobRequest method instead of StepsShim
    • fcf24754 - Remove now obsolete code

    Compare with previous version

  • Cameron Swords
  • Cameron Swords
  • Cameron Swords
  • Cameron Swords approved this merge request

    approved this merge request

  • added 27 commits

    • fcf24754...8c8f546e - 10 commits from branch main
    • 8c8f546e...81414597 - 7 earlier commits
    • f4426ecb - Steps.Docker implementation that uses docker tunneling dialer
    • c1ac3d26 - Function to create steps RunRequest from Build
    • 284163eb - Method to validate steps request, and determine how to execute it
    • b7fad0fb - Wire it all up
    • c45239cf - Remove now obsolete StepShim
    • fab1e8e4 - Test utils to create JobResponse object that uses run keyword
    • 83583402 - Docker executor integration tests
    • 1cbbdc47 - Go module additions
    • 6d12e21d - Fix test-compile jobs
    • 12a9d8aa - Add job to run steps integration tests

    Compare with previous version

  • added 1 commit

    • 2fedc316 - Add job to run steps integration tests

    Compare with previous version

  • added 4 commits

    • 1dd0cab5 - Function to skip tests if job variable set
    • d00f88bf - Docker executor integration tests
    • 1e1b729d - Go module additions
    • 94b933aa - Fix test-compile jobs

    Compare with previous version

  • Axel von Bertoldi resolved all threads

    resolved all threads

  • added 14 commits

    • 94b933aa...e514a2fa - 4 earlier commits
    • d18d5c3c - Steps.Docker implementation that uses docker tunneling dialer
    • 51bc9425 - Function to create steps RunRequest from Build
    • a2236137 - Method to validate steps request, and determine how to execute it
    • dfc989b0 - Wire it all up
    • 4de9767e - Remove now obsolete StepShim
    • 3c45e34e - Test utils to create JobResponse object that uses run keyword
    • 86a247e0 - Function to skip tests if job variable set
    • a59a595b - Docker executor integration tests
    • 990e94bc - Go module additions
    • d8cf6d1b - Fix test-compile jobs

    Compare with previous version

  • added 29 commits

    • d8cf6d1b...34a4d6ed - 12 commits from branch main
    • 34a4d6ed...2adac980 - 7 earlier commits
    • 88845f03 - Steps.Docker implementation that uses docker tunneling dialer
    • bac5f83b - Function to create steps RunRequest from Build
    • f9c5e6af - Method to validate steps request, and determine how to execute it
    • 40510baa - Wire it all up
    • a1bfb4c4 - Remove now obsolete StepShim
    • c0de782a - Test utils to create JobResponse object that uses run keyword
    • 2cc97d4f - Function to skip tests if job variable set
    • d7a01220 - Docker executor integration tests
    • 5b43b917 - Go module additions
    • 2ad95b59 - Fix test-compile jobs

    Compare with previous version

  • Axel von Bertoldi resolved all threads

    resolved all threads

  • Axel von Bertoldi enabled automatic add to merge train when checks pass

    enabled automatic add to merge train when checks pass

  • Axel von Bertoldi aborted automatic add to merge train because the source branch was updated. Learn more.

    aborted automatic add to merge train because the source branch was updated. Learn more.

  • added 35 commits

    Compare with previous version

  • Cameron Swords
  • Cameron Swords
  • added 1 commit

    • d99b3cec - Add documentation for the docker executor

    Compare with previous version

  • mentioned in merge request step-runner!138 (merged)

  • Axel von Bertoldi mentioned in merge request !4555 (closed)

    mentioned in merge request !4555 (closed)

  • added 12 commits

    • d99b3cec...8874b7e2 - 2 earlier commits
    • fc43f1fa - Function to create steps RunRequest from Build
    • 12073c19 - Method to validate steps request, and determine how to execute it
    • 386af07f - Wire it all up
    • 47dc6a98 - Remove now obsolete StepShim
    • 2f472403 - Test utils to create JobResponse object that uses run keyword
    • ab820e7c - Function to skip tests if job variable set
    • e3319d17 - Docker executor integration tests
    • a2a86543 - Go module additions
    • 794c76bf - Fix test-compile jobs
    • 244d75fb - Add documentation for the docker executor

    Compare with previous version

  • @avonbertoldi I've added some suggestions. Thanks!

  • added 21 commits

    Compare with previous version

  • The integration tests are failing because we have (elsewhere) broken a step used by the integration test (http://gitlab.com/gitlab-org/ci-cd/runner-tools/echo-step)

  • added 4 commits

    • b2eee7a6 - Docker executor integration tests
    • abd07bc6 - Go module additions
    • 778e6e40 - Fix test-compile jobs
    • dbff81c0 - Add documentation for the docker executor

    Compare with previous version

  • added 20 commits

    Compare with previous version

  • added 17 commits

    • 50ad7704...34fdd328 - 7 earlier commits
    • 1c79dd83 - Function to create steps RunRequest from Build
    • a6319b6e - Method to validate steps request, and determine how to execute it
    • 7f87cb42 - Wire it all up
    • 054243c1 - Remove now obsolete StepShim
    • 4f2ee8c5 - Test utils to create JobResponse object that uses run keyword
    • 5432f630 - Function to skip tests if job variable set
    • 022da934 - Docker executor integration tests
    • 88f98baf - Go module additions
    • 50ba2f0b - Fix test-compile jobs
    • 4a4245bf - Add documentation for the docker executor

    Compare with previous version

  • mentioned in commit 6a770dd2

  • Axel von Bertoldi mentioned in merge request !5140 (merged)

    mentioned in merge request !5140 (merged)

  • mentioned in commit 1307cda7

  • requested changes

  • added 39 commits

    Compare with previous version

  • Axel von Bertoldi enabled automatic add to merge train when checks pass

    enabled automatic add to merge train when checks pass

  • requested review from @rsarangadharan

  • Roshni Sarangadharan approved this merge request

    approved this merge request

  • Axel von Bertoldi aborted automatic add to merge train because the source branch was updated. Learn more.

    aborted automatic add to merge train because the source branch was updated. Learn more.

  • added 1 commit

    • b6b248fe - Add documentation for the docker executor

    Compare with previous version

  • added 32 commits

    Compare with previous version

  • Romuald Atchadé approved this merge request

    approved this merge request

  • Axel von Bertoldi added this merge request to the merge train at position 2

    added this merge request to the merge train at position 2

  • mentioned in commit bb5fd4f1

  • mentioned in commit ab47be56

  • mentioned in commit 25518873

  • mentioned in commit 974d26c3

  • Please register or sign in to reply
    Loading