Skip to content
Snippets Groups Projects

Make kubernetes API retries configurable

1 unresolved thread

What does this MR do?

It allows to specify limit for Kubernetes API calls instead of hardcoded limit in const

Why was this MR needed?

We run our jobs in EKS Kuberentes and use Karpenter for scaling nodes.

When using default limit, our jobs often fail with prepare environment: setting up trapping scripts on emptyDir: error dialing backend: remote error: tls: internal error

When we bumped defaultTries to 35, issue disappeared. Therefore we made this limit configurable defaulting to old limit, not to enforce higher limit for everyone.

We're running this code on on production since 6.12.2023 and successfully processed more than 6000 jobs since then.

What's the best way to test this MR?

What are the relevant issue numbers?

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • added 1 commit

    • 8597f6c6 - Change configuration option name

    Compare with previous version

  • added 1 commit

    • 3ad2dc3c - Apply changes suggested from ggeorgiev

    Compare with previous version

  • added 1 commit

    Compare with previous version

  • Georgi N. Georgiev mentioned in merge request !4529 (closed)

    mentioned in merge request !4529 (closed)

  • added 1 commit

    • dad27beb - Format code and add regenerate mockery

    Compare with previous version

  • @m.skibicki, it seems we're waiting on an action from you for approximately two weeks.

    This message was generated automatically. You're welcome to improve it.

  • added 1 commit

    • ac937c55 - Check if the KubernetesConfig instance is nil

    Compare with previous version

  • added 1 commit

    Compare with previous version

  • added 1 commit

    Compare with previous version

  • Michał Skibicki added 202 commits

    added 202 commits

    • 1ba30613...9000c3c0 - 195 commits from branch gitlab-org:main
    • 2b82c015 - Make kubernetes API retries configurable
    • 023d3914 - Change configuration option name
    • 47f9589a - Apply changes suggested from ggeorgiev
    • f6dec965 - Format code and add regenerate mockery
    • d32825e9 - Check if the KubernetesConfig instance is nil
    • 7e4a55e9 - Fix whitespaces
    • 5af44ad3 - Correct integration tests

    Compare with previous version

  • added 1 commit

    Compare with previous version

  • added 1 commit

    • 39549eb7 - Add new configuration option to docs

    Compare with previous version

  • Michał Skibicki resolved all threads

    resolved all threads

  • Author Contributor
  • Hi @aqualls! Please review this documentation merge request. This message was generated automatically. You're welcome to improve it.

  • requested review from @aqualls

  • @ggeorgiev_gitlab @aqualls, this Community contribution is ready for review.

    • Do you have capacity and domain expertise to review this? If not, find one or more reviewers and assign to them.
    • If you've reviewed it, add the workflowin dev label if these changes need more work before the next review.

    This message was generated automatically. You're welcome to improve it.

    • Resolved by Fiona Neill

      @fneill I've left some minor changes here, but I'm not knowledgeable enough about Runner to be a truly helpful reviewer. Sending your way. (Aside: I see a multiple style improvements in common/config.go to lines that already exist. Lots of small polish.)

  • Amy Qualls requested review from @fneill and removed review request for @aqualls

    requested review from @fneill and removed review request for @aqualls

  • added 1 commit

    • 79eeaa60 - Optimize function ; correct typo

    Compare with previous version

  • added 1 commit

    Compare with previous version

  • Michał Skibicki resolved all threads

    resolved all threads

  • Noting that a large GitLab Ultimate customer is currently facing the remote error: tls: internal error error in their Karpenter/EKS runner setup when higher numbers of jobs run at once and that this MR may help them avoid the disruption they are facing (ZD internal link).

  • requested review from @ggeorgiev_gitlab

  • mentioned in issue #37244 (closed)

  • Fiona Neill approved this merge request

    approved this merge request

  • Thanks @m.skibicki for this contribution. The changes here LGTM, approving from my side.

  • Fiona Neill removed review request for @fneill

    removed review request for @fneill

  • Fiona Neill resolved all threads

    resolved all threads

  • Fiona Neill mentioned in issue #37331

    mentioned in issue #37331

  • Georgi N. Georgiev started a merge train

    started a merge train

  • Georgi N. Georgiev changed milestone to %16.9

    changed milestone to %16.9

  • mentioned in issue #37349 (closed)

  • mentioned in commit 25aa6fe5

  • @m.skibicki, how was your code review experience with this merge request? Please tell us how we can continue to iterate and improve:

    1. React with a :thumbsup: or a :thumbsdown: on this comment to describe your experience.
    2. Create a new comment starting with @gitlab-bot feedback below, and leave any additional feedback you have for us in the comment.

    Subscribe to the GitLab Community Newsletter for contributor-focused content and opportunities to level up.

    Thanks for your help! :heart:

    This message was generated automatically. You're welcome to improve it.

  • @m.skibicki, congratulations for getting your first MR merged :tada:

    If this is your first MR against a GitLab project, we'd like to invite and encourage you to self-nominate yourself for First MR Merged swag prize here.

    Thank you again for contributing, what's your next contribution going to be? :thinking:

    This message was generated automatically. You're welcome to improve it.

  • Please register or sign in to reply
    Loading