Skip to content

Explore xk6-exec POC implementation

📜 Summary

This issue explores the viability of implementing git-ssh performance testing using xk6-exec based on @john.mcdonnell 's recommendation. We have concerns about potential limitations (filesystem I/O under load, scalability constraints, etc.), so this POC focuses on deliberately stress-testing the approach to uncover gotchas and failure modes rather than building a production-ready solution.

🥅 Goal

  • Build a functional POC implementation using xk6-exec
  • Stress-test the approach to identify failure modes and constraints (filesystem bottlenecks, load generator stability, etc.)
  • Document discovered limitations and workarounds to inform next iteration decisions
  • Determine if identified issues are solvable within this approach or require the custom K6 plugin path

🏁 Exit Criteria

  • POC implementation completed
  • Load tests executed specifically to trigger suspected failure modes
  • Findings documented with discovered gotchas, limitations, and potential solutions
  • Clear recommendation on whether to iterate on xk6-exec approach or pivot to custom plugin development

Results

Recommendation

Continue with the xk6-exec approach. The POC successfully demonstrated:

  • Viable integration with GPT testing framework
  • Ability to test both SSH and HTTP protocols
  • No fundamental technical blockers discovered

Analysis

We successfully created a POC script that tests git clone --depth 1 via both SSH and HTTP using the native git client, comparing protocol performance. We ran multiple load levels against a GET RA 10k environment.

Here is a breakdown of the results found:

run configuration expected iterations actual iterations dropped iterations Maximum VUsers actual VUsers run duration SSH clone min/avg/max (s) HTTP clone min/avg/max (s)
30s_2rps 30 10 20 10 10 1.5 min 12.8 / 31.1 / 37.8 23.6 / 30.3 / 35.6
60s_2rps 60 11 49 10 10 1.6 min 9.3 / 30.0 / 38.5 6.0 / 30.1 / 37.6
60s_10rps 60 28 32 50 28 8.2 min 15.3 / 83.4 / 118.2 60.8 / 79.5 / 108.4
60s_20rps 60 28 32 100 28 8.3 min 13.6 / 84.0 / 111.5 63.6 / 84.0 / 111.6
60s_40rps 120 55 65 200 56 18.9 min 66.3 / 201.4 / 252.4 157.9 / 207.0 / 248.0
60s_80rps 180 83 97 400 84 32.6 min 179.1 / 349.4 / 446.6 273.4 / 350.7 / 406.0
  • Performance degraded 3x at just 10 RPS (from ~30s to ~80s clone times) and 12x at 80 RPS (to ~350s), far below the infrastructure's rated 200 RPS capacity
  • The number of VUsers never approached the configured maximums because clone operations became so slow (100-400+ seconds) that the 60-second arrival window ended before k6 could ramp additional VUs.
  • None of the runs achieved their expected iterations and thus load levels
  • SSH and HTTP clone performance tracked within 5-10% at all load levels, confirming SSH protocol overhead is NOT the bottleneck
  • Zero SSH connection failures across all tests despite concurrent connections up to 84 VUs, confirming SSH key reuse is not a limiting factor
  • The script ran better than expected and did not hit the SSH connection limits or saturate the test generator disk

Tooling findings

  • GPT can be configured to run SSH tests as well as HTTP tests.
  • The xk6-exec module is not included in GPT's k6 implementation, requiring a custom k6 build (one-time setup)

Open questions

  • What are our real KPIs for git ssh (response time? byte throughput?)
  • What git commands are of interest?
    • Is a shallow git clone of interest or should we do a full clone?
    • How to conduct a git push with data?
    • How to do a git pull where the data on the server is different than local?
  • Are these clone times (30s baseline, 350s at load) expected for the test environment, or do they indicate a GitLab configuration/infrastructure issue, is the performance cliff expected?

Next steps

  • Investigate the root cause of the 10 RPS performance cliff via Grafana metrics (Gitaly queue depth, network bandwidth, Git-specific bottlenecks)
  • Continue iterating on the xk6-exec path
    • Explore how to implement the new k6 build in GPT rather than a one-off implementation
    • Explore how to implement git pull and git push
  • Explore what other metrics we can gather (clone size based on duration of load time is suspect, can we capture throughput a different way?)
  • Compare the native git results against the API based testing we do currently
Edited by Andy Hohenner