Skip to content

Spike: Cloudflare Worker POC

We are proposing to create a dedicated edge service layer for Cloud Connector through which all traffic to GitLab-hosted features is routed. The motivation for this is laid out in !132977 (merged).

One promising alternative to writing and deploying a service from scratch is to use Cloudflare Workers, a serverless solution to deploying application code that:

  • Is auto-scaled through Cloudflare's service infrastructure.
  • Supports any language that compiles to Webassembly, including Rust.
  • Supports various options for cloud storage including a key-value store we can use to cache data.
  • Supports a wide range of network protocols including WebSockets.
  • Built in secrets management
  • Supports regional deployments

We should build a POC that demonstrates all stated goals of the blueprint can be accomplished with this.

Key issues to explore

  • What latency overhead do we need to expect?
  • Can we implement logic like cryptographic verification?
  • Can we handle requests at the TCP stream level?
  • Cost forecast based on current and expected traffic
  • Which storage options are available to store and enforce request budgets and what would they cost
  • How does telemetry work and how would it integrate with our dashboards and alerts
  • Can we do mTLS between the worker and a GCP LB or something similar?

Outcome

We wrote a POC that demonstrates how a CC smart router could look like in https://gitlab.com/mkaeppler/cloud-connector-cloudflare-worker-poc. The POC:

  • Reads a JWT from an HTTP header, decodes and verifies it, and renders 401 unless successful
  • Parses the request URI and maps any requests to /ai/* to the AI gateway

You can invoke this worker as follows:

curl -v -H"X-Gitlab-Token: $(curl -s -H'Authorization: Bearer <PAT>' -XPOST https://gitlab.com/api/v4/code_suggestions/tokens | jq -r '.access_token')" -d'
{
    "prompt_version": 1,
    "current_file": {
      "file_name": "test.py",
      "content_above_cursor": "def is_even(n: int) ->",
      "content_below_cursor": ""
    }
  }' https://cc-gateway.mkaeppler.workers.dev/ai/v2/completions

We found the following pros and cons with this approach:

Pros

  • Very easy and fast to stand something up that works
  • Very easy to run and debug worker locally (tooling is great)
  • Supports any implementation language that compiles to WASM
  • Provides several cloud storage options that would cover our needs
  • Supports smart placement logic that executes worker closest to where backends operate
  • Attractive pricing model; default pricing model does not charge for wall time

Cons

  • The V8 based runtime has limitations in the APIs and hence libraries it supports
  • Workers have numerous limits such as capped memory and CPU use, beyond which requests get discarded
  • Logging and telemetry need to either stay within the Cloudflare ecosystem (web dashboard or CLI), or require additional operational complexity to integrate with our existing stacks (Prometheus/ELK)
  • Opting for a 3rd party solution means we aren't dog-fooding Runway
  • No OOTB solution for staging or canary environments
Edited by Matthias Käppler