Make sure that using serverless functions actually scales down to zero

Description

One the https://about.gitlab.com/product/serverless/ page, we claim that:

Functions-as-a-service (FaaS) allows you to write small, discrete units of code with event-based execution. Developers deploy code without worrying about the infrastructure it will run on. Code executes when it's needed so you don't use compute resources while your app is idle. GitLab Serverless allows you to run your own FaaS on any infrastructure without the vendor lock-in of traditional cloud function services.

specifically

Code executes when it's needed so you don't use compute resources while your app is idle.

This is not entirely true.

Knative allows scaling pods down to zero, but you still need a cluster with Knative installed to use that. Users are getting charged for running a cluster with Knative components, and minimal costs on GKE is around 100$ per month, at the moment.

GKE does not charge users for "Kubernetes master nodes", but users still need to create at least a n1-standard-4 node to install Knative components there. From my experiments it appears that minimal Knative cluster requirements are a little big bigger actually (2 x n1-standard-4 at least).

Even if we could avoid a problem with users getting charged for running raw Knative components when idling, the billing model that GitLab supports is still far from a reasonable one for FaaS.

AWS Lambda example:

Lambda Pricing Details

Lambda counts a request each time it starts executing in response to an event notification or invoke call, including test invokes from the console. You are charged for the total number of requests across all your functions.

Duration is calculated from the time your code begins executing until it returns or otherwise terminates, rounded up to the nearest 100ms. The price depends on the amount of memory you allocate to your function.

We currently do not support that, and it would be really difficult to support something like that unless we are going to manage the Knative cluster on which functions are getting executed.

Proposal

Introduce GitLab-managed custom Knative cluster, and expose it in a similar way we expose Shared CI/CD Runners.

This way we can:

Make sure that FaaS users get charged for code execution and duration.
Make sure that FaaS is zero cost when not in use.
Increase adoption of this feature, get more users to improve feedback / iteration.
Scale Knative clusters horizontally if needed.
Resolve isolation concerns by creating a node per run (like in GitLab CI build example)
Eagerly provision more nodes to make it sure that the latency is acceptable.
Make sure that it works in the same way on-premises.

On-premises users will need to pay for their own cluster, just like they do in case of GitLab Runners, however this feature can also bring a lot of value to on-premises users, because we will show usage / metrics per group / namespace / project.

There are some caveats and challenges with this approach, but it might be something we need to do if we want to build something that solves real problems and gets some attention from the community.

Without people using GitLab Serverless we are not going to get feedback and iteration is going to be difficult. Without iterating properly we might not be able to build a good product. It is hard to justify spending 100$+ to execute a few functions a month.

Thoughts @DylanGriffith @danielgruesso @Alexand?

Edited Feb 18, 2019 by Grzegorz Bizon