Image Resizing: [2] Sidecar process-in-WH approach

Idea

We do the resizing in a separate process that is alive through the whole WH lifecycle and would be responsible for image resizing tasks. Workhorse will communicate with this process through some form of IPC (sockets, RPC, ...)

Pros

Better control over Mem/CPU than the Inbound approach (#230516 (closed))
There is a chance it may cheaper in terms CPU/Mem/overall latency than starting/forking a new process on each request (to check)
Fault isolation from main request serving process
Can be scaled independently of serving processes
No new service definition required in Omnibus; it remains an implementation detail of workhorse
Allows us to abstract away the actual scaling implementation by defining a simple IPC interface; see also Variations (we could decide to start with a simple library based scaler approach, but later swap it out for a more powerful service like imgproxy)
It makes it easy to degrade gracefully to the current serving approach if the sidecar falls over, since we could simply serve the original image as before

Cons

May be tricky to implement
Will need health monitoring and strategies to restart if down
No existing examples in the WH (utils in /cmd use another approach)
Cannot benefit from existing workhorse utilities such as logging, prometheus integration etc. (although most of this can be reinstated via https://gitlab.com/gitlab-org/labkit
Higher memory use overall for workhorse nodes

Variations

Variation 1: Sidecar uses custom golang module + imaging library

This is similar to the embedded approach, where a library would size images in-process. Just that it would now run in its own process. Pros: this could be a simple evolution/iteration of the embedded approach, since we'd just extract existing code into a new module and process.

Variation 2: Use stand-alone service

We can use a dedicated scaling service as a sidecar. For instance, imgproxy can bind to a UNIX domain socket so workhorse could talk to it: https://github.com/imgproxy/imgproxy/issues/296

Pros: image scaling work is already done for us and it should just work", i.e. faster iteration. Cons: we drag in some baggage we don't need, since workhorse already functions as a proxy. This would also almost certainly mean higher overall memory use for workhorse nodes.

Concerns

For each concern, it is good to have a strategy on how to evolve the solution to solve it and the estimation of doing so.

Concern: Does it fit the WH philosophy? (to ask WH/Infra folks)
Solution: TBD

Concern: Does this fit our Kubernetes roadmap? In terms of scaling out, would it be preferable for services to run in their own pods rather than as a sidecar? If it runs as a sidecar on the same pod, can or should we containerize it?

Security prerequisites

As stated by Jeremy in https://gitlab.com/gitlab-com/gl-security/engineering/-/issues/1043#note_388815325:

Absolutely avoid processing svg files: here is a post mortem of the 3rd party service we are currently using to resize Gitter images. Anyhow it should not be an issue because svg files don't need to be resized.
Enforce strict limits on inputs:

file size
picture size
crosscheck extension matches signature

Sandbox the library that will do the image processing, i.e. don't run it as the same linux account than Rails/Workhorse/Gitaly.

Notes

Edited Aug 05, 2020 by Aleksei Lipniagov