Skip to content

Image Resizing: [3] Spawn separate process on request approach

Idea

On each resizing request, we spawn/fork a separate process which would be responsible for the resizing.

Pros

  • Probably, the most granular control over Mem/CPU: we could set the limit for each process and control the rate at which we spawn/fork new ones.
    • mk: let's clarify this (don't understand)
  • Would not require a long-living process like in the Sidecar approach (#230517 (closed))
    • mk: how is that a pro though?
  • It seems that we do something similar while executing our /cmd utils (to check if it is like that)
    • mk: you mean for consistency reasons?
  • Failures are isolated out of main serving process
  • Easy to evolve into from embedded approach
  • Can rely on existing tools to do the heavy lifting (e.g. imagemagick)

Cons

  • May be expensive in terms of resources (to check the latency/CPU/Mem hit)
  • Memory thrashing: need to page in and out the same program data constantly
  • Zombies; since we would be running hundreds of thousands of these every hour, we might be creating zombies that get stuck or slip out of the parent pid (e.g. double fork) so we cannot reap them anymore

Concerns

For each concern, it is good to have a strategy on how to evolve the solution to solve it and the estimation of doing so.

Concern: Does it fit the WH philosophy? (to ask WH/Infra folks)
Solution: TBD

Security prerequisites

As stated by Jeremy in https://gitlab.com/gitlab-com/gl-security/engineering/-/issues/1043#note_388815325:

  1. Absolutely avoid processing svg files: here is a post mortem of the 3rd party service we are currently using to resize Gitter images. Anyhow it should not be an issue because svg files don't need to be resized.
  2. Enforce strict limits on inputs:
  • file size
  • picture size
  • crosscheck extension matches signature
  1. Sandbox the library that will do the image processing, i.e. don't run it as the same linux account than Rails/Workhorse/Gitaly.

Notes

Edited by Aleksei Lipniagov