Create data flow documentation

Description

There seems to be no single resource or source of truth for when, how, and what data moves into or out of Runner. Without this documentation, it's difficult and time consuming to:

Evaluate Runner from a compliance and security perspective
Perform risk assessments
Train new employees (particularly security and compliance) on Runner

Having this documentation would help with all of the above. Currently, on the compliance side, understanding data flow in/out of Runner and protocol specifics is needed for evaluating FIPS 140-2 compliance. We need to know exactly what data is going in and out of Runner, and how (e.g., TLS, SSH, etc). Without this documentation being proposed, getting the information needed is extremely time consuming and error-prone (gaps, mistakes, etc).

In the short-term, this documentation would help drive the FIPS 140-2 compliance evaluation and make it easier to teach employees about how Runner works (for example, myself). In the long-term, it would also help the security team understand the movement of data and the protocols used. It was also give the community more insight into how their data is handled for their own security, compliance, and risk considerations.

Proposal

Create a documentation showing the following for every data ingress and egress to/from Runner:

Where the data is going from and to
What data is being sent
Protocol(s) used

For example (making this up - not real situations):

Runner sends build information to B via HTTPS (TLS).
B sends build information to Runner via HTTPS (TLS). Runner forwards that information to C via SSH.

Links to related issues and merge requests / references

https://gitlab.com/gitlab-org/gitlab-ce/issues/41463