WIP: Use BuildMetadata to store build configuration in OLD/YAML serialized form
What does this MR do?
My initial work on introducing Ci::BuildConfig
model which should be used for presenting all CI-job-runner-oriented options.
Ideally, we should move everything that is crucial only for CI job processing to this class:
- all variables,
- all helpers generations and so on.
Why re-use serialization?
This is the second simpler iteration after: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/21450. The other MR uses serialization, but it because too complex. I took another approach that allow us to be more iterative and reduce the risk of such change. This is pretty much no-risk change, as it touches implementation in a very specific and targeted way.
The general outline of the plan:
- let us introduce base model now,
- let us hide a usage of new model behind feature flag: we can confidently test that on large scale system and reduce impact,
- let us migrate all logic in next iterations,
- let us add columns next to use a new model to not use serialization anymore,
I consider this a minimal change, because:
- allow us to introduce a life-cycle of data: this model is disposable and should be able to remove it aggressively (1-3 months),
- allow us to add additional columns to store new data in a new form,
- allow us to not migrate data, but rather assume that after "3 months" we no longer care about existing ones,
- move away from serialization to add explicit columns next, in the backward compatible way.
Next steps
Next step will be:
- we gonna add soft-archiving: all builds older than 3 months will no longer be retryable nor playable: https://gitlab.com/gitlab-org/gitlab-ce/issues/50939
The state of different models
ci_builds
- we use that for long-term, frequently and frequently updated data. Each row in this table can be updated 10-20 times, as it holds various relations, informations about who and when performed action on the subject,
ci_build_metadata
- we use that to store mid-term data, that is sometimes updated, but it contains very specific information for jobs. Right now it is timeout, but next would be: features available by runner, runner version, used executor/shell/platfrom/system. Each row of this table can be updated 3-4 times during the lifecycle of Build. Generally: 1. created once, 2. updated on the job being picked by runner, 3. on job marked as finished.
What are the relevant issue numbers?
Relates to https://gitlab.com/gitlab-org/gitlab-ce/issues/50195
Does this MR meet the acceptance criteria?
-
Changelog entry added, if necessary -
Documentation created/updated -
Tests added for this feature/bug -
Conforms to the code review guidelines -
Conforms to the merge request performance guidelines -
Conforms to the style guides -
Conforms to the database guides