Implement "ManagedSQL" as a new SGScripts CRD referenced by SGCluster
We want more and more to offer a fully-featured GitOps approach to managing Postgres databases.
One such areas that are not fully covered is managing a database DDL, or even more generic, managing a database with SQL. While you'd normally interact with a database with a CLI or GUI client, having the ability to programmatically maintain database DDL with a CRD is a well justified use-case. The current ability to execute operations by a component of the operator make this system very powerful for programmatic creation and update of DDL.
This issue aims to capture this and define a new CRD, called SGScripts
, to be added to StackGres' set of CRDs. This would allow to "define in YAML" commands to execute on a cluster.
Note that it won't be a goal to support DDL migrations, deltas and others --that would be relied on the user or other external tools. If any edit is performed to the SGScripts
, it will be blindly re-executed. If it is deleted, no action will be executed.
The internal spec of the CRD would be very similar to the actual initialData.scripts
spec:
- name: $scriptName
database: $dbname #optional, default to 'postgres'
scriptFrom | script: read the script from either a ConfigMap, Secret or inline.
- name: ...
...
There is some overlap with the spec.initialData.scripts
section of the SGCluster
and for this reason the intermediary CRD called SGScripts
will be created in order to be used as a reference by both SGCluster
(and in the future SGDbOps
). Section spec.initialData.scripts
of the SGCluster
will be deprecated and the new array field spec.managedSQL.sgScripts
will be added in order to reference an array of SGScripts
instances. SGScripts
CRD will have the following model:
apiVersion: stackgres.io/v1beta1
kind: SGScripts
spec:
managedVersions: <boolean> # If `true` the versions will be managed by the operator automatically. The user will still be able to update them. `true` by default.
continueOnError: <boolean> # if true, when any entry fail will not prevent subsequent entries from being executed. `false` by default.
scripts:
- name: <string> # Optional. Does not determine order of execution but creates a label to identify this script externally (like in events and logs).
id: <integer> # The id is immutable and must be unique across all the script entries. It is replaced by the operator and and is used to identify the script for the whole life of the `SGScript` object.
version: <integer> # The version of the script. Set to `1` by default.
database: <string> # Database where the script is executed. Defaults to the `postgres` database
user: <string> # User that will execute the script. Defaults to the `postgres` user
wrapInTransaction: <string> # Wrap the script in a transaction using the specified transaction mode (`NONE` (The default, means that the script will not be wrapped in a transaction), `READ COMMITTED`, `REPEATABLE READ` and `SERIALIZABLE`).
storeStatusInDatabase: <boolean> # When specified `inTransaction` field must not to be set to `null` or `NONE` and the script execution will include storing the status of the execution of this script in the table `managed_sql.status` that will be created in the specified `database`. This will avoid an operation that fails partially to be unrecoverable requiring the intervention from the user. This is `false` by default.
retryOnError: <boolean> # if true it retries to execute the script entry with an exponential back-off algorithm limited to 5 minutes with a variation of 1 minute. Default is false.
script: <string> # Raw SQL script to execute. This field is mutually exclusive with
`scriptFrom` field.
scriptFrom:
configMapKeyRef:
name: <string> # The name of the ConfigMap that contains the SQL script to execute.
key: <string> # The key name within the ConfigMap that contains the SQL script to execute.
secretKeyRef:
name: <string> # The name of the Secret that contains the SQL script to execute.
key: <string> # The key name within the Secret that contains the SQL script to execute.
status:
scripts:
- id: <integer> # Identify the associated script entry with the same value in the `id` field.
hash: <string> # Hash of the value of the ConfigMap or Secret referenced by `scriptFrom.configMapKeyRef` and `scriptFrom.secretKeyRef` respectively.
However, old spec.initialData.scripts
section of the SGCluster
was intended to be run only once, and only if a backup is not restored. We instead allow all referenced SGScripts
from field .spec.managedSQL.sgScripts
to be modified and to be applied "live" to the SGCluster
. The new version
field will allow to detect if script have to be executed and will be managed by the operator (default) or manually if the .spec.managedVersions
is set to false
. When .spec.managedVersions
is set to true
in the SGScripts
following rules will apply:
- Initially set version to 1.
- Secret or ConfigMap values hash is stored in the
SGScripts
.status.scripts[].hash
section - When an inline value is changed the version will be automatically incremented by 1.
- When a Secret or a ConfigMap is changed, the operator reconciliation cycle will update the hash and the version will be automatically incremented by 1.
- User will still be able to increment the value manually if needed.
Here is the proposed change to SGCluster
in order to reference an array of SGScripts
and maintain the status of applied SGScripts
:
apiVersion: stackgres.io/v1
kind: SGCluster
spec:
managedSQL:
continueOnSGScriptsError: <boolean> # if true, when any entry of any `SGScripts` fail will not prevent subsequent `SGScripts` from being executed. `false` by default.
sgScripts:
- id: <integer> # The id is immutable and must be unique across all the `SGScript` entries. It is replaced by the operator and is used to identify the `SGScript` entry.
name: <string> # a reference to an `SGScript`
status:
managedSQL:
sgScripts:
- id: <integer> # Identify the associated `SGScript` entry with the same value in the `id` field.
startedAt: <string> # ISO-8601 datetime of when the script execution has been started. Will be reset each time the referenced `SGScripts` will be re-applied.
failedAt: <string> # ISO-8601 datetime of when the script execution had failed (mutually exclusive with `completedAt`). Will be reset each time the referenced `SGScripts` will be re-applied.
completedAt: <string> # ISO-8601 datetime of when the script execution had completed (mutually exclusive with `failedAt`). Will be reset each time the referenced `SGScripts` will be re-applied.
scripts:
- id: <integer> # Identify the associated script entry with the same value in the `id` field.
version: <integer> # The latest version applied
intents: <integer> # Indicates the number of intents or failures occurred
failureCode: <string> # If failed, the error code of the failure. See also https://www.postgresql.org/docs/current/errcodes-appendix.html
failure: <string> # If failed, a message of the failure
The user will be able to add or replace any entry of the .spec.managedSQL.sgScripts
array in the SGCluster
. When replacing a value the associated status entry will also be emptied as it was never executed.
The order of execution and rules of the execution flow will be as follow:
- Each
SGScripts
will run in sequence one after another following the order of the `.spec.managedSQL. - If any entry of any
SGScripts
fails nextSGScripts
will not be executed unless.spec.managedSQL.continueOnSGScriptsError
is set totrue
inSGCluster
. - An entry of an
SGScripts
that has not been executed will be executed only if there are no other previous entry (following the array order) of the sameSGScripts
that has not been executed. - If execution of an
SGScripts
's entry fail no subsequent entries will be executed. Execution of subsequent entries is performed when previous entry execution failed only if field.spec.continueOnError
is set totrue
inSGScripts
. - If any entry of an
SGScripts
has been already executed it will not be re-executed even if it failed. - An entry is detected as "not being executed" if the entry version is missing from the status or different than the entry version in the status.
This is inspired by the original ideas by @stoetti1 and subsequent discussion on #956.
Acceptance criteria:
-
Implement the feature, TODO raise to epic and create sub epics. Epic measured in 10 days to accomplish. Sub tasks: -
#1545 (closed) The UI part. to be evaluated. -
Creating a CRD to implement sgcripts. -
Adapting webhooks and rest API. -
Tests CRD -
Documentation CRD -
Implement applying the SQL -
Implement migrating from .spec.initialData.scripts
-
Implement the test and documentation.
-