Make SGCluster restart / upgrade controlled directly by the operator as a rollout feature
Currently to restart, perform a minor version upgrade, a security upgrade or a major version upgrade an SGDbOps of the corresponding type have to be created. The SGCluster prevent users and tools from changing the postgres version since this is an operation only allowed by an SGDbOps. The postgres version, postgres configuration major version, extensions versions (in particular core and contrib) and backup path (or paths) have to be set by the SGDbOps operation based on its spec on the SGCluster's spec during the upgrade operation.
This approach leads to some issues:
- Prevents DevOpts tool and Users from changing the postgres version for an
SGCluster. - User can not specify an SGCluster to always follow the latest postgres version.
- Prevents composability of an SGCluster in general. For example, it make difficult to implement an
SGShardedDbOpsthat can change postgres version for anSGClustergenerated by anSGShardedClusterfollowing a specific sequence. - The postgres version value can indicate to use the latest version or latest minor version of a specific major version but the actual version value is then applied and changed so that the intent to indicate latest is lost after actually storing it in Kubernetes.
- Do not allow to perform upgrades automatically, since an SGDbOps operation have to be created.
Proposal
Have the actual Postgres version along with postgres target configuration (on major version change), extensions versions and backup path actually used by the SGCluster stored in its status. The version in spec will not be changed so that value like latest or 17 will remain with such a value. Instead the resolved latest or specified Postgres version will be set in the status. If the Postgres version is not compatible with the target Postgres configuration or any extensions it will not be set in the status and a Warning event will be created. The backup path in the spec will be changed in the spec in the case a new major version is resolved and will only be set in the status when the resolved Postgres version can be applied. Extensions will no longer be validated during SGCluster creation or update, instead a Warning event will be created indicating which extension do not exists for the resolved Postgres version.
The extensions will not be updated to the latest version when the version is not specified and a version have already been resolved in the status. This will force an user to set a specific version if they want to upgrade to in order to avoid upgrading a critical extension by mistake. The exception is for those extensions that are shipped with a specific Postgres version like core and contrib extensions. To perform automatic extensions update a value like latest or a channel (specified in the extension metadata) have to be used.
apiVersion: v1
kind: SGCluster
spec:
postgres:
version: latest
extensions:
- name: timescaledb
configurations:
sgPostgresConfig: postgresconf-17
backups:
- path: sgbackups.stackgres.io/stackgres/cluster/2025-07-10-16-00-00/17
status:
postgresVersion: "17.5"
extensions:
- name: timescaledb
version: 2.20.3
sgPostgresConfig: postgresconf-17
backupPaths:
- sgbackups.stackgres.io/stackgres/cluster/2025-07-10-16-00-00/17
When a new Postgres version or a new extension version is specified and there is no validation error, the PendingRestart condition will be true. When a new version of the operator is installed that requires an upgrade of the SGClusters and there is no validation error, the PendingUpgrade and PendingRestart conditions will be both true. In both cases it means that the SGCluster requires to be restarted with a rollout. The rollout will take care of the sequence of the restart and will apply any operation that is needed in order to bring the SGCluster to the desired state specified by the spec and will be performed by the operator itself so there will no longer be a Job spawned by an SGDbOps that will perform the restart or similar operation.
By default the operation will not be performed unless the user creates a restart SGDbOps on the SGCluster. This SGDbOps will not create any Job as before, the operator will take care of performing the operation.
The SGCluster may also be annotated to change this need or to configure rollout behavior:
-
stackgres.io/rollout: "always": SGCluster get restarted without the need of creating any other resources as soon as a restart is needed. -
stackgres.io/rollout: "never": disable completely rollout. The SGCluster will never be restarted even using SGDbOps. -
stackgres.io/rollout: "schedule"andstackgres.io/rollout-schedule: "<cron expression>[:<duration>][|...]": The SGCluster will be automatically restarted when needed but only in the time windows specified by the cron expressions and durations values defined in thestackgres.io/rollout-scheduleannotation. - No
stackgres.io/rolloutannotation andstackgres.io/rollout-dbops: "<restart/minor version upgrade/security upgrade SGDbOps name>": The SGCluster will be restarted as long as thestackgres.io/rollout-dbops: "<restart/minor version upgrade/security upgrade SGDbOps name>"is present. The annotation is meant to be created by an SGDbOps and the operator will take care of removing it if there is no such SGDbOps or it has completed.
The SGCluster will also include a specific section that will control the rollout behavior as an alternative to the stackgres.io/rollout.* annotations (annotations will takes the precedence over spec):
kind: SGCluster
spec:
pods:
updateStartegy: # updateStartegy indicates the strategy that the SGCluster controller will use to perform updates. It includes any additional parameters necessary to perform the update for the indicated strategy.
type: <string> # Indicates the type of the update strategy. Default is SGDbOps.
#
# * Always: update will be performed as soon as possible.
# * Schedule: update will be performed as specified in the schedule section where you can configure windows of time where the update can be performed.
# * SGDbOps: update will be performed as soon as an SGDbOps of type restart, securityUpgrade or minorVersionUpgrade targeting the SGCluster is started up to its timeout (if configured).
# * Never: update will never be performed unless the Pods are deleted manually.
schedule:
- cron: <string> # A UNIX cron expression indicating the start of the window of time where the update can be performed.
duration: <string> # An ISO 8601 duration in the format `PnDTnHnMn.nS`, that, together with the cron expression, indicates the end of the window of time where the update can be performed.