Support Docker Compose `deploy` Specification for HA in incus-compose
This issue tracks the implementation of the Docker Compose deploy specification in incus-compose, enabling users to define HA and scaling configurations in a familiar, declarative way. The goal is to align with the Compose spec while leveraging Incus's unique features (e.g., live migration, system containers).
Key Features to Implement
deploy.replicas- Basic N-replica scaling is already implemented.
- Remaining work: auto-distribute replicas across cluster nodes when shared storage is configured.
- Warn if shared storage is missing.
deploy.mode- Support
replicated(default) andglobal(one container per node).
- Support
deploy.placement- Support
constraintsandpreferencesto control container placement. - Example:
placement.constraints: node.labels.storage == ssd.
- Support
deploy.restart_policy- Map
conditionto Incus boot config (boot.autostart,boot.autorestart). delayandmax_attemptshave no Incus equivalent and will be ignored.
- Map
deploy.update_config- Implement rolling updates (e.g.,
parallelism: 1,delay: 10s).
- Implement rolling updates (e.g.,
deploy.rollback_config(Future)- Implement rollback support for failed updates.
Out of Scope (For Now)
deploy.endpoint_mode: Requires external load balancer integration.deploy.rollback_config: Lower priority than core HA features.
Proposed UX for incus-compose HA
1. Basic Replica Scaling
services:
web:
image: nginx:alpine
deploy:
replicas: 3- Behavior:
- Creates 3 containers named
web-1,web-2,web-3. - Auto-distributes them across available nodes (if shared storage is configured).
- Warns if shared storage is missing.
- Creates 3 containers named
2. Global Mode (One Container per Node)
services:
agent:
image: monitoring-agent:latest
deploy:
mode: global- Behavior:
- Creates one container on each node in the Incus cluster.
- Automatically starts a container on new nodes as they join the cluster.
3. Placement Constraints and Preferences
services:
db:
image: postgres:14
deploy:
replicas: 2
placement:
constraints:
- node.labels.storage == ssd
preferences:
- spread: instanceId- Behavior:
- Only deploys containers on nodes labeled
storage=ssd. - Spreads containers across nodes to avoid co-location.
- Only deploys containers on nodes labeled
4. Restart Policy
services:
worker:
image: myworker:latest
deploy:
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3- Behavior:
- Restarts the container on unexpected exit (
boot.autorestart=true). delayandmax_attemptsare not supported by Incus and will be ignored.
- Restarts the container on unexpected exit (
5. Rolling Updates
services:
web:
image: nginx:alpine
deploy:
replicas: 3
update_config:
parallelism: 1
delay: 10s
order: start-first- Behavior:
- Updates containers one at a time.
- Starts the new container before stopping the old one (
start-first). - Waits 10 seconds between updates.
6. Resource Limits
services:
app:
image: myapp:latest
deploy:
resources:
limits:
cpus: "1.0"
memory: 1G- Behavior:
- Sets CPU and memory limits for the container (already implemented).
Tasks
- Implement node-aware distribution for
deploy.replicasacross cluster nodes. - Add shared storage validation when using
replicasorglobalmode. - Implement
deploy.mode: global. - Implement
deploy.placement.constraintsandpreferences. - Map
deploy.restart_policy.conditionto Incus boot config. - Implement
deploy.update_configfor rolling updates. - Document HA examples (e.g., 3-node cluster with
replicas: 3).
Open Questions
- Should
incus-composeautomatically create node labels (e.g.,storage=ssd) if they don't exist, or require users to label nodes manually? - How should
incus-composehandle node failures during rolling updates (e.g., pause, continue, or rollback)? - Should we support
deploy.rollback_configin the initial implementation, or defer it to a later release?