feat(project): support kernel-mode NAT for port proxies via x-incus.nat-proxy
Summary
incus-compose currently translates every Compose ports: entry into an Incus proxy device running in userspace mode
(project/project.go:223-234, nat flag never set). Each forwarded connection therefore goes through a host-side Go process: read from
listen socket → context switch → write to connect socket inside the container's netns. For high-RPS or high-throughput services this is
measurably slower than Docker's default iptables DNAT, despite our docs claiming the opposite (docs/compose-compatibility.md:248).
Incus proxy devices already support a kernel-mode nat=true setting that installs nftables DNAT rules instead of spawning a proxy process
— roughly on par with Docker's iptables path. This issue proposes exposing that mode through a Compose extension field, opt-in at the
service level.
Proposed UX
Add a service-level extension x-incus.nat-proxy: true. When set, all of that service's ports get nat=true on their generated proxy
devices, and connect.addr is rewritten from the hard-coded 127.0.0.1 to the container's actual IP.
```yaml
services:
web:
image: docker.io/nginx:alpine
ports:
- "8080:80"
- "8443:443"
x-incus:
nat-proxy: true
api:
image: docker.io/myapi:latest
ports:
- "3000:3000"
# No extension → userspace proxy as today (default unchanged)
```
Service-level granularity (one switch covers every port on that service) is deliberate — it keeps the parser simple. Per-port granularity
can come later if there is demand; it requires custom parsing of long-syntax port entries since compose-go's ports[] does not surface
per-entry extensions cleanly.
Why service-level only
- compose-go exposes service.Extensions as map[string]any directly. Reading x-incus.nat-proxy is one map lookup.
- Per-port extensions would need either a custom YAML decoding pass or a parallel x-incus.ports: block that mirrors ports:. Neither is
worth the complexity for the first iteration.
- A service that needs mixed modes can be split into two services sharing the same image.
Implementation sketch
1. Parse the extension — early in the service loop in project/project.go (before the port loop at line 206), read:
```golang
natProxy := false
if v, ok := service.Extensions["x-incus"].(map[string]any); ok {
if b, ok := v["nat-proxy"].(bool); ok {
natProxy = b
}
}
```
2. Validate eligibility — for each port entry, NAT mode requires:
- Protocol is tcp or udp (not unix/abstract sockets, no protocol conversion)
- Listen address is one the host can bind (0.0.0.0, ::, or a host IP — our default 0.0.0.0 is fine)
- Service has at least one routable NIC (managed bridge, etc.)
If any check fails, log a warning and silently fall back to userspace mode for that port. Do not fail the whole up.
3. Switch device generation — at project/project.go:223-234, when natProxy && eligible:
- Set Nat: true on InstanceDeviceProxyConfig
- Move the device from devices (pre-creation) to postDevices (the existing slice at project/project.go:147), because we need the
container's IP before we can fill connect.addr
- Leave connect.addr empty for now; resolve it in the post-attach step
4. Resolve container IP in PostDevices flow — client/resource_instance.go:410 attachPostDevices already runs after the instance is
created/started. Extend it (or add a sibling pass) to:
- Fetch the instance's IPv4 address from the Incus API (GetInstanceState → first managed-NIC inet address)
- Substitute it into any NAT proxy device's connect.addr before attaching
- The existing fix in 796cd08 (re-run attachPostDevices on existing instance) means errors will surface correctly on re-runs
5. Serialization is already correct — client/resource_instance_device.go:152-154 already emits "nat": "true" when cfg.Nat is set. No
changes needed there.
Edge cases / open questions
- IP changes across restarts. If the container is recreated and gets a new bridge IP, the proxy device's connect.addr becomes stale. up
--recreate will regenerate everything correctly, but a bare incus restart could leave the rule dangling. Options: (a) document that
NAT-mode proxies require up --recreate after manual restarts; (b) extend the daemon side to re-resolve on start. Recommendation: (a) for
v1.
- Multiple NICs. Which IP do we pick if a service joins several networks? Proposal: first managed-bridge NIC in declaration order, with a
warning if there is ambiguity.
- --recreate without the extension change. Removing x-incus.nat-proxy: true from a compose file and re-running up --recreate should drop
the user back to userspace mode cleanly. Since devices are regenerated from scratch each recreate, this should just work — needs a test.
- Docs claim at docs/compose-compatibility.md:248 is misleading today. Either way this PR lands, that line should be reworded to describe
the actual behavior (default = userspace, opt-in = NAT).
Out of scope
- Per-port x-incus.nat-proxy granularity
- Auto-detecting when NAT mode is "obviously safe" and enabling it by default — keep this opt-in until the rough edges above are
understood
- Changing the userspace default for existing users
Acceptance criteria
- [ ] x-incus.nat-proxy: true on a service produces proxy devices with nat=true and connect.addr = <container-ip>
- [ ] Ineligible ports (e.g. udp+tcp mismatch, no routable NIC) log a warning and fall back, do not break up
- [ ] Existing fixtures without the extension produce byte-identical device configs (snapshot tests cover this)
- [ ] New fixture under test/fixtures/with-nat-proxy/ exercises both eligible and fallback paths
- [ ] docs/compose-compatibility.md section on port publishing is corrected to describe actual default behavior and the new opt-in
- [ ] docs/environment-variables.md or a new doc page describes the extension
issue