NATS: Enforce deploying jetstream cluster nodes into separate availability zones
In combination with sync interval see #119 (closed), it is possible that NATS jetstream can lose data in event of os failure or crashes even in a replicated setting.
See https://github.com/nats-io/nats-server/issues/7567 and https://github.com/nats-io/nats-server/issues/7564 for details.
Quoting from the recent doc changes (https://github.com/nats-io/nats.docs/pull/896/files)
In a replicated setup, a published message is acknowledged after it successfully replicated to at least a quorum of servers. However, replication alone is not enough to guarantee the strongest level of durability against multiple systemic failures.
- If multiple servers fail simultaneously, all due to an OS failure, and before their data has been
fsync-ed, the cluster may fail to recover the most recently acknowledged messages.- If a failed server lost data locally due to an OS failure, although extremely rare, it may rejoin the cluster and form a new majority with nodes that have never received or persisted a given message. The cluster may then proceed with incomplete data causing acknowledged messages to be lost.
Setting a lower
sync_intervalincreases the frequency of disk writes, and reduces the window for potential data loss, but at the expense of performance. Additionally, settingsync_interval: alwayswill make sure serversfsyncafter every message before it is acknowledged. This setting, combined with replication in different data centers or availability zones, provides the strongest durability guarantees but at the slowest performance.
The default settings have been chosen to balance performance and risk of data loss in what we consider to be a typical production deployment scenario across multiple availability zones.
It is recommended that the NATS nodes are running in separate availability zones to avoid data loss even in case of OS failure.
We should enforce deployment configurations to only deploy NATS cluster nodes in different AZs. For example, via pod anti-affinity rules for our k8s-based deployments.