raft: Use raw node for better multi-raft support
Note. This is a memo. I haven't looked too deep into etcd/raft to evaluate the possibility.
On the surface purpose, etcd/raft provides a Node object for the application to interact with. Node's interface is fairly simple, mostly through a handful of functions: Ready()
, Propose()
, and ProposeConfChange()
. In the core Raft loop, the application polls the next ready state from Ready()
and processes accordingly. The application is woken up as soon as there are new states or messages to send. This activeness might be a problem in a crowded cluster with a huge amount of Raft groups (multi-raft problem).
Some systems overcome this too-activeness problem by using RawNode with quiescing and batching techniques. One typical example is cockroach DB. This object is located at a lower level than the Node object. It provides extra flexibility such as batching (health check) messages, quiescing, full control of async storage, etc.
So, it's worth exploring how it is useful for us. It's also uncertain if this is required for quiescing. In the latest release, the TickQuiesced API is removed (https://github.com/etcd-io/raft/commit/d6c1d644811d7aeb14046655d010773b38e38362). We might need to implement that feature ourselves.