implement registry of nodes and services available on the network
We need to have a logical structure for each service to find other running services. This should be done via nomad api wrapper, which would maintain a 'registry' of available services, their endpoints and availability. As usual, we have two options of implementation and we will go with the first for this sprint, but i will list both for keeping the perspective:
-
central database / table of all running jobs, endpoints and availabilities. this will be hosted somewhere on the public address. the endpoint will be passed to each adapter so that each adapter could check it. This will make consul querying (link to the code) obsolete and thus we will be able to get rid of consul from the platform. Notes:
-
the centralized implementation of the database/service-discovery will have to run somewhere, therefore it has to be a separate docker container. the best for now is to to place it on nunet-infra/service-discovery directory. this container should be built and deployed (manually or via nomad) when nunet platfrom is rebooted or re-launched.
-
each adapter on the network will be able to query this global container with required service name (similarly to how it is done now with consul); the returned payload depends on how we solve the load balancing of calls on the platform level, which is the subject of another issue (#28 (moved)). for now lets assume that the call will return a list of endpoints to the requested service.
-
the database has to contain up to date information about service availability and take care of
-
registering new service when received a grpc request from the platfrom; -
de-registering old services when received a grpc request from the platfrom; -
Services should be referenced with their nodes and vice-versa -
services should to know on what machines they are running on and machines (i.e. adapter) should now services that are running on them; -
The inavailability of a node should indicate that services on it are not available also; -
unavailable nodes should be deleted from the table/database as soon as they deregister or the unavailability is discovered; -
The list of nodes and services should contain metadata about each services or at least a reference to where metadata can be obtained.
-
-
health checking and actual information about availability of services is covered by issue #29 (closed)
-
-
another option is dht running on each nunet-adapter holding the same information as above; we are not going to implement dht solution for the current sprint, but if the central database is implemented correctly (mostly health checking part), it should be fairly easy to change it into dht implementation down the road.