Skip to content

Convert platform properties from key-value pairs into digests

Rohit Kothur requested to merge rkothur/platform-property-representation into master

Description

The goal of this MR is to normalize the representation of the platform properties in the database. Rather than using a table as a key-value store for the platform properties, we will simply hash the platform properties dictionary and store it as a column in the Jobs table.

When a worker connects, we will expand its capabilities dictionary into all possible partial capabilities dictionaries, then hash each of these. To search for a suitable job, we simply check whether the job's hashed representation matches one of the worker's partial dictionaries. Though figuring out these partial dictionaries is an expensive (exponential-time) operation, it is expected that many workers with similar capabilities will connect to BuildGrid, so we cache the result of this operation, allowing future workers to skip the computation.

This change should allow us to significantly reduce the space usage of the platform properties in SQL-based schedulers, and it allows us to avoid a database join in the worker assignment query.

Limitation introduced by this MR

Due to the powerset expansion, BuildGrid may take an inordinate amount of time processing a CreateBotSession request for a worker with a large, previously unseen capabilities dictionary. There is a warning printed when this happens, and much of this problem is ameliorated by caching, but this means that BuildGrid will be more suitable for workers with smaller capabilities dictionaries than larger ones.

Changes proposed in this merge request:

  • Eliminate the platform_properties table, and add a "platform_properties" column to the Jobs table.
  • Change the platform property representation from key-value pairs to a json digest
  • Change worker side of scheduler to expand all partial matches of capabilities
  • Move default platform property logic into a global variable in settings.json rather than hardcoding it in the execution instance

TODO:

  • Fix all broken tests
  • Address all TODOs in the code
  • Memoize the hash calculations
  • Compare runtimes
  • Add unit tests for all new helper methods

Validation

The tests should pass, and creating a new database with alembic should use the updated schema.

This merge request, when merged, will address issue/bug:

#239 (closed)

Edited by Rohit Kothur

Merge request reports