Skip to content

Draft: PoC: Transactional Behavior on Topology Service

This implements Transactional Behavior on Topology Service.

Purpose

Validate technical feasibility of implementing atomic transactions on Topology Service using Cloud Spanner.

Note:

This PoC was split into below MRs. The PoC branch represents the left-over changes that are not part of essentials to support Happy Path for Phase 5. Namely: ListRecords, and ListLeases.

References

Requirements

  • Bulk updates are executed as atomic operations on database.
  • Uniqueness constraints are executed by database to prevent concurrent operations.
  • Uses leases to lock objects temporarily.
  • Leases and claims can be enumerated efficiently.

What is covered

  • Proto is adopted to the new data structure.
  • All fields in Proto that are exposed, are persisted in database and can be queried.
  • Services are extended to provide Get/Update/List operations.
  • Significant rework of configuration is done to expose [claim_store]
  • Multi-queries using positional arguments, queries are written both for Cloud Spanner and PostgreSQL
  • Queries are consistent across database drivers, and are documented on the behavior and parameters used
  • Database store is covered with all functional tests validating the behavior: Get, BeginUpdate, CommitUpdate, RollbackUpdate, ListClaim, ListLeases
  • ListClaims follows cursor-based paginations based on source_table+cell_id
  • ListLeases follows cursor-based pagination based on created_at
  • All batch updates are send a single request to database and executed atomically

What is not covered

  • Not all indexes are created to make operations efficient
  • Cursor pagination is not functional in Service (lack of deserialisation)
  • Multiple claim stores can be implemented, ex.:
    • proxying claim store where store proxies writes to another store
    • dual-write claim store: writes are first send upstream, but then are performed locally
  • The internal/database should expose ReadTx(ctx, func(tx *db.Tx){}) as a way to execute reads on the same epoch of database and have them consistent
  • It is partially test covered. Test are written for now where it matters: database store, database framework, query parsing. The tests were used as a way to ensure that consistent behavior is observed across all databases.

Choices

  • Do not pass cfg.Config, instead env.Env is introduced to provide a reduced information to services, like helper to find CellProtoById
  • Add [claim_store] and [claim_store.database] as a way to capture configuration specific to claim_store
  • The [claim_store.database] introduces driver_name and data_source to retain the semantics used by golang
  • The internal/database is introduced to provide a minimal / common interface to interact with database, to expose only efficient operations as available by all database systems: the writes can only be performed by BatchWrite, queries can be performed by QueryRow and QueryAll. No other interface is exposed.
  • The internal/stores/claim/database/ uses internal/database
  • The queries are stored in internal/stores/claim/database/queries/*.sql, they are at startup converted into optimised MultiQuery to avoid repetitive parsing of the queries
  • The env.sh is a script used to prepare desired environment for testing and local running - it will create DB (postgres/spanner) and run migrations to prepare for running tests or application

Outcome

All operations function as designed. Operations are executed in batch.

Edited by Kamil Trzciński

Merge request reports

Loading