Commit 5ad35d94 authored by Emily Chui's avatar Emily Chui
Browse files

Update object pools design

parent 36fce746
Loading
Loading
Loading
Loading
+69 −1
Original line number Diff line number Diff line
@@ -313,6 +313,47 @@ growth of references in object pools.

## Design and implementation details

### Storing pool metadata in embedded databases

To track object pool memberships, Gitaly needs a persistent store. Pool metadata needs to be retained on restart of the Gitaly service and should be queried efficiently. The pool metadata Gitaly would need to keep track of would include:

1. All the members that belong to the pool network
1. Source repo (upstream)
1. Count of how many repositories reference the pool

We looked at different embedded databases and their tradeoffs.
The following table contrasts the performance of BadgerDB (key-value) and SQLite
(relational) for the queries Gitaly will now have to answer for.

| Query | BadgerDB | SQLite |
| ------- | ---------- | -------- |
| **"List all members in pool X"** | Prefix scan `pool:{poolID}:member:*` on the forward index with keys `pool:{poolID}:member:{forkID} -> (empty)` — O(n) iteration | `SELECT fork_id FROM pool_members WHERE pool_id = ?` — O(log n) with index |
| **"Which pool is fork Y in?"** | Reverse index `fork:{forkID} -> {poolID}` — must maintain 2 keys per relationship | `SELECT pool_id FROM pool_members WHERE fork_id = ?` — O(log n), single table |
| **"Which upstream does pool X belong to?"** | Lookup `pool:{poolID}:upstream` — O(1) but separate key to maintain | `SELECT upstream FROM pools WHERE pool_id = ?` — O(log n), stored with pool record |
| **"How many members in pool X?"** | Must store/maintain `pool:{poolID}:count` separately, risk of desync | `SELECT COUNT(*) FROM pool_members WHERE pool_id = ?` — always accurate |
| **"Can I delete pool X?" (count = 0)** | Lookup `pool:{poolID}:count`, an extra key that needs to be in sync whenever membership changes in the pool | `SELECT COUNT(*) FROM pool_members WHERE pool_id = ?` — always accurate, real-time |
| **"Add a member"** | Must update 3 keys atomically (member key, reverse lookup, count). Still O(1) for each operation | Single `INSERT` into `pool_members`. O(log n) single INSERT with index maintenance. |
| **"Remove a member"** | Must delete 2 keys atomically, update count | Single `DELETE` from `pool_members`. O(log n) single DELETE with index maintenance |

SQLite is favorable because:

1. **Relational queries are core to pool management** — membership lookups from
   both directions (pool→members, member→pool) are first-class operations.
2. **Counts are always accurate** — no risk of stale `count` keys that can be out of sync. Since badger needs 3 operations, we can't guarantee transactional aspect of those 3 operations that need to happen in Gitaly. This is a big consistency win when using sqlite.
3. **Single source of truth** — no need to maintain bidirectional indexes manually. With badger we needed a forward index and a reverse index.
4. **Hot path queries** - Answering the query "List all members in pool X" would require a full scan in badger with a prefix. 
5. **Scale** - For a popular repository like gitlab-org/gitlab, it contains ~12000 forks. Storing this in badger would mean it needed 12000 entries for the reverse index, 12000 for the forward index, 1 for the count, 1 for the upstream entry of the pool, so ~24002 entries in total. 100 monorepos like this one would need 24,002,000 entries. 

### Epics

**[15741](https://gitlab.com/groups/gitlab-org/-/work_items/15741): Gitaly managed object pool state has parity with GitLab**

This epic is designed to encapsulate the work required to get Gitaly to have the state of object pool that rails currently has within Gitaly only. It also includes bugs that would restrict us from getting a proper state of object pools and exposes some RPC on rails side that Gitaly will require in order to check for parity. 

**[15742](https://gitlab.com/groups/gitlab-org/-/work_items/15742): Transition pool lifecycle control to Gitaly**

This work captures the changes to RPC required in Gitaly to make it have more ownership and control over the object pool lifecycle.

### Moving lifecycle management of object pools into Gitaly

As stated, the goal is to move the ownership of object pools into Gitaly.
@@ -573,6 +614,9 @@ supporting RPCs may be provided:
Migration towards the object deduplication network-based architecture involves
a lot of small steps:

1. Introduce internal Rails API such as `object_pool_members` to let Gitaly 
query object pool membership including finding the upstream repository of a
pool. Also allows us to discover current pool relationships managed by Rails.
1. `CreateFork()` starts automatically linking against preexisting object
   pools. This allows fast forking and removes the notion of object pools for
   callers when creating a fork.
@@ -581,7 +625,9 @@ a lot of small steps:
   `AddRepositoryToObjectPool()` and `DisconnectGitAlternates()` and migrate
   Rails to use the new RPCs. The object deduplication network is identified
   via a repository, so this drops the notion of object pools when handling
   memberships.
   memberships. Additionally introduce `IsUsingObjectDeduplication()` and 
   `DisableObjectDeduplication()` enabling rails to interact with object
   pools without understanding the underlying details of pools relationships.
1. Start recording object deduplication network memberships in `CreateFork()`,
   `AddRepositoryToObjectDeduplicationNetwork()`,
   `RemoveRepositoryFromObjectDeduplicationNetwork()` and `RemoveRepository()`.
@@ -602,6 +648,15 @@ a lot of small steps:
   to remove the `CreateObjectPool()` RPC.
1. Remove the `ObjectPoolService` and the notion of object pools from the Gitaly
   public API.
1. Implement orphaned pool cleanup. Add mechanism to detect and clean up pools that exist on disk but have no members recorded in Gitaly's metadata store. Historically we have had issues with out of sync object pools with what's on disk and what was recorded in the rails database. This would a final step to rectify out-of-sync issues.
1. Simplify Gitaly's RPC interface to abstract away object pool complexities from Rails.
Deprecate RPCs like `CreateFork`, `LinkRepositoryToObjectPool`, `CreateObjectPool`, `DisconnectGitAlternates` and replace it with:

- `CreateRepository` - modifying the request to include a hint on whether to handle object pool deduplication

- `DisableObjectDeduplication` - disconnect from the pool

1. Clean up Rails-side object pool APIs since Gitaly should take over managing this information.

This plan is of course subject to change.

@@ -736,6 +791,19 @@ database to ensure they stay up to date.
  and persist the new storage assignment in the `repository_assignments`
  database table.

#### Pool metadata storage

In standalone Gitaly deployments, pool metadata is stored in an embedded SQLite database local to the node. For simplification and a unified codebase, in Gitaly Cluster, each node also maintains its own local SQLite database. 

In Praefect deployments, the voting mechanism is only activated when Git references (branches, tags) are modified. This triggers the reference transaction hook to route the RPC to all Gitaly nodes that participated in the vote and provide strong consistency through synchronous updates. If however, a replica was not healthy and could not be updated, the write is asynchronous. This can lead to temporary inconsistencies in nodes and there could be other scenarios where sqlite might be out of sync:

- When Praefect voting does not get triggered in RPCs. For eg. `LinkRepositoryToObjectPool` which only modifies .git/objects/info/alternates file (no refs changed), or `DisconnectGitAlternates`, removing the alternates file ( no refs changed) then we don't get the voting mechanism that guarantees consistency and Praefect just routes the RPC to all Gitaly nodes. Some nodes might succeed while others fail. 
- When (replication factor < total number of nodes), not all Gitaly nodes will store the same object pools or pool metadata. This means we need to agreggate pool metadata across multiple nodes to get a complete picture of all the pool relationships that Gitaly manages.

To get a node back to healthy, our reconciliation strategy would be to extend the scheduled `ReplicateRepository` RPC to also update the pool metadata and ensure it validates the metadata with what exists on-disk. 

Consideration has been given to alternates such as the use of Praefect's PostgreSQL database and other distributed sqlite db but drawbacks of additional code paths diverging from Gitaly standalone deployments and additional complexity have ruled them out.   

## Problems with the design

As mentioned before, object pools are not a perfect solution. This section goes