Skip to content

Draft: PoC: Pods architecture using decomposed cluster-wide tables approach

Kamil Trzciński requested to merge many-pods into observe-queries

TL;DR

This is a proof of concept trying to model some aspects Pods Stateless Router Proposal.

What it does?

Iteration 1 (concluded with recording on 12th of October)

Those are initial changes implemented in PoC and presented via https://youtu.be/mUcALjn-yqQ.

  1. Application behavior: Focuses on modeling how GitLab would work in something similar to the described architecture.
  2. Focus on unknowns: Chooses to ignore all known problems since we know how they could be solved, with significant amount of effort
  3. Classify tables: There are currently manual WIP to classify what could be part of gitlab_users. This PoC does implemented a more systematic approach by building a complete reference map between all tables (~500), adds context (pod or cluster) to identify references (related or external). Based on that rebuild and provide a list of tables that are cluster-wide or pod-local.
  4. Foreign keys: Gets rid of foreign keys between cluster-wide and pod-tables.
  5. Use PostgreSQL schema: Based on tables classification, do create PostgreSQL schemas (context, pod_N) to limit data visibility between "virtual Pods". This is needed to model how application would work if it would have access to data only in pod_0 and context
  6. Implement many Pods: It does create two Pods with a set of Cluster-wide tables with the help of PostgreSQL schemas
  7. Switch Pods: Extend performance bar to be able to dynamically switch between Pods to quickly test different features
  8. Passthrough Pod into Sidekiq: Any operation triggering Sidekiq Worker will passthrough Pod specification to allow workers to executed against selected Pod.

This follows write-sync approach: where all Pods can write to cluster-wide tables.

  1. A pod_N in general uses DB replica, but can write write to cluster-wide tables directly by connecting to primary database => a region can probbly work fine with this approach
  2. A set of cluster-wide tables is under public schema
  3. A pod specific tables is under pod_0 and pod_1 schema
  4. A Rack/Sidekiq Middleware is added to configure connection.schema_search_path = "public,pod_0|pod_1" depending on selected_pod Cookie to model switching organizations

(click image to expand)

Iteration 2 (in progress)

Those are additional changes implemented in this PoC to continue solution validation. Yet to be presented.

  1. Router: Implement router to send pre-flight requests and path_info classification on Rails side: !102553 (merged).
  2. GitLab-Shell and Gitaly: Fix support for Git Push and make it work with Router.
  3. PreventClusterWrites: Implement mechanism to model async writes approach: where only Pod 0 can write to cluster-wide tables.
  4. QA: Work on fixing as many QA tests as possible.

This follows write-async approach: where only Pod 0 can write to cluster-wide tables.

  1. A pod_N always uses DB replica of cluster-wide tables and is expecting to observe latency on those tables to up-to 500ms => a region-first approach
  2. A set of cluster-wide tables is under public schema
  3. A pod specific tables is under pod_0 and pod_1 schema
  4. A Rack/Sidekiq Middleware is added to configure connection.schema_search_path = "public,pod_0|pod_1" depending on selected_pod Cookie to model switching organizations
  5. Only pod_0 can write to cluster-wide tables: this is enforced by PreventClusterWrites Query Analyzer
  6. The pod_N forwards write calls via API to pod_0: this is currently modelled by suppressing PreventClusterWrites in a place where the write happens to identify places required to be changed
  7. Some endpoints are forced by served by pod_0, like /admin, or /-/profile that require cluster-wide access.

(click image to expand)

What problems it ignores as we know that we can solve them?

  1. Decompose cluster-wide tables: We know that we can decompose all cluster-wide tables (as we did that for CI decomposition). The biggest problem there is fixing all cross-join schemas. Using a single logical database with separate PostgreSQL schemas (cluster+pod_0 or cluster+pod_1) allows to retain all existing cross-joins working, but still create a separate visibility between tables.
  2. Monotonic sequences: We know that we can handle ID sequences across all Pods in a non-conflicting way for things like projects.id or issues.id. This PoC makes all PostrgreSQL sequences to be shared across all pod_0/pod_1.
  3. Loose foreign keys: The loose foreign keys needs to be updated to allow removal across different Pods
  4. Partitioning: The partitioning code use gitlab_partitions_dynamic and gitlab_partitions_static. Since this is not compatible with context, pod_N` approach all partitioned tables are for now converted into non-partitioned.
  5. Sidekiq Cron: Only regular Sidekiq Workers are covered. In future each Pod would have its own Sidekiq Cron executor.

Problems to evaluate

  1. Router: Current approach uses a single GitLab, pass a Cookie, and dynamically search_schema_path depending on selected Pod. Ideally router should understand (Workhorse?) or have a logic to route a request to a correct Pod based on information from GitLab Rails
  2. Cross-Pod talking: a. fetch data from another Pod (like Project) b. aggregate data across all Pods c. schedule Sidekiq job in a context of another Pod d. route all requests (Controller, GraphQL and API) requests to correct Pod automatically
  3. Many versions of GitLab: A truly Pod architecture allows to run many different versions of GitLab at the same time, allowing to upgrade some customers less frequently than others, and in thus improving resiliency due to application bugs. In a model of decomposed shared cluster-wide tables this might not be possible, since we would require all nodes to run the same latest version of application if cluster-wide tables were updated.
  4. ...

Run it

  1. Configure config/database.yml with schema_search_path:, ideally using a new DB
  2. Run scripts/decomposition/create-pods-database
  3. Run bin/rake -t db:seed_fu to seed development database
  4. (Optionally) Run scripts/decomposition/classify-pods-database to fetch test DB and update gitlab_referenced.yml
# config/database.yml

development:
  main:
    database: gitlabhq_development_pods
    schema_search_path: public,pod_0
  ci:
    database: gitlabhq_development_pods
    database_tasks: false
    schema_search_path: public,pod_0
Edited by Kamil Trzciński

Merge request reports