Address database differences between on prem customers and GitLab.com
Problem Statement
Follow up on my conversation with @sytses and @dzaporozhets. We need to ensure that our on-prem customers are also able to leverage the performance tweaks and enhancements we have on GitLab.com.
Right now our customers only deploy what we have version controlled in code. But there is no proper feedback loop which brings the delta of enhancements we do on production (GitLab.com). These differences can be:
- Schema tweaks
- PostgreSQL configurations.
Proposal
We need to find out and assert that what we have running in production is the same as what we have version controlled and available for our on-prem customers.
Recommended PostgreSQL setup/schema for GitLab installations (mirror our production) We will need to collaborate with CS/Support team and production team.
Phase 1
Ask DB team to help come up with a script to document the DB schema that can run on both gitlab.com and custom installation.
- Production: The schemas shouldn’t be different. We may have migrations in gitlab.com is not the same as what is in the rails app. Measure this, if there are few exceptions we should document those.
- On-prem: Enlist the help of CS to run this script on a number of our important customers.
Phase 2
Document everything in code.
- Differences: We should update what we have in our repository to match what is in production.
- Exceptions: If there are exceptions to the norm, migrations that only applies to large-scale deployments or HA, create an optional migration script and document it. In the same way, we can have post-setup for HA only. Those are not needed for common instances. We don’t want to force 1000+ installations to setup weird HA installation that is not applicable to them.
Phase 3
Clean up & partial roll out
- Cleanup mechanism: Come up with a cleanup script with conditional logic to address the differences that we see in our customer's installation. Compare customer schema with the one we have under version control. If not then we should have a conditional migration that will apply the difference.
- Ensure to enlist the help from Support to collect data on the diff from customers, use it to build research data.
- Test and rollout: test the script in-house and slowly roll out to some of our customers. Have this migration be mandatory to clean up the schema for customers.
Phase 4
Complete rollout with documented guidelines
- Roll this out to all of our important customers
- publish a technical write up on the recommended way to address this for all other on-prem installations.
We need to ensure that this new migration script does not mutate or corrupt data on our customer's installation. We also need a way to measure that the migration cleanup results in a better performance for our customers.
Current schema status:
Differences from the previous effort
See: https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/4981
-
varchar(255) and varchar(510) columns: - https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/14381
- Turns out these come from MySQL migrations (https://github.com/gitlabhq/mysql-postgresql-converter/issues/6 and https://github.com/gitlabhq/mysql-postgresql-converter/issues/8)
-
Stray column appearances.updated_by
: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/14643 -
OIDs (can't fix without downtime) -
Index names spelled differently (may not be worth fixing) https://gitlab.com/gitlab-org/gitlab-ce/issues/38723 -
Missing indexes in bundled schema https://gitlab.com/gitlab-org/gitlab-ce/issues/38724 -
Missing/extra default values https://gitlab.com/gitlab-org/gitlab-ce/issues/38725 -
Missing/extra not null constraints https://gitlab.com/gitlab-org/gitlab-ce/issues/38726 -
Datetime with timezone vs Datetime #18849 (closed) -
"Unique constraints" vs "unique index" https://gitlab.com/gitlab-org/gitlab-ce/issues/38727
Current differences
Production vs v11.4.3-ee see: https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/5147#note_112776595
-
Extensions differences in production which are not checked into source control e.g.
master
:-
btree_gin -
dblink -
pageinspect -
pg_buffercache -
pg_stat_statements -
pgstattuple -
pg_repack
-
-
Schema differences:
-
#25218 On production a lot of columns have a limit of 255 but they're not presented in the source (we should add those limits to the source) -
#25217 Some columns are in different order in tables (especially for application_settings table) (completely expected, and not sure if there's a way to fix it) -
#25216 Some production columns are using datetime_with_timezone
but the source isdatetime
(broadcast_messages table, and others) -
gitlab-foss#53994 (closed) (source) Missing partial index for ci_builds in the source (index_ci_builds_project_id_and_status_for_live_jobs_partial) -
gitlab-com/gl-infra/production#524 (closed) (source) Missing index for ci_pipelines in the source (nik_tmp_ci_pipelines_ce_50763) -
gitlab-foss#53992 (closed) (source) Missing index for events (index_events_on_project_id_and_created_at) -
gitlab-foss#53993 (closed) (source) Missing index for project_mirror_data (index_project_mirror_data_on_next_execution_timestamp) -
gitlab-foss#53991 (closed) (source) Missing partial index for projects (index_projects_on_mirror_and_mirror_last_update_at) -
gitlab-foss#53989 (closed) (source) Missing index for push_rules (index_push_rules_is_sample) -
gitlab-foss#53988 (closed) (source) Extra index for notes (index_notes_on_updated_at) -
#25215 A different index name: index_emails_on_email (source) vs emails_email_key (production) -
#25214 A different index name: index_forked_project_links_on_forked_to_project_id (source) vs forked_project_links_forked_to_project_id_key (production) -
#25211 A different index name: index_projects_on_path (source) vs index_on_projects_path (production) -
#25210 A different foreign key name: Source didn't have any name. Production is fk_a27c483435
-