Split SCIM tables
Relates to #498905 (closed)
Summary
This MR introduces separate tables for Group SCIM-related data. It creates new models GroupScimAuthAccessToken and GroupScimIdentity to replace the existing ScimOauthAccessToken and ScimIdentity models for group-level SCIM operations. This change aims to improve performance and scalability for Group SCIM operations in GitLab.com.
Changes overview
- Created new models:
GroupScimAuthAccessTokenandGroupScimIdentity - Updated Group SCIM-related controllers and API endpoints to use the new models
- Added migrations to create new tables and move existing data
- Updated finders and services to work with the new models
- Implemented feature flag
separate_group_scim_tableto control the rollout
Validation steps
Setup Group SAML and Group SCIM locally
Before checking out this branch
- Start your GDK instance
- Select a group and add SCIM and SAML settings
- Make sure the SCIM is working as expected
After checking out this branch
- Run database migrations
- Enable feature flag
Feature.enable(:separate_group_scim_table, Group.find(<group-id>)) - Make sure that previous SCIM calls are still working with old scim token and URI
- Setup scim in another group
- confirm that everything works as expected
Script for testing scim
require 'bundler/inline'
gemfile do
source 'https://rubygems.org'
gem 'json'
gem 'http'
gem 'faker'
end
require 'http'
require 'json'
require 'faker'
# Replace with your SCIM endpoint and token
SCIM_ENDPOINT = 'https://gdk.test:3000/api/scim/v2/groups/twitter/Users'
SCIM_TOKEN = 'glsoat--N82VshzyPDJ6YFhCxKi' # (GDK safe secret)
scim_user_id = Faker::Internet.uuid
# Function to generate random user data
def generate_random_user_data(scim_user_id)
{
"schemas" => ["urn:ietf:params:scim:schemas:core:2.0.0"],
"externalId" => scim_user_id,
"userName" => Faker::Internet.user_name,
"name" => {
"givenName" => Faker::Name.first_name,
"familyName" => Faker::Name.last_name
},
"emails" => [
{
"value" => Faker::Internet.email,
"type" => "work"
}
],
"active" => true
}
end
# Create a new user
def create_user(user_data)
http = HTTP.auth("Bearer #{SCIM_TOKEN}")
response = http.post(SCIM_ENDPOINT, json: user_data, headers: { 'Content-Type' => 'application/scim+json' })
handle_response(response)
end
# Edit an existing user
def edit_user(user_id, updated_data)
user_uri = "#{SCIM_ENDPOINT}/#{user_id}"
http = HTTP.auth("Bearer #{SCIM_TOKEN}")
response = http.patch(user_uri, json: updated_data, headers: { 'Content-Type' => 'application/scim+json' })
handle_response(response)
end
# Delete a user
def delete_user(user_id)
user_uri = "#{SCIM_ENDPOINT}/#{user_id}"
http = HTTP.auth("Bearer #{SCIM_TOKEN}")
response = http.delete(user_uri)
handle_response(response)
end
# Handle the response
def handle_response(response)
if response.code == 201 || response.code == 200 || response.code == 204
puts "Operation successful!"
puts response.body
else
puts "Error:"
puts response.body
puts response.code
end
end
# Example usage:
# Create a new user with random data
new_user_data = generate_random_user_data(scim_user_id)
create_user(new_user_data)
# Edit an existing user (assuming you have the user ID)
user_id = scim_user_id
updated_data = {
"emails" => [
{
"value" => "updated_email@example.com",
"type" => "work"
}
]
}
edit_user(user_id, updated_data)
# Delete a user
delete_user(user_id)
Queries and SQL plans
- user.groups.top_level.with_scim_identity
Query
SELECT "namespaces"."id", "namespaces"."name", "namespaces"."path", "namespaces"."owner_id", "namespaces"."created_at", "namespaces"."updated_at", "namespaces"."type", "namespaces"."description", "namespaces"."avatar", "namespaces"."membership_lock", "namespaces"."share_with_group_lock", "namespaces"."visibility_level", "namespaces"."request_access_enabled", "namespaces"."ldap_sync_status", "namespaces"."ldap_sync_error", "namespaces"."ldap_sync_last_update_at", "namespaces"."ldap_sync_last_successful_update_at", "namespaces"."ldap_sync_last_sync_at", "namespaces"."description_html", "namespaces"."lfs_enabled", "namespaces"."parent_id", "namespaces"."shared_runners_minutes_limit", "namespaces"."repository_size_limit", "namespaces"."require_two_factor_authentication", "namespaces"."two_factor_grace_period", "namespaces"."cached_markdown_version", "namespaces"."project_creation_level", "namespaces"."runners_token", "namespaces"."file_template_project_id", "namespaces"."saml_discovery_token", "namespaces"."runners_token_encrypted", "namespaces"."custom_project_templates_group_id", "namespaces"."auto_devops_enabled", "namespaces"."extra_shared_runners_minutes_limit", "namespaces"."last_ci_minutes_notification_at", "namespaces"."last_ci_minutes_usage_notification_level", "namespaces"."subgroup_creation_level", "namespaces"."max_pages_size", "namespaces"."max_artifacts_size", "namespaces"."mentions_disabled", "namespaces"."default_branch_protection", "namespaces"."max_personal_access_token_lifetime", "namespaces"."push_rule_id", "namespaces"."shared_runners_enabled", "namespaces"."allow_descendants_override_disabled_shared_runners", "namespaces"."traversal_ids", "namespaces"."organization_id" FROM "namespaces" INNER JOIN "members" ON "namespaces"."id" = "members"."source_id" LEFT OUTER JOIN "scim_identities" ON "scim_identities"."group_id" = "namespaces"."id" LEFT OUTER JOIN "group_scim_identities" ON "group_scim_identities"."group_id" = "namespaces"."id" WHERE "members"."type" = 'GroupMember' AND "members"."source_type" = 'Namespace' AND "namespaces"."type" = 'Group' AND "members"."user_id" = 1 AND "members"."requested_at" IS NULL AND (access_level >= 10) AND "namespaces"."parent_id" IS NULL AND (scim_identities.id IS NOT NULL OR group_scim_identities.id IS NOT NULL) GROUP BY "namespaces"."id"
Plan: https://postgres.ai/console/gitlab/gitlab-production-main/sessions/34979/commands/108361
Summary: Time: 5.269 ms
- planning: 4.763 ms
- execution: 0.506 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 132 (~1.00 MiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
- GroupScimIdentity.for_group
SELECT "scim_identities".* FROM "scim_identities" WHERE "scim_identities"."group_id" = 100081788;
Time: 1.040 ms
- planning: 0.962 ms
- execution: 0.078 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 6 (~48.00 KiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
- GroupScimIdentity.with_extern_uid
SELECT "group_scim_identities".* FROM "group_scim_identities" WHERE (LOWER("group_scim_identities"."extern_uid") = LOWER('abcs'))
plan: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/34979/commands/108363
Time: 0.611 ms
- planning: 0.585 ms
- execution: 0.026 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 0 from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
- user.with_group_scim_identities_by_extern_uid
SELECT "users"."id", "users"."email", "users"."encrypted_password", "users"."reset_password_token", "users"."reset_password_sent_at", "users"."remember_created_at", "users"."sign_in_count", "users"."current_sign_in_at", "users"."last_sign_in_at", "users"."current_sign_in_ip", "users"."last_sign_in_ip", "users"."created_at", "users"."updated_at", "users"."name", "users"."admin", "users"."projects_limit", "users"."failed_attempts", "users"."locked_at", "users"."username", "users"."can_create_group", "users"."can_create_team", "users"."state", "users"."color_scheme_id", "users"."password_expires_at", "users"."created_by_id", "users"."last_credential_check_at", "users"."avatar", "users"."confirmation_token", "users"."confirmed_at", "users"."confirmation_sent_at", "users"."unconfirmed_email", "users"."hide_no_ssh_key", "users"."admin_email_unsubscribed_at", "users"."notification_email", "users"."hide_no_password", "users"."password_automatically_set", "users"."encrypted_otp_secret", "users"."encrypted_otp_secret_iv", "users"."encrypted_otp_secret_salt", "users"."otp_required_for_login", "users"."otp_backup_codes", "users"."public_email", "users"."dashboard", "users"."project_view", "users"."consumed_timestep", "users"."layout", "users"."hide_project_limit", "users"."note", "users"."unlock_token", "users"."otp_grace_period_started_at", "users"."external", "users"."incoming_email_token", "users"."auditor", "users"."require_two_factor_authentication_from_group", "users"."two_factor_grace_period", "users"."last_activity_on", "users"."notified_of_own_activity", "users"."preferred_language", "users"."theme_id", "users"."accepted_term_id", "users"."feed_token", "users"."private_profile", "users"."roadmap_layout", "users"."include_private_contributions", "users"."commit_email", "users"."group_view", "users"."managing_group_id", "users"."first_name", "users"."last_name", "users"."static_object_token", "users"."role", "users"."user_type", "users"."static_object_token_encrypted", "users"."otp_secret_expires_at", "users"."onboarding_in_progress", "users"."color_mode_id", "users"."composite_identity_enforced" FROM "users" INNER JOIN "group_scim_identities" ON "group_scim_identities"."user_id" = "users"."id" WHERE (LOWER("group_scim_identities"."extern_uid") = LOWER('abcd'))
Plan: https://postgres.ai/console/gitlab/gitlab-production-main/sessions/34979/commands/108368 Summary:
Time: 5.133 ms
- planning: 5.053 ms
- execution: 0.080 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 0 from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
Fixes #495503 (closed), #498906 (closed) and #463930 (closed)