Draft: Backfill placeholder users that miss an organization

What does this MR do and why?

We found about 50k users of type placeholder that are not related to the Default Organization.

After fixing the root cause, this migration will fix the data.

It retrieves all the users with user_type = placeholder and created_at > '2025-02-01'. For each user, it will check if an organization has been assigned. If not, the migration will assign one.

Database queries

Batch

Batch size = 100 000, sub batch 1000

See query plan (Postgres.ai): 3ms

SELECT
	"users" .*
FROM
	"users"
WHERE
	"users"."id" BETWEEN 100615 AND 1100269
	AND "users"."user_type" = 15
	AND "users"."created_at" >= '2025-02-01'
	AND "users"."id" >= 863123
	AND "users"."id" < 950567

Select Organization User record

Look up by user_id, this is indexed. See Query Plan

SELECT
	"organization_users" .*
FROM
	"organization_users"
WHERE
	"organization_users"."user_id" = 1100166
	AND "organization_users"."organization_id" = 1
LIMIT 1

References

Create a background migration for fixing the pl... (#546659 - closed)

How to set up and validate locally

Tested by creating test users.

  • Generate test users
  • Check number of good and bad records
  • Run migration for scheduling the job
    load 'db/post_migrate/20250602120927_queue_backfill_organization_placeholder_users.rb'
    QueueBackfillOrganizationPlaceholderUsers.new.up
  • Check number of good and bad records again

Test query. Output should be two lines that both have a true value for column has_org

WITH generated_users AS (
	SELECT u.id, u.username, u.user_type, u.created_at, ou.id IS NOT NULL AS has_org
	FROM users u 
	LEFT JOIN organization_users ou ON ou.user_id = u.id AND ou.organization_id  = 1
	WHERE u.username LIKE 'generated-user%'
)
SELECT has_org, created_at > '2025-02-01' AS feb_2025, count(*) FROM generated_users GROUP BY 1, 2 ORDER BY 1,2

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Rutger Wessels

Merge request reports

Loading