Skip to content

WIP: Backfill LfsObjectsProject records of forks

Patrick Bajao requested to merge 55487-backfill-lfs-objects-v3 into master

What does this MR do?

To fix the behavior of existing forks, LfsObjectsProject records need to be backfilled. This is so we can remove the need to depend on the source when looking for LFS objects from forks.

This is based on implementation in !25343 (merged) that was reverted. The difference is the query used for finding the projects to be migrated is now performant.

Migration

Output:

== 20200306134708 RescheduleLinkLfsObjects: migrating =========================
== 20200306134708 RescheduleLinkLfsObjects: migrated (37.6010s) ===============

Tested sample queries on #database-lab. Here's the test data that I added on top of existing data:

/* Create source project */
INSERT INTO projects (id, namespace_id, name, archived, created_at, updated_at) VALUES (100000001, 9970, 'pb-lfs-backfill-source', false, NOW(), NOW());

/* Create forks */
INSERT INTO projects (id, namespace_id, name, archived, created_at, updated_at) SELECT n, 2327904, 'pb-lfs-backfill-fork', false, NOW(), NOW() FROM generate_series(100000002, 100100001) AS n;

/* Create fork networks and members */
INSERT INTO fork_networks (id, root_project_id) VALUES (100000001, 100000001);
INSERT INTO fork_network_members (fork_network_id, project_id) VALUES (100000001, 100000001);
INSERT INTO fork_network_members (fork_network_id, forked_from_project_id, project_id) SELECT 100000001, 100000001, projects.id FROM projects WHERE name = 'pb-lfs-backfill-fork';

/* Create `LfsObjectsProject` records for source project */
INSERT INTO lfs_objects_projects (id, lfs_object_id, project_id, created_at, updated_at) SELECT n, n, 100000001, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000002, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000003, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000004, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000005, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000006, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000007, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000008, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000009, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;
INSERT INTO lfs_objects_projects (lfs_object_id, project_id, created_at, updated_at) SELECT n, 100000010, NOW(), NOW() FROM generate_series(100000000, 100100000) AS n;

ANALYZE projects;
ANALYZE fork_network_members;
ANALYZE lfs_objects_projects;

This creates 100k forks and 1M lfs_objects_projects records. Query and plans are added to corresponding lines.

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team

#55487 (closed)

Edited by 🤖 GitLab Bot 🤖

Merge request reports