Check for non-eligible active context code repositories
What does this MR do and why?
Finds Ai::ActiveContext::Code::Repository records that are invalid and marks them for deletion by setting the state to :pending_deletion and setting a delete_reason in metadata.
The actual deletion happens in Handle active context code repository deletion (!212099 - merged)
The check runs every day using the SchedulingService.
Because the delete_reason is different for the different cases, we have 3 separate queries:
-
last_activity_before_cutoff: finds repos that haven't been queried lately (within the last 3 months). This indicates the embeddings aren't being used and it can be cleaned up.Note
We will index again if queried again [Index state tracking: Deletion] Trigger ad-hoc... (#580251)
-
duo_features_disabled: if the project has duo features disabled. This comes from a cascading setting that can be set on application settings, namespace settings or project settings. The cascading has eventual consistency onprojectbeing performed by sidekiq jobs so we can just checkproject.project_setting.duo_features_enabled -
namespace_invalid:enabled_namespaceis nil (the group was deleted) or not inreadystate
References
- [Code Embeddings] MarkRepositoryAsPendingDeleti... (#577333 - closed)
- FF rollout: [FF] `active_context_code_event_mark_repository... (#579815)
Database queries
Ai::ActiveContext::Code::Repository.with_active_connection.not_in_delete_states.limit(1000).last_activity_before_cutoff.mark_as_pending_deletion_with_reason('last_activity_before_cutoff')
20.054 ms https://postgres.ai/console/gitlab/gitlab-production-main/sessions/45417/commands/139233
Ai::ActiveContext::Code::Repository.with_active_connection.not_in_delete_states.limit(1000).duo_features_disabled.mark_as_pending_deletion_with_reason('duo_features_disabled')
55.489 ms https://postgres.ai/console/gitlab/gitlab-production-main/sessions/45381/commands/139105
Ai::ActiveContext::Code::Repository.with_active_connection.not_in_delete_states.limit(1000).namespace_invalid.mark_as_pending_deletion_with_reason('namespace_invalid')
23.689 ms https://postgres.ai/console/gitlab/gitlab-production-main/sessions/45381/commands/139107
How to set up and validate locally
- Create repository records in various states, including
:delete - Set a repo's
last_queried_atto 5 months prior - Set a repo's
enabled_namespace_idto nil - Set a repo's
enabled_namespaceto state:pending - Set a repo's
project.project_settingtoduo_features_enabled: false - Set a repo's
project.namespace.namespace_settingstoduo_features_enabled: false - Run the
mark_repository_as_pending_deletiontask and note that these repos are marked aspending_deletionwith adelete_reasonset
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.