Check for non-eligible active context code repositories

What does this MR do and why?

Finds Ai::ActiveContext::Code::Repository records that are invalid and marks them for deletion by setting the state to :pending_deletion and setting a delete_reason in metadata.

The actual deletion happens in Handle active context code repository deletion (!212099 - merged)

The check runs every day using the SchedulingService.

Because the delete_reason is different for the different cases, we have 3 separate queries:

  • last_activity_before_cutoff: finds repos that haven't been queried lately (within the last 3 months). This indicates the embeddings aren't being used and it can be cleaned up.

    Note

    We will index again if queried again [Index state tracking: Deletion] Trigger ad-hoc... (#580251)

  • duo_features_disabled: if the project has duo features disabled. This comes from a cascading setting that can be set on application settings, namespace settings or project settings. The cascading has eventual consistency on project being performed by sidekiq jobs so we can just check project.project_setting.duo_features_enabled
  • namespace_invalid: enabled_namespace is nil (the group was deleted) or not in ready state

References

Database queries

Ai::ActiveContext::Code::Repository.with_active_connection.not_in_delete_states.limit(1000).last_activity_before_cutoff.mark_as_pending_deletion_with_reason('last_activity_before_cutoff')

20.054 ms https://postgres.ai/console/gitlab/gitlab-production-main/sessions/45417/commands/139233

Ai::ActiveContext::Code::Repository.with_active_connection.not_in_delete_states.limit(1000).duo_features_disabled.mark_as_pending_deletion_with_reason('duo_features_disabled')

55.489 ms https://postgres.ai/console/gitlab/gitlab-production-main/sessions/45381/commands/139105

Ai::ActiveContext::Code::Repository.with_active_connection.not_in_delete_states.limit(1000).namespace_invalid.mark_as_pending_deletion_with_reason('namespace_invalid')

23.689 ms https://postgres.ai/console/gitlab/gitlab-production-main/sessions/45381/commands/139107

How to set up and validate locally

  • Create repository records in various states, including :delete
  • Set a repo's last_queried_at to 5 months prior
  • Set a repo's enabled_namespace_id to nil
  • Set a repo's enabled_namespace to state :pending
  • Set a repo's project.project_setting to duo_features_enabled: false
  • Set a repo's project.namespace.namespace_settings to duo_features_enabled: false
  • Run the mark_repository_as_pending_deletion task and note that these repos are marked as pending_deletion with a delete_reason set

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Madelein van Niekerk

Merge request reports

Loading