Delete invalid enabled namespaces
What does this MR do and why?
Adds a scheduled event to clean up enabled namespace records that are invalid.
- For non-SaaS instances, we check if the instance is valid. If not, we delete all the enabled namespace records in batches of 10_000 up to 100_000 and then re-emit the event with the last ID that was processed to continue with the next 100_000 records. If the instance is valid, we return early and do nothing.
- For SaaS, we need to check the eligibility of namespaces. We get a batch of 10_000 IDs, check if they're valid, delete if not valid up to 100_000 records, then re-emit the event to process the next 100_000 records.
- The worker runs once a day.
flowchart TD
Start([Event Received]) --> CheckIndexing{Indexing enabled?}
CheckIndexing -->|No| End1([Return false])
CheckIndexing -->|Yes| CheckValidity{GitLab.com OR<br/>instance invalid?}
CheckValidity -->|No| End2([Return false])
CheckValidity -->|Yes| ProcessBatches[Process in batches<br/>BATCH_SIZE: 10,000]
ProcessBatches --> GetBatch[Get next batch<br/>ordered by ID]
GetBatch --> IsGitLabCom{GitLab.com?}
IsGitLabCom -->|Yes| FilterInvalid[Filter invalid namespaces<br/>without valid subscriptions]
FilterInvalid --> DestroyInvalid[Destroy invalid records]
IsGitLabCom -->|No| DestroyAll[Destroy all records<br/>in batch]
DestroyInvalid --> CheckLimit{Processed >= 100,000?}
DestroyAll --> CheckLimit
CheckLimit -->|Yes| Reemit[Re-emit event with<br/>last_processed_id]
Reemit --> LogMetadata[Log metadata]
CheckLimit -->|No| MoreBatches{More batches?}
MoreBatches -->|Yes| GetBatch
MoreBatches -->|No| LogMetadata
LogMetadata --> End3([Complete])
A feature flag controls adding the event to the event store per:
References
- [Index state tracking: Deletion] Find enabled n... (#580233 - closed)
- [FF] `active_context_code_event_invalid_enabled... (#580273)
Query plans
Only on Gitlab.com
Getting valid namespaces from 10_000 ids
Ai::ActiveContext::Code::EnabledNamespace.valid_saas_namespaces.id_in(batch_namespace_ids).pluck_primary_key
290.596 ms (25.221 ms warmed)
batch.destroy_all creates a DELETE query for every record. This is needed to ensure the callbacks run to FK cleanups.
DELETE FROM "p_ai_active_context_code_enabled_namespaces" WHERE "p_ai_active_context_code_enabled_namespaces"."id" = 1314
How to set up and validate locally
- Ensure your GDK has a valid license and has beta settings enabled
::Gitlab::CurrentSettings.instance_level_ai_beta_features_enabled? && ::License.ai_features_available?
- Create enabled namespaces by running the
create_enabled_namespaceevent
Ai::ActiveContext::Code::SchedulingWorker.new.perform("create_enabled_namespace")
- Run the deletion event
Ai::ActiveContext::Code::SchedulingWorker.new.perform("process_invalid_enabled_namespace")
- [Optional] Simulate SaaS and run the events again
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Edited by Madelein van Niekerk
