Worker: adjust iterating for subgroups and projects
What does this MR do and why?
Related to #214601 (closed)
Discovered while rolling out the FF. Using namespace.all_projects
caused a timeout for root level namespaces (kibana log (internal)).
I also noticed that the worker was running a long time (> 10 minutes) during monitoring in Grafana.
This MR refactors how the worker iterates through a namespace's subgroups and projects. It now iterates recursively through direct subgroups and direct projects. The workers are enqueued with a random delay to avoid flooding the system for a group with many subgroups or projects.
Database
Explain Plans for namespace batching
Before: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/16835/commands/57301
After: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/16914/commands/57442
Screenshots or screen recordings
N/A
How to set up and validate locally
It is important that the project you use for testing is a nested project (meaning within a subgroup)
I created a subgroup under Flightjs group and moved the Flight project into it
Example: http://gdk.test:3000/flightjs/flight-little/Flight
- setup gdk is for elasticsearch
- the indexes are created/setup:
bundle exec rake gitlab:elastic:index
- enable advanced search in the Admin Settings Advanced Search UI:
- enable the feature flag:
Feature.enable(:search_index_integrity)
- perform a group level code search
- verify results come back
- verify some blobs exist against Elasticsearch instance, replace integer 7 in the json below with project id from step 5
curl --request POST \
--url http://localhost:9200/gitlab-development/_count \
--header 'Content-Type: application/json' \
--data '{
"query": {
"bool": {
"must": [
{
"term": {
"type": {
"value": "blob"
}
}
},
{
"term": {
"project_id": {
"value": 7
}
}
}
]
}
}
}' | jq
{
"count": 43,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
}
}
- delete all of those blobs from the index, replace 7 with the project id from step 5
curl --request POST \
--url http://localhost:9200/gitlab-development/_delete_by_query \
--header 'Content-Type: application/json' \
--cookie 'perf_bar_enabled=true; experimentation_subject_id=IjQwMjUxOWZlLWIwYWItNDZlNi1hY2VkLTRjMWE0NzZkMjAyNCI%253D--dc985bd87edc1f47a1018fbc26fdc35dbeab34ba; BetterErrors-2.9.1-CSRF-Token=67dca20f-92f6-4685-8085-56fa84085f14' \
--data '{
"query": {
"bool": {
"must": [
{
"term": {
"type": {
"value": "blob"
}
}
},
{
"term": {
"project_id": {
"value": 7
}
}
}
]
}
}
}' | jq
{
"took": 18,
"timed_out": false,
"total": 43,
"deleted": 43,
"batches": 1,
"version_conflicts": 0,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1,
"throttled_until_millis": 0,
"failures": []
}
- re-run the group search
- verify the index integrity worker runs for the root namespace and queues up a new worker for children namespaces and the projects from: log/elasticsearch.log
{"severity":"INFO","time":"2023-03-30T16:45:24.631Z","correlation_id":"01GWSPHHGWWEK5YHYHRJRAQTX7","class":"Search::NamespaceIndexIntegrityWorker","message":"enqueueing all children for namespace","namespace_id":85,"meta.caller_id":"Search::NamespaceIndexIntegrityWorker","meta.remote_ip":"127.0.0.1","meta.feature_category":"global_search","meta.user":"root","meta.user_id":1,"meta.root_namespace":"flightjs","meta.client_id":"user/1","meta.root_caller_id":"SearchController#show","job_status":"running","queue":"default","jid":"396147774003244fda9ef976"}
{"severity":"INFO","time":"2023-03-30T16:45:24.633Z","correlation_id":"01GWSPHHGWWEK5YHYHRJRAQTX7","class":"Search::NamespaceIndexIntegrityWorker","message":"enqueueing all projects for namespace","namespace_id":85,"meta.caller_id":"Search::NamespaceIndexIntegrityWorker","meta.remote_ip":"127.0.0.1","meta.feature_category":"global_search","meta.user":"root","meta.user_id":1,"meta.root_namespace":"flightjs","meta.client_id":"user/1","meta.root_caller_id":"SearchController#show","job_status":"running","queue":"default","jid":"396147774003244fda9ef976"}
{"severity":"WARN","time":"2023-03-30T16:45:42.433Z","correlation_id":"e8781f69df585dfc8745b780eb4a0bd1","class":"Search::IndexRepairService","message":"blob documents missing from index for project","namespace_id":85,"root_namespace_id":33,"project_id":7,"project_last_repository_updated_at":"2023-03-28T13:37:45.001Z","index_status_last_commit":"eddf2a972e4a95ee6aba7e2b17969872d63b38e4","index_status_indexed_at":"2023-03-30T16:00:01.916Z","repository_size":975175}
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.