Be able to correctly time the throughput of Global Search (how long it takes to index items)
We recently added timings for how long it takes to index items in ElasticSearch. This is done in two places, the ElasticCommitIndexerWorker
for code and wiki and the ProcessBookkeepingService
for everything else that gets indexed.
The timings for the ProcessBookkeepingService
are however not always correct. We currently time it by taking the time the job is finished and subtracting it with the updated_at
of the item we're measuring. This works fine for items that are updating and directly get indexed. However there are cases where we index items without updating them, for example when a customer imported a project in a paid namespace or updated from a free to a paid license, we want to index all items, but don't update the items themselves. We also index related items This means that the difference between the item's updated_at
and the current time will be way longer than that it takes to actually index.
Instead, we should measure the time it takes from tracking
an item (telling the system we want to (re-)index this item). We can't just add this info to the job, since that would prevent deduplication.