Processors - Check for an existing comment last
Context
Some processors take up to 90 seconds to just run their applicable? or comment_cleanup_applicable? method. This happens when:
- The issue has A LOT of comments (e.g. this one)
- We're using
UniqueCommentto see whether we already posted a comment from that processor (we have to go through ALL the comments in the issue)
Closes Three processors causing 4.5-minute webhook pro... (#1668 - closed).
What does this MR do and why?
- We do API calls at the very end for
applicable?andcomment_cleanup_applicable?methods, which are called for every processor, for every webhooks.- This reduces the 4 webhooks I used for testing from 90s for each of the three problematic processors to ~1-2s for each
🚀
- This reduces the 4 webhooks I used for testing from 90s for each of the three problematic processors to ~1-2s for each
- We filter certain kinds of webhooks that were triggered by automation
Other things I tried
I tried to search via the https://docs.gitlab.com/api/search/ API, but:
- It would return ALL the issues matching the hidden comment we add for a processor (potentially thousands of issues). We cannot search AND filter for a specific issue at the same time it seems.
- I thought of changing the hidden
UniqueCommentcomment from processor_name to also include the issue/MR ID. It is not worth pursuing at the time, as I expect this MR to fix 90+% of the long-webhook-processing cases.
Expected impact & dry-runs
- Faster webhook processing overall when issues have a lot of comments (applicable issues with a lot of comments would still be slow)
- A lot less API calls
Action items
-
If adding environment variables for reactive processors, update config/triage-web.yamland.gitlab/ci/triage-web.yml -
(If applicable) Add documentation to the handbook pages for Triage Operations => - (If applicable) Identify the affected groups and how to communicate to them:
-
/cc @ person_or_group=> -
Relevant Slack channels => -
Engineering week-in-review
-
Edited by David Dieulivol