Model Selection code using a lot of CPU time in Duo Workflow Service

In some load testing we saw the following:

Screenshot_2025-10-23_at_2.50.22_pm

source

Read more at https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/3794#note_2838923664 .

It seems strange that we spend so much time selecting a model in our flows. Maybe this relates to YAML parsing. Perhaps we are parsing YAML for every request rather than parsing them all upfront and caching them? High CPU is quite problematic in asyncio Python code so anything we can do to pre-parse YAML or JSON and cache it will have a dramatic improvement to throughput.

Edited by 🤖 GitLab Bot 🤖