Handle non-structured error responses from the ai-gateway
What does this MR do and why?
There are some failure cases where the AI gateway does not return structured errors and when that happens the error handling was failing to parse it. As someone who is just learning how gitlab AI works, I found this kept wasting debug time leading me up the garden path.
{"severity":"ERROR","time":"2025-11-07T03:20:16.809Z","correlation_id":"01K9E58BTCKBF3WTT3ZYWNE5GX","meta.caller_id":"Llm::CompletionWorker","meta.feature_category":"ai_abstraction_layer","meta.organization_id":1,"meta.remote_ip":"127.0.0.1","meta.http_router_rule_action":"classify","meta.http_router_rule_type":"SESSION_PREFIX","meta.user":"root","meta.gl_user_id":1,"meta.client_id":"user/1","meta.root_caller_id":"GraphqlController#execute","exception.class":"TypeError","exception.message":"String does not have #dig method","exception.backtrace":["ee/lib/gitlab/llm/ai_gateway/client.rb:76:in `dig'","ee/lib/gitlab/llm/ai_gateway/client.rb:76:in `stream'","ee/lib/gitlab/llm/chain/requests/ai_gateway.rb:38:in `request'","ee/lib/gitlab/llm/chain/concerns/ai_dependent.rb:33:in `request'","ee/lib/gitlab/llm/chain/tools/identifier.rb:23:in `block in perform'","ee/lib/gitlab/llm/chain/tools/identifier.rb:22:in `perform'","ee/lib/gitlab/llm/chain/tools/tool.rb:47:in `execute'","ee/lib/gitlab/duo/chat/react_executor.rb:218:in `process_tool_action'","ee/lib/gitlab/duo/chat/react_executor.rb:53:in `block in execute'","ee/lib/gitlab/duo/chat/react_executor.rb:46:in `execute'","ee/lib/gitlab/llm/completions/chat.rb:147:in `agent_or_tool_response'","ee/lib/gitlab/llm/completions/chat.rb:72:in `execute'","ee/app/services/llm/internal/completion_service.rb:44:in `block in execute'","ee/app/services/llm/internal/completion_service.rb:65:in `with_tracking'","ee/app/services/llm/internal/completion_service.rb:22:in `execute'","ee/app/workers/llm/completion_worker.rb:74:in `perform'","ee/lib/gitlab/sidekiq_middleware/set_session/server.rb:18:in `block in call'","lib/gitlab/session.rb:11:in `with_session'","ee/lib/gitlab/sidekiq_middleware/set_session/server.rb:17:in `call'","lib/gitlab/sidekiq_middleware/identity/restore.rb:12:in `call'","lib/gitlab/sidekiq_middleware/resource_usage_limit/middleware.rb:16:in `perform'","lib/gitlab/sidekiq_middleware/resource_usage_limit/server.rb:8:in `call'","lib/gitlab/sidekiq_middleware/skip_jobs.rb:51:in `call'","lib/gitlab/sidekiq_middleware/concurrency_limit/middleware.rb:37:in `perform'","lib/gitlab/sidekiq_middleware/concurrency_limit/server.rb:11:in `call'","lib/gitlab/sidekiq_middleware/throttling/middleware.rb:18:in `perform'","lib/gitlab/sidekiq_middleware/throttling/server.rb:8:in `call'","lib/gitlab/sidekiq_middleware/pause_control/strategies/base.rb:31:in `perform'","lib/gitlab/sidekiq_middleware/pause_control/strategy_handler.rb:22:in `perform'","lib/gitlab/sidekiq_middleware/pause_control/server.rb:8:in `call'","lib/gitlab/sidekiq_middleware/duplicate_jobs/strategies/until_executed.rb:17:in `perform'","lib/gitlab/sidekiq_middleware/duplicate_jobs/duplicate_job.rb:44:in `perform'","lib/gitlab/sidekiq_middleware/duplicate_jobs/server.rb:8:in `call'","lib/click_house/migration_support/sidekiq_middleware.rb:7:in `call'","lib/gitlab/sidekiq_middleware/worker_context.rb:9:in `wrap_in_optional_context'","lib/gitlab/sidekiq_middleware/worker_context/server.rb:19:in `block in call'","lib/gitlab/application_context.rb:177:in `block in use'","lib/gitlab/application_context.rb:177:in `use'","lib/gitlab/application_context.rb:99:in `with_context'","lib/gitlab/sidekiq_middleware/worker_context/server.rb:17:in `call'","lib/gitlab/sidekiq_status/server_middleware.rb:7:in `call'","lib/gitlab/sidekiq_versioning/middleware.rb:9:in `call'","lib/gitlab/sidekiq_middleware/query_analyzer.rb:7:in `block in call'","lib/gitlab/database/query_analyzer.rb:83:in `within'","lib/gitlab/sidekiq_middleware/query_analyzer.rb:7:in `call'","lib/gitlab/sidekiq_middleware/admin_mode/server.rb:14:in `call'","lib/gitlab/sidekiq_middleware/set_ip_address.rb:10:in `block in call'","lib/gitlab/ip_address_state.rb:11:in `with'","lib/gitlab/sidekiq_middleware/set_ip_address.rb:9:in `call'","lib/gitlab/sidekiq_middleware/instrumentation_logger.rb:9:in `call'","lib/gitlab/sidekiq_middleware/batch_loader.rb:7:in `call'","lib/gitlab/sidekiq_middleware/extra_done_log_metadata.rb:7:in `call'","lib/gitlab/sidekiq_middleware/server_metrics.rb:111:in `block in call'","lib/gitlab/sidekiq_middleware/server_metrics.rb:139:in `block in instrument'","lib/gitlab/metrics/background_transaction.rb:33:in `run'","lib/gitlab/sidekiq_middleware/server_metrics.rb:139:in `instrument'","lib/gitlab/sidekiq_middleware/server_metrics.rb:110:in `call'","lib/gitlab/query_limiting/sidekiq_middleware.rb:12:in `block in call'","lib/gitlab/query_limiting/transaction.rb:48:in `run'","lib/gitlab/query_limiting/sidekiq_middleware.rb:11:in `call'","lib/gitlab/sidekiq_middleware/request_store_middleware.rb:8:in `block in call'","lib/gitlab/sidekiq_middleware/request_store_middleware.rb:7:in `call'","lib/gitlab/sidekiq_middleware/monitor.rb:10:in `block in call'","lib/gitlab/sidekiq_daemon/monitor.rb:46:in `within_job'","lib/gitlab/sidekiq_middleware/monitor.rb:9:in `call'","lib/gitlab/sidekiq_middleware/shard_awareness_validator.rb:10:in `block in call'","lib/gitlab/sidekiq_sharding/validator.rb:42:in `enabled'","lib/gitlab/sidekiq_middleware/shard_awareness_validator.rb:9:in `call'","lib/gitlab/sidekiq_middleware/size_limiter/server.rb:13:in `call'","lib/gitlab/sidekiq_logging/structured_logger.rb:21:in `call'"],"user.username":"root","tags.queue":"default","tags.jid":"4e6c958342767dba839b30ac","tags.program":"sidekiq","tags.locale":"en","tags.feature_category":"ai_abstraction_layer","tags.correlation_id":"01K9E58BTCKBF3WTT3ZYWNE5GX"}
The issue suggests that the response from ai-gateway is nil, but the exception is "String does not have #dig". I found the only way to get this exception is if the response is a json string.
References
Screenshots or screen recordings
| Before | After |
|---|---|
How to set up and validate locally
You need a broken ai-gateway to replicate this. I don't know how I could write instructions for that.
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.