Docs: Add Prometheus metrics setup for AI Gateway monitoring

Summary

The GitLab AI Gateway and Duo Workflow Service support Prometheus metrics for monitoring, but the required configuration steps are not documented in our AI Gateway installation documentation. This is causing confusion for customers who need to enable monitoring for their self-hosted AI Gateway deployments.

Problem

When customers attempt to enable Prometheus metrics for monitoring their AI Gateway deployment, they cannot find the necessary configuration in our official documentation. The environment variables and setup steps required to expose metrics endpoints are not mentioned in the installation guide or operational documentation.

Current Documentation Gap

The following critical information is missing from https://docs.gitlab.com/install/install_ai_gateway/:

AI Gateway Metrics Configuration

The AI Gateway metrics endpoint should start automatically when properly configured, but customers need to know about these environment variables:

AIGW_FASTAPI__METRICS_HOST=0.0.0.0 AIGW_FASTAPI__METRICS_PORT=8082

Metrics endpoint: http://localhost:8082/metrics

Code reference: Configuration is defined in ai_gateway/config.py using Pydantic Base settings.

Duo Workflow Service Metrics Configuration

The Duo Workflow Service requires explicit configuration (no defaults) for Prometheus metrics:

PROMETHEUS_METRICS__ADDR=0.0.0.0 PROMETHEUS_METRICS__PORT=8083

Metrics endpoint: http://localhost:8083/metrics

Code reference: Configuration is defined in duo_workflow_service/monitoring.py

Expected Documentation Updates

The AI Gateway installation documentation should include:

A new "Monitoring and Observability" section that covers:
- How to enable Prometheus metrics
- Required environment variables for both AI Gateway and Duo Workflow Service
- Metrics endpoints and how to access them
- Example Prometheus scrape configuration
- Available metrics and their descriptions
Environment variable reference table similar to other configuration pages
Integration guidance for:
- Prometheus/Grafana setup
- Common monitoring scenarios
- Recommended alerts for production deployments

Customer Impact

All self-hosted AI Gateway customers: Monitoring is a critical operational requirement

This is particularly important as more customers adopt self-hosted AI Gateway deployments for GitLab Duo features.

Suggested Documentation Structure

## Monitoring and Observability

### Enable Prometheus metrics

GitLab AI Gateway and Duo Workflow Service support Prometheus metrics for monitoring.

#### AI Gateway metrics

Configure the following environment variables to enable AI Gateway metrics:

| Variable | Description | Default | Required |
|----------|-------------|---------|----------|
| AIGW_FASTAPI__METRICS_HOST | Metrics server bind address | - | Yes |
| AIGW_FASTAPI__METRICS_PORT | Metrics server port | - | Yes |

Example configuration:

[configuration example would go here]

Metrics are available at: http://:8082/metrics``

#### Duo Workflow Service metrics

`Configure the following environment variables to enable Duo Workflow Service metrics:`

`| Variable | Description | Default | Required |`
`|----------|-------------|---------|----------|`
`| `PROMETHEUS_METRICS__ADDR` | Metrics server bind address | None | Yes |`
`| `PROMETHEUS_METRICS__PORT` | Metrics server port | None | Yes |`

`[continued with examples and Prometheus configuration]`

AI Gateway repository: https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist
Current installation docs: https://docs.gitlab.com/install/install_ai_gateway/
Slack discussion: [link to #g_custom_models thread if available]

CC

@manojmj

Edited Dec 08, 2025 by 🤖 GitLab Bot 🤖