Proposal: Secure, Compliant, Zero-Latency Logging for Model Gateway
Secure, Compliant, Zero-Latency Logging for Model Gateway
Depends on outcome of this discussion https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/issues/357#note_1759867401
Problems to solve
- We want to log prompts and outputs in the model-gateway (ai-gateway) FastAPI webapp
- We don't want to log secrets, keys, PII and other sensitive data
- We don't want to introduce latency when redacting sensitive data from logs
- Destination = Kibana production
Related Issues and MRs
- https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/issues/357
- !519 (closed) (Closed, not to be merged)
Proposal
- Use FastAPIs Background Tasks to write logs after response has been returned, thus zero-to-negligible impact on latency
- https://fastapi.tiangolo.com/tutorial/background-tasks/
- Use middlewares to inject the logger background tasks on all routes: https://fastapi.tiangolo.com/advanced/middleware/
- Are there any exceptions? i.e. Any routes we do not want logged?
- Implement the logger background task which sanitizes and redact secrets, PII etc. from logs before publishing on Kibana
- Try out major Python libraries and measure efficacy:
- Compile significant test dataset to verify solution
- Seek advise from legal/compliance teams to verify solution
Guiding Principles
- Leverage the ecosystem
- Use mechanisms offered by FastAPI (ai-gateway is a FastAPI app after all) such as middlewares, background tasks etc.
- Use redaction libraries available in Python
- Redacting secrets and PII is a -solved- problem on Python
- Apart from handling rare edge-cases, there is very little need to re-invent the wheel
- Transparent tests and results
- Write comprehensive, near-real-world tests
- Publish for internal approvals (legal, compliance etc.)
- Publish externally for trust + transparency
Where and what to patch and test
- FastAPI middleware access logger
- https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/blob/main/ai_gateway/api/middleware.py?ref_type=heads#L153
- https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/blob/main/ai_gateway/structured_logging.py?ref_type=heads#L45
- https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/blob/main/ai_gateway/app.py?ref_type=heads#L39
- API V2 snowplow events tracker
- API V3 structured logging
Next steps
- Compile comprehensive data set of near-real-world test data
- Create redactors using various libraries and combination of libraries
- Test and review test results, identify best redactors
- Introduce logger as background task on handpicked endpoints / routes
- Test for latency impact
- Introduce logger as background tasks on all endpoints using middleware
- Test, measure latency, release
Edited by Sri Rang