Auth Failure with Reason Codes
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Summary
Introduce structured logging for authentication failures (HTTP 401/403/404), enriched with internal reason codes. This will enable operators and security teams to distinguish between different failure causes, correlate events across services, and detect both misconfigurations and abuse more effectively.
Problem Statement
Currently, authentication failures are only observable at the HTTP status code level (401, 403, 404). These codes are too coarse to diagnose issues or detect abuse patterns effectively:
-
401 Unauthorizedcould mean missing token, expired token, invalid header, or unknown token. -
403 Forbiddencould mean insufficient scope, fine-grained restriction, or job-token limitation. -
404 Not Foundcould be an actual missing resource or a masked "forbidden" response.
This lack of granularity makes it difficult to:
- Troubleshoot user/project misconfigurations quickly.
- Detect spikes in specific types of failures (e.g., brute-force attempts with invalid tokens).
- Correlate failures across endpoints and projects.
Proposed Solution
Add reason codes for auth failures in request logs (e.g., invalid_token_format, expired_token, insufficient_scope, job_token_restricted, masked_forbidden)