Improved logs for the Container Registry
Context
We had several issues related to how inconsistent the container registry logs are. Application logs are structured, but access logs are in the Apache combined format. Even though application logs are structured, error details (err.detail) were sometimes a string and sometimes an object. This has caused some problems (e.g., https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/9316).
Considering this, gitlab-org/container-registry#31 (closed) was raised to integrate LabKit logging in the container registry and normalize all logs. The development is now complete, and we need to coordinate the tagging and deployment of a new version of the container registry with these changes, as I assume we'll need to make some changes in ELK to accommodate the new syntax.
Changes
Application Logs
We've refactored the application logs, ensuring that:
- All logs include a correlation_id;
- All keys use snake_case notation;
- There are no duplicate log entries;
- All key values are flat (no more nested objects) and have a consistent type (no more attributes that sometimes are a string and sometimes an object);
- Error messages are consistent and more descriptive.
Access Logs
We're now using LabKit and its defaults. Given that the only format available so far was the Apache combined format, we added a new configuration option (docs) that lets us customize that with new options, text, json and combined (default).
Log Output
All registry logs go to stdout and this was not configurable. A new configuration was now added (docs), with options stderr and stdout (default).
Example
Before
log:
  level: debug
  formatter: json{"environment":"development","go.version":"go1.14.4","instance.id":"b283b13f-aa6d-449e-943b-711b42ba721a","level":"info","msg":"listening on [::]:5000","service":"registry","time":"2020-07-24T10:55:59.771028+01:00","version":"v2.9.1-gitlab-277-gc4913e07"}
{"go.version":"go1.14.4","http.request.host":"198.18.0.1:5000","http.request.id":"95e1cf2f-5e4b-4964-b427-662d5d81fe28","http.request.method":"GET","http.request.remoteaddr":"198.18.0.1:51505","http.request.uri":"/v2/_catalog","http.request.useragent":"HTTPie/2.2.0","level":"debug","msg":"authorizing request","time":"2020-07-24T10:56:11.027668+01:00"}
{"go.version":"go1.14.4","http.request.host":"198.18.0.1:5000","http.request.id":"95e1cf2f-5e4b-4964-b427-662d5d81fe28","http.request.method":"GET","http.request.remoteaddr":"198.18.0.1:51505","http.request.uri":"/v2/_catalog","http.request.useragent":"HTTPie/2.2.0","http.response.contenttype":"application/json; charset=utf-8","http.response.duration":"19.764824ms","http.response.status":200,"http.response.written":29,"level":"info","msg":"response completed","time":"2020-07-24T10:56:11.046591+01:00"}
198.18.0.1 - - [24/Jul/2020:10:56:11 +0100] "GET /v2/_catalog HTTP/1.1" 200 29 "" "HTTPie/2.2.0"
{"go.version":"go1.14.4","http.request.host":"198.18.0.1:5000","http.request.id":"b5a8f81f-2057-4694-a459-761e69d98a27","http.request.method":"GET","http.request.remoteaddr":"198.18.0.1:51507","http.request.uri":"/v2/busybox/tags/list","http.request.useragent":"HTTPie/2.2.0","level":"debug","msg":"authorizing request","time":"2020-07-24T10:56:18.480063+01:00","vars.name":"busybox"}
{"go.version":"go1.14.4","http.request.host":"198.18.0.1:5000","http.request.id":"b5a8f81f-2057-4694-a459-761e69d98a27","http.request.method":"GET","http.request.remoteaddr":"198.18.0.1:51507","http.request.uri":"/v2/busybox/tags/list","http.request.useragent":"HTTPie/2.2.0","http.response.contenttype":"application/json; charset=utf-8","http.response.duration":"5.783075ms","http.response.status":200,"http.response.written":31,"level":"info","msg":"response completed","time":"2020-07-24T10:56:18.484147+01:00"}
198.18.0.1 - - [24/Jul/2020:10:56:18 +0100] "GET /v2/busybox/tags/list HTTP/1.1" 200 31 "" "HTTPie/2.2.0"
{"go.version":"go1.14.4","http.request.host":"198.18.0.1:5000","http.request.id":"1ab2874d-27d6-4d4e-92ff-82f8601d919e","http.request.method":"GET","http.request.remoteaddr":"198.18.0.1:51512","http.request.uri":"/v2/busyboxs/tags/list","http.request.useragent":"HTTPie/2.2.0","level":"debug","msg":"authorizing request","time":"2020-07-24T10:56:23.869266+01:00","vars.name":"busyboxs"}
{"err.code":"name unknown","err.detail":{"data":{"name":"busyboxs"}},"err.message":"repository name not known to registry","go.version":"go1.14.4","http.request.host":"198.18.0.1:5000","http.request.id":"1ab2874d-27d6-4d4e-92ff-82f8601d919e","http.request.method":"GET","http.request.remoteaddr":"198.18.0.1:51512","http.request.uri":"/v2/busyboxs/tags/list","http.request.useragent":"HTTPie/2.2.0","http.response.contenttype":"application/json; charset=utf-8","http.response.duration":"1.711513ms","http.response.status":404,"http.response.written":116,"level":"error","msg":"response completed with error","time":"2020-07-24T10:56:23.870219+01:00","vars.name":"busyboxs"}
198.18.0.1 - - [24/Jul/2020:10:56:23 +0100] "GET /v2/busyboxs/tags/list HTTP/1.1" 404 116 "" "HTTPie/2.2.0"After
log:
  level: debug
  formatter: json
  output: stderr
  accesslog:
    formatter: json{"environment":"development","go_version":"go1.14.4","instance_id":"636dfc7c-70ed-469e-addd-0675ddf299c8","level":"info","msg":"listening on [::]:5000","service":"registry","time":"2020-07-24T10:58:10+01:00","version":"v2.9.1-gitlab-283-g91d5217f"}
{"correlation_id":"pPiyOYCO7C","go_version":"go1.14.4","level":"debug","msg":"authorizing request","time":"2020-07-24T10:58:50+01:00"}
{"content_type":"application/json; charset=utf-8","correlation_id":"pPiyOYCO7C","duration_ms":3,"host":"198.18.0.1:5000","level":"info","method":"GET","msg":"access","proto":"HTTP/1.1","referrer":"","remote_addr":"198.18.0.1:51532","remote_ip":"198.18.0.1","status":200,"system":"http","time":"2020-07-24T10:58:50+01:00","uri":"/v2/_catalog","user_agent":"HTTPie/2.2.0","written_bytes":29}
{"correlation_id":"KAwrWL1UPC8","go_version":"go1.14.4","level":"debug","msg":"authorizing request","time":"2020-07-24T10:59:00+01:00","vars_name":"busybox"}
{"content_type":"application/json; charset=utf-8","correlation_id":"KAwrWL1UPC8","duration_ms":3,"host":"198.18.0.1:5000","level":"info","method":"GET","msg":"access","proto":"HTTP/1.1","referrer":"","remote_addr":"198.18.0.1:51534","remote_ip":"198.18.0.1","status":200,"system":"http","time":"2020-07-24T10:59:00+01:00","uri":"/v2/busybox/tags/list","user_agent":"HTTPie/2.2.0","written_bytes":31}
{"correlation_id":"K9Wbn3EoKta","go_version":"go1.14.4","level":"debug","msg":"authorizing request","time":"2020-07-24T10:59:03+01:00","vars_name":"busyboxs"}
{"code":"NAME_UNKNOWN","correlation_id":"K9Wbn3EoKta","detail":"map[name:busyboxs]","error":"name unknown: repository name not known to registry","go_version":"go1.14.4","level":"error","msg":"repository name not known to registry","time":"2020-07-24T10:59:03+01:00","vars_name":"busyboxs"}
{"content_type":"application/json; charset=utf-8","correlation_id":"K9Wbn3EoKta","duration_ms":2,"host":"198.18.0.1:5000","level":"info","method":"GET","msg":"access","proto":"HTTP/1.1","referrer":"","remote_addr":"198.18.0.1:51536","remote_ip":"198.18.0.1","status":404,"system":"http","time":"2020-07-24T10:59:03+01:00","uri":"/v2/busyboxs/tags/list","user_agent":"HTTPie/2.2.0","written_bytes":116}We want to improve this further, but this is a good start IMO.
Desired settings for GitLab.com
log:
  level: info
  formatter: json
  accesslog:
    formatter: jsonRelated to gitlab-org/container-registry#31 (closed).