Skip to content

Remove duplicate messages and trim/supplement log data to support Ruby/Rails code

Copied from gitlab-org/charts/gitlab#2432 (comment 464138776)

Problem Statement

The current use of gitlab-logger presents two primary issues that make this unsuitable for GitLab.com:

  • The structure of messages is not desired
  • Messages are often duplicated

If we enabled it as is, we'd have the incorrect logging structure which would prevent our logging system from taking in the appropriate logs. However, we also require something to help provide usable logs from Ruby/Rails.

Example Logs

Edited for ease of viewing

  {
    "date": "2021-01-07T19:08:30Z",
    "component": "gitlab",
    "subcomponent": "production",
    "level": "info",
    "file": "/var/log/gitlab/production.log",
    "message": "Started GET \"/-/readiness\" for 172.17.0.1 at 2021-01-07 19:08:30 +0000"
  }
  {
    "date": "2021-01-07T19:08:30Z",
    "component": "gitlab",
    "subcomponent": "production_json",
    "level": "info",
    "file": "/var/log/gitlab/production_json.log",
    "message": {
      "method": "GET",
      "path": "/-/readiness",
      "format": "html",
      "controller": "HealthController",
      "action": "readiness",
      "status": 200,
      "time": "2021-01-07T19:08:30.881Z",
      "params": [],
      "remote_ip": null,
      "user_id": null,
      "username": null,
      "ua": null,
      "db_count": 0,
      "db_write_count": 0,
      "db_cached_count": 0,
      "correlation_id": "4fab6aec-310e-429c-aea1-ae0830f400a7",
      "cpu_s": 0.01,
      "db_duration_s": 0,
      "view_duration_s": 0.00051,
      "duration_s": 0.00235
    }
  }

The above represents 1 single request to the /-/readiness endpoint.

Desired Structure

  • Drop date - this is a duplicate of the message.time field which we use for ES, there's no need to have this secondary field
  • Drop file - this information is repetitive of subcomponent
  • Ensure the log object is as flat as possible, avoiding nested json objects
  • message should not alternate between objects and strings as this may lead to errors when slurping log data and the inability to appropriately search this field
  • In the above example, all items in message should be sent to the root of the json object.

Duplicated Messages

From our example log output, you already see that we are duplicating logs from the production.log and production_json.log. This needs to be handled accordingly. Due to the duplication, we violate the constraints of the field message as it's either a string, as noted in the first log entry, or an object, as noted in the second log entry.

Careful Considerations

  • We may need to invest some effort into the application itself: gitlab-org&5059
  • Errors will look vastly different - we need to ensure that we hold the json object appropriately
  • ...
Edited by Amy Phillips