Skip to content
Snippets Groups Projects

Add support for showing the last few lines of the pod logs in kubectl describe output

Often times when a container within a pod crashes or exits with an error, a kubernetes operator may not notice the failure right away, and the logs for the previous run of the container may be cycled out of the node before one has an opportunity to check those previous run logs to capture any errors and troubleshoot the issue. This makes identifying intermittent crashes more difficult.

There are 2 fields in the Kubernetes container spec called terminationMessagePath and terminationMessagePolicy respectively, which can be useful in this situation.

These fields modify the kubectl describe output to show the last few lines of either the file specified in terminationMessagePath, or the container logs under certain conditions. There are some additional docs around this feature found here, however I found that they don't go into quite enough detail to answer some questions around what happens in different scenarios.

I've been using this in my local gitlab agent instances for about 6 months and will provide the missing details below.


The first field, terminationMessagePath provides a writable file path in the container for a process to log messages before the container exits. This value defaults to /dev/termination-log (hereafter referenced as the "termination log") and is set by the API server if not provided in the manifest upon deploy. The termination log is used to provide context about a container failure via kubectl describe.

By default, the second field, terminationMessagePolicy references the termination log (terminationMessagePolicy: File). If a container ends with a non-zero exit code in the default scenario, and the termination log is empty, the kubectl describe output simply displays the container's exit code, last exit time, last start time, etc. Unfortunately, none of our deployments write to the termination log, and so it is always empty; thus I came to the conclusion that the application binaries must be designed to make use of the file in order for kubectl describe output to provide the details in the default configuration.

However, according to the second link above, if the file is populated with a message, then the last 2048 bytes, or last 80 lines of the file (whichever is smaller) will be added to the kubectl describe output to provide context about the last non-zero exit of the container even if the container's logs have already become inaccessible. Note, this only applies for as long as the pod itself exists. If a pod containing this added context is replaced during a rolling update, the new pod does not retain the additional context from the previous pod.

From what I've seen, most processes, including the agentk binary, don't make use of the termination log; so we end up in the situation described at the top of this issue and are unable to explain the cause of a crash if the container logs are already gone.

Fortunately, Kubernetes has a solution for this as well. If you set terminationMessagePolicy: FallbackToLogsOnError, when a container exits with a non-zero exit code, Kubernetes will still try to check the termination log first, and if it has some contents, those contents will still be displayed. However, if the termination log is empty as it would be if the agentk binary experienced a crash of some kind, Kubernetes will append the last 2048 bytes, or last 80 lines (whichever is smaller) of the container's logs to the kubectl describe output for the container.

This does bear some permissions considerations -- if the agent logs any sensitive data, the sensitive data could be provided to users who don't have access to the pod logs via kubectl describe.

I believe the risk of this for the GitLab Agent is low because (1) the GitLab agent is intended to be deployed by users with Cluster Admin privileges and (2) agentk doesn't appear to log any sensitive data such as the agent registration token.

This MR allows for the GitLab Agent pods to be configured with terminationMessagePolicy: FallbackToLogsOnError if the value crashLogsInKubectlDescribe is set to true

Edited by Thomas Spear

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
Please register or sign in to reply
Loading