gitlab-sshd: comprehensive logging story
Currently, what we log to gitlab-shell.log
, especially in gitlab-sshd
, is haphazard. We need to audit the logs and ensure that they provide all the information we need to support an instance running in production.
In particular, we want to be able to track the lifecycle of individual SSH connections and channels,from looking up the SSH key to the command(s) that are run, and the Gitaly RPCs they invoke.
-
Using log.ContextLogger
everywhere it's appropriate: https://gitlab.com/gitlab-org/labkit/-/blob/master/log/logger.go -
Replacing all Infof/Warnf/Debugf
calls withInfo/Warn/Debug
, moving the variable component intoWithField[s]
instead -
Audit all code paths and ensure that the logging is sufficient for us to debug them in production -
Add request/response size to gitlab-shell logs gitlab#338497 (closed)
Code path audit
For gitlab-sshd
, these are the files we principally have to be interested in:
/
├cmd/ [OK]
├── gitlab-sshd/ [OK]
│ └── main.go [OK]
├internal/
├── command/
│ ├── commandargs/
│ │ ├── command_args.go [OK]
│ │ └── shell.go [!!]
│ ├── command.go [OK]
│ ├── lfsauthenticate/
│ │ ├── lfsauthenticate.go [!!]
│ ├── personalaccesstoken/
│ │ ├── personalaccesstoken.go [!!]
│ ├── readwriter/ [OK]
│ │ └── readwriter.go [OK]
│ ├── receivepack/
│ │ ├── gitalycall.go [OK] - can be done in `gc.RunGitalyCommand`
│ │ ├── receivepack.go [!!]
│ ├── shared/
│ │ ├── accessverifier/
│ │ │ ├── accessverifier.go [??] - probably fine if everything calling Verify already logs
│ │ ├── customaction/
│ │ │ ├── customaction.go [!!]
│ │ └── disallowedcommand/ [OK]
│ │ └── disallowedcommand.go [OK]
│ ├── twofactorrecover/
│ │ ├── twofactorrecover.go [!!]
│ ├── twofactorverify/
│ │ ├── twofactorverify.go [!!]
│ ├── uploadarchive/
│ │ ├── gitalycall.go [OK] - can be done in `gc.RunGitalyCommand`
│ │ ├── uploadarchive.go [!!]
│ └── uploadpack/
│ ├── gitalycall.go [OK] - can be done in `gc.RunGitalyCommand`
│ ├── uploadpack.go [!!]
├── config/
│ ├── config.go [!!] - one thing we could improve but unsure about order dependency
├── console/
│ ├── console.go [OK]
├── gitlabnet/
│ ├── client.go [!!]
│ ├── */ [OK] - no errors swallowed here
├── handler/
│ ├── exec.go [!!]
├── keyline/
│ ├── key_line.go [OK]
├── logger/
│ ├── logger.go [OK]
├── metrics/
│ └── metrics.go [OK]
├── pktline/
│ ├── pktline.go [OK]
├── sshd/ [!!]
│ ├── connection.go
│ ├── server_config.go
│ ├── session.go
│ ├── sshd.go
├── sshenv/ [OK]
│ ├── sshenv.go [OK]
I suppose I'll start by going through each one and adding any logging that seems missing.
Since each connection gets its own correlation-id, we don't need to worry so much about logging data like client IP in every log message; just the initial message for the connection.
Problems noted:
-
internal/command/commandargs/shell.go
-validate
method swallows the actual error by usingisValidSSHCommand
-
internal/command/lfsauthenticate/lfsauthenticate.go
-
Errors from authenticate
method are swallowed. -
No around-command logging; will need fixing if not addressed systematically a level up -
No intra-command logging
-
-
internal/command/personalaccesstoken/personalaccesstoken.go
-
No around-command logging; will need fixing if not addressed systematically a level up -
No intra-command logging -
Errors & outcomes need to be evident in the logs as well as in the user's console output
-
-
internal/command/receivepack/receivepack.go
-
No around-command logging; will need fixing if not addressed systematically a level up -
No intra-command logging -
Need to log custom-action status
-
-
internal/command/shared/customaction/customaction.go
-
Some logging, but could do with fleshing out
-
-
internal/command/twofactorrecover/twofactorrecover.go
-
No around-command logging; will need fixing if not addressed systematically a level up -
No intra-command logging -
Errors & outcomes need to be evident in the logs as well as in the user's console output
-
-
internal/command/twofactorverify/twofactorverify.go
-
No around-command logging; will need fixing if not addressed systematically a level up -
No intra-command logging -
Errors & outcomes need to be evident in the logs as well as in the user's console output
-
-
internal/command/uploadarchive/uploadarchive.go
-
No around-command logging; will need fixing if not addressed systematically a level up -
No intra-command logging
-
-
internal/command/uploadpack/uploadpack.go
-
No around-command logging; will need fixing if not addressed systematically a level up -
No intra-command logging -
Need to log custom-action status
-
-
internal/config/config.go
-
Would be nice to log when we're applying SSL_CERT_DIR
-
Logging interceptors as well as tracing interceptors for HTTP?
-
-
internal/gitlabnet/client.go
-
Swallows actual JSON parser error in client.ParseJSON
#533 (closed)
-
-
internal/handler/exec.go
-
Needs to log what's happening in RunGitalyCommand
-
Logging interceptors as well as tracing interceptors for gRPC?
-
-
internal/sshd/*.go
-
connection.go
-
session.go
-
server_config.go
-
sshd.go
-