Refactor command line logs w/ logexec (PoC) (!177) · Merge requests · GitLab.org / security-products / analyzers / gemnasium

Fabien Catteau requested to merge logexec into master Feb 03, 2021

What does this MR do?

(functional PoC, can be merged once it's been reviewed and approved)

Introduce a logexec package to refactor the logging of command lines executed by the analyzer. This simplifies the code and ensures command lines are logged in a consistent way:

		output, err := cmd.CombinedOutput()
		log.Debugf("%s\n%s", cmd.String(), output)
		if err != nil {
		if err := cmd.Run(); err != nil {
			return err
		}

Also, capture the command line and its output as an ERROR message when it fails. This way it's possible to investigate scanning job that failed because of a command line even if the log level wasn't set to debug; there's no need to set SECURE_LOG_LEVEL to debug and trigger a new pipeline, which is time consuming and not always possible.

In the following example, the analyzer logs:

git remote update and its output as a debug message because its successful
git remote checkout and its output as an error message because it fails

% ./analyzer run
[INFO] [Gemnasium] [2021-02-03T15:26:11+01:00] ▶ GitLab Gemnasium analyzer vnot-configured
[DEBU] [Gemnasium] [2021-02-03T15:26:11+01:00] ▶ scanning directory: .
...
[DEBU] [Gemnasium] [2021-02-03T15:26:11+01:00] ▶ scanning directory: convert
[DEBU] [Gemnasium] [2021-02-03T15:26:14+01:00] ▶ /usr/local/bin/git -C ../../gemnasium-db remote update
Fetching origin

[ERRO] [Gemnasium] [2021-02-03T15:26:14+01:00] ▶ /usr/local/bin/git -C ../../gemnasium-db checkout 
fatal: empty string is not a valid pathspec. please use . instead if you meant to match all paths

exit status 128

Limitations

This can't be used when an error is expected. In particular, in some cases the caller expects Cmd.Run() to return an error, in which case the combined output shouldn't be log as an error when the command line fails. We could introduce a bool attribute to handle that, but in some cases error handling is more complex, and one particular type of error is expected; other types of errors should still be processed as true errors, and the combined output should be logged with the error level. Overall it's probably best to keep it simple, and only cover the happy path.

Currently it's not possible to set the logger. This should be change to provide more flexibility, and also to make it possible to implement unit tests.

What are the next steps?

extract logexec as a separate package, or make that part of the existing logutils package
write unit tests for logexec; we probably need to stub the logger for that
document the logexec package, and improve documentation
debug the standard error when the Output method is called; see TODO
stream stdout and stderr instead of logging when the command line exits; see gitlab-org/gitlab#227130

What are the relevant issue numbers?

TODO

Does this MR meet the acceptance criteria?

Edited Feb 04, 2021 by Fabien Catteau

Refactor command line logs w/ logexec (PoC)

What does this MR do?

Limitations

What are the next steps?

What are the relevant issue numbers?

Does this MR meet the acceptance criteria?

Merge request reports