Skip to content

Incorrect usage of go.sum in go dependency scanning

Release notes

TBD

Summary

We recently started using the awesome dependency scanning feature, especially related to go projects.

When using the scanner, we often get false-positives, because the scanner does not recognize updated dependencies in the go.mod file.

When updating a dependency including its transitive dependencies via go get -u, the go.sum file contains both the old and new version of a dependency, while in the real world, only the newly specified version is used.

Even go mod tidy will not clean up the file correctly here, if a transitive dependency specifies a different version in its go.mod file.

This leads to false-positives during the dependency scan.

In my opinion, parsing the go.sum for all used package is not fully correct, as multiple package versions will show up.

Steps to reproduce

Create 3 go projects:

  • second-module (v1.0.0 and v1.1.0): contains exposed module function Second()
  • first-module (v1.0.0): uses Second() function from second-module and exposes it as First(), specifies v1.0.0 of second-module in its go.mod
  • main-project: calls First() function of first-module

Finally, call go get -u first-module in the main-project, which will update first-module and all its dependencies.

Take note of the updated go.sum file in main-project. This one will now contain references to both v1.0.0 and v1.1.0 of the second-module, while only v1.1.0 is actually used.

This will lead to false-positives in the dependency scanning.

Example Project

tbd

What is the current bug behavior?

False positives in the go dependency scanning, as old unused imports are scanned as well.

What is the expected correct behavior?

Only actually used modules are scanned and will show up in the scan results

Proposed fix

Utilize golang.org/x/tools/go/packages

👍 Benefits 👎 Drawbacks
Outputs the modules required by the main module (the module of the repository scanned). Requires the Go toolchain to be installed. The packaged version of this is around 100MB.
Can report or filter out test dependencies. Requires a goproxy to be setup for offline module downloads, vendored dependencies, or a modcache with all dependencies preloaded.

MR: Draft: Add go module resolution [golang-only] (gitlab-org/security-products/analyzers/gemnasium!372 - closed)

Please let me know what you think, also let me know if I misunderstood the behavior here. Thanks a lot for this awesome tool!

Older proposals

Keep the highest version of module

The correct way would be probably parsing the go.mod file and all its references recursively while merging versions specified on upper levels to lower levels accordingly. => gitlab-org/security-products/analyzers/gemnasium!218 (closed)

👍 Benefits 👎 Drawbacks
Works in online and offline environments. Can produce false negatives i.e. missed vulnerabilities.
Does not require Go toolchain to be installed.

MR: Draft: Change go.sum parser to keep the highest... (gitlab-org/security-products/analyzers/gemnasium!218 - closed)

Utilize go list tool

Also, there is go list -json -deps which looks promising and might be a better fit. => gitlab-org/security-products/analyzers/gemnasium!203 (closed)

👍 Benefits 👎 Drawbacks
Outputs the modules required by the main module (the module of the repository scanned). Requires the Go toolchain to be installed. The packaged version of this is around 100M.
Can report or filter out test dependencies. Requires a goproxy to be setup for offline module downloads, vendored dependencies, or a modcache with all dependencies preloaded.

MR: List go dependencies using go list (gitlab-org/security-products/analyzers/gemnasium!203 - closed)

Edge Cases

Go versions

The implementation must take into account the module's Go version when selecting dependencies. For example, if the module mentions 1.17 building with 1.18 might result in a different set of dependencies selected. By default, the version inside of the main module's go.mod file is used. For this iteration, matrix testing should be considered out of scope.

Vendored dependencies

Vendored dependencies can be used to lock in the dependencies that are utilized. Referencing the Go mod go directive documentation, it states that the vendor/modules.txt file records what modules are used. The format for this looks like the following example.

       │ File: vendor/modules.txt
───────┼────────────────────────────────────────────────────────────────────────
   1   │ # github.com/BurntSushi/toml v0.3.1
   2   │ github.com/BurntSushi/toml
   3   │ # github.com/Microsoft/go-winio v0.5.1
   4   │ github.com/Microsoft/go-winio

Every line for a module that it depends on is commented and implementing a parser for this may be possible. However, it should be noted that it is possible for this to produce false negatives. See Build tags, operating systems, and architectures for an example of this.

Offline environments

Offline environments are currently supported by the go.sum parser since it does not require any network configuration. On the other hand, the go list command and the golang.org/x/tools/go/packages module both require the modcache to contain the required modules or a Go project that has been vendored. To prevent a breaking change, we can run go-modlist first and fallback to go.sum parsing if needed. If running the go.sum parser, the possibility of false positives should be explicitly logged and include a link to this issue. Running go-modlist first offers compatibility for offline environments that have configured the GOPROXY variable to point to an internal mirror and repositories that contain vendored dependencies.

Build tags, operating systems and architectures

Testing the gemnasium project with the go list -deps -json -f solution and go-modlist produces a total of 41 modules. Interestingly enough, vendoring then parsing vendor/modules.txt produces 43! Diffing the module outputs and then running go mod why -m on the diffed modules shows that they are used by the main module.

Output
diff --git a/modules.txt b/modules_2.txt
index 9170913..69df6ad 100644
--- a/modules.txt
+++ b/modules_2.txt
@@ -1,5 +1,7 @@
 github.com/BurntSushi/toml
+github.com/Microsoft/go-winio
 github.com/ProtonMail/go-crypto
+github.com/acomagu/bufpipe
 github.com/bmatcuk/doublestar
 github.com/cpuguy83/go-md2man/v2
 github.com/davecgh/go-spew
$ go mod why -m -vendor github.com/acomagu/bufpipe
# github.com/acomagu/bufpipe
gitlab.com/gitlab-org/security-products/analyzers/gemnasium/v2/advisory
gitlab.com/gitlab-org/security-products/analyzers/report/v3
gitlab.com/gitlab-org/security-products/analyzers/ruleset
github.com/go-git/go-git/v5
github.com/go-git/go-git/v5/utils/ioutil
github.com/acomagu/bufpipe

$ go mod why -m -vendor github.com/Microsoft/go-winio
# github.com/Microsoft/go-winio
gitlab.com/gitlab-org/security-products/analyzers/gemnasium/v2/advisory
gitlab.com/gitlab-org/security-products/analyzers/report/v3
gitlab.com/gitlab-org/security-products/analyzers/ruleset
github.com/go-git/go-git/v5
github.com/go-git/go-git/v5/plumbing/transport/client
github.com/go-git/go-git/v5/plumbing/transport/ssh
github.com/xanzy/ssh-agent
github.com/Microsoft/go-winio

This indicates that either go list and go-modlist are both not accurate or that the vendor directory contains more packages due to it's compatibility with different environments. Investigating the build tags for the files that contains these imports shows that this is correct! The github.com/go-git/go-git/v5/utils/ioutil/pipe_js.go file will only include github.com/acomagu/bufpipe if the js build tag is set. Similarly, the github.com/xanzy/ssh-agent package will only include github.com/Microsoft/go-winio if the windows tag is set (building on a windows environment). This may cause false negatives if the build system is Linux based and the production system runs on Windows.

It would beneficial to handle this edge case via matrix testing to cover multiple environments. Generating this information and propagating should be scoped within future iterations.

Implementation plan

Edited by Tetiana Chupryna