nodejs-scan should work with directories that contain symlinks
Problem to solve
nodejs-scan v3.1.0 will fail to scan any repository that contains compatible files that are symlinks, i.e. a project layout like this:
nodejs-scan-debugging on main [?] via
❯ ls -la
total 32
drwxr-xr-x 8 james staff 256 29 Jul 15:26 .
drwxr-xr-x 24 james staff 768 29 Jul 15:17 ..
drwxr-xr-x 12 james staff 384 28 Jul 18:17 .git
-rw-r--r-- 1 james staff 532 28 Jul 11:50 main.js
lrwxr-xr-x 1 james staff 7 28 Jul 11:50 main_sym.js -> main.js
The analyser will crash with the following logs:
❯ analyzer-build && analyzer-run ../../tests/nodejs-scan-debugging
tag: nodejs-scan:master
[+] Building 1.2s (15/15) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 37B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/python:3.10-alpine 1.1s
=> [internal] load metadata for docker.io/library/golang:1.17-alpine 1.1s
=> [internal] load build context 0.0s
=> => transferring context: 3.48kB 0.0s
=> [stage-1 1/5] FROM docker.io/library/python:3.10-alpine@sha256:a746f64081fca7d6368935750ffcbf04d447cb0131408c60cbf1a4392981890a 0.0s
=> [build 1/4] FROM docker.io/library/golang:1.17-alpine@sha256:844031724987d525bd99857b3b8c00f99ff003241afdc5d1ee121d81eb4b8301 0.0s
=> => resolve docker.io/library/golang:1.17-alpine@sha256:844031724987d525bd99857b3b8c00f99ff003241afdc5d1ee121d81eb4b8301 0.0s
=> CACHED [stage-1 2/5] RUN pip install ruamel.yaml==0.16.12 njsscan==0.3.1 0.0s
=> CACHED [stage-1 3/5] RUN apk --no-cache add git ca-certificates gcc libc-dev 0.0s
=> CACHED [build 2/4] WORKDIR /go/src/app 0.0s
=> CACHED [build 3/4] COPY . . 0.0s
=> CACHED [build 4/4] RUN CHANGELOG_VERSION=$(grep -m 1 '^## v.*$' "CHANGELOG.md" | sed 's/## v//') && PATH_TO_MODULE=`go list -m` && go build -ldflags="-X '$PATH_ 0.0s
=> CACHED [stage-1 4/5] COPY --chown=root:root --from=build /go/src/app/analyzer / 0.0s
=> CACHED [stage-1 5/5] COPY .njsscan .njsscan 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:389d1d5e1db1bf345b01e7294b84393eff14b7e6b048811865bd10a7b79019b1 0.0s
=> => naming to docker.io/library/nodejs-scan:master 0.0s
image: nodejs-scan:master
[INFO] [NodeJsScan] [2022-07-28T01:50:52Z] [/go/pkg/mod/gitlab.com/gitlab-org/security-products/analyzers/command@v1.8.0/command.go:76] ▶ GitLab NodeJsScan analyzer v3.1.0
[INFO] [NodeJsScan] [2022-07-28T01:50:52Z] [/go/pkg/mod/gitlab.com/gitlab-org/security-products/analyzers/command@v1.8.0/run.go:125] ▶ Detecting project
[INFO] [NodeJsScan] [2022-07-28T01:50:52Z] [/go/pkg/mod/gitlab.com/gitlab-org/security-products/analyzers/command@v1.8.0/run.go:147] ▶ Found relevant files in project, analyzing entire repository
[INFO] [NodeJsScan] [2022-07-28T01:50:52Z] [/go/pkg/mod/gitlab.com/gitlab-org/security-products/analyzers/command@v1.8.0/run.go:159] ▶ Running analyzer
[DEBU] [NodeJsScan] [2022-07-28T01:50:52Z] [/go/src/app/loadRuleset.go:21] ▶ /tmp/app/.gitlab/sast-ruleset.toml not found, ruleset support will be disabled.
[DEBU] [NodeJsScan] [2022-07-28T01:50:53Z] [/go/src/app/analyze.go:40] ▶ /usr/local/bin/njsscan --config .njsscan --json --output /tmp/njsscan.json /tmp/app
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/semgrep/semgrep_main.py", line 336, in main
target_manager = TargetManager(
File "<attrs generated init semgrep.target_manager.TargetManager>", line 24, in __init__
File "/usr/local/lib/python3.10/site-packages/semgrep/target_manager.py", line 483, in __attrs_post_init__
self.targets = [
File "/usr/local/lib/python3.10/site-packages/semgrep/target_manager.py", line 484, in <listcomp>
Target(
File "<attrs generated init semgrep.target_manager.Target>", line 7, in __init__
File "/usr/local/lib/python3.10/site-packages/semgrep/target_manager.py", line 338, in validate_path
raise FilesNotFoundError(paths=tuple([value]))
semgrep.error.FilesNotFoundError: File not found: /tmp/app/main_sym.js
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/njsscan", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.10/site-packages/njsscan/__main__.py", line 77, in main
).scan()
File "/usr/local/lib/python3.10/site-packages/njsscan/njsscan.py", line 44, in scan
result = scanner.scan()
File "/usr/local/lib/python3.10/site-packages/libsast/scanner.py", line 65, in scan
self.options).scan(valid_paths)
File "/usr/local/lib/python3.10/site-packages/libsast/core_sgrep/semantic_sgrep.py", line 40, in scan
sgrep_out = invoke_semgrep(paths, self.scan_rules)
File "/usr/local/lib/python3.10/site-packages/libsast/core_sgrep/helpers.py", line 50, in invoke_semgrep
) = semgrep_main.main(
File "/usr/local/lib/python3.10/site-packages/semgrep/semgrep_main.py", line 347, in main
raise SemgrepError(e)
semgrep.error.SemgrepError: File not found: /tmp/app/main_sym.js
[FATA] [NodeJsScan] [2022-07-28T01:50:53Z] [/go/src/app/main.go:28] ▶ open /tmp/njsscan.json: no such file or directory
Note the semgrep.error.SemgrepError: File not found: /tmp/app/main_sym.js exception.
This problem was peculiar because the underlying scanner (njsscan) had not been upgraded between the v3.0.0 and v3.1.0 release of our analyser, so we weren't expecting any changes in scanner behaviour. Further discovery uncovered that semgrep is installed as a transitive dependency of njsscan, via a library called libsast. libsast is not pinned to a specific version, so rebuilding the Docker container for njsscan could cause newer versions of libsast to be downloaded. That's in fact what has happened here.
- v3.0.0 of our nodejs-scan analyser was built with version 1.5.0 of libsast , which pulls semgrep 0.80.0
- v3.1.0 of our nodejs-scan analyser was built with version 1.5.2 of libsast, which pulls semgrep 0.104.0
Because the v3.1.0 Docker image was built recently, it also implicitly upgraded semgrep because the libsast dependency wasn’t being pinned by the upstream scanner.
semgrep 0.80.0 only filtered out “invalid” files, but 0.104.0 will raise an exception.
Proposal
An upstream issue has been filed: https://github.com/ajinabraham/njsscan/issues/99
In the meantime, it's possible to downgrade libsast to 1.5.0 in the analyser's Dockerfile to restore the old, working behaviour.
References
- Slack thread (internal link for team members)
- Zendesk ticket) (internal link for team members)
- Upstream issue