Semgrep analyzer fails with some C files
Summary
Some C files cause the Semgrep analyzer to fatally crash after the process to convert the report from SARIF to our own format begins. This behavior is present in the latest Semgrep analyzer release.
This was reported by our customer within this ticket (internal use) but is also occurring in the Wireshark repository.
The team behind Wireshark have opened an issue for this within the Semgrep repository on GitHub. As the SARIF report is properly generated and the error only surfaces after the conversion process begins, I'm unsure if this issue lies with us or with Semgrep.
Steps to reproduce
-
Copy this file from the Wireshark repository.
-
Ensure the
Security/SAST.gitlab-ci.ymltemplate is included in your.gitlab-ci.ymland setSECURE_LOG_LEVEL: debug. -
Run the job with the latest image - the error will occur.
-
Modify the job to use image version
2.13.1of the Semgrep analyzer - the error will not occur.
You can add the following in your .gitlab-ci.yml to ensure the SARIF report is collected from the failing job for further analysis.
semgrep-sast:
artifacts:
when: always
paths:
- semgrep.sarif
reports:
sast: gl-sast-report.json
image: registry.gitlab.com/security-products/semgrep:2.13.1
Example Project
https://gitlab.com/calebw/c_hw/-/pipelines
What is the current bug behavior?
Semgrep fatally crashes with certain C files.
What is the expected correct behavior?
Semgrep properly analyses all C files.
Relevant logs and/or screenshots
[DEBU] [Semgrep] [2022-09-19T16:21:15Z] [/go/src/buildapp/analyze.go:137] ▶ METRICS: Using configs from the Registry (like --config=p/ci) reports pseudonymous rule metrics to semgrep.dev.
To disable Registry rule metrics, use "--metrics=off".
Using configs only from local files (like --config=xyz.yml) does not enable metrics.
More information: https://semgrep.dev/docs/metrics
Scanning 3 files with 70 c rules.
Some files were skipped or only partially analyzed.
Partially scanned: 1 files only partially analyzed due to a parsing or internal Semgrep error
Ran 304 rules on 3 files: 5 findings.
[INFO] [Semgrep] [2022-09-19T16:21:15Z] [/go/pkg/mod/gitlab.com/gitlab-org/security-products/analyzers/command@v1.9.1/run.go:179] ▶ Creating report
[DEBU] [Semgrep] [2022-09-19T16:21:15Z] [/go/src/buildapp/convert.go:25] ▶ Converting report with the root path: /builds/calebw/c_hw
[FATA] [Semgrep] [2022-09-19T16:21:15Z] [/go/src/buildapp/main.go:27] ▶ tool notification error: Fatal error Fatal error at line another_one.c:1:
Common.Todo
====[ BEGIN error trace ]====
Common.Todo
Raised at Ast_c_build.statement_sequencable in file "src/pfff/lang_c/parsing/ast_c_build.ml", line 484, characters 37-47
Called from Stdlib__List.map in file "list.ml", line 92, characters 20-23
Called from Ast_c_build.compound in file "src/pfff/lang_c/parsing/ast_c_build.ml", line 474, characters 7-36
Called from Ast_c_build.func_def in file "src/pfff/lang_c/parsing/ast_c_build.ml", line 214, characters 13-41
Called from Ast_c_build.declaration in file "src/pfff/lang_c/parsing/ast_c_build.ml", line 155, characters 27-45
Called from Ast_c_build.toplevel in file "src/pfff/lang_c/parsing/ast_c_build.ml", line 143, characters 6-26
Called from Stdlib__List.map in file "list.ml", line 92, characters 20-23
Called from Stdlib__List.map in file "list.ml", line 92, characters 32-39
Called from Stdlib__List.map in file "list.ml", line 92, characters 32-39
Called from Stdlib__List.map in file "list.ml", line 92, characters 32-39
Called from Stdlib__List.map in file "list.ml", line 92, characters 32-39
Called from Stdlib__List.map in file "list.ml", line 92, characters 32-39
Called from Stdlib__List.map in file "list.ml", line 92, characters 32-39
Called from Stdlib__List.map in file "list.ml", line 92, characters 32-39
Called from Stdlib__List.map in file "list.ml", line 92, characters 32-39
Called from Ast_c_build.program in file "src/pfff/lang_c/parsing/ast_c_build.ml", line 130, characters 2-18
Called from Parse_c.parse in file "src/pfff/lang_c/parsing/parse_c.ml", line 35, characters 8-33
Re-raised at Exception.reraise in file "src/pfff/commons/Exception.ml", line 20, characters 2-41
Called from Parse_c.parse in file "src/pfff/lang_c/parsing/parse_c.ml", line 41, characters 6-25
Called from Parse_target.throw_tokens in file "src/parsing/Parse_target.ml", line 214, characters 12-18
Called from Parse_target.run_parser.(fun) in file "src/parsing/Parse_target.ml", line 104, characters 22-28
Re-raised at Exception.reraise in file "src/pfff/commons/Exception.ml", line 20, characters 2-41
Called from Parse_target.parse_and_resolve_name in file "src/parsing/Parse_target.ml", line 395, characters 12-42
Called from Parse_with_caching.ast_or_exn_of_file in file "src/runner/Parse_with_caching.ml", line 115, characters 6-51
Called from Parse_with_caching.parse_and_resolve_name in file "src/runner/Parse_with_caching.ml", line 159, characters 6-641
Called from CamlinternalLazy.force_lazy_block in file "camlinternalLazy.ml", line 31, characters 17-27
Re-raised at CamlinternalLazy.force_lazy_block in file "camlinternalLazy.ml", line 36, characters 4-11
Called from Common.with_time in file "src/pfff/commons/Common.ml", line 215, characters 12-16
Called from Match_search_mode.matches_of_patterns in file "src/engine/Match_search_mode.ml", line 226, characters 8-67
Called from Match_search_mode.matches_of_xpatterns in file "src/engine/Match_search_mode.ml", line 331, characters 6-62
Called from Match_search_mode.matches_of_formula in file "src/engine/Match_search_mode.ml", line 667, characters 4-61
Called from Match_search_mode.check_rule in file "src/engine/Match_search_mode.ml", line 704, characters 26-73
Called from Common.set_timeout in file "src/pfff/commons/Common.ml", line 1437, characters 12-16
Re-raised at Exception.reraise in file "src/pfff/commons/Exception.ml", line 20, characters 2-41
Called from Match_rules.timeout_function in file "src/engine/Match_rules.ml", line 45, characters 4-73
Called from Match_rules.check.(fun) in file "src/engine/Match_rules.ml", line 108, characters 19-1019
Called from Common.partition_either.part_either in file "src/pfff/commons/Common.ml", line 972, characters 15-18
Called from Match_rules.check in file "src/engine/Match_rules.ml", line 85, characters 4-1023
Called from Run_semgrep.semgrep_with_rules.(fun) in file "src/runner/Run_semgrep.ml", line 624, characters 13-145
Called from Run_semgrep.iter_targets_and_get_matches_and_exn_to_errors.(fun) in file "src/runner/Run_semgrep.ml", line 369, characters 21-29
Called from Stdlib__Fun.protect in file "fun.ml", line 33, characters 8-15
Re-raised at Stdlib__Fun.protect in file "fun.ml", line 38, characters 6-52
Called from Memory_limit.run_with_memory_limit in file "src/system/Memory_limit.ml", line 83, characters 6-62
Called from Run_semgrep.iter_targets_and_get_matches_and_exn_to_errors.(fun) in file "src/runner/Run_semgrep.ml", line 360, characters 17-1023
=====[ END error trace ]=====