Skip to content

Upgrade GitLab semgrep to v1.119.0

While attempting to upgrade GitLab Semgrep from v1.118.0 to v1.119.0, we encountered an integration-test failure.

Error Details

1) running image with test project with cpp when using .h extension created report behaves like recorded report scan is equivalent
   Failure/Error: DEFAULT_FAILURE_NOTIFIER = lambda { |failure, _opts| raise failure }
   
               json atom at path "observability/events/0/exit_code" is not equal to expected value:
   
                 expected: 0
                      got: 3
               
   Shared Example Group: "recorded report" called from ./spec/semgrep_image_spec.rb:200
   # /usr/bin/rspec:25:in `load'
   # /usr/bin/rspec:25:in `<main>'
Finished in 9 minutes 22 seconds (files took 0.43049 seconds to load)

Implementation Plan

  1. Investigate the root cause of the exit code 3 failure

    The reason why semgrep exits with status code 3 instead of 0 when analyzing qa/fixtures/cpp/h/header_only.h is due to the changes in the recover_when_partial_error flag in languages/cpp/tree-sitter/Parse_cpp_tree_sitter.ml between semgrep v1.118.0 and v1.119.0:

    image

    The change to the recover_when_partial_error flag causes status code 3 to be returned, due to a parsing error on line 7 of the fixture file header_only.h:

    SomeClass() {};

    The presence of the trailing semicolon ; results in different behaviour between semgrep 1.118.0 and 1.119.0:

    • semgrep v1.118.0

      The recover_when_partial_error flag controlled error handling. When recover_when_partial_error = true (the default), parsing errors were silently recovered and only logged as warnings, and Semgrep continued analysis and exited with status 0 (success).

    • semgrep v1.119.0

      Removed the conditional error recovery logic, so the function now always logs warnings but never raises errors.

      However, parsing errors are now tracked and reported more explicitly. Semgrep still completes the scan but exits with status 3 to indicate "partial analysis due to parsing errors".

      This shows up in the semgrep logs when scanning header_only.h:7:

      Partially analyzed due to parsing or internal Semgrep errors
      
       • header_only.h (~6.7% of lines always skipped)
      
         The following lines were skipped for all analysis:
      
         • lines 7-7 due to exception "PartialParsing" raised during analysis   

      note: even though semgrep exits with a different status code between v1.118.0 and v1.119.0, the vulnerabilties detected are the same, which means there's no impact to this change.

  2. Identify what changed between v1.118.0 and v1.119.0 that causes this issue

  3. Fix the underlying problem causing the integration test failure

    We don't need to fix the underlying problem, we can just update our expectation file to reflect the new status code value.

  4. Successfully upgrade Semgrep to v1.119.0 (or latest available version)

  5. Ensure all integration tests pass after the upgrade

Edited by Adam Cohen