SV-COMP un-preprocessed programs

Goal

Get rid of preprocessed .i files in sv-benchmarks for tasks which have corresponding .c files.

Motivation

Preprocessing inconsistencies between .c and .i files.
1. Maintaining consistency is a burden on benchmark maintenance: additional/duplicate work when fixing benchmarks.
2. Inconsistencies happen for various reasons:
  1. .i file (used by tools) is fixed, but .c file is forgotten.
  2. .c file is fixed and re-preprocessed, but with wrong architecture (than in task definition .yml).
  3. .c file is fixed and .i is also manually fixed, but differently.
    1. This is often done to avoid re-preprocessing which yields larger-than-necessary diffs due to version differences (see below).
  4. Some tasks are excluded from preprocessing consistency CI:
    1. ConcurrencySafety, because of pthread.h (see below).
    2. LDV, because of many large tasks.
Preprocessing is not stable across OS/compiler/libc versions.
1. Preprocessing consistency CI uses a fixed set of old versions.
2. Implementation-detail types in pthread.h have changed over the years.
3. Libc headers are conditional on the compiler (GCC vs Clang) and its version.
  1. Most .i files have been generated with GCC, which can cause problems for Clang-based tools.
  2. For example, using compiler-specific attributes which are rejected by the other.
Size of sv-benchmarks repository and programs themselves.
1. .c files which include anything from the C standard library result in lots of C standard library declarations in the .i files, with the same declarations repeated in many .i files.
2. Often, most of the .i file is included standard declarations and the actual code is a few dozen lines at the very end.
3. Such .i files are not very human-readable, which is desirable by:
  1. Tool developers for debugging their tools.
  2. Benchmark maintainers for fixing benchmarks and reviewing MRs.
4. No benefit from code reuse: some tasks (e.g. LDV) have been preprocessed using common model headers.
  1. Fixes in such headers yield especially large MR diffs.

Additional advantages

Programs are verified w.r.t. the C standard (which specifies the standard library), as opposed to some implementation of the standard (which has implementation details in the header files).
1. For example, fesetround takes an integer rounding mode whose values are not fixed by the standard, but only required to present as certain #define-s with no stability guarantees.
  1. With preprocessing the tools never see the macro name and must hard-code magic constants.
ConcurrencySafety programs can be cleaned up notably.
1. As mentioned above, pthread.h pulls in particularly much (non-portable) declarations, which would be avoided.
2. Deprecated non-standard __VERIFIER_atomic-s can be replaced with C11 atomics (atomic.h) whose macro expansions are very unreadable.
Tools can customize preprocessing if they desire.
1. For example, instead of libc headers, use custom versions with their own stubs/models, which may be easier to deal with than libc macro expansions.
2. This is entirely optional and the tool's own responsibility to make sure it follows the C standard.
3. Perhaps some tool even prefers working with un-preprocessed programs directly.
Glibc headers are not included with their licence text removed!
1. Headers are mostly API declarations matching the C standard, but they also contain some implementation details (e.g. inline functions, type definitions).
Tools are easier to use in the real world.
Improved traceability: benchmarks remain closer to original real-world code.

Disadvantages/concerns

Some programs in sv-benchmarks have only .i files and no .c files.
1. Non-issue: preprocessed benchmarks can (and will) still exist.
Benchmark programs are no longer (so) self-contained.
1. #include-d C standard library headers contain declarations and minimal implementation detail, not implementations of libc itself.
2. We will verify C programs w.r.t. The C standard, not an implementation of the standard itself.
Many/most tools need to each implement preprocessing.
1. Simple to implement: execute C preprocessor (cpp).
  1. Requires -m32 or -m64 argument depending on data model from task definition .yml.
  2. The latter requires multi-arch headers installed.
2. Preprocessing should have negligible resource usage.
3. Solution: provide clear instructions.
4. Already supported by some tools: Goblint, Mopsa, CPAchecker?, …
Should witnesses refer to locations in the .c input file or the preprocessed .i file?
1. The input .c file, which is the same for all tools, whereas the preprocessed .i file might be different, depending on possible tool customizations.
2. Solution: how will tools know locations in .c file if they only parse the .i file?
  1. Run cpp without -P argument: this produces a .i file with #line directives referring to original locations: https://gcc.gnu.org/onlinedocs/gcc-14.2.0/cpp/Line-Control.html
  2. Parse the #line directives and use them to override locations of parsed tokens.
3. Additional problem: locations inside (multiple) macro expansions may be difficult to refer to, so the proposal avoids them.
4. Additional problem: macros from C standard.

Open questions

sv-benchmarks is also used for Test-Comp. How would this affect Test-Comp?

Proposed roadmap

We propose to move towards the goal gradually, not in one huge step. That is, all .i files (with .c counterparts) will not be removed at once.

SV-COMP 2026

Change SV-COMP (and Test-Comp?) rules to allow (but not require) un-preprocessed .c files under the following constraints:

An un-preprocessed task consists of a single .c file.
1. Generalizing to multi-file programs is left for the future to not require too many changes in tools at once.
2. Thus, there will be the following kinds of tasks:
  1. Only un-preprocessed .c file as input — these require new tool support.
  2. Only preprocessed .i file as input — tools can assume .i files are preprocessed, no new tool support needed.
  3. Only preprocessed .c file as input — some tasks exist, may still be preprocessed by tool (will make no difference), but no new tool support is needed.
  4. Preprocessed .i file as input, but .c file exists for reference — majority of current tasks, no new tool support needed.
Only #include-s headers specified by the C standard or pthread.h.
1. Currently excludes programs with custom headers.
2. Simplifies preprocessing because task consists of a single file and no include paths need to be given to cpp.
No #define-s.
1. Avoids potential issues where witnesses may need to refer to locations under macro expansion.
2. ~~Possible relaxation:~~ ~~Only #define numeric constants (e.g. for loop bounds, array sizes, …)~~
No custom preprocessor arguments are needed.
1. That is, no -D arguments for #define-s from command line.
~~Possible addition:~~ **At most …% of tasks in every category may be un-preprocessed.**
1. In case some teams are worried about too drastic changes.
2. The percentage is then to be relaxed over years.
3. Would probably require scripting to measure this to keep track.

Proposal MR: !537.

The rule changes allow the following to take place in sv-benchmarks:

New benchmarks submitted with only un-preprocessed .c files.
1. Tasks with only .i files could still be accepted if either:
  1. Un-preprocessed code is no longer available.
  2. Task requires no #include-s at all.
2. Tasks with both .c and .i files could still be accepted if either:
  1. Task consists of multiple .c files combined and preprocessed into a single .i file.
  2. Code uses other #include-s not allowed above.
  3. Code uses #define-s not allowed above.
  4. Code uses custom preprocessor arguments.
Some existing tasks to be changed from .i to .c files and the .i files removed.
1. It will not be done en masse, but will be a voluntary process.
2. Requires checking of the above rule constraints by MR submitter and reviewers.

SV-COMP 2027+

Add cppflags option field to task definition .yml (like data model).
1. These are to be given to cpp when preprocessing the .c file.
2. This allow preprocessing (the same .c file) with different #define-s from the command line, a convenient pattern for some tasks to get different values of N with minimal duplication.
Custom (possibly multiple) #include files in sv-benchmarks.
1. All (transitively) #include-d files under required_files in task definition .yml.
  1. Technically already supported by the format and BenchExec: https://gitlab.com/sosy-lab/benchmarking/sv-benchmarks/-/issues/1124#note_693279901
2. Tools may need to add additional include paths to cpp?
3. Can create risky situations where a fix to one header accidentally affects many tasks.
  1. Would be currently done by mass regex replace, which has similar risks.
Multi-file programs.
1. All .c files under input_files in task definition .yml format.
  1. Technically already supported by the format and BenchExec: https://gitlab.com/sosy-lab/benchmarking/sv-benchmarks/-/issues/1124#note_693279901
  2. Most(?) BenchExec tool info modules do not support multiple input files.
2. Assumes homogeneous preprocessing for all .c files.
  1. Not true for large/complex real-world programs where different files get different compiler/preprocessor arguments (include directories, -D arguments, etc).
3. Much bigger challenge for tools: essentially source-level linking.
4. Allows real-world programs to be included into sv-benchmarks much easier and with less alterations.
Real-world programs with heterogeneous preprocessing.
1. Makefile? Compilation database (compile_commands.json)?

Previous discussions

Community meetings

SV-COMP 2021: https://gitlab.com/sosy-lab/sv-comp/archives-2021/-/issues/31
1. 2020-09-29: Get rid of preprocessing? Result of discussions on this years ago:
  1. We want exactly the files we give to the tools in the repo
  2. We want to go with the C standard (would be an argument against the preprocessing)
  3. We will add a Demo Category where tools get unpreprocessed .c files
2. 2020-10-13: ”Unpreprocessed" Demo Category in January adding optional attribute in *.yml - no new branch, but it would require changes in benchexec; if there is no "source_files" attribute, testcase is ignored something (or similar) like source_files: 'program.c', 'custom_header.h' preprocessed_file: 'program.i' insead of input_files: 'program.i'
SV-COMP 2024: 2024 SV-COMP Community Meetings
1. 2023-10-20: T05: Benchmarks preprocessing/ no preprocessing
  1. This year, we will still use preprocessed benchmarks (the `.i` files)
  2. In the future we primarily want to use non-processed benchmarks. We need a discussion in Luxembourg.
  3. New preprocessed benchmarks should contain source file(s) and the preprocessing command that was used as mandatory entries of the .yaml file.
2. 2023-11-21: Simmo: are standard library declarations in preprocessed files licensed?
  1. So far, all tasks have been preprocessed without comments, but now there’s precedent otherwise (~1800 lines of comments from headers): https://gitlab.com/sosy-lab/benchmarking/sv-benchmarks/-/merge_requests/1501#note_1659573598
  2. Another argument for using unpreprocessed files.
  3. Outcome: We do not require comments from headers. License information related to the headers is lost. We accept it. (And yes, it is another reason to not preprocess benchmarks in future.)
3. 2024-04-08 (Luxembourg): (PredatorHP, Franz, Philipp, Matthias Heizmann) Support of non-preprocessed files in SV-COMP ’25. Issue #1124
  1. Problems for validator tools atm.:
    1. #line directives not supported by tools nor expected by GraphML format
    2. 32- and 64-bit pre-processing not properly distinguished (at least for CPAchecker, it always runs `cpp $FILE`)
    3. Probably requires multi-arch headers to be installed (not really a problem) on the host (UAutomizer bails out for <assert.h> and <limits.h>, maybe others)
  2. Question of header licenses: https://gitlab.com/sosy-lab/benchmarking/sv-benchmarks/-/merge_requests/1501#note_1659573598
  3. Poor maintainability - fixes in preprocessed files, not in source files; stdlibs funcs declarations included in "*.c files" instead of header files includes, …
SV-COMP 2025: 2025 Community Meetings SV-COMP/Test-Comp
1. 2024-11-05: Hernan: use un-preprocessed sources. The topic is at least 4 years old. Many people seem to be in favor. We keep saying “let's discuss it in person in next TACAS” and nothing ever happens. Can we already decide to move forward and agree that this will be implemented for SVCOMP 2026?
  Draft: Define concurrency tasks on un-preprocessed .c sources where available (!1359) · Merge requests · SoSy-Lab / Benchmarking / SV-Benchmarks · GitLab
  1. Simmo: The first step would be to agree on a rules change allowing unpreprocessed programs for SV-COMP 2026 as early as possible (after 2025). Then it’s possible to make MRs that change subsets of tasks over/require new tasks to be unpreprocessed.
  2. ~~TODO (Hernan)~~: Open MR that will remove from competition rules the request to not have include in benchmarks.
2. 2024-12-03: Raphaël (but mentioned previously by Simmo, Hernan): un-preprocessed files? https://gitlab.com/sosy-lab/sv-comp/bench-defs/-/merge_requests/489#note_2207944643
  1. Advantages:
    1. It is difficult to keep some tasks in sync between their original and preprocessed version
  2. Disadvantages:
    1. Changing everything at once is difficult
  3. Conclusion: Will be discussed by interested people and the benchmark quality assurance team in: https://gitlab.com/sosy-lab/sv-comp/bench-defs/-/merge_requests/489#note_2207944643
  4. Conclusion: Should be done incrementally i.e. by allowlisting

Issues

Incorrect pre-processing of source files leads to incompatible C library redeclarations: sosy-lab/benchmarking/sv-benchmarks#121
Pre-processed files should be removed: sosy-lab/benchmarking/sv-benchmarks#122 (closed)
Increase Traceability and Reduce Maintenance Effort for Verification Tasks: sosy-lab/benchmarking/sv-benchmarks#123
Floating-point rounding mode set directly due to preprocessing: sosy-lab/benchmarking/sv-benchmarks#571
Do not track preprocessed files in this repository: sosy-lab/benchmarking/sv-benchmarks#1123
Use unpreprocessed files in task definitions: sosy-lab/benchmarking/sv-benchmarks#1124
Fix preprocessing inconsistency with newer CBMC versions: sosy-lab/benchmarking/sv-benchmarks#1375
Fix preprocessing consistency of modified LDV tasks: sosy-lab/benchmarking/sv-benchmarks#1408
Include all tasks in preprocessing-consistency checks: sosy-lab/benchmarking/sv-benchmarks#1409

MRs

Define concurrency tasks on un-preprocessed .c sources where available: sosy-lab/benchmarking/sv-benchmarks!1359
Remove requirement that task should not contain include statements: !489 (closed)

This MR includes the proposed rule changes but also includes a few hints on how tools should handle un-preprocessed programs.

This is a more concrete alternative to !489 (closed).

Edited Apr 10, 2025 by Simmo Saan

Relax rules to allow some un-preprocessed programs