v0.1.0 · Tags · Nomograph Labs / sysml-bench

v0.1.0 protected

7b021579 · fix: replace docker-in-docker with buildah for container build · Mar 10, 2026

sysml-bench v0.1.0: Reproducible benchmark for SysML v2 model comprehension

Initial public release:
- 132 tasks across 13 categories (O1-O14)
- 5 models evaluated (Claude 3.5/3.7, GPT-4o, Gemini 2.0 Flash, DeepSeek-V3)
- Tool-augmented evaluation with sysml-cli (tree-sitter parser)
- pip install sysml-bench
- Container: registry.gitlab.com/nomograph/sysml-bench:v0.1.0
- HuggingFace: nomograph/sysml-v2-reasoning-benchmark