sysml-bench v0.1.0: Reproducible benchmark for SysML v2 model comprehension Initial public release: - 132 tasks across 13 categories (O1-O14) - 5 models evaluated (Claude 3.5/3.7, GPT-4o, Gemini 2.0 Flash, DeepSeek-V3) - Tool-augmented evaluation with sysml-cli (tree-sitter parser) - pip install sysml-bench - Container: registry.gitlab.com/nomograph/sysml-bench:v0.1.0 - HuggingFace: nomograph/sysml-v2-reasoning-benchmark