Tags give the ability to mark specific points in history as being important
-
v0.1.1
protected10ef7731 · ·sysml-bench v0.1.1: CI fix — buildah container build, decoupled publish jobs Fixes: - Replace docker-in-docker with buildah for podman runner compatibility - Decouple publish stages so PyPI, container, and HF dataset run in parallel - Container: registry.gitlab.com/nomograph/sysml-bench:v0.1.1 - pip install sysml-bench==0.1.1
-
v0.1.0
protected7b021579 · ·sysml-bench v0.1.0: Reproducible benchmark for SysML v2 model comprehension Initial public release: - 132 tasks across 13 categories (O1-O14) - 5 models evaluated (Claude 3.5/3.7, GPT-4o, Gemini 2.0 Flash, DeepSeek-V3) - Tool-augmented evaluation with sysml-cli (tree-sitter parser) - pip install sysml-bench - Container: registry.gitlab.com/nomograph/sysml-bench:v0.1.0 - HuggingFace: nomograph/sysml-v2-reasoning-benchmark