Tags

Tags give the ability to mark specific points in history as being important
  • v0.1.1

    protected
    sysml-bench v0.1.1: CI fix — buildah container build, decoupled publish jobs
    
    Fixes:
    - Replace docker-in-docker with buildah for podman runner compatibility
    - Decouple publish stages so PyPI, container, and HF dataset run in parallel
    - Container: registry.gitlab.com/nomograph/sysml-bench:v0.1.1
    - pip install sysml-bench==0.1.1
  • v0.1.0

    protected
    sysml-bench v0.1.0: Reproducible benchmark for SysML v2 model comprehension
    
    Initial public release:
    - 132 tasks across 13 categories (O1-O14)
    - 5 models evaluated (Claude 3.5/3.7, GPT-4o, Gemini 2.0 Flash, DeepSeek-V3)
    - Tool-augmented evaluation with sysml-cli (tree-sitter parser)
    - pip install sysml-bench
    - Container: registry.gitlab.com/nomograph/sysml-bench:v0.1.0
    - HuggingFace: nomograph/sysml-v2-reasoning-benchmark