L
llm-as-judge

  • Any
  • Blade
  • C
  • C#
  • C++
  • CMake
  • CSS
  • Dockerfile
  • Go
  • HCL
  • HTML
  • Java
  • JavaScript
  • Jupyter Notebook
  • Kotlin
  • Makefile
  • Objective-C
  • PHP
  • Python
  • Ruby
  • SCSS
  • Shell
  • Swift
  • TSX
  • TypeScript
  • Vue

Projects with this topic

Sort by:
  • Sort by
  • Updated date
  • Name
  • Name, descending
  • Oldest updated
  • Oldest created
  • Last created
  • Most stars
  • Hide archived projects
  • Show archived projects
  • Show archived projects only
  • View jig project

    Nomograph Labs / jig

    Agent-shape testing harness that measures how an LLM-driven agent uses a tool's CLI, scored by an LLM judge.

    AI agent claude claude-code Rust cli developer-tools benchmark testing llm-as-judge
    0
    Updated Apr 27, 2026
    0 0 0 0
    Updated Apr 27, 2026