Skip to content

Split duochat documentation eval

Problem

eli5 implements an end to end approach to evaluating documentation search. This is captures the user experience as whole, but for a team trying to improve the score it becomes harder to decide where to focus on.

Proposed Solution

A request for documentation has a few different steps:

  1. Identify from the input that documentation is needed
  2. Format action input
  3. Retrieve the correct documents based on the action input
  4. Generate response based on results

All of these steps need to be logged, and can have independent evaluations