Draft: PoC: pytest/langsmith integration tests (!320) · Merge requests · GitLab.org / duo-workflow / Duo Workflow Service

A proof of concept of pytest/langsmith integration for agent/tool/prompt integration testing

tests/duo_workflow_service/tools/test_issue_integration_simple.py
- Uses a simple system prompt rather than the real one, to focus the test on the tool prompts. Just checks 2 URLs
- LangSmith results
tests/duo_workflow_service/tools/test_issue_integration_description_variants.py
- Compares different description variants for the get_issue tool. This gives an example of how we could quickly compare a few different prompt variants to see which performs best. We could rerun the test several times to check for reliability.
- LangSmith results
- The failures show that some forms of URL input still can't be handled properly

Note: The tests don't run in CI atm because they need a LangSmith token. But you can run them locally as typical pytest tests, plus a LangSmith environment variable:

export LANGSMITH_TEST_SUITE="LLM Integration tests" # the name of the LangSmith dataset that the tests will be grouped into 
poetry run python -m pytest "tests/duo_workflow_service/tools/test_issue_integration_simple.py"

Draft: PoC: pytest/langsmith integration tests

Merge request reports