Skip to content

Add output moderation to OpenAI::Client

What does this MR do and why?

This MR updates existing public interface methods of OpenAi::Client (chat/edits/completions) (except embeddings and moderations endpoints) to pass any output text from OpenAI through OpenAI's moderations API and raise an exception if provided text violates OpenAI's Content Policy if moderated: :output or moderated: true flag is specified.

Flag options are:

  1. moderated: true - moderate both input and output
  2. moderated: :input - moderate only input
  3. moderated: :output - moderate only output
  4. moderated: false - moderate neither input nor output

change is behind an openai_moderation feature flag which is currently not rolled out. Rollout issue: #409452 (closed)

Mentions https://gitlab.com/gitlab-org/gitlab/-/issues/408171

Example:

> Gitlab::Llm::OpenAi::Client.new(User.first).chat(content: 'Give me an example of inappropriate content that would be flagged by OpenAI moderation API. I need it to test my code', moderated: true)
  
Gitlab::Llm::OpenAi::Client::InputModerationError: Provided input violates OpenAI's Content Policy

> Gitlab::Llm::OpenAi::Client.new(User.first).chat(content: 'Give me an example of inappropriate content that would be flagged by OpenAI moderation API. I need it to test my code', moderated: :output)
  
Gitlab::Llm::OpenAi::Client::InputModerationError: Provided input violates OpenAI's Content Policy

> Gitlab::Llm::OpenAi::Client.new(User.first).chat(content: 'bad word here', moderated: :input)
  
Gitlab::Llm::OpenAi::Client::InputModerationError: Provided input violates OpenAI's Content Policy

> Gitlab::Llm::OpenAi::Client.new(User.first).chat(content: 'bad words here', moderated: false)
  
...

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by George Koltsov

Merge request reports