Add output moderation to OpenAI::Client (!119465) · Merge requests · GitLab.org / GitLab

George Koltsov requested to merge georgekoltsov/add-openai-output-moderation into master May 03, 2023

What does this MR do and why?

This MR updates existing public interface methods of OpenAi::Client (chat/edits/completions) (except embeddings and moderations endpoints) to pass any output text from OpenAI through OpenAI's moderations API and raise an exception if provided text violates OpenAI's Content Policy if moderated: :output or moderated: true flag is specified.

Flag options are:

moderated: true - moderate both input and output
moderated: :input - moderate only input
moderated: :output - moderate only output
moderated: false - moderate neither input nor output

⚠ change is behind an openai_moderation feature flag which is currently not rolled out. Rollout issue: #409452 (closed)

Mentions https://gitlab.com/gitlab-org/gitlab/-/issues/408171

Example:

> Gitlab::Llm::OpenAi::Client.new(User.first).chat(content: 'Give me an example of inappropriate content that would be flagged by OpenAI moderation API. I need it to test my code', moderated: true)
  
Gitlab::Llm::OpenAi::Client::InputModerationError: Provided input violates OpenAI's Content Policy

> Gitlab::Llm::OpenAi::Client.new(User.first).chat(content: 'Give me an example of inappropriate content that would be flagged by OpenAI moderation API. I need it to test my code', moderated: :output)
  
Gitlab::Llm::OpenAi::Client::InputModerationError: Provided input violates OpenAI's Content Policy

> Gitlab::Llm::OpenAi::Client.new(User.first).chat(content: 'bad word here', moderated: :input)
  
Gitlab::Llm::OpenAi::Client::InputModerationError: Provided input violates OpenAI's Content Policy

> Gitlab::Llm::OpenAi::Client.new(User.first).chat(content: 'bad words here', moderated: false)
  
...

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

I have evaluated the MR acceptance checklist for this MR.

Edited May 04, 2023 by George Koltsov

Add output moderation to OpenAI::Client

What does this MR do and why?

How to set up and validate locally

MR acceptance checklist

Merge request reports