Add prompt caching opt out backend (!188732) · Merge requests · GitLab.org / GitLab

What does this MR do and why?

We would like to implement an opt-out mechanism for any users who do not want to use prompt caching for code completion.

This MR does the following

Add a top-level namespace setting (in GitLab Rails) to let admins opt out of prompt caching
1. This setting should then apply to all groups and projects within the top-level namespace.
Pass that information along in the headers which we pass to the client when fetching direct_access (see code here). We need to make sure this information is added for indirect connections as well.

Implementation

followed the guide here: !143278 (diffs) and !143278 (diffs)

AI Settings notes

We will also need to work across teams to ensure there's an AI Settings admin option to support the opt out. Prior details on this related to Chat are captured here: &16708 (closed)

Note: Using this as example: !188732 (diffs)

            let(:top_level_namespace) { create(:group) }
            let(:enabled_group) { create(:group, parent: top_level_namespace) }
            let(:project_setting) { create(:project_setting, model_prompt_cache_enabled: false) }
            let(:disabled_project) { create(:project, group: enabled_group, project_setting: project_setting) }

# set at project_setting
project.read_attribute_before_type_cast(:model_prompt_cache_enabled)
=> false
group.read_attribute_before_type_cast(:model_prompt_cache_enabled)
=> true
application_setting.read_attribute_before_type_cast(:model_prompt_cache_enabled)
=> true

project.model_prompt_cache_enabled => false
group.model_prompt_cache_enabled => true
application_setting.model_prompt_cache_enabled => true


# set at group namespace_setting
project.read_attribute_before_type_cast(:model_prompt_cache_enabled)
=> nil
group.read_attribute_before_type_cast(:model_prompt_cache_enabled)
=> false
application_setting.read_attribute_before_type_cast(:model_prompt_cache_enabled)
=> true

project.model_prompt_cache_enabled => false
group.model_prompt_cache_enabled => false
application_setting.model_prompt_cache_enabled => true

# set at application_setting
project.read_attribute_before_type_cast(:model_prompt_cache_enabled)
=> nil
group.read_attribute_before_type_cast(:model_prompt_cache_enabled)
=> nil
application_setting.read_attribute_before_type_cast(:model_prompt_cache_enabled)
=> true

project.model_prompt_cache_enabled = true
group.model_prompt_cache_enabled = true
application_setting.model_prompt_cache_enabled => true

References

gitlab-org/modelops/applied-ml/code-suggestions/ai-assist!2376 (merged)

#535651 (closed)

Screenshots or screen recordings

verify:

set the project setting to false:

[33] pry(main)> p = Project.where(name: "Gitlab Shell")
[33] pry(main)> p = p.first
[33] pry(main)> ps = p.project_setting
[33] pry(main)> ps.model_prompt_cache_enabled = false
[33] pry(main)> ps.save

curl --request POST \
  --url 127.0.0.1:3000/api/v4/code_suggestions/direct_access \
  --header "Authorization: Bearer glpat-4dXV1h" \
  --header "Content-Type: application/json" \
  -d '{"project_path":"gitlab-org/gitlab-shell"}'
{
   "base_url":"http://0.0.0.0:5052",
   "token":"e",
   "expires_at":1745421283,
   "headers":{
      "X-Gitlab-Feature-Enablement-Type":"add_on",
      "x-gitlab-enabled-feature-flags":"",
      "x-gitlab-enabled-instance-verbose-ai-logs":"true",
      "X-Gitlab-Host-Name":"127.0.0.1",
      "X-Gitlab-Instance-Id":"5b4465f2-986e-44b1-adcf-35492a19369a",
      "X-Gitlab-Realm":"self-managed",
      "X-Gitlab-Version":"18.0.0",
      "X-Gitlab-Global-User-Id":"pYtWOLH65PxMJSvaiKZhAESiyW2Gi/I/c+/0QjgQls8=",
      "X-Gitlab-Duo-Seat-Count":"100",
      "X-Gitlab-Feature-Enabled-By-Namespace-Ids":"1000000",
      "X-Gitlab-Model-Prompt-Cache-Enabled":"false",
      "X-Gitlab-Authentication-Type":"oidc"
   },
   "model_details":{
      "model_provider":"fireworks_ai",
      "model_name":"codestral-2501"
   }

set the top-level namespace setting to false:

[33] pry(main)> ps.model_prompt_cache_enabled = nil
[33] pry(main)> ps.save
[33] pry(main)> g = p.parent
[33] pry(main)> g.parent ==========================================> returns nil
[33] pry(main)> gn = g.namespace_settings
[33] pry(main)> gn.model_prompt_cache_enabled = false
[33] pry(main)> gn.save


curl --request POST \
  --url 127.0.0.1:3000/api/v4/code_suggestions/direct_access \
  --header "Authorization: Bearer glpat-4dXV1hMz" \
  --header "Content-Type: application/json" \
  -d '{"project_path":"gitlab-org/gitlab-shell"}'
{
   "base_url":"http://0.0.0.0:5052",
   "token":"e",
   "expires_at":1745421283,
   "headers":{
      "X-Gitlab-Feature-Enablement-Type":"add_on",
      "x-gitlab-enabled-feature-flags":"",
      "x-gitlab-enabled-instance-verbose-ai-logs":"true",
      "X-Gitlab-Host-Name":"127.0.0.1",
      "X-Gitlab-Instance-Id":"5b4465f2-986e-44b1-adcf-35492a19369a",
      "X-Gitlab-Realm":"self-managed",
      "X-Gitlab-Version":"18.0.0",
      "X-Gitlab-Global-User-Id":"pYtWOLH65PxMJSvaiKZhAESiyW2Gi/I/c+/0QjgQls8=",
      "X-Gitlab-Duo-Seat-Count":"100",
      "X-Gitlab-Feature-Enabled-By-Namespace-Ids":"1000000",
      "X-Gitlab-Model-Prompt-Cache-Enabled":"false",
      "X-Gitlab-Authentication-Type":"oidc"
   },
   "model_details":{
      "model_provider":"fireworks_ai",
      "model_name":"codestral-2501"
   }

not setting at project and namespace level, use application setting as default

[33] pry(main)> gn.model_prompt_cache_enabled = nil
[33] pry(main)> gn.save
[42] pry(main)> ApplicationSetting.first.model_prompt_cache_enabled
=> true

{
   "base_url":"http://0.0.0.0:5052",
   "token":"e",
   "expires_at":1745422111,
   "headers":{
      "X-Gitlab-Feature-Enablement-Type":"add_on",
      "x-gitlab-enabled-feature-flags":"",
      "x-gitlab-enabled-instance-verbose-ai-logs":"true",
      "X-Gitlab-Host-Name":"127.0.0.1",
      "X-Gitlab-Instance-Id":"5b4465f2-986e-44b1-adcf-35492a19369a",
      "X-Gitlab-Realm":"self-managed",
      "X-Gitlab-Version":"18.0.0",
      "X-Gitlab-Global-User-Id":"pYtWOLH65PxMJSvaiKZhAESiyW2Gi/I/c+/0QjgQls8=",
      "X-Gitlab-Duo-Seat-Count":"100",
      "X-Gitlab-Feature-Enabled-By-Namespace-Ids":"1000000",
      "X-Gitlab-Model-Prompt-Cache-Enabled":"true",
      "X-Gitlab-Authentication-Type":"oidc"
   },
   "model_details":{
      "model_provider":"fireworks_ai",
      "model_name":"codestral-2501"
   }
}

Before	After

How to set up and validate locally

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #535651 (closed)

Edited Apr 25, 2025 by Allen Cook

Add prompt caching opt out backend