Allow users to opt out of prompt caching for code completion
We would like to implement an opt-out mechanism for any users who do not want to use prompt caching for code completion.
I think the implementation would be something like this:
1. Add a top-level namespace setting (in GitLab Rails) to let admins opt out of prompt caching
1. This setting should then apply to all groups and projects within the top-level namespace.
2. Pass that information along in the headers which we pass to the client when fetching direct_access (see [code here](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/api/code_suggestions.rb#L204)). We need to make sure this information is added for indirect connections as well.
3. Client/IDE attaches the headers to each request sent to AIGW (this already happens)
4. AIGW checks the header and passes the information to Fireworks (see [code here](https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/blob/ccf5ad2494dbaee40f6d86c5ac1a529a7d442dd9/ai_gateway/models/litellm.py#L211))
**AI Settings notes**
We will also need to work across teams to ensure there's an AI Settings admin option to support the opt out. Prior details on this related to Chat are captured here: https://gitlab.com/groups/gitlab-org/-/epics/16708
issue