Connect abstraction layer to AI Gateway

Context

The reasoning behind connecting the monolith to the gateway for provider requests are these:

We need to support massive scale, including self managed instances, Dedicated instances, and .com. Having all of this scale go directly to .com presents an availability and reliability issue. There are a lot of benefits to unpack here - for example being able to deploy the gateway in multiple regions (unlike .com).
We need to support a level of real time updates and most importantly :no2: breaking changes for self managed and Dedicated instances. While true that prompts are specific to providers, we will always have the option to patch a fix to the gateway and have those instances get the real time update for it. Keep in mind that Dedicated customers are enormous customers who we do not want to upset with broken instances.
We need flexibility for providers. Yes, we would ❤ love ❤ to commit to one provider, but the truth is that we have pivoted providers repeatedly in the last 5 months and therefore need to remain flexible. Instances are not going to update as frequently as we update providers.
If there is a provider outage, or an outage with the gateway, it will be isolated from affecting .com. As a worst case scenario, AI features will not function, but you can still use everything else and we won't see the knock-on effects to other services that we see today.
Reduce the burden on development environments and the need for dozens of access keys per provider

Edited Sep 08, 2023 by Michelle Gill