Clarify current vs prospective properties of the AI architecture

The following discussion from !131772 (merged) should be addressed:

  • @mkaeppler started a discussion: (+5 comments)

    Just thinking out loud (question to the room!)

    I wonder if we should make this documentation more "alive" by better clarifying which parts of this architecture are fact, and which are aspirational i.e. should exist, but don't. We could then add a section that tracks all work required to reach the desired state via epic and issue links.

    Currently, this page (and the diagram) is a mix of fact and fiction, which makes it unclear whether its role is for reference of the status quo (i.e.: documentation) or some kind of proposal/north star (i.e.: a design document.)

    For example, in the architecture diagram, all AI requests go through the AI gateway, as there is no arrow that connects GitLab with 3P models directly. However, this is not true as of today. If our AIAL documentation is to be believed, the AIAL talks to 3P providers directly: https://main-ee-131772.docs.gitlab-review.app/ee/development/ai_features/index.html#abstraction-layer

    This is definitely true for Duo Chat, which even talks to several different 3P providers directly. Our team is now looking to push this access down into the AI gateway so we can use it for self-managed.

    We should make it clear somewhere that we must not talk to 3P models directly from gitlab-rails anymore, as this is in direct conflict with our ability to make AI features available to self-managed/dedicated, and we should provide links to where this work is planned, who will do it, and when.

    cc @andrewn @m_gill