Refactor: Messaging Adapter Architecture for AI Flow Triggers

🤖 This has been AI assisted with Duo Chat

Summary

Redesign the messaging adapter infrastructure introduced in !234959 (closed) / !232343 (merged) to provide a clean, scalable foundation for triggering AI flows from any messaging surface (Slack, GitLab note mentions, @GitLabDuo, future surfaces). The current design conflates orchestration with policy, blocks several upcoming use cases, and misses architectural feedback (!234959#note_3328524508). This issue captures the agreed-on end-state design and the reasoning that led to it.

Context

Where this came from

The messaging adapter pattern was introduced for the Slack MVC (epic &590434):

The PoCs surfaced bugs and architectural pain points (composite identity missing in Slack path, no authorization in messaging path, duplicated orchestration logic between RunService and TriggerFlowService). !234959 (closed) attempted to consolidate these. Review surfaced deeper problems that warrant a clean foundation rather than incremental fixes.

  • ADR: gitlab-com/content-sites/handbook!19714 — needs to be updated to reflect this design
  • Architectural discussion #599330 (closed)@GitLabDuo mention vs Slack integration approaches; established that messaging adapter pattern + CI runner is the chosen direction
  • Flow versioning: !235204 (merged) — introduces FoundationalFlow#flow_version, flow_config_id, flow_config_schema_version; surfaces should be able to override version

Surfaces that need to work

Surface Auth model Service account source Flow source Project source
GitLab note @mention (foundational flows) :trigger_ai_flow upstream + FlowTrigger record existence From FlowTrigger.service_account From the foundational flow tied to the consumer note.project
@GitLabDuo mention in MR MR-level ability (e.g. :execute_ai_review_merge_request) Dedicated bot service account Today: borrows existing flow. Long-term: own catalog item that routes internally MR's project
Slack Slack workspace install + namespace mapping From namespace's catalog item consumer Hardcoded developer/v1 initially Lazily-created duo-workspace project
Future custom entry points (e.g. WhatsApp, webhooks) Surface-defined Surface-defined Surface-defined Surface-defined

Note on @GitLabDuo: today it already runs flows but bypasses the modern Duo Workflow path. The long-term plan is for @GitLabDuo to become a routing entry point — potentially with its own catalog item that internally dispatches to developer/v1, code_review/v1, or a future router agent. The redesign must not assume that the surface's catalog item and the executed flow's catalog item are the same. What the redesign locks in is the property that the surface owns auth and flow selection; it does not preclude @GitLabDuo later having its own first-class catalog item with its own quota/billing/audit attachment.

Constraints to honor

  • Must not require Developer foundational flow to be enabled in a namespace just because @GitLabDuo borrows its current flow (kinsingh's UX problem in #599330 (closed))
  • Must support flow versioning per !235204 (merged) (surface can override version)
  • Must use composite identity primitives from lib/gitlab/auth/identity.rb, not parallel modules (Fred's blocking review on !234959#note_3328524508)
  • Non-foundational catalog flows are out of scope but must not be a one-way-door decision
  • Existing functionality (Code Review, Duo Developer assign/mention, Fix Pipeline) must keep working

Problems with the current design (!234959)

  1. Orchestrator conflates orchestration with policy. TriggerFlowService does namespace resolution, enablement check, service account resolution, authorization, project resolution, composite identity linking, SA membership grant, callback enrichment, and workflow execution. Too many responsibilities.
  2. Flow reference is overloaded. A single flow_reference string drives four independent decisions: which FoundationalFlow to look up, which catalog item to authorize against, which catalog item to check enablement for, and which workflow definition to execute. This blocks any surface that wants to authorize differently than via namespace-level catalog item enablement (e.g. @GitLabDuo).
  3. Two parallel orchestration paths. RunService (mention path, pre-resolved) and TriggerFlowService (Slack path, cold-start) both end up at ExecuteWorkflowService but resolve and gate completely differently. The promised "single orchestration primitive" doesn't exist.
  4. Composite identity duplicates auth-domain code. Ai::CompositeIdentity::Linker recreates a pattern (fabricate(sa).link!(user)) that already exists in Gitlab::Auth::Identity as link_from_web_request, link_from_oauth_token, etc. Fred (review) and Sahasranamam (auth/composite identity owner) prefer the linking primitive lives in the auth domain.
  5. No room for flow version override. !235204 (merged) introduces version pinning, but the orchestrator hardcodes version derivation from the catalog item consumer. A surface can't say "run developer/v1 at 1.0.0."
  6. Surface flexibility is poor. Any new surface inherits the cold-start gating logic, even when it has its own auth path. Any divergence requires escape hatches.
  7. Blanket safe { } swallows errors that should fail loudly. Wrapping every lifecycle hook in rescue StandardError + track_exception means that if on_request_received (which creates the user-visible "🔄 Processing..." note) fails, the user sees nothing yet the flow proceeds. The result is a workflow whose final response posts back into a discussion that never had an acknowledgement note. Resilience is good; silent resilience is not.
  8. Two failure hooks (on_flow_failed, on_trigger_failed) for what is really a single concern. The dual hooks were introduced in !232441 (closed) to handle the Slack adapter logging spurious no_reaction errors when removing the 👀 reaction in the sync failure path (where the reaction had never been added). The actual requirement is "adapters need to know whether earlier lifecycle state exists, so cleanup can be conditional," not two distinct hook names.

Design decisions and reasoning

Surface owns policy, base adapter owns mechanism

The fundamental restructuring: the surface (concrete adapter) handles all decisions specific to its context — authorization, service account selection, flow selection, project selection, goal building. The shared infrastructure (Adapters::Base) handles only the mechanical work that no surface can or should do for itself — composite identity linking, SA project membership, callback context enrichment, resource translation, and workflow execution.

This eliminates the "two parallel paths" problem for adapter-shaped surfaces because RunService-style pre-resolved surfaces and TriggerFlowService-style cold-start surfaces both fit the same model — they just do their resolution differently inside build_trigger_bundle. RunService itself remains a parallel path for non-foundational flows and the assignment-trigger path until a future consolidation moves the security primitives into ExecuteWorkflowService itself (out of scope, see "Out of scope").

Typed TriggerBundle value object instead of symbol-keyed hash

build_trigger_bundle returns a Data.define-based value object, not a hash:

TriggerBundle = Data.define(
  :current_user, :service_account, :flow_reference, :flow_version,
  :project, :goal, :resource, :additional_context, :source_branch
)

Data.define raises ArgumentError on unknown keys and on missing required keys. This catches contract violations at construction time inside the adapter, instead of with a deep NoMethodError inside ExecuteWorkflowService. Adapter authors get attribute access (bundle.service_account) instead of payload[:service_account].

This pattern is established in the codebase, see for reference:

  • ee/lib/security/scan_result_policies/push_settings_overrides.rb
  • ee/lib/gitlab/security/orchestration/pipeline_execution_policies/intervals.rb
  • lib/api/helpers/notes_helpers.rb
  • ee/app/services/search/zoekt/rollout_service.rb

No separate TriggerFlowService

We considered keeping a thin TriggerFlowService between the adapter and ExecuteWorkflowService for symmetry. We rejected this because the service ends up being ~50 lines of glue around 3 lines of real work, with no behavior that isn't already part of the adapter's trigger lifecycle. A standalone service would be a fig leaf, not a meaningful boundary.

The alternative of moving everything into concrete adapters was also rejected: composite identity linking and SA project membership are security-critical primitives that should not be per-adapter discretion. Putting them in Adapters::Base#trigger makes them inherited by contract.

Composite identity preconditions split

Three preconditions exist before linking today:

  1. service_account.composite_identity_enforced? — auth-domain knowledge, lives in Gitlab::Auth::Identity
  2. Ai::Setting.instance.duo_workflow_oauth_application is configured — AI-platform configuration check
  3. :ai_flow_triggers_use_composite_identity feature flag — temporary rollout gate

(1) goes into Gitlab::Auth::Identity.link_for_ai_flow. (2) and (3) are AI-flow concerns, kept as caller responsibilities in Adapters::Base (or removed when rollout completes). This addresses Fred's review without recreating Ai::CompositeIdentity::Linker inside Gitlab::Auth::Identity (which would miss the point).

Fred's blocking review concern (!234959#note_3328524508) is satisfied by either this split or by pushing the FF + OAuth-app checks into Gitlab::Auth::Identity.link_for_ai_flow itself. We're going with the split above since the FF is an AI-flow rollout concern, not an auth-domain concern. Loop @skundapur in for review on the link_for_ai_flow addition (he owns lib/gitlab/auth/identity.rb); if he prefers everything in the auth domain, the change is a small refactor that doesn't affect the rest of this design.

Tiered hook resilience (replaces blanket safe { })

Three categories of work in Base#trigger:

  1. Must succeed for the user to know anything happened. Example: on_request_received creating the "🔄 Processing..." note. If this fails, the user has no acknowledgement that their request was received. The flow MUST NOT proceed silently.
  2. Should succeed but the world keeps spinning if it doesn't. Example: on_flow_started updating the existing note from "Processing" to "Started." If this fails, the note still exists from step 1; worst case it shows "Processing" longer than it should.
  3. Hard requirements for correctness. Example: composite identity linking, SA project membership grant, ExecuteWorkflowService.execute. These cannot be skipped, no wrapper.

Implementation:

def must
  yield
  ServiceResponse.success
rescue StandardError => e
  ::Gitlab::ErrorTracking.track_exception(e, adapter: self.class.name)
  ServiceResponse.error(message: e.message, reason: :hook_failed)
end

def safe
  yield
rescue StandardError => e
  ::Gitlab::ErrorTracking.track_and_log_exception(e, adapter: self.class.name)
end

Base#trigger short-circuits if must { on_request_received } returns an error.

Single failure hook with workflow context

We collapse on_flow_failed and on_trigger_failed into a single hook with an optional workflow: kwarg:

def on_flow_failed(callback_context:, error:, workflow: nil)
  deliver_error(callback_context: callback_context, error: error)
end
  • Sync path (Base#trigger after ExecuteWorkflowService returns error): pass workflow: nil.
  • Async path (CallbackWorker after a started workflow fails): pass the actual workflow.

Adapters that need conditional cleanup branch on workflow.present?. The Slack adapter only removes the 👀 reaction when workflow.present?, eliminating the phantom no_reaction log Marco flagged in !232441#note_3317304974.

This is strictly more useful than two hook names because adapters get richer state (the workflow object itself, not just a phase boolean) and the abstraction matches natural language ("the flow failed").

Adapter registry built from descendants, not manual registration

Each concrete adapter declares its own key as a class method:

class GitlabNote < Base
  ADAPTER_KEY = 'gitlab_note'

  def self.adapter_key = ADAPTER_KEY
end

CallbackWorker::ADAPTER_REGISTRY is built from Adapters::Base.descendants, with a CI-time uniqueness check on adapter_key. Base#enriched_callback_context auto-injects 'adapter' => self.class.adapter_key so adapters cannot accidentally write a string into the callback context that doesn't match their registry key. Removes a whole class of "I forgot to register / I wrote the wrong string" bugs.

No Flow value object (yet)

We initially proposed an Ai::Messaging::Flow struct to carry reference, config_id, schema_version, version. We dropped it: !235204 (merged) already does the splitting inside StartWorkflowService via FoundationalFlow.for_reference. The adapter only needs to pass flow_reference (string) and flow_version (string) to ExecuteWorkflowService. Avoids introducing a value object that would need to evolve as flow versioning matures.

Always call ServiceAccountMemberAddService

ServiceAccountMemberAddService short-circuits when membership exists. The cost is one indexed DB lookup. Always-on guarantees no surface ever forgets to grant access. The cost of forgetting (runtime permission failure deep in workflow execution) far exceeds the cost of one extra query.

Proposed architecture

Surface event (Slack message, MR note, webhook, ...)


Concrete Adapter (extends Adapters::Base)
        │   - knows the surface
        │   - does its own auth
        │   - resolves: user, service_account, flow_reference, flow_version, project, resource, goal
        │   - implements build_trigger_bundle + build_callback_context + delivery

        ▼ adapter.trigger
Adapters::Base#trigger
        │   - must { on_request_received }            -- short-circuits on failure
        │   - build_trigger_bundle (returns TriggerBundle or error)
        │   - build_callback_context                  -- runs after on_request_received
        │   - composite identity link (via Gitlab::Auth::Identity)
        │   - service account project membership (via ServiceAccountMemberAddService)
        │   - callback context enrichment (service_account_id, flow_reference, flow_version, adapter_key)
        │   - resource translation (Issue → issue_id, MergeRequest → merge_request_id)


Ai::Catalog::ExecuteWorkflowService.execute  (existing, unchanged)


Workflow runs on CI runner

        ▼ async, on completion
Ai::Messaging::CallbackWorker
        │   - looks up adapter from descendants registry
        │   - dispatches deliver_result / on_flow_completed / on_flow_failed (with workflow:)

Adapter posts result back to surface

Components to add

# in lib/gitlab/auth/identity.rb
def self.link_for_ai_flow(service_account:, scoped_user:)
  return unless service_account
  return unless scoped_user
  return unless service_account.composite_identity_enforced?

  link_from_web_request(service_account: service_account, scoped_user: scoped_user)
end

Lives alongside existing factory methods (link_from_web_request, link_from_oauth_token, etc.). Owner: @skundapur and the Auth team.

Ai::Messaging::Adapters::Base (redesigned)

module Ai
  module Messaging
    module Adapters
      class Base
        # === Sync construction (called from #trigger) ===
        # Concrete adapters call super(...) with full constructor state.

        # === Async construction (called from CallbackWorker) ===
        # Subclass implements from_callback_context. Returned instance
        # has access to callback_context only -- constructor attrs like
        # current_user, project, etc. are nil.
        def self.from_callback_context(_ctx) = new

        def self.adapter_key = raise NotImplementedError

        # Wraps the lifecycle: ack -> resolve -> link -> grant -> execute -> notify.
        # `on_request_received` runs BEFORE `build_callback_context` so adapters
        # can stash state (e.g. progress note) used during context serialization.
        def trigger
          ack = must { on_request_received }
          return ack if ack.error?

          bundle = build_trigger_bundle
          ctx = build_callback_context

          unless bundle.success?
            safe { on_flow_failed(callback_context: ctx, error: bundle.reason, workflow: nil) }
            return bundle
          end

          payload = bundle.payload  # TriggerBundle instance

          if composite_identity_eligible?(payload.current_user)
            ::Gitlab::Auth::Identity.link_for_ai_flow(
              service_account: payload.service_account,
              scoped_user: payload.current_user
            )
          end

          member_result = ::Ai::ServiceAccountMemberAddService
            .new(payload.project, payload.service_account).execute
          unless member_result.success?
            safe { on_flow_failed(callback_context: ctx, error: :service_account_error, workflow: nil) }
            return ServiceResponse.error(message: member_result.message, reason: :service_account_error)
          end

          result = ::Ai::Catalog::ExecuteWorkflowService.new(
            payload.current_user,
            container: payload.project,
            goal: payload.goal,
            service_account: payload.service_account,
            flow_definition: payload.flow_reference,
            item_version: payload.flow_version,
            source_branch: payload.source_branch || payload.project.default_branch_or_main,
            additional_context: payload.additional_context,
            messaging_callback_context: enriched_callback_context(ctx, payload),
            **resource_params(payload.resource)
          ).execute

          if result.success?
            safe { on_flow_started(callback_context: ctx, workflow: result.payload[:workflow]) }
          else
            safe { on_flow_failed(callback_context: ctx, error: :execute_workflow_failed, workflow: nil) }
          end

          result
        end

        # === Surface contract — concrete adapter MUST implement ===

        TriggerBundle = Data.define(
          :current_user, :service_account, :flow_reference, :flow_version,
          :project, :goal, :resource, :additional_context, :source_branch
        )

        # Returns ServiceResponse.success(payload: TriggerBundle.new(...))
        # OR ServiceResponse.error(message:, reason:)
        def build_trigger_bundle = raise NotImplementedError
        def build_callback_context = raise NotImplementedError
        def deliver_result(callback_context:, message:) = raise NotImplementedError
        def deliver_error(callback_context:, error:) = raise NotImplementedError

        # === Lifecycle hooks ===
        # Override as needed. Defaults are no-ops or delegate to deliver_error.

        # Sync, called from #trigger before resolving the bundle.
        # MUST raise on failure -- the framework treats this as a hard precondition.
        def on_request_received; end

        # Sync, called from #trigger after ExecuteWorkflowService succeeds.
        def on_flow_started(callback_context:, workflow:); end

        # Async, called from CallbackWorker on success.
        def on_flow_completed(callback_context:, workflow:); end

        # Called from BOTH paths:
        # - Sync (Base#trigger): workflow: nil
        # - Async (CallbackWorker): workflow: <Workflow instance>
        # Adapters that need conditional cleanup branch on workflow.present?.
        def on_flow_failed(callback_context:, error:, workflow: nil)
          deliver_error(callback_context: callback_context, error: error)
        end

        private

        def composite_identity_eligible?(user)
          return false unless user
          return false unless Feature.enabled?(:ai_flow_triggers_use_composite_identity, user)
          return false if ::Ai::Setting.instance.duo_workflow_oauth_application.nil?

          true
        end

        def resource_params(resource)
          case resource
          when Issue then { issue_id: resource.iid }
          when MergeRequest then { merge_request_id: resource.iid }
          else {}
          end
        end

        def enriched_callback_context(ctx, payload)
          ctx.merge(
            'adapter' => self.class.adapter_key,
            'service_account_id' => payload.service_account.id,
            'flow_reference' => payload.flow_reference,
            'flow_version' => payload.flow_version
          ).compact
        end

        def must
          yield
          ServiceResponse.success
        rescue StandardError => e
          ::Gitlab::ErrorTracking.track_exception(e, adapter: self.class.name)
          ServiceResponse.error(message: e.message, reason: :hook_failed)
        end

        def safe
          yield
        rescue StandardError => e
          ::Gitlab::ErrorTracking.track_and_log_exception(e, adapter: self.class.name)
        end
      end
    end
  end
end

Ai::Messaging::CallbackWorker (descendants-based registry)

  • Registry built from Adapters::Base.descendants, indexed by adapter_key
  • CI-time uniqueness check on adapter_key across all subclasses
  • Resolves adapter via from_callback_context
  • Dispatches deliver_result / on_flow_completed / on_flow_failed(workflow: workflow) based on workflow status
  • Per-hook safe { } isolation so a failing on_flow_completed doesn't prevent deliver_result from running

Components to remove

  • Ai::Messaging::TriggerFlowService (introduced in !232343 (merged), currently on master) — deleted; logic absorbed into Adapters::Base

Note: Ai::CompositeIdentity::Linker was introduced in !234959 (closed) and does not exist on master. Since !234959 (closed) is being closed without merge and this issue starts fresh from master, there is nothing to delete — Linker is simply never added.

Surface contract

Concern Responsibility
Receive surface event Adapter constructor
Authorize the user Inside build_trigger_bundle (any ability check, surface-specific policy)
Pick the service account Inside build_trigger_bundle
Pick the flow (flow_reference, flow_version) Inside build_trigger_bundle
Pick the project Inside build_trigger_bundle
Build the goal Inside build_trigger_bundle
Build callback context build_callback_context (runs after on_request_received)
Deliver result/error to surface deliver_result, deliver_error
Lifecycle UX (e.g. progress note) Override on_request_received (must succeed) and on_flow_started, on_flow_failed (best effort)
Composite identity link Never — Base handles it
SA project membership Never — Base handles it
Resource translation Never — Base handles it
Adapter key in callback context Never — Base injects from self.class.adapter_key
Flow enablement check Never — surface decides during auth if relevant

Surface examples (informative, not part of this issue)

GitLab note mention adapter (foundational flows): authorizes via :trigger_ai_flow upstream; resolves SA + flow from a pre-existing FlowTrigger record; project = note.project; resource = note.noteable. on_request_received creates the progress note; build_callback_context (runs after) embeds progress_note_id into the context. on_flow_failed updates the progress note in place when workflow.present? (async failure) or when the progress note exists (sync failure).

@GitLabDuo MR mention adapter: authorizes via MR-level ability; uses dedicated bot service account; flow + version from whatever flow it borrows or routes to. Does not require the borrowed flow to be enabled in the namespace because authorization is independent of catalog enablement. Long-term, may grow its own catalog item that internally dispatches.

Slack adapter: authorizes via Slack workspace install + namespace mapping; resolves SA from namespace's catalog item consumer; project = lazily-created workspace project via WorkspaceProjectService; flow hardcoded to developer/v1 initially. on_flow_started adds 👀; on_flow_failed removes 👀 only if workflow.present?, then adds .

Custom future entry point: any auth model, any SA, any flow + version.

Properties this gives us

  1. Composability for adapter-shaped surfaces: New adapter-shaped surface = new adapter. No changes to Base, ExecuteWorkflowService, or anything else.
  2. No redundant gating: Each surface gates exactly once, in its own auth path. The shared infrastructure never re-validates.
  3. Versioning-ready: Surface picks flow_version. Aligns with !235204 (merged).
  4. No model coupling: flow_reference and flow_version are strings.
  5. No one-way doors for non-foundational flows or @GitLabDuo evolution: Future surface could derive flow_reference and flow_version (or leave version nil) from a non-foundational catalog item; ExecuteWorkflowService already handles both.
  6. Security primitives centralized for adapter-shaped surfaces: Composite identity link and SA membership are guaranteed by Base#trigger. RunService (non-foundational mention path, assignment path) remains a parallel flow until a future consolidation moves these primitives into ExecuteWorkflowService itself (out of scope).
  7. Tiered resilience: Hard preconditions raise; UX-polish hooks are best-effort with full error tracking. No silent acknowledgement failures.
  8. Auth-domain alignment: Composite identity linking lives in Gitlab::Auth::Identity per Fred's review.
  9. Type-checked adapter contract: TriggerBundle (Data.define) catches missing keys at adapter-construction time, not deep in ExecuteWorkflowService.

Scope of this issue

This issue captures one piece of work: redesigning the messaging adapter base from current master. !234959 (closed) will be closed without merge; we start fresh. Concretely, one MR delivers:

  • Add Gitlab::Auth::Identity.link_for_ai_flow (review by @skundapur since he owns the file)
  • Redesigned Ai::Messaging::Adapters::Base: TriggerBundle value object, tiered resilience (must + safe), single failure hook with workflow: kwarg, adapter_key class method, auto-injection into callback context
  • Ai::Messaging::CallbackWorker updated: descendants-based registry, per-hook safe { } isolation
  • Delete Ai::Messaging::TriggerFlowService (introduced in !232343 (merged), currently on master)
  • Refactor Ai::FlowTriggers::RunService to use link_for_ai_flow directly (replaces the inline fabricate(sa).link!(user) pattern that exists on master). If this pushes the diff size beyond reviewable scope, defer to a follow-up — it does not affect the adapter pattern.

The ADR gitlab-com/content-sites/handbook!19714 will be updated in a separate MR to reflect this design.

Out of scope (tracked separately)

  • Rebasing !235039 (closed) (GitlabNote adapter) on the new Base. Stacked work, lands after this.
  • Slack adapter (!232441 (closed)) and @GitLabDuo MR mention adapter — separate efforts in their own MRs.
  • Long-term consolidation: moving composite identity link + SA membership into ExecuteWorkflowService itself. This is the strictly cleaner end-state (eliminates RunService as a parallel path entirely) but is a substantive separate effort. Worth a follow-up issue.

Implementation notes

Things the implementer needs to know but that don't gate implementation.

  1. Hook ordering contract: on_request_received runs before build_callback_context so adapters like GitlabNote can stash state (e.g. @progress_note) used during context serialization. Document this contract loudly in Base#trigger so it is not accidentally inverted in a future refactor.

  2. Async vs sync method visibility: methods on a concrete adapter run in two contexts — sync via #trigger (full constructor state available) and async via from_callback_context (only callback_context hash available). Document which constructor attributes are nil in the async path. Group methods with comment dividers (# === Sync === / # === Async ===) so adapter authors don't accidentally call sync-only state from async hooks.

  3. Flow version precedence: !235204 (merged)'s StartWorkflowService resolves flow version with feature-flag overrides (duo_developer_next_unstable, fix_pipeline_next). The precedence is: adapter flow_version arg > FF override replaces it for unstable variants. Document this in Adapters::Base so adapter authors are not surprised that flow_version: '2.0.0' becomes '1.0.0' when duo_developer_next_unstable is set.

  4. Spec isolation: !234959 (closed)'s trigger_flow_service_spec.rb had a CI flake (12 tests failing on first run, all passing on retry — likely test ordering issue from spec restructuring). When writing the new Base spec, prefer let over let_it_be for state stubbed via stub_feature_flags; isolate flow-enablement contexts in their own before blocks.

  5. Test coverage: follow standard GitLab contribution requirements and the patterns used by maintainers in this area. Reference !234959 (closed)'s spec structure for context-coverage breadth (auth, namespace resolution, project resolution, FF-on/off, error reasons), but write the spec fresh against the new Base.

  6. Rollback / migration considerations: messaging adapter is not yet used in production by anyone (Slack adapter is internal MVC, GitlabNote adapter is unmerged PoC). No production callback-context rows to migrate; no rollback FF needed.

  7. What carries over from !234959 (closed): !234959 (closed) contained genuine bug fixes alongside the architectural mistakes. When implementing fresh, preserve these:

    • :execute_ai_catalog_item authorization check on the messaging path
    • service_account_id and flow_reference enrichment of callback_context
    • pinned_version_prefix resolution from the item consumer
    • service_account_error / unauthorized / workspace_project_error reason symbols and their log_and_error pattern
    • The resolve_service_account_and_consumer consolidation (passes the project as container so the child consumer is found, not the group-level parent)
    • Resolve service account service payload includes item_consumer (the ResolveServiceAccountService change in !234959 (closed))

    These pre-date the architectural concerns and should not be lost.

Known limitations to address before rollout

  • Epic noteables: note.project is nil for epic notes, so the GitlabNote adapter cannot post a reply. Today this means epic mentions silently fail. Add an explicit guard at the post-process layer that rejects epic-noteable triggers with a clear error, until a separate group-scoped delivery path exists.
  • Conversation context performance: GitlabNote adapter loads all discussion notes via Note.where(discussion_id:).user.includes(:author) before slicing the most recent N. For a 500-note discussion this is a moderately expensive query. Verify with a load test before rolling out widely, or add a SQL LIMIT upstream.

Migration / cleanup

  • Close !234959 (closed) without merge (referencing this issue)
  • Rebase !235039 (closed) (GitlabNote adapter) on the new Base once this lands
  • Update ADR gitlab-com/content-sites/handbook!19714 in a separate MR to reflect this design
  • Loop @skundapur in for review on the Gitlab::Auth::Identity.link_for_ai_flow addition

References

Edited by Thomas Schmidt