Feature flag rollout: `agentic_manual_retry_for_duo_chat_responses`
<!--IssueSummary start-->
<details>
<summary>
Everyone can contribute. [Help move this issue forward](https://handbook.gitlab.com/handbook/marketing/developer-relations/contributor-success/community-contributors-workflows/#contributor-links) while earning points, leveling up and collecting rewards.
</summary>
- [Label this issue](https://contributors.gitlab.com/manage-issue?action=label&projectId=278964&issueIid=601522)
</details>
<!--IssueSummary end-->
## Summary
This issue is to roll out the [Add retry button to Duo Chat responses](https://gitlab.com/gitlab-org/gitlab/-/work_items/587828) feature on production, which is currently behind the `agentic_manual_retry_for_duo_chat_responses` feature flag.
**Feature flag type:** `gitlab_com_derisk` (default: off)
**Milestone:** 19.1
**MR:** https://gitlab.com/gitlab-org/gitlab/-/merge_requests/237062
**Parent epic:** https://gitlab.com/groups/gitlab-org/-/work_items/21289
> **Note:** The feature flag should remain **off** until all three implementation MRs are merged:
> - Frontend wire-up: !237062 (this MR)
> - Frontend `isRetry` plumbing: !238116
> - Backend duo-workflow-service: [ai-assist!5570](https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/merge_requests/5570)
## Owners
- Most appropriate Slack channel to reach out to: `#g_duo_chat`
- Best individual to reach out to: @tbulva
## Expectations
### What are we expecting to happen?
A retry button (redo icon) appears in the Duo Chat response action bar after every completed or stopped response. Clicking it resubmits the preceding user prompt as a fresh chat message, consuming credits like any normal submission. The button is hidden during active streaming and does not appear on intermediate tool call steps.
### What can go wrong and how would we detect it?
- Unexpected credit consumption if retries are triggered unintentionally — monitor via billing/usage dashboards.
- Duplicate or looping messages if the retry handler fires multiple times — monitor error rates and Sentry for JS exceptions in `DuoAgenticChatStateManager`.
- Regression in normal chat flow (non-retry) — monitor Duo Chat error rates and user-facing error reports.
## Rollout Steps
Note: Please make sure to run the chatops commands in the Slack channel that gets impacted by the command.
### Rollout on non-production environments
- Verify the MR with the feature flag is merged to `master` and has been deployed to non-production environments with `/chatops gitlab run auto_deploy status <merge-commit-of-your-feature>`
- [ ] Deploy the feature flag at a percentage (recommended percentage: 50%) with `/chatops gitlab run feature set agentic_manual_retry_for_duo_chat_responses 50 --actors --dev --pre --staging --staging-ref`
- [ ] Monitor that the error rates did not increase (repeat with a different percentage as necessary).
- [ ] Enable the feature globally on non-production environments with `/chatops gitlab run feature set agentic_manual_retry_for_duo_chat_responses true --dev --pre --staging --staging-ref`
- [ ] Verify that the feature works as expected.
The best environment to validate the feature in is [`staging-canary`](https://about.gitlab.com/handbook/engineering/infrastructure/environments/#staging-canary) as this is the first environment deployed to. Make sure you are [configured to use canary](https://next.gitlab.com/).
- [ ] If the feature flag causes end-to-end tests to fail, disable the feature flag on staging to avoid blocking [deployments](https://about.gitlab.com/handbook/engineering/deployments-and-releases/deployments/).
- See [`#e2e-run-staging` Slack channel](https://gitlab.enterprise.slack.com/archives/CBS3YKMGD) and look for the following messages:
- test kicked off: `Feature flag agentic_manual_retry_for_duo_chat_responses has been set to true on **gstg**`
- test result: `This pipeline was triggered due to toggling of agentic_manual_retry_for_duo_chat_responses feature flag`
### Before production rollout
- [ ] Confirm all three implementation MRs (!237062, !238116, ai-assist!5570) are merged and deployed.
- [ ] If the change is significant and you wanted to announce in [#whats-happening-at-gitlab](https://gitlab.enterprise.slack.com/archives/C0259241C), it best to do it before rollout to `gitlab-org/gitlab-com`.
### Specific rollout on production
For visibility, all `/chatops` commands that target production must be executed in the [`#production` Slack channel](https://gitlab.slack.com/archives/C101F3796)
and cross-posted (with the command results) to the responsible team's Slack channel.
- Ensure that the feature MRs have been deployed to both production and canary with `/chatops gitlab run auto_deploy status <merge-commit-of-your-feature>`
- [ ] Enable for all internal GitLab team members first (per the epic's feature delivery process):
- `/chatops gitlab run feature set --feature-group=gitlab_team_members agentic_manual_retry_for_duo_chat_responses true`
- [ ] Verify that the feature works for internal users. Gather feedback from `#g_duo_chat`.
### Preparation before global rollout
- [ ] Set a milestone to this rollout issue to signal for enabling and removing the feature flag when it is stable.
- [ ] Check if the feature flag change needs to be accompanied with a
[change management issue](https://about.gitlab.com/handbook/engineering/infrastructure-platforms/change-management/#feature-flags-and-the-change-management-process).
Cross link the issue here if it does.
- [ ] Ensure that you or a representative in development can be available for at least 2 hours after feature flag updates in production.
If a different developer will be covering, or an exception is needed, please inform the oncall SRE by using the `@sre-oncall` Slack alias.
- [ ] Ensure that documentation exists for the feature, and the [version history text](https://docs.gitlab.com/development/documentation/feature_flags/#add-history-text) has been updated.
- [ ] Notify the [`#support_gitlab-com` Slack channel](https://gitlab.slack.com/archives/C4XFU81LG) and your team channel ([more guidance when this is necessary in the dev docs](https://docs.gitlab.com/development/feature_flags/controls/#communicate-the-change)).
### Global rollout on production
For visibility, all `/chatops` commands that target production must be executed in the [`#production` Slack channel](https://gitlab.slack.com/archives/C101F3796)
and cross-posted (with the command results) to the responsible team's Slack channel.
- [ ] [Incrementally roll out](https://docs.gitlab.com/development/feature_flags/controls/#process) the feature on production.
- Example: `/chatops gitlab run feature set agentic_manual_retry_for_duo_chat_responses <rollout-percentage> --actors`.
- Between every step wait for at least 15 minutes and monitor the appropriate graphs on https://dashboards.gitlab.net.
- [ ] After the feature has been 100% enabled, wait for at least one day before releasing the feature.
### Release the feature
After the feature has been [deemed stable](https://about.gitlab.com/handbook/product-development-flow/feature-flag-lifecycle/#including-a-feature-behind-feature-flag-in-the-final-release),
the [clean up](https://docs.gitlab.com/development/feature_flags/controls/#cleaning-up)
should be done as soon as possible to permanently enable the feature and reduce
complexity in the codebase.
You can either [create a follow-up issue for Feature Flag Cleanup](https://gitlab.com/gitlab-org/gitlab/-/issues/new?description_template=Feature%20Flag%20Cleanup)
or use the checklist below in this same issue.
- [ ] Create a merge request to remove the `agentic_manual_retry_for_duo_chat_responses` feature flag. Ask for review/approval/merge as usual. The MR should include the following changes:
- Remove all references to the feature flag from the codebase.
- Remove the YAML definitions for the feature from the repository.
- [ ] Ensure that the cleanup MR has been included in the release package.
- [ ] Close [the feature issue](https://gitlab.com/gitlab-org/gitlab/-/work_items/587828) to indicate the feature will be released in the current milestone.
- [ ] Once the cleanup MR has been deployed to production, clean up the feature flag from all environments by running these chatops command in `#production` channel: `/chatops gitlab run feature delete agentic_manual_retry_for_duo_chat_responses --dev --pre --staging --staging-ref --production`
- [ ] Close this rollout issue.
## Rollback Steps
- [ ] This feature can be disabled on production by running the following Chatops command:
```
/chatops gitlab run feature set agentic_manual_retry_for_duo_chat_responses false
```
- [ ] Disable the feature flag on non-production environments:
```
/chatops gitlab run feature set agentic_manual_retry_for_duo_chat_responses false --dev --pre --staging --staging-ref
```
- [ ] Delete feature flag from all environments:
```
/chatops gitlab run feature delete agentic_manual_retry_for_duo_chat_responses --dev --pre --staging --staging-ref --production
```
issue