DRAFT Implementation plan: Evolving from Assistant to a @GitLabDuo Team Member

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

DRAFT - Use cases

Brainstorming list

DRAFT - NOT READY - DO NOT READ BEYOND THIS POINT: Recommendation: Duo as an e2e DevSecOps collaborator for individuals and teams

This is what GitLab should do to go after the vision laid out at the top.

Phase 1a: Agentic Chat in IDE

Users can provide development related instructions to Duo and Duo can will change code for them. Users can choose to accept or reject the suggestions or choose to auto-accept.

Use Cases: Code Author job steps

Missing capabilities

Agentic: enable Duo to take actions

  • Enable Duo to take the same actions as humans can do in IDE and GL. Specifically, we need to focus on
    • Read epics/issues/tasks/workitems/MRs
    • Search through Code Base and read files (Chat with your code base)
    • Edit multiple files
  • Enable Duo to take into consideration more than one context (currently it cannot consider for instance an MR and an issue in the same question)

Artifact relationship understanding

  • Based on how GitLab models its domain data, provide Duo with the understanding of how different artifacts (code, issues, MRs, etc.) relate so that Duo can gain a holistic understanding of individual customer projects.

Historical context preservation

  • Extend Custom Rules letting Duo itself contribute to it, so it can learn from its interactions with users.
  • Specifically, enable it to capture and maintain
    • project history,
    • decisions, and their rationale
    • ways of working to inform AI actions.
  • Humans should approve the custom rules proposed by Duo.

Choice between user approval or auto-approve

Phase 1b: Agentic Chat for Plan tasks in GitLab Web Application

Users can provide Plan related instructions such as asking for a report and Duo responds in the chat with a report. Users can also ask Duo to take an action like creating a work plan in a workitem description or break down an epic into implementation issues.

Use cases:

  • Reporting tasks as identified in Initial Chat Use Cases for EM / Planner Persona
  • Getting up to speed. For example:
    • What changed in this workitem since I last commented on it?
  • Work preparation tasks such as updating descriptions in workitems based on discussions in comment threads.
    • Could you please update the description of this issue, based on the discussion thread <link>.
    • Could you better structure the description in the issue into goals and solution proposal, please consider also the discussion in the parent epic.
  • Creating an implementation plan that considers the existing code base
    • Could you break down this featuer into an implementation plan and create tasks for each step? Consider the existing code base in this project here.

Missing capabilities

Agentic: enable Duo to take actions

  • Enable Duo to take the same actions as humans can do in GL. Specifically, we need to focus on
    • Search through epics/issues/tasks/workitems/MRs/comments via
      • natural language query
      • tree walking to parents, children, and linked items
      • filtering by milestones, labels, assignees, etc.
      • any mix of the above
    • Search through Code Base and read files (Chat with your code base)
    • Read/write/create epics/issues/tasks/workitems/comments
    • Read/comment on MRs
  • Enable Duo to take into consideration more than one context (currently it cannot consider for instance an MR and an issue in the same question)

User experience: approving changes in GitLab in the users name

  • User sees which changes Duo proposes to make in epics/issues/workitems/comments (ideally as diff)
  • User approves changes
  • Changes are made in the name of the user not in the name of Duo. (Potentially, include "changes made with help of AI".)

Phase 2: Enhance @GitLabDuo to become a collaborator that support Plan tasks in GitLab Web Application

@GitLabDuo currently only supports MR reviews. In this phase, we will extend @GitLabDuo so it can work on work-items to help teams with plan tasks. Users will interact with it like with a human user in GitLab simply by @mentioning it by its name in comments or work-items.

Use cases are the same as in the phase above, but their use becomes more visible for the entire team.

A pre-condition is that the output of AI is almost always correct. This collaborator is not so valuable when it has to come back many times to ask for clarification or makes mistakes. In a synchronous, 1:1 collaboration in a chat this is more forgivable.

Missing capabilities

AI Identity and Presence

  • Establish Duo as first-class citizens within the GitLab platform, with identities, profiles, and clear attribution of their actions.
  • Trigger Duo to run when human co-workers and other AI Agents @-mentioned Duo in comments, assigned work-items to Duo, or when a certain event happens, like a new bug being created.
  • Enable Duo to get back to the user or team that assigned a task via @-mentioning the user or team in comments or re-assigning the user

Use existing tracking and governance also for AI co-worker - no change needed

Most likely nothing needs to change in GitLab's existing governanance and tracking when AI are added as co-workers.

  • The actions that AI takes will be tracked just like those of humans (e.g. when they commit to a repo in GL, create an MR/workitem, comment in GL, approve MRs, etc.).
  • Also it is at the customers' discretion which roles they want to assign to AI agents in which project/group.
  • They can also apply MR approval rules that would require humans to review AI work and vice versa before merging. This is possible with groups.

However, it may be more convenient if the role model was extended to differentiate human roles from AI roles, e.g. developer vs. AI-developer. But this can what until AI co-workers are a reality.

Probability to be correct

  • High probability to be correct the first time (98%)
  • Self-awareness of limits

Duo's user profile page: https://gitlab.com/GitLabDuo:

For example:

Hi, I am @GitLabDuo. You can ask me about the workitems (epics/issues/tasks/OKRs) or about Merge Requests. I can also update these workitems/MRs. Here is how you can interact with me:

  • You can @mentioning me by my user name @GitLabDuo in comments or descriptions or by assigning me as a reviewer on MRs.
    • Your question and my answer are then visible to your colleagues for their benefit as well.
    • Changes that I make for instance to an issue description, will then be under my user name.
  • You can also ask your question in DuoChat.
    • This is then a private communication between you and me.
    • Changes that I make for instance to an issue description will be tracked under your user name. Of course you can undo changes as always.

For Planning tasks

Just leave a comment in the respective workitem mentioning @GitLabDuo with your specific question. For instance,

  • @GitLabDuo what is progress of this epic and I will respond with a comment considering relevant content such as child items, comments, MRs, etc.
  • @GitLabDuo, could you update the status in the description and I will find the relevant text in the description and update it based the information, I find child items, comments, MRs, etc.
  • @GitLabDuo could you see if this epic could be cleaned up. Please, close all issues that are done and report back what you have done. and I will do so.
  • @GitLabDuo could you take the conclusion from this discussion and update this epic/issue/task/MR/OKR? and I will review the discussion and update the description or in an MR propose a change based on the discussion.
  • @GitLabDuo could you go through all the customer comments in this issue here and issue-123 and summarize the distinct needs that they raise?
  • @GitLabDuo could you show me all the related but not linked workitems that you can find and explain how they relate? and I will go search from them based on labels and natural language search and respond with a summary of what I found.
  • @GitLabDuo please break down this issue into implementation steps. Please, consider the current implementation in repo-123 as you do so. and I will compare the goals of the issue with the current implementation and come up with a plan.

Access Control and Context

  • If you can @-mention me, then I have access to the workitem that you are attempting to incquire about.
  • I can also go through linked workitems and content behind URLs mentioned if I have access to them. If I don't have access I will mention it in my response.
  • I can also search through workitems based on your question, based on parent-child relationships or linkage, based on milestones, assignees or labels.
  • If I don't have access, you can provide me with access like you do with any other team member, simply by assigning me a role in the respective project. For instance the "developer" role.

Automation tasks

You can also assign automation tasks to me.

  • @GitLabDuo could you provide a status update on this epic every Monday? I will then assign myself to the epic and provide this weekly update until I am told to stop or until the epic is closed.

You find the full list of my assigned automations under Agents>@GitLabDuo where you can cancel or change assignments.

For Code Reviews or other MR related tasks

When you create an MR, I will automatically give it review. You can request a re-review any time.

You can also ask me questions about the MR by commting in the MR. This may be particularly helpful if you are reviewing someone elses MR. For example:

  • @GitLabDuo why was this function renamed?
  • @GitLabDuo why was the .vue file changed?
  • @GitLabDuo could you propose an improvement of this code selection?

Phase 3: Enhance @GitLabDuo to become a developer that can carry out developmemnt tasks independtly (no IDE involved)

After the above phase, @GitLabDuo will support planning tasks via work-items and MR related tasks such as MR review. This phase is about enhancing @GitLabDuo to become a developer that can work indepentently on development tasks that are assigned to it or that it self-assigns.

Duo co-worker picks up development work like fixing bugs and vulnerabilities on its own and tries to close them. Human team members or AI co-worker can assign tasks/issues to the Duo developer. When done results are reviewed and approved and merged by human co-worker or Duo reviewer depending on branch rules.

This is very different from current AI developers that are assistants inside IDEs. The @GitLabDuo developer will work largely autonomously. An IDE will not be involved - the development happens in a safe environment in the backend. In those cases when Duo cannot finalize a task on its own it will decide if it should give up or if it should ask a human co-worker for help. In both cases, it will comment in an MR or an issue with its decision/question.

This will likely create a signinficant cost, so it must be under control of the buyer and value must be transparent.

Missing capabilities

Agentic capabilities to develop and test software indendently

  • Artifact modification capabilities: Building secure mechanisms for AI to directly modify system artifacts, including:
    • Changing code across multiple files and multipe repositories
    • Updating work items, issues, and epics
    • Creating and modifying documentation
    • Adding comments and insights to discussions
  • Secure execution environments: sandboxed environments where AI can:
    • Execute code to validate proposed changes
    • Run tests to verify functionality and prevent regressions
    • Validate security compliance of changes
    • Experiment with different approaches safely
  • Computer use: to try out application
    • Duo needs to be able to test the application
    • Clicking buttons, etc.
    • Seeing output
  • Internet access: for learning
    • For example to find the latest version of a library
  • Development workflow integration: Enabling AI to participate directly in standard development workflows by:
    • Creating and updating merge requests
    • Triggering and responding to CI/CD pipelines
    • Addressing review feedback autonomously
    • Implementing requested changes across multiple systems
  • Bounded decision autonomy: Establishing frameworks for appropriate AI decision-making:
    • Defining clear boundaries for autonomous action vs. human approval
    • Creating escalation paths when confidence is low or stakes are high; Duo should comment in the issue or MR
    • Building transparent confidence metrics for AI judgments
    • Implementing progressive permission models based on trust and experience
  • Cross-system coordination: Supporting AI work that spans multiple repositories or systems:
    • Coordinating changes across related components
    • Understanding and managing dependencies between systems
    • Maintaining consistency across distributed changes
    • Handling complex refactorings that touch multiple repositories
  • Feedback-driven improvement: Creating systems for AI to learn from its actions:
    • Tracking the outcomes of AI-initiated changes
    • Capturing and applying human feedback to future actions
    • Building institutional knowledge about successful patterns
    • Developing project-specific expertise through continued engagement

AI Administration and Governance

  • AI resource management: Tools to allocate, monitor, and govern computational resources for AI agents.
  • Performance tracking: Building systems to measure the effectiveness and impact of AI team members. E.g. Dollars spent vs. issues closed.
  • Safety mechanisms: Implementing guardrails and oversight processes to ensure AI actions align with team goals and organizational policies.
  • Attribution and licensing management: Developing capabilities to track the origins and licensing of AI-generated content.

Phase 4: custom agents in GitLab and Duo Identities in other Enterprise Software Platforms

Extensibility

  • Enable customers and 3rd parties to easily create custom AI agents for their special puporposes, by offering APIs to fetch GitLab context.
  • Enable customers and 3rd parties to integrate GL agents into other platforms such as JIRA, so that users can kick off agent workflows from other platforms or so that agents can respond in other platforms.

🚧 WIP: Consideration: One Duo Agent vs. Multiple Duo Agents

There are two distinctions to make:

  • How does it look from the user perspective?
  • How is implemented?

The two are largely independent. Duo can act as one user towards its human collaborators and be still be built as a multi-agent system under the hood. And it could also be the other way around. Or experience and implementation follow the same pattern.

The following looks at the user perspective.

Multi-agent systems from the user perspective:

  • Discovery: Users can probably not remember more than 7 agents and their specific capabilities.
  • Each agent must be sufficiently clear in its capabilities.
  • Each agent must be suffieciently distinct enough to justify its additional existance next to the other agents.
  • Mapping agents to tradional job roles is difficult as AI can often cover a broader spectrum of work than humans can do.
    • For instance, if an agent has access to workitems that contain customer feedback, research results, competitor analysis, plans, features, bugs, etc. the agent can do the work of a product manager, project manager, and engineering manager.
    • Or let's say there is Development Collaborator Agent that claims to be able to create and modify code across multiple files and there is also a Quality Guardian Agent claims to be able generate tests. Could the user not ask the Development Collaborator to create tests? Would it reject create those and point to the Quality Guardian Agent? Probably, not.

Single-agent system from user perspective

  • Discovery: When there is only one agent (e.g. @GitLabDuo) that stands for an AI that can in principle do all thinkable tasks, it may be more difficult to explain what it can actually do and not do. (Similar to DuoChat today.) On the contrary in a multi-agent system if the system can't change code, the developer agent would simply not exist.
  • This will become less of a problem when this one agent can actually do most things and can express that it cannot do a certain thing being asked.
  • What the agent can do is probably largely defined by what it has access to and which models are used under the hood.

Single-agent system vs. Multi-agent system from an eco-system perspective

??? CONCLUSION: Duo should be one agent that does work accross the SDLC. Customers and 3rd parties can create their own agents. They can choose to integrate their capabilities in Duo or create separate agents.

Edited by 🤖 GitLab Bot 🤖