@vincywilson I started this issue as a discussion point to define requirements. We would need a semi-quick solution, so we should keep low-effort in mind, and focus on near-term stability.
Do we have existing automations for updating test / demo environments in this way (to stay X days, releases, branches behind)?
What do you think about the idea to define actual customer demo workflows, and script E2E tests for those use workflows? Do you think it would be a viable solution?
To have a proper list of known issues for a given release, I would think any environment would need to be at least 2-3 days behind in order for us to identify a regression in a worst case scenario (rely on customer feedback) + address the root cause + update the prior release documentation. Because usage of Duo features is technically low, it could take this long to find 1 feature not working properly. For more visible features this is of course not true, but thinking holistically about the suite.
@m_gill, Yes, I agree with the idea of having a stable demo environment for SAs to utilize for customer demos.
We can spin up an environment using reference architectures, but as @wortschi pointed out, we will need to understand the requirements better. Our RA is defined based on RPS primarily or the number of users. Do you know what should be the RPS for this demo environment?
Do we have existing automations for updating test / demo environments in this way (to stay X days, releases, branches behind)?
In our performance envs we configure them to deploy nightly, and no additional work is required. It's automated as long as you set it to use nightly builds
for custom packages - you need to manually/directly provide new package URLs.
What do you think about the idea to define actual customer demo workflows, and script E2E tests for those use workflows? Do you think it would be a viable solution?
Before we look into automating them, could we have these customer demo workflows defined? We can have our team look into our existing E2E tests to see if we already have coverage.
To have a proper list of known issues for a given release, I would think any environment would need to be at least 2-3 days behind for us to identify a regression in a worst-case scenario
If we need a stable branch, I don't think even 2-3 days behind would be enough. Should we pin it to the last GitLab n-1 release for stability? Do SAs demo AI features that are not quite released yet as an insight into what's coming? If so, then we need to define the cadence of the environment. In either case, we can deploy any OPmnibus package that the SAa need deployed.
Questions for you:
Once an env is stood up, who will review the test results and ensure they are up to par with the release we need?
On what cadence do we want to deploy to this environment? Who will own that piece?
Bug fixes -- All bug fixes should go through the proper bug fixes process and only be deployed to this env once it has gone through staging env. We should avoid having anything in the demo env that is not already in our test environments. How do we ensure this process is followed?
For how long do we want to keep this env up?
Is there anyone in your team that can maintain this env once its stood up?
Thanks for the thoroughness @vincywilson , I think answering the questions in your comments will get us everything we need. For the questions on my side:
We will want to prevent anyone needing to check the demo environment to ensure quality. Hopefully this can be done automatically, or if we use a stable release, this problem will go away. Engineering teams will need to document known issues for these releases in a central location so that even stable environments have transparency into the known issues of those releases.
Cadence will also depend on how far behind we keep the demo environment. If there is documentation around this, we can own that piece.
Completely agree on the bug process, demo environments will not be used for engineering.
Because this is a short-term solution while we work through stability, I feel it should follow the same timeline as this issue (reassess after 30 days) - WDYT?
@wortschi are you all already used to managing this? If not, the responsibility likely falls to AI Framework and they will need to make this part of their process. We can look at offloading that in the interim given their other responsibilities around incident reduction.
@wortschi are you all already used to managing this? If not, the responsibility likely falls to AI Framework and they will need to make this part of their process. We can look at offloading that in the interim given their other responsibilities around incident reduction.
We used the env only internally, i.e., for our team. We did had one engineer who would perform ad-hoc updates for the instance to test code changes. Currently, we're not using the environment anymore and are not maintaining it.
@m_gill@vincywilson@wortschi Strongly agree with investing in end-to-end testing to detect issues earlier and be less reliant on customer reports.
QA is an important piece here, but it's prone to be flaky and thus often requires triage, which takes time -- and it's tied to the monolith deployment cycle which may not reflect deployments of other moving pieces (AIGW, customersdot, cloud connector). It's useful, but not sufficient.
To complement more in-depth tests, I think it would be useful to also have some synthetic smoke tests to catch the worst issues. This could be something fairly rudimentary, such as a blackbox_exporter making authenticated calls against AIGW and expecting a 2xx response. This will not catch everything, but it should help with some of the subtler issues we've seen around authentication and licensing -- which are gaps in our current SLO monitoring. And since this would be probing continuously, it would detect issues very quickly, also making it easier to correlate with changes.
Happy to set up a call for us to discuss and no longer spend cycles on this but the real issues come from outages like the Anthropic outage today, SAs not being aware of changes like the CWE issue yesterday, and SAs being apart of model test groups that have issues. The automated testing on our workflows should fix that problem and Ive been told that SAs shouldnt be added to the test models in the future.
@jfullam do you want to help answer @vincywilson's questions above (regarding RPS, workflows, and how far behind the demo environment should be) or do you have someone who can engage here more closely? I think a zoom sync between the 3 of us would iron out next steps.
Once an env is stood up, who will review the test results and ensure they are up to par with the release we need?
Someone in SA.
On what cadence do we want to deploy to this environment? Who will own that piece?
Monthly should be fine. SA can own that piece.
Bug fixes -- All bug fixes should go through the proper bug fixes process and only be deployed to this env once it has gone through staging env. We should avoid having anything in the demo env that is not already in our test environments. How do we ensure this process is followed?
I'm not sure why this is needed. I need more context
For how long do we want to keep this env up?
I would target for at least a full quarter, when we'd reassess.
Is there anyone in your team that can maintain this env once its stood up?
Someone in SA.
Some additional information
SA would still selectively use GitLab.com when needing to demo the latest that might not be in the stable demo environment.
I don't know what the required RPS is. There are around 150 SAs at GitLab and we are definitely not all doing demos at the same time.
@lfstucker@poffey21 - Please feel free to additionally comment. Having a stable demo environment for Duo is a top priority from a demo perspective.
Hi @jfullam@m_gill@vincywilson what is the target problem this new instance is meant to solve? We already have https://cs.gitlabdemo.cloud that SAs use for demos and showing off Duo as it has a full Duo Pro license attached. All SAs (and CSM/CSEs) have access by default, and a monthly deployment schedule is already in place but can be changed if the ask is a 3 day release lag. From this issues context I believe the existing demo instance should solve all listed problems.
"Solutions Architects need a stable demo environment that does not change unexpectedly during critical prospect/customer demos and workshops" my read from this is more so the problem that SA accounts keep getting added to the testing group for new models and are effected by the higher rates of outages, as well as not having their own duo license in their own group which heavily reduces their demoing capabilities
@lfstucker - SAs are mostly using Gitlab.com to demo Duo. It's likely for the reasons you mentioned. Bottom line is that we don't have a stable and reliable environment to demo Duo.
@reshmikrishna for additional context give this ask came originally from you.
Hello @lfstucker Currently the challenge SAs are running into challenges, noted below:
Demo Environment is constantly changing resulting in regression and surprises during real time demos even with dry runs a day prior. Here's a latest example.
We provide pre canned workshop steps so customers can follow along typically 2 weeks before the hands-on workshops. In the recent workshops, customers have expressed frustrations with not able to follow along as our UI and workflows have changed and are changing regularly.
SAs have raised this as a major risk to delivering demos and workshops and trials that we can proactively prescribe and control. This leaves customer with negative sentiment about Duo.
Does that help?
Also adding @kkwentus1@sbailey1@msingh-gitlab@johnbush@yhsueh here who have been part of lot of customer demos/workshops recently which has encountered these issues. Pls feel free to add anything I have missed.
@reshmikrishna the latest example you posted - based on the Slack message it appears this was on .com and not using the demo environment that @lfstucker mentioned above. Is that right? If so, I'd like to put focus back on a stable demo environment, while we work to make .com more stable in parallel.
I'm also interested in these pre-canned workshop steps that result in frustration because of workflow updates. A demo environment can help with this because the updates will not be as frequent, however, we can't reduce the frequency of workflow changes in all cases given some of our Duo features are still in development and having those things finalized. Can you share the pre-canned workshop steps here?
Hello @m_gill as the referenced Slack thread was related to my customer, I'll add my thoughts, although please consider what I am saying only one person's experience: the issues I have experienced with Duo are not related to the demo environment, but rather to Duo itself. One case was where Code Suggestions was outright returning an error; while a conclusive root cause was never reached, it was likely related to a stale cache (gitlab-com/gl-infra/production#18262 (comment 2028170837)). The Slack thread referenced above is in regard to IDE plugins (gitlab-org/editor-extensions/gitlab-jetbrains-plugin#561 (comment 2031475139)). I also hit the issue whereby Duo Chat was returning error M4000 (gitlab-org/gitlab#473481 (closed)).
These did impact the demo environment, but the demo environment relies on Duo as much as GitLab.com does, so they were not issues with the demo environment.
The only challenge I have personally experienced (again being just a one-person data point) with the demo environment are when the instructions are "out of sync" with some recent change in Duo. For example, when the tanuki icon has recently moved to a different location in the UI without the SA team's knowledge, or when the text for a button has changed, or when the sequence of buttons to click to access a feature has changed. These changes have historically been difficult for the SA team to become aware of ahead of time, and they do make the workshop flow challenging for customers follow for customers as @reshmikrishna points out, but they are also not major issues in my experience, at least not vis a vis when Duo itself has an active incident/bug.
What would help would be if there were a release of Duo that is effectively "frozen" or "stable" and that is what is utilized by the demo environment, but to my knowledge this is not possible with our current deployment model. And given that it seems most of our customers use the demo environment for enablement but subsequently test on GitLab.com (many have licensed Ultimate namespaces on GitLab.com), even that probably does not make a material difference. I'll let the other SA indicate if I am wrong on that latter point.
My latest experience when the demo flow instructed me to click on a button for root cause analysis on a failed job in a pipeline.
The button name was changed from RCA to Troubleshoot
The button was completely moved to the bottom from the top and I felt something broke and was moving away from talking more around this feature and one of the participant from the customer team pointed me back to the button.
Also, the vulnerability explanation was moved and is not a link at the top which blends with the entire content
Hi @reshmikrishna@m_gill ill try to respond to all of the questions here but also happy to hop on a call to discuss, as @yhsueh mentioned the demo environment itself isnt the problem as both the .com & self managed solutions have been stable and reliable since Duo release last summer. All of the problems stem from two issues:
SAs & Demoers have been added to experimental models in the past - this causes RCA, Chat, etc outages that only company accounts are effected by. Ive worked with the model team to have the SAs temporarily removed from this group but it should be standard practice going forward. @tmccaslin curious on your thoughts here, how can we standardize the process of not adding every GitLab team member to test models and features with Duo? Alternative is to just use personal accounts which I know a few SAs have started to do
Duo updates are fast and frequent - no matter what demo instance we use we will run into this issue, the self managed instance is the most stable approach to not run into troubles but back end model changes will still effect us. Planned solution for this is to add UI testing to all workshops starting with AI hopefully sometime this week. In the process of checking with the UX team right now to see what testing is already in place and to see if it fits out needs
By getting off of the experimental models and offering self managed for demos/.com for workshops should be as stable as we can get
Totally with @lfstucker on this! As far as having a stable demo/workshop environment, I'd agree that a self-managed instance like cs.gitlabdemo.cloud could be a good solution, especially once self-managed reaches feature parity with .com.
@lfstucker : Yes, we are ok with any environment(.com or self managed). There's also been regression issues for example chat working intermittently. Self managed works as long as there's enough feature parity of features and there's not significantly more latency which I think is achievable with the 17.3 release?
Possibly, Will update the instance once 17.3 is confirmed stable. Chat working/not working is due to gitlab team members being added to experimental models. For now the best approach to get around this is to use personal accounts for demos/workshops that showcase duo until the product is in a better spot
Not including field folks in new models is a double edge sword. While they should have more stable features, they also won't get improvements to demo earlier which has been a common request, additionally this is commonly used as a last dogfooding effort before rolling out to customers, so I'd rather have internal folks trip over issues rather than customers.
Additionally I question how technically feasible it is to be able to opt out a select group of people as our feature flag system to my understanding doesn't make this easy.
@tmccaslin is it at all possible to provide better insight / visibility into what is currently in production for our AI capabilities? I think a gap here is it's extremely difficult to know what to expect.
Steps Worth Considering to Take:
Work with @lfstucker and @seraphiney would be of value to ensure they also know what they're offering within our cs.gitlabdemo.cloud.
Provide insight into what has been shipped regarding Duo to .com (and available to SM instances)
Align these updates with IDE updates to ensure SAs know what and when they should be demoing.
Additionally, I'm in conversation with @lfstucker to determine if GitLab Workspaces (and the VSCode IDE with plugins available) could be a good failsafe way of giving SAs not only a GitLab Environment with Duo but also having a consistent IDE Plugin experience as that could be provided by the Demo Architecture team.
In the spirit of iteration, I feel an easy first step to take, regardless of what model GitLab team members are on, would be to improve the speed of communications. Maybe that's as simple as a dedicated "team readiness" Slack channel for the field whereby each SA that is about to go into a demo, workshop, Office Hour, or other customer touchpoint just posts "anyone aware of active issues/incidents today?"
As an example, I noticed this Slack message regarding Android Studio yesterday (https://gitlab.slack.com/archives/C02UY9XKABH/p1722874729482139). Other team members likely did as well, but probably not all team members. With a "readiness" channel that is just for people who interface with customers, we could inform each other of issues that can impact customers. I'm not saying the rest of the considerations brought up are not important - they are - but this seems like it would be a good first step.
Probably need to think through a better approach as a team for how to showcase new or upcoming duo features because the risk of having a full day outage (which has happened twice now) is much more detrimental, and Ive seen it backfire during hands on sessions where customers get confused why the look or output is so different than what they are seeing on their end
Hi @m_gill@igorwwwwwwwwwwwwwwwwwwww what would be the added benefit over the existing self managed instance? Its updated fairly regularly, are their Duo benefits with this approach? From a far out view it just seems to shift maintenance but keep the same duo problems
@igorwwwwwwwwwwwwwwwwwwww definitely agree, from a use case outside of this issue I see a lot of value in having a CS/SA focused dedicated instance for demos. We get requests to just showcase self managed to see that it works, so being able to do the same with dedicated I imagine being a huge plus.
In context of this issue and after reaching out to the SAs listed above the demo instance itself hasnt been the problem as its been the models, experimental groups, and unannounced changes that have caused hiccups. The proposed automated testing + removal of experimental groups should fix that problem instead of shifting to problem to another instance
The Support Sandbox might be the best way to go. It already exists, and it gets updated on a regular schedule. @bcarranza@ashiel - do you agree with this, or do we have a better environment that non-dedicated teams could depend on?
I know that @bcarranza and team use this instance so if there any concerns another option is to use the ashieltest9d1acd instance in the preprod environment. Either of these would be appropriate.
@amyphillips@ashiel I would request that we not use the Support sandbox for demos. I say that because the more other purposes it's used for, the harder it is for the Support team to feel able to freely "break" or misconfigure it in some ways to help customers. (Example: testing SAML configs.) We haven't needed a detailed usage policy but this topic has come up a bit as different folks from outside Support have requested access. I think that demonstrates the demand for folks outside Support to have access to a GitLab Dedicated instance.
I'd be interested in thoughts from @weimeng-gtlb on this topic and happy to take Amy up on her ashieltest9d1acd offer.
(Historical context for those who may be interested in gitlab-com/gl-infra/gitlab-dedicated/team#2417.)
I'm aligned with you 100% Brie. We cannot guarantee stability on the Support Sandbox, as Support can test configurations and actions which will have the effect of "breaking" the instance for all other users.
To illustrate what I mean using the example which Brie mentioned: While troubleshooting a ticket, a support engineer (SE) may reproduce a customer's SAML single sign-on config using the SE's own test SSO provider. This has the effect of preventing all Support Sandbox users from logging in while testing is ongoing. I believe this would be undesirable for the purposes stated in this issue.
To be fair, disruptive actions happen relatively infrequently, but Support cannot guarantee that it will not happen during customer/prospect workshops and demos. If we have an emergency or other escalated customer situation, we may need to use the Support Sandbox with little to no prior notice.
However, if the SA team is willing to take on this risk, I don't see a reason why they can't share use of the Support Sandbox with Support.
I agree, stability is top priority for the SAs to use the instance so the Support Sandbox would not work. @amyphillips what would the correct path be to request this internally?
Duo features in SaaS being unable stable not only impacts SAs, it impacts every SaaS trialing and owning customer of Duo products.
The problem with pure SM for all demos it that is always going to be last in a rollout from a feature standpoint - as @tmccaslin reminded us - we iterate in SaaS. Especially the new models - those benefits are big. I feel we need to just say in SaaS but look for ways to contribute to make it better for all users.
100% yes to having the option of @lfstucker csdemo with the latest SM is a very good option, and we need it. It will be the simplest "fall back" if we have a huge problem. But it doesn't solve for things like general behavior problems for SaaS Duo, where LearnLabs could also be impacted, or on-going SaaS trials impacted.
I really like the @yhsueh idea of Office Hours and more teaming - mostly cause we discussed it together earlier. We need to spread more information quicker. SAs are professional and we can rebound, if given information.
I love office hours but also no one really attends them. We create a DuoOfficeHours slack channel (please forgive me) meant for statements like "just had a hiccup in a demo 10 minutes ago". Constant feedback from SAs on what they are experiencing. We can open a Zoom in the channel at anytime for bigger discussions
I've always also wondering how we can build a better bridge between engineering and SAs. SAs are surely one of the best users of Duo, in any company - and engineering can access us at any time.
If they think there is a problem - we are right here to help validate.
Is there a staging environment we can look at to help before changes get pushed to Father SaaS?
Piggy backing off of this and some messages with @poffey21 DA is in the process of attempting to get access to the needed agent to bypass Cloudflare on .com and enable selenium testing. A huge part of this effort will be fully testing the AI workshop to ensure its always up to date, but also confirming responses are what we expect. Because the workshop covers almost all AI features ideally we can pipe the results to a slack channel daily (or 2x a day) so that everyone is aware that certain features are behaving as expected?
I've received an escalation request for guidance on this since I built the original demo systems. I've discussed this at length with @lfstucker to get more context.
I'd like to ask that we take a step back and look at the bigger picture. Although many of the ideas here are interesting, not all of them are the right solution technically and many wouldn't solve the problem.
The goal of a stable demo environment is well understood, however it would be helpful for us to paint a better picture of the shared responsibility model of the upstream systems and APIs that we depend on that are outside of our control.
In other words, spinning up yet another demo environment/instance won't help here and any discussion of GitLab Dedicated does not solve our problem since the cs.gitlabdemo.cloud environment is already identical.
Let me chat with @m_gill next week and collaborate more with @lfstucker. Once we understand our options, we will validate the ideas with @vincywilson.
We'll come back with a viable path forward in a week or two.
Thanks @jeffersonmartin@lfstucker I like your thought process. We have a meeting with meks, hillary, tim zallman this tuesday(august 13) regarding this, do you and logan wanted to be added to it or do you want to try to brainstorm and resolve it with vincy first and I can let Hillary know to reschedule or cancel the meeting?
Thank you to @m_gill and @lfstucker for the extensive collaboration today. We've finished looking at the full landscape of the demo environments and identified the problems, pain points, and whether proposed solutions are viable or not.
We've created a new diagram today as part of our collaboration that will be sanitized later to publish onto our handbook page.
Action for Stakeholders: Please take time to open and study this diagram to understand the full picture. We've identified and labeled the different problem areas that correspond to the threads below to discuss each problem separately.
(A1) Upstream AI Models can have outages and service unavailability (reported to be a frequent problem). GitLab has no control of this.
(A2) GitLab Outage and Slow UX : AI features are new and although they are classified as GA, we may see user experience challenges with response times and "perceived" outages.
This is an expected "fact of life" during the early maturity iterations of these features and the ever-changing state of AI as the industry "moves fast and breaks things".
Proposed Solutions
No product, infrastructure, or technical solutions.
The Solution Architects could have pre-recorded demo videos (with or without audio) that they can use as a backdrop in their demos when they have problems. This can be in small 5 minute snippets of "the server is having a problem, but let me show you what it looks like" or a 20-30 minute scripted demo so you can pick up where it broke at the 12 minute mark.
Also planning a new tracking approach of when SAs or customers experience outages during demos/workshops - that way we are able to classify what is user error and what may be a problem with the backend systems
Even a click-through demo would be a viable alternative here - there's nothing that can be done about other's uptime/availability, so this is a problem we'll just have to be adaptable too IMO.
Agreed and understood around third party dependencies, though there are probably other partnership discussions that could be had. The fact is when GitLab Duo doesn't work because of our third party dependencies, customer and prospect perceptions of GitLab diminish. These all pose risks to our go to market which will have account teams being extra cautious. Better reporting methods like what @lfstucker is looking into and alternative demo options like @jeffersonmartin and @simon_mansfield mention will be key.
While I agree that there is no product, infrastructure, or technical solutions here for the demo environment, this remains a top priority for our teams in terms of overall Duo stability.
For A1 - We are investigating fallback strategies in the event of an AI Model outage like we saw last week. This could mean falling back to a different model, or a brand new provider all together.
For A2 - We have numerous corrective actions from the AI-related incidents, but I have tried to keep the most global and most impactful in this list to help address problems with AI features themselves, which we continue to focus on.
(B1) We need real live users to use for learning for upcoming iteration or version (that can't be solved with unit tests). The team decided to force opt-in all team members based on our dogfooding philosophy. This means that team members have a different experience (that may include unstable features or inconsistent performance or results) than what a customer sees, even in the same classroom, just because they are using a user account that has been force opted in to be a beta tester for live model learning and testing based on being a team member.
(B2) We have a technical limitation where the user profiles on GitLab.com do not have department/team metadata so cannot "easily" exempt or conditionally add to the beta testers group. We've done some API work in the past, but it turns into a shoestring and bubblegum script pretty quickly.
(B3) Team members are determined by whether their GitLab.com user account is a member of gitlab-com top-level group.
(B5) Team members only have a single GitLab.com user account today that has access to all groups and projects in gitlab-com and gitlab-org and any work related groups or projects outside of those namespace. It is also used for demo and sandbox use cases.
Proposed Solutions
We could use a different group that would be an MVC but it would create a lot of administrative pain for a one off use case that we would outgrow and have a lot of tech debt in 6-12 months on the CorpSec and Compliance side. We can do a one off with an easy script for all team members not in Sales division, however it would not be maintained and would be a frozen in time stop gap.
(B4) The most viable solution so far is to have team members that do demos have a secondary GitLab.com user account that is not joined to corporate groups (effectively creating a GREEN only user account that behaves like a personal account but provisioned by the company). This solves for other Corporate Security and Security Assurance risks as well, and is the best path forward long term and aligns with other security hardening work we've been thinking about for later this year.
For the time being SAs have two options for this as well, they can use the cs demo instance as the accounts there should not be effected by dogfooding or temporarily use personal accounts in a last ditch effort.
The proposed solutions are brand new (invented in this thread) and presented as possibilities, and there is no issue or scoped initiative around it yet.
Thanks for asking - I'm interested in B4. Specifically, I'm wondering how systematic we will be in creating secondary user accounts. I have been curious if we could get to a point where we provide team members multiple user names that are persona based so that they can do persona-based demos (carrying multiple personas within a single demo)
We have options to do it, it’s just an investment on my team’s side to build the automation. One user or five users doesn’t matter (the automation adds a foreach loop), so if we want to do avatar based or role based, that’s viable.
Based on current roadmap, this isn’t something we would see happen until probably Q4 at the earliest unless we sacrificed HackyStack growing pains we are fixing in Q3 (really don’t want to delay any longer).
The state of the industry is rapidly evolving and we are seeing significant changes between versions of AI models. Whether it is GitLab native models or Vertex 3rd party models, we will always see a constant state of change (it is the only constant) and have to accept that this is what it is.
Proposed Solutions
No product, infrastructure, or technical solutions.
Invest more enablement time into staying on top of what's coming next with the product and with the models that the product uses to stay one step ahead or tracking the issues through the product development.
Add more requirements for AI team to document and publish release notes (or more of them) or changelog that's disseminated to interested parties. The sentiment is that it's build to ship and less build to docs to ship, however this is not surprising (or blame) given the breakneck speed that the team is operating at and delivering new features.
Also covering the rapid UX changes on .com we will implement automated UI testing for our canned workflows to be sure SAs are up to date when things break. Access requests are currently out to be able to test on .com and have a Slack webhook so that all SAs can be alerted when our testing fails. Will keep this issue up to date as development progresses.
Add more requirements for AI team to document and publish release notes (or more of them) or changelog that's disseminated to interested parties.
This. 1000%. I get the fact we have to move fast, but if we can't keep up with the changes, then I can tell you now for certain that our customers can't either. And they form opinions quickly; if we can't demonstrate that things have changed, then they will not look to change their opinions.
Here's some anecdotal points to back up the need for more changelog.
The What Is New Since AI Powered Stage changelog looks bleak with only 1 "change" in 17.1 and 2 changes in 17.0 from an "AI-Powered" perspective.
Maybe there's a better way to find these in What Is New Since? Regardless WINS (did any one know what is new since spelled WINS???) is one of the field's greatest assets.
The AI Powered stage is specific to only these groups. Of these groups, only 2 are feature teams (Duo Chat which has a GA product and Custom Models which does not)
We expect all product groups across GitLab to be able to provide AI features (so a filter on a single group would be difficult)
We offer the ability to filter by Tier, but not Add-Ons, which is new
(D1) Duo License is not available for top-level groups for each team member that support group-level features not available in personal namespaces. We have been patiently waiting for several quarters for these features but the Fulfillment team needs more resources to build licensing features, with additional thought into how to handle internal licenses that have been a shoe string and bubble gum approach to date.
(D2) Workaround Group with Duo License: Since Duo license cannot be assigned by Fulfillment team to each team member's top-level group yet, we have this child group under the Learn Labs parent group that does have Duo licenses as a temporary stop gap (credit to Logan for clever workaround).
(D3) Workaround Group for Team Member: We manually create a group for each team member under the parent group.
(D4) Am I demoing AI features? Team members need to evaluate which group they are demo'ing in, or sometimes demo from two different groups based on where their demo projects are configured at and the features that they are showcasing.
Proposed Solutions
The Fulfillment team is aware of the challenges and has our issues in the backlog that are prioritized for the near future, but work has not commenced yet. The team recently had a management change so it is expected that some things are in flux. I'd like to offer the opportunity for @rhardarson and @courtmeddaugh to re-evaluate the situation and add feedback on what is needed to prioritize this properly.
@lfstucker is the DRI that can help with understanding the internal pain points. Here are links to related issues and threads.
Based on past history of the team being under resourced, I would encourage the product and engineering leadership team to consider loan engineers for this team and additional headcount long term to keep up with the ever-evolving needs. The CorpSec/IT and Demo Architecture team have been suffering for 18+ months and have taken to reverse engineering undocumented API calls to provision things that we needed, but have spent more time building workarounds as part of our bias for action to meet sales demand and pressure to execute.
It would be great to improve our collaboration and counterpart partnership with the Fulfillment product and engineering management team and schedule a call with @lfstucker and @jeffersonmartin to discuss the broader requirements further and to get ahead of the broader internal/demo licensing gaps/problems (several more outside of this) in addition to just trying to catch up with the current blockers.
In the past, there has been a sentiment that internal licensing provisioning is a 3rd class citizen, and the focus has understandably been on meeting external audience licensing for marketing and sales deadlines for new features and moving on to the next high priority epic. It would be worth revisiting our holistic view of how approach internal licensing as part of our efficiency for the right group sub-value. @jeffersonmartin came up with a proposal awhile back that outlines additional pain points, however was largely put on the shelf due to evolution of CustomersDot and license generation enforcement restrictions due to SOX compliance.
I couldn't agree more. To put it succinctly; how are we supposed to sell a SKU/product that we ourselves don't have access to in a reliable and consistent manner?
@jeromezng can you (or a delegate - @rhardarson ?) weigh in here to see how much effort this would require, and potentially how many borrows? Having that information handy will at least allow product to make the right trade offs given the competing priorities on the team.
I'm just dropping a note to recognize that, yes, prioritizing this is something that we want to do within Fulfillment. As Jerome and Ragnar mentioned this is already in OKRs and we hope to staff it and make progress soon.
The rest of my note here is for the sake of transparency.
The current stack-ranking of priorities for groupprovision includes several top company priorities, which is often the case. The tradeoffs we are asked to make are painful because of this. Our FY25 Q3 OKR list includes (just for Provision work):
Launching Duo Enterprise SKU to production so sales can book orders
Launch customer-facing Duo Enterprise trials
Enabling offline provisioning for Duo add-ons (critical for pubsec and other sensitive organizations that will pay for custom models functionality and can't have connections out to CDot on their environment).
This ask to figure out Duo Provisioning for team member namespaces.
Furthermore, another recent ask from e-group is gaining a lot of steam and we need to keep an eye on its prioritization, adding more pressure to thegroup.
As we look to other groups within Fulfillment, and the overall list of requirements in our OKR list, I don't see clear things to deprioritize to bump this up in the overall Fulfillment queue. As we look to other engineering groups, there is also no obvious teams to pull resources from to staff up on Fulfillment (something beyond my direct control but an exercise we've done in the past, including having e-group and Sabrina look at staffing options for other Fulfillment efforts we deprioritized). All that said, we hope to make progress soon with the team we have and hopefully relieve the pain/pressure from all teams involved in this.
I'm feeling optimistic about addressing this paint point for GitLab Duo Pro/Enterprise within Q3. And using that effort as groundwork towards a long-term solution.I don't think a borrow request will help us here. The Fulfillment domain requires quite a bit of time to get up to speed.
Provision is also performing higher now than the last few months. We have 2 new FS hires and a recent EM (that's me ) that are getting up to speed nicely.
I've seen asks like this one from e-group a number of times now. This doesn't happen every quarter but routinely enough that I am cautious about the long term. Ideally I would hire 2 additional engineers to ~group::provision so we have capacity for high priority projects + internal users + maintenance + long-term architecture.
We react on the priority set by our stakeholders. Over the last few quarters the message on priority coming from CRO was to add more product offerings, not to deliver internal licensing provisioning. I think that to avoid internal licensing provisioning from being treated as 3rd class citizens we need e-group to agree with us on the priority.
Based on follow up collaboration with @courtmeddaugh@rhardarson@paulobarros@mtimoustafa@lfstucker, I did a Friday afternoon MVC spike and was able to provision Duo Enterprise licenses with 5 seats for all existing gl-demo-premium-{handle} and gl-demo-ultimate-{handle} team member demo groups across the company. We have also updated the gitlab-learn-labs group to Duo Enterprise.
This problem will resolve itself with other upstream fixes.
Analysis
The most noticeable impact for demos or end users is when using the Web IDE or Desktop IDE extension since many of the real time benefits are tied to Code Suggestions and Duo Chat and a slow response time or service outage impacts the end user in real time (see problem A).
This is coupled with the UX differences (see problem B) between what a customer sees and what a team member sees that is part of the next-gen testing group.
GitLab.com SaaS Web IDE
GitLab.com connection for desktop IDE extension (VS Code, Jetbrains, etc.)
GitLab self-managed Web IDE
GitLab self-managed connection for desktop IDE extension (VS Code, Jetbrains, etc.)
When performing demos, you are often switching between multiple user personas or accounts to provide the cleanest state and not showing production/work data in your demo. The VS Code extension is tied to one user account with an API token and does not support multiple user profiles, so users would almost always have their GitLab.com account configured and rarely/never use a secondary account (ex. self-managed account if they have a GitLab.com account).
Proposed Solutions
There is no immediate action for this problem, since most user experience are based on upstream problems that once solved this becomes a non-issue. It is expected that this will always be imperfect (the same way that some days the Internet just feels slow).
Is this its own problem? I think this is more like a side effect of the problems above. It is a problem that the user journey would be for one account, but the VS Code extension is still going to be tied to the team members GitLab.com, but as you said - it is a non-issue once upstream problems are solved.
There are no actions for this iteration. Additional content discussion is encouraged to understand more.
Analysis
Updates between SaaS and self-managed will always lag by 1-2 months. There is not 100% feature parity for latest features. We've had cron jobs enabled in the past for patching to the latest release every weekend but have seen stability issues off and on over the years, especially around the major releases (v15 and v17 were rough). Even with sysadmin monthly upgrades, we still have seen lag behind.
The inverse problem exists with rapid iteration changes. AI features are evolving daily and weekly, and are moving very fast compared to stable release cycle that can introduce instability or breaking changes. For those of us that used to be engineers, it has a lot of the "ship it to production as fast as possible" even if quality still has some work to do. This is the nature of the world of AI where you don't know until users try using it what will work and what won't.
Proposed Solutions
This is more of a "live with it" pain point. We do a reasonably good job of having feature parity, however there are always going to be a few gaps for the latest and greatest things, particularly with AI features.
For more details on what's perceived to be missing, @lfstucker can comment with more details.
As I understand it we should have full parity going into Duo Enterprise release, but until then self managed is rarely used for demos outside of the "prove it works on self managed" due to the fact that our Duo Demos encompass all Duo features. This makes using self managed as default for all demos difficult.
@lfstucker I am confused by that statement that self managed is rarely used for demos, since you mentioned that there is already a demo instance available to the team. If it is only used rarely, perhaps encouraging more usage would make sense?
due to the fact that our Duo Demos encompass all Duo features.
Can you elaborate on this, which sounds like it might be the reason why it is used rarely?
Sure, @m_gill - When we conduct a Duo demo or workshop, we focus on the "AI throughout the SDLC" messaging and aim to cover every AI feature—whether experimental, beta, or GA. This emphasis makes the self-managed version less desirable for demos as it doesn't currently include all the features. This issue will be resolved once there’s feature parity, but until then .com is usually the only viable option.
Problem G. Need for Additional or Separate Demo Infrastructure
There are no actions for this iteration.
Analysis
Additional "Stable" Instance? There are no additional benefits to another self-managed instance that we can't already do today with cs.gitlabdemo.cloud instance. We can change any policies we need to without more infra.
Use GitLab Dedicated? The behavior (expected or problematic) of cs.gitlabdemo.cloud would be identical on a GitLab Dedicated instance. A new GitLab Dedicated instance does not solve anything. Jeff has been using a GitLab Dedicated instance for CorpSec and the only thing that has changed is that we do not have many administrative capabilities and have another team to call if the system crashes, so it is more of an offloaded responsibility problem than a feature, capability, or stability benefit story. We've actually lost more ability to be successful since we are dependent on another team's issue backlog to enable feature flags or set configurations for us.
Proposed Solutions
This is not needed and will not solve any problems.
Problem H. Need for Additional or Separate AI Infrastructure
There are no actions for this iteration.
Analysis
There is a potential for building and deploying separate infrastructure for AI that could be considered more stable due to less frequent changes. This would likely require additional headcount on the SRE team and a lot of investment to make this possible. This is similar to creating a new "staging" style environment that mimic's production but would be version locked and updated monthly instead of daily. At this early stage of growth and incubation, this would be more inhibitive than helpful. This is a heavy lift that in concept (theoretically) moves us from 99.9% to 99.999% stability for demos, but at what cost? (assuming headcount is needed, this starts at hundreds of thousands of dollars).
Proposed Solutions
This is not something that the engineers closest to the problem believe is viable, however is being listed here for leadership as a (high cost) option in the future if needed.
To be honest this would likely only cause other issues, similar to problem statement B. We want to show customers exactly what they will get, not what it was like 2 months ago, or what it could be like 2 months from now. (There is a benefit to being able to access the latter however!)
When kicking off a POV we want to demonstrate how to use functionality that they might have just been given access too, we want our experience to mirror their experience 1:1, this solution would probably move us away from that.
@m_gill - After reading through the different problem statements and their proposed solutions, I understand there is no need for a new demo environment or automated test coverage. Please confirm if my understanding is correct. Thank you!
@lfstucker This is the right place to collaborate on Selenium test ideas that you had. Can you provide a deep dive overview of the thinking there to see if @vincywilson team can help?
Hi @vincywilson the best deep dive overview will come from this issue, the short version is that we update our workflows for demos + workshops extremely frequently and it dosent make sense to have your team maintain it (Discussion here) as our end goal is for daily walk throughs of all of the content using selenium. My one question would be how does your team currently handle the sensitive tokens? Are they just stored as protected variables on GitLab or are you using a solution similar to vault to host these?
@vincywilson based on my understanding today, the SA team does not need our help on offering a demo environment.
I do still highly recommend integration tests for the customer demo workflows or the "golden journeys" that product is meant to define as part of Duo Enterprise. I understand in the linked comment that this may be difficult, however it would help reduce incidents overall and is not specific to demos. This is because we have disparate, complex applications that need tested end to end in that way.
@lfstucker@poffey21 it can be a future discussion, but I am noticing throughout this overall discussion that there are a lot of folks wanting to help but not a lot of help being needed in the end. It might be a good idea to collaborate more closely with Engineering, Product, and Test Platforms to get to a more aligned process on how we do demos and workshops at GitLab.
I do still highly recommend integration tests for the customer demo workflows or the "golden journeys" that product is meant to define as part of Duo Enterprise.
@m_gill, please feel free to tag me once these "golden journeys" are defined so we can see what's doable. I want to stress that e2e tests are very expensive in terms of maintenance and running costs, so they are worth automating only when we know those flows won't change regularly.
My one question would be how does your team currently handle the sensitive tokens? Are they just stored as protected variables on GitLab or are you using a solution similar to vault to host these?
@marin and I were pinged regarding this issue and want to provide the SaaS Platforms perspective:
If demoing GitLab Duo on GitLab.com is a problem because of stability concerns, spinning up another Demo environment - Dedicated or otherwise - is not the solution to this problem. We have thousands of customers on GitLab.com that may want to buy GitLab Duo. Instead of side-stepping this problem, we need to understand why there is instability or perceived instability on GitLab.com and address this. That will help SAs, our customers and improve the overall experience
The support sandbox is not meant to be stable and as such not a target environment for SAs. If there is a need for a demo environment for Dedicated specifically for SAs and PS, we can certainly discuss this but this should focus on providing access to Switchboard, Dedicated and allow those teams to further their understanding of Dedicated.
In short, we are not going to spin up another Dedicated instance at this moment.
Please let me know if there are any additional questions. If additional support is needed by teams in SaaS platforms to improve the stability of Duo, please reach out!
Hi @fzimmer agree with all of your points, the ask for a Dedicated instance/access would not be a solution for Duo demos but SAs are getting more asks to show off Switchboard and Dedicated which they currently dont have an environment to do so. Providing access to Switchboard\Dedicated would be a huge win especially as the SAs are starting to try and build out a Dedicated SME team
Having just gone through GitLab Dedicated onboarding and understanding the behind the scenes challenges, I would strongly encourage only doing pre-recorded video demos of this. Once the instance is up and running with onboarding, it's the same as a self-managed instance.
While I agree that the instances pretty much are the same, there are a few challenges that having direct access for enablement/learning/demos this would solve:
Its not unlikely for a customer to request proof something works on an instance before buying. We are seeing this with Duo currently where even though it functions the same the customer requests the demo be completely on self managed
Internal education/testing - SAs can learn from some of the recordings that we have but I still think there would be value in personally walking through Dedicated/switchboard use or being able to fact check some of their demos on the live instance
Im open to other solutions than just spinning up a new instance but I have heard demo prep/education around Dedicated is difficult right now
It's not a viable option right now, at least for end to end flow testing. Onboarding has a UX, but is not automated administratively and it goes through assisted onboarding via Zoom that pulls the Dedicated Engineers or Engineering Managers away from their work.
The onboarding is per instance (not per user), and the cost and labor burden is simply too high based on current state.
After a quick Zoom with @lfstucker, we came to the following conclusions.
TLDR
We do need the ability to demo GitLab Dedicated as a platform/architecture for customers that are buying Dedicated. This is not a replacement/alternative to what have today, solving/providing any stability, and is not related to Duo.
Switchboard flow should be a video demo only.
There would not be multiple Dedicated instance for different SA/CSM/PS teams.
There should be one instance (cs.gitlab-dedicated.com) for anything related to Dedicated demos and @lfstucker would be the Instance Admin, similar to what we do with cs.gitlabdemo.cloud. @lfstucker will manage team members that should have access.
Details
This issue topic is for Duo and ability to demo. This has unearthed several gaps across the way that we demo that need to be addressed. There is a need for a GitLab Dedicated instance, but it is not related to stability or Duo, it is related to the ability to show "proof in the pudding" and demonstrate that customer workflows do work on a GitLab Dedicated instance when the customer is interested in buying Dedicated.
Based on this clarity, I do support and sponsor the need to create a GitLab Dedicated tenant instance for GitLab Customer Success (cs.gitlab-dedicated.com most likely). This will be shared by ~200 users, with a regular usage by <20 of them (Dedicated SMEs).
The Dedicated Team is welcome to ping me with context questions. A 2K or 3K architecture is perfectly fine.
This is not designed for end-to-end workflow demos, this is only for performing group/project/runner demos on the GitLab Dedicated instance, the same way that we would on any other self-managed instance (ex. what we do on https://cs.gitlabdemo.cloud). It does not replace the existing instance, and it does not solve for Duo related demos. It is a reference architecture proof demo only.
Action Items
Here are related issues to understand the onboarding flow as an internal customer.
Just want to be clear on my side, Dedicated and Duo are two different things and are not going to be discussing these two orthogonal projects in an issue that is discussing Duo demo environments.
For Dedicated, @fzimmer and I are working with SA leadership to agree on the path forward to enable SA org. Right now, that is not a demo environment, and as such we will not be providing a Dedicated environment for the time being.
Based on Slack DM with @marin, there are several overlapping private discussions happening, at the engineer level and at the Director level. The good news is, we are collaborating extensively on this topic, however it is a new area and we need to ensure we have alignment as we move forward.
We'll get there, we just need to get the right folks in the same room. We'll follow up as those discussions happen and we have business and technical alignment worked out.