There is a single feature flag, remote_development_feature_flag. This is a development type flag which disables all
all graphql endpoints,
the controller which renders the Vue UI,
the internal/kubernetes/modules/remote_development/reconcile and internal/kubernetes/authorize_proxy_user endpoints.
Description
Put everything behind the feature flag on the remote_dev branch.
We will do this by putting guard clauses in the API to throw exception if flag isn't on: in the GraphQL create/update, and in the REST kubernetes internal API method.
We will also put the UI behind the flag, by making the controller return a 404 if it's not enabled.
Tasks
(TODO: more details to list including actual endpoint paths)
REST internal kubernetes agent config (existing endpoint that is used by other GA4K features, but we should put guard around our part of the request logic): internal/kubernetes/authorize_proxy_user
Add guard to controller to return 404 for UI if flag is off
Update documentation with references on how to enable feature flags.
✓
10 of 10 checklist items completed
· Edited by
Chad Woolley
Designs
An error occurred while loading designs. Please try again.
Child items
0
Show closed items
GraphQL error: The resource that you are attempting to access does not exist or you don't have permission to perform this action
@oregand@cwoolley-gitlab Let's discuss the appropriate actor for the feature flag so we can configure it correctly from the start.
Given our current strategy, my understanding is that nothing will prevent Self-managed instances from enabling the flag globally. Question: Does the scope of the flag impact their ability to enable it as well? So if we scope it to user and Acme Inc. enables it, could they pass in specific users on their end?
Given our current strategy, my understanding is that nothing will prevent Self-managed instances from enabling the flag globally. Question: Does the scope of the flag impact their ability to enable it as well? So if we scope it to user and Acme Inc. enables it, could they pass in specific users on their end?
Correct.
My suggestion would be we use Group level actors, WDYT @cwoolley-gitlab ?
Oh right, that's new after we did the Web IDE roll out, right?
I agree, if we can pass in a group instead of individual users, that is ideal.
I'd prefer not to do it by project because it could get confusing for someone who thinks they have access to the feature... navigating to a project and not remembering whether it's enabled or disabled.
I'd prefer not to do it by project because it could get confusing for someone who thinks they have access to the feature... navigating to a project and not remembering whether it's enabled or disabled.
The same actor type must be used consistently for all invocations of Feature.enabled?
So if we pick group as our actor, that means we can't use the same flag to enable it on a per-user or per-project basis.
I think this was the limitation we ran into with the Web IDE too.
I also think that's probably part of what https://docs.gitlab.com/ee/development/feature_flags/#feature-groups was intended to help address, but it seems like that's hardcoded to just allow all gitlab_team_members but we couldn't use it as is for anything other than that, unless we implemented our own custom group.
@ericschurter@oregand@vtak Now that I have started implementing this, I realize there's some problems with using the group actor.
The problems arise because we've now moved the feature to the My Work level in the menu instead of under a Group context.
This means that we can't use a group-scoped feature flag to disable the overall UI rendering and Workspace Create form under "My Work", because there's no concept of a "group" to scope it to at that level.
But, we have options.
First, lets remember that there's two separate goals we could want from our feature flag(s) for this feature:
To prevent users from using the feature if it is not enabled for a specific top-level group.
As a global "kill switch", which can completely disable the feature in the event of unexpected problems.
Option 1:
If we are OK with the feature being enabled globally for all users, on both SaaS and self-managed instances, we can go with a single, non-scoped feature flag that achieves goal number 2.
This is the easiest approach, but we have to be OK with enabling or disabling it globally.
Option 2:
If we still want to be able to enable-disable it on a per-root-group basis, then we can still have one "global" flag which achieves goal number 2, and then have an additional "per-root-group" flag which is required to enable it on a per-root-group basis.
However, there's implications of this approach:
It's a more complex approach (and these flags are already complex to reason about, implement and test, and trying to test two flags in conjunction makes it even more so). Plus, every flag is supposed to have a dedicated rollout issue and be managed over its lifetime (we get reports for every existing feature flag for our group until they are deleted), so adding a second one adds even more overhead.
Both flags have to be enabled in conjunction as appropriate. E.g. the global flag will enable the UI and kubernetes API, and the 'per-root-group' flag will control the actual display of the workspaces as returned from the GraphQL API to be displayed in the UI, such as for the workspace list, or starting/restarting/stopping/terminating a workspace.
If you want to allow users to create workspaces in multiple root groups, then the per-root-group flag should be consistently enabled/disabled for all root groups at the same time, or else they would see nothing in the list UI (and currently it's just a generic error). This is because it would be even more complex at this point to try to selectively apply the per-group flag to individual workstation retrieval from the GraphQL - the flag checks are done at a higher level in the logic, and also there's even more implications of this about potentially orphaning some of the user's workspaces.
All of this has to be documented and explained to on-prem admins.
Again, as you can see, it's more complex and harder to reason about.
So, if it's acceptable, I'm in favor of Option 1 - make it a global per-instance flag.
But if that isn't acceptable, we need to have some discussions about the implications of this and make sure they are fully understood.
This means that we can't use a group-scoped feature flag to disable the overall UI rendering and Workspace Create form under "My Work", because there's no concept of a "group" to scope it to at that level.
Is it acceptable to still have a group-level FF like we originally planned but let the UI be out of scope of the feature flag. For me, the feature flag is a protective switch to save GitLab in case of a disaster.
This will also take care of the following point Chad raised above -
If you want to allow users to create workspaces in multiple root groups, then the per-root-group flag should be consistently enabled/disabled for all root groups at the same time, or else they would see nothing in the list UI (and currently it's just a generic error). This is because it would be even more complex at this point to try to selectively apply the per-group flag to individual workstation retrieval from the GraphQL - the flag checks are done at a higher level in the logic, and also there's even more implications of this about potentially orphaning some of the user's workspaces.
So
The UI ("My Workspaces" sidebar entry, workspace creation form, workspace listing page) will be visible to everyone
When a user tries to create a project(whose root group has FF enabled), backend will allow it.
When a user tries to create a project(whose root group has FF disabled), backend will NOT allow it and the UI can show an error that this root-group does not have the FF enabled yet.
Is it acceptable to still have a group-level FF like we originally planned but let the UI be out of scope of the feature flag. For me, the feature flag is a protective switch to save GitLab in case of a disaster.
No, this doesn't really help much. Because it's not just the UI, it's the kubernetes REST API that also has no group scope (because it can send requests applying to any group in the instance). We'd still have to have a separate global flag for it.
Plus, this still has all the same complexity as the second case, where we have to reason about coordinating flag enabling for overlapping group cases when an admin wants to enable the flag for multiple groups.
Finally, this would still result in a broken and unusable UI being presented to the user if the flag is disabled, and we have already had reviewers questioning that in the reviewers channel.
Also, I forgot to mention another drawback of the per-root-group flag appraoch: the implementation of the feature flag logic itself could potentially cause performance problems.
This is because finding the root ancestor can be an expensive operation, especially in groups with large hierarchies and many subgroups.
And we have to perform this check for every workspace the user retrieves to list, because each workstations project could potentially be in any different root group.
@cwoolley-gitlab This issue looks like it may slip this current milestone. Can you leave a 👍 or 👎 to signify if you are on track to deliver this issue?
Please also consider updating the issue's Health Status or Milestone to reflect its current state,
and communicate with your Product Manager as appropriate.
@shekharpatnaik and @vtak - Can you review the task list I've added in the description here and make sure we have all endpoints covered, especially any ones related to auth?
For workspaces proxy, we don't really have feature flags since its outside the rails monolith. However, customers always have the option to uninstall it using helm uninstall if they don't want to use it.
For authentication - I think we are using an existing GraphQL endpoint - https://gitlab.com/gitlab-org/remote-development/gitlab-workspaces-proxy/-/blob/main/pkg/gitlab/user.go#L15For authorization, we use a new GraphQL endpoint - https://gitlab.com/gitlab-org/remote-development/gitlab-workspaces-proxy/-/blob/main/pkg/gitlab/workspace.go#L17So authentication should work and authorization should fail because that endpoint would be behind feature flag AFAICT. @Shekhar , please correct me if I'm wrong here.
Could you help update the task list in the description here with any additional endpoints other than the ones I've listed?
internal/kubernetes/authorize_proxy_user - I'm not sure what new thing we've added here? Maybe you meant kubernetes/agent_configuration ? In which case, I think there is little need to put this behind a FF. It is merely updating the DB and then doing normal GA4K thing(sending agent configuration to agent). But if this helps with the reviews, by all means.
The create, update, reconcile endpoints cover most of the exposure points. 👍
@cwoolley-gitlab ah I see. Its the workspace(id: $Id) endpoint at ee/app/graphql/ee/types/query_type.rb that we use. We compare the user Id returned from that endpoint with the logged in user.
Right, the fields on the root QueryType. Thanks Shekar, I've added those to the list. I've asked a question in slack about how to check the flag for this one:
If I have the following field on the root EE QueryType:
field :workspace, ::Types::RemoteDevelopment::WorkspaceType, null: true, description: 'Find a workspace.' do argument :id, ::Types::GlobalIDType[::RemoteDevelopment::Workspace], required: true, description: 'Find a workspace by its ID.'
…how would I put a feature flag check on it to disable its use if the flag is disabled? Is there a generic ID resolver that gets used that I can somehow add to?
Chad Woolleymarked the checklist item GraphQL root QueryType workspaces by ids field as completed
marked the checklist item GraphQL root QueryType workspaces by ids field as completed
Chad Woolleymarked the checklist item GraphQL User interface workspaces field as completed
marked the checklist item GraphQL User interface workspaces field as completed
Chad Woolleymarked the checklist item REST internal kubernetes workspace reconcile: internal/kubernetes/modules/remote_development/reconcile as completed
marked the checklist item REST internal kubernetes workspace reconcile: internal/kubernetes/modules/remote_development/reconcile as completed
Chad Woolleymarked the checklist item REST internal kubernetes agent config (existing endpoint that is used by other GA4K features, but we should put guard around our part of the request logic): internal/kubernetes/authorize_proxy_user as completed
marked the checklist item REST internal kubernetes agent config (existing endpoint that is used by other GA4K features, but we should put guard around our part of the request logic): internal/kubernetes/authorize_proxy_user as completed
Chad Woolleymarked the checklist item Update documentation with references on how to enable feature flags. as completed
marked the checklist item Update documentation with references on how to enable feature flags. as completed