We're shipping #21971 (closed) in 8.12, but it only supports creating environments dynamically for branches. We need to provide some way to clean up (delete) environments when MRs are merged and/or branches are deleted. This means deleting the environment in GitLab, as well as deleting the backing environment, wherever it is hosted.
Proposal
Specify on_close: job_name to point to a manual action job to run when deleting a branch or when someone manually deletes an environment from the web UI.
[Optional] At top-level, declare all environments so that they can be declared once, but used in multiple jobs. Also, by using an array, we can have a defined order for the environments which may help with Cycle Analytics detecting "production":
The simplest implementation for the pipeline graph is to change the play icon with the stop icon. We also will leave the background color intact for now:
In the future we may well be changing this to perhaps: image
@dimitrieh We already have a Destroy button for environments in the web UI. Do we want/need to do anything special here? Do we want to show a the build log triggered by the destroy? Do we want to inform people of the side effects of destroying the environment? Should they be optional?
@markpundsack is correct to assume we may want a mass destroy function? Or is it just for incidental individual environments?
Also Could you give me master access to a repository which has environments, so I can see all controls live for myself?
even if, Yes I think it would be great to see a log of what happens when those scripts get activated. They should then get their own "destroyed" build log isn't it in the commit view and pipeline view from which they sprouted?
So in the discussion I had with @ayufan we came to the following:
Currently its possible to "Destroy" an environment, without leaving a trace anywhere. This is sort of bad, as there should be a history and a way to get to the delete review app "delete build log" from the environment list as well, as a review app basically is a dynamic environment.
Therefore I propose the following; open and close tabs for environments view similar to Issues/MR's etc:
In the case of a dynamic environment (aka review app, as it is defined in the gitlab-ci.yml file) we can close the environment via a manual action (which will trigger a build with a script) in:
Problem with this is that we now have a manual action that is pretty different from a deploy. @markpundsack the question here is if there are other manual actions possible apart from deploy and now close (to) an environment?
I would like to have a different icon for a destroy/close action of the environments to give more context... as the implications are more "severe" when closing something? (We are closing on gitlab, but destroying on the external server effectively)
The pipeline graph:
the problem with the play button, is that you have to get your context (that you are destroying the environment instead of deploying towards) somewhere else, in this case from the column header, so in this case the close icon has clear benefits
or:
if we'd go with the play button, i believe we must add extra context in the sense of Close Review/hello-world
The build list:
The problem here with the close icon, is that it is the same icon as the "stop a build from running"
possible solution in case of using the "close" icon:
As we created a new icon for https://gitlab.com/gitlab-org/gitlab-ce/issues/22628 we should update the status icon for manual actions (as there cannot be status... skipped is sort of what happened, but doesn't totally actually represent what the status is... it should be more clear in the sense is that this is still possible and not only skipped but just needs a manual action :) ) We can also loose the manual tag in this case! bonus (as we are also getting rid of the allowed to fail tag in: https://gitlab.com/gitlab-org/gitlab-ce/issues/21948#note_16048269 )
Also, the play icon for a manual action seems slightly bigger than all the other icons, we can do the same for a manual "close environment" action button.
Both of these options are included in the following mock
The environment list as well:
the problem here is that when only showing a manual "play" button, it looks as if you are deploying towards the environment instead of closing/destroying it, its the same as in the pipeline graph, you will then have to get your context somewhere different.
options:
optionally we can put the destroy/close environment list actions in its own button, next to the manual deployment actions button
Basically the point here is we have to choose, or be okay with mixed use, which I think is weird
I came to a conclusion that should make more sense, as an environment is not always deployed.. or managed by gitlab in that sense (the external server that is). The environment can exist, without being used or being deployed to. In that sense It would be wise to change
open/closed not to deployed/stopped but to available/unavailable.
However we can still use the the play/stop icons idea. or to quote @awhildy
But available/unavailable describes the state the environment is in, not the action you take. Like if the environment is available, you want to stop it and then it is unavailable. If it is unavailable you want to start it and it is now available
I would love @markpundsack opinion on these semantics
Also, would we be able to start the environment again from the unavailable tab? For manual environments this would be no biggie imo. But I think in the case of review apps (aka dynamically created environments) this could only be done by using the same namespace for a review_app and branch.. effectively doing a new dynamic deploy.
this would change the following for
perhaps include a "all" tab.. with unavailable environments defined by a tag, icon or a grey background...
thanks @stanhu I was thinking about that as well :).. however this brings up problem on itself.. as there is mixed use of this throughout the application I believe (cc: @awhildy@cperessini).
Mmh I think the best would be to leave it white though, as we don't color it in the pipeline graph or MR view
The square icon looks like a placeholder to me. Maybe we should use these icons for deploy/close?
Unavailable sounds to me like something went wrong and I can't touch it. Maybe available and stopped?
If all actions performed on an environment are under the same dropdown, the dropdown title should be something that reflects all actions. Or if they are separated, I don't think the stop button needs a dropdown, right?
Should all copy that says "Close environment" say "Stop environment" to match the button?
We use slightly different icons (different line thickness etc), than the standards from font awesome, but I provided them in the production/svg's directory in the design repo.
I like this one, although it wouldn't be inline with what we have used up until now.. see the build list (would just need updated icon from above).
The dropdown doesn't say anything about whats in there...
Apart from that the stop button would in need of a dropdown as well as it can have multiple review apps, plus you want it to know how the stopped dynamic env would be called.
I however think this is not looking good...: (imo let's still go with actions...) it sort of represents: "manual environment actions in here". It doesn't say anything but you can suspect it.
this looks terrific just needs the icon mentioned above:
Btw I am okey with "available/stopped"
@markpundsack so the jobs could be named like that not? if given the right example?
@tauriedavis what do you think? how would you tackle the build-list actions?
If we come to a conclusion I'll do an overal update on the designs.
The simplest implementation for the pipeline graph is to change the play icon with the stop icon. We also will leave the background color intact for now:
In the future we may well be changing this to perhaps: image
Besides the UI discussion above, the question as to when the review app gets deleted is still a good one relating to on_close. Does a failure during the on_close action prevent the merge/delete? or is it a best-of-luck after merge/delete step? Can we have a general 'on_close' that is unrelated to environments? (run this job on merge/delete of branch)
To add a real world example, here is a setup I use at work:
This works, but the automatic environment close would be awesome. Thanks.
As far as I know, this will be tackled in the next iteration of this? @markpundsack As this was a question of mine as well or rather an assumption that the Review apps would close on MR merge
When I started implementing the backend changes the amount work that needed to be done was very significant and required a lot of redundant configuration in .gitlab-ci.yml.
@ayufan I'm not opposed to this, but here are a few thoughts:
I like the simplicity of it. If the vast majority of real use-cases is a single command, then this is great.
Defining a job inside a job feels weird. It's kind of ruins the purity of thinking of top-level keys as jobs.
Where would these pseudo-jobs live in the pipeline?
Would closing still be manually triggerable in the pipeline?
It either limits flexibility in the close job, or requires a lot of complicated, nested YAML. e.g. what if I want/need to use a different docker image when closing?
For environments that have multiple jobs that deploy to it, would each one have to define close?
It seems to be going away from the explicitness we value elsewhere. More magic, which has its benefits of course, but also limitations.
This reminds me of http://www.slideshare.net/SkeltonThatcher/continuous-delivery-antipatterns-from-the-wild-matthew-skelton-ipexpo-europe, slide 12, where one of the basic ideas behind continuous delivery is to deploy the same way to every environment. And with Distelli, for example, you declare how you create and destroy environments in a single place, and it is applied to every environment you create. I like the idea of declaring how to deploy and destroy environments once for all environments. Or perhaps once per class of environments (e.g. one for Heroku and one for Openshift, if you're using both).
Today I ran into an issue where my stop environment job does not match the state of the server it runs against -- i.e. the container was stopped and removed not by GitLab CI, but manually. I missed the tags directive on the stop job, and therefor it was not connecting to the correct server to stop the docker container (using shell executor).
I now find myself in a situation where I don't see a way to remove this environment from my active env list without trying to run the stop job -- which fails since the container no longer exists.
I'm not sure if A) There is something I'm missing, and can easily be resolved within the UI, or B) I should open a feature request to manually remove environments, regardless of whether cleanup was successful. Ideally the state would be consistent, but surely there will be times where an admin just needs to beat something up.