Currently the spin up of this instance is gated by all the test stage which may not need to all pass to get the review app up and running. We should consider deploying the review app even when some tests are failing.
Proposal
Have review apps be deployed in a more aggressive manner. This can be done when we have solved all the dependencies and achieve a stable pass rate. We need to determine which tests can fail. E.g.
thanks @meks. I'll copy my original statement here:
Quick question related to review apps. Are we able to make review apps spin up even if tests do not pass?
Review apps can be seen as a test themselves. The experience and usefulness of review apps is degraded severely when a review app becomes unavailable only because for example some linting/ee-compat/code-quality tests for example fail. The iteration cycle between dev and review should be as short as possible .
@vkarnes it would be great if we can give this an incentive from a gitlab-ce~2024184 perspective. Often the review apps are not working/deployed because of this reason. A simple test often proves to be a blocker for deploying an environment which is meant to be a test on itself.
If it needs just arrangement of the CI stages/configuration, Engineering Productivity team can do this. We are trying to get the review apps to be faster and more stable before allowing more usage.
@meks this issue was discussed in the UX/FE meeting of 22 Aug 2019. Do we have a new update regarding progress on this? This would be very welcome towards the functional and visual review process.
Is there a possibility to expose this issue a bit more?
Once these 2 workstreams are completed then we can open this up more. If we enable it now it would add more complexity to the configuration. I have optimistically put %12.5 on this.
@kwiebers I am assigning this to you for now since we need someone to drive this from Eng Prod. I am optimistically setting this to %12.5 by then we should have more stable deploys with metrics to back it up so review apps can be expanded further.
review-build-cng and review-deploy was adjusted to be run based on less dependencies with !24803 (merged). Are you still experiencing the same friction with how often review apps are deployed?
@kwiebers bringing visibility that this topic has been discussed in the UX weekly yesterday and it still seems to be a problem. Would love for you to look into this
I'd like to hear more about Review App success challenges. I know the success rate has been lower than 70% since February but has improved to 90% for April.
@dimitrieh@mvanremmerden - Are you and the other UX team members seeing challenges since some of the corrective actions during April?
Some of the other items that drew me in were related to these topics:
Feature flags for review apps
There are a few ways to set Feature Flags in a review app:
I think this could be improved and created #216838 (closed) to track this.
Slow cycle time for quick changes
This definitely falls under Engineering Productivity area. The quote that resonated with me "CSS took 15 minutes to change and MR has lingered for a couple of days"
I appreciate hearing cases of very small MRs where the MR process seems to take longer than it should.
@hollyreynolds and @pedroms - Could you share some examples of these MRs or changes to look at a little closer?
In addition to that, most of my MRs are around the Web IDE, and in the past I also had trouble with getting that to work in review apps, but that might have been solved as we also made some changes to be more flexible for the gdk, as can be seen here: gitlab-development-kit!982 (comment 315333982)
These type of review app not being successful would not surface in the Deployment Success Rate metric that I referred to above. If you observe issues where review-deploy passes but the deployment is not working how you would expect please let us know on either Slack (#g_qe_engineering_productivity) or an issue in gitlab-org/gitlab .
In addition to that, most of my MRs are around the Web IDE, and in the past I also had trouble with getting that to work in review apps
@mvanremmerden - Can you expand on the trouble that you've had with Web IDE for Review Apps? Is it the same CORS error or something else?
but that might have been solved as we also made some changes to be more flexible for the gdk, as can be seen here:
Review apps use the GitLab Charts and isn't leveraging the GDK for these type of improvements. I'd be curious if the issues are the same what's happening because it may be a Review App implementation problem or an upstream GitLab Charts problem.
If you observe issues where review-deploy passes but the deployment is not working how you would expect please let us know on either Slack (#g_qe_engineering_productivity) or an issue in gitlab-org/gitlab.
I will try to do that more again, thanks!
Can you expand on the trouble that you've had with Web IDE for Review Apps?
The Web IDE was relying on a certain host that was configured somewhere in the environment. As you can see in the gdk example, that was something that broke in a couple of places over the last year (and if I remember correctly, at a certain point, the review apps were one of them), but we made some changes to make that more flexible, so it might already be fixed.
@kwiebers Here are a couple of MRs that hung around for me for a while but this was primarily due to pipeline failures and addressing problems there. Not sure if that helps? !29212 (merged) !30636 (merged)
@kwiebers@caalberts yesterday I have opened 4 merge requests with minor changes (change icons or replace button with new component). All of them have failed to spin up the review app:
Replace fa-link icons with GitLab SVG link icon (!36973 (merged))
Replace fa-plus icons with GitLab SVG plus icon (!36972 (merged))
Replace with in app/assets/javascripts/pipelines/components/graph/linked_pipeline.vue (!36968 (merged))
Updates deprecated button in in app/assets/javascripts/pipelines/components/graph/action_component.vue with Pajamas component (!36966 (closed))
What can I do differently here to make sure the review apps are reliable?
Additionally, there is an interesting conversation in slack:
I’d like to understand our product testing capabilities better. Should either gdk or a review app be viewed as a better way to test? Or are they equal? Why are review apps created from some MRs, but not others?
Review apps are created via a CI job. If the job doesn't run or fails before the review app is deployed, no review app will be available. Projects that are not using CI or do not have the necessary job, will not have a review app created.
In my experience, review apps are great if you are reviewing something in the application that is fairly basic and the job runs/passes. Review apps are not set up to include every feature, but we have this issue open to populate review apps further #25297 (closed)
This is where GDK can come in because we have special set ups that allow us configure certain features to test locally.
Imo review apps are the superior testing method in an ideal world.
They allow for anyone to spin up the testing env in an instant just by using a browser. No complicated tools necessary. This is incredibly powerful!
This makes it possible to evangalise functional reviews to additional stakeholder persona’s such as PM’s, tech writers, marketing, etc.
They are also made available soon to non project members with #22090 (closed)
However, there are some draw backs currently with using a review app:
They rely on a successful pipeline, meaning that if it fails or is not kicked off there is no review app. (This happens fairly often)
Review apps are only kept online for 2 days since spinning up afaik.. this means any review app that is spun up on Friday, is not available on Monday. Alternatively, this requires activity on the merge request within a 2 day period (this is a recurring theme).
Review apps are indeed as stated above not always provisioned with the data needed to test the feature in dev. Additionally, any data setup is deleted after a new commit is pushed and a new review app is spun up (again a recurring theme).
In the past I have opened issues to improve 1. and 2. (I would need to look them up though). However, for review apps to become a true tool in our arsenal this needs a greater amount of prioritisation.
Hope that helps!
Thanks for all that. So overhead aside, it sounds like review apps are preferred, but gdk would be more robust?
Review apps save you from checking out a branch and spinning up gdk so I can see why it would be preferred by many who are not doing development.
I don't personally prefer one over the other. I often want to make changes in the codebase while reviewing, so I usually default to gdk. They both have their place, depending on the MR you are reviewing.
@dimitrieh - How can Engineering Productivity share these type of breaking issues for Review Apps in a more efficient manner? Would communication in #ux on Slack be the best place?
@kwiebers I think that would be a great place to start.
The problem here is that every failed deploy of a review app without a good reason decreases the trust placed in review apps working in general. My assumed thoughts of the general thinking is; can I rely on this or not?
@kwiebers@caalberts@rymai FYI the results from the poll. I would love to know how I can be of help any further and to see how we can further prioritize and improve our review app offering here!
Thanks for this useful information @dimitrieh. The friction aspect is very helpful. We are actively planning how to seed the data in a more efficient manner, this will likely be a Q3 OKR for us and it will be used in review apps as well gitlab-com/www-gitlab-com#7397 (closed)
The unreliable part is being measured as a KPI, wanted to call out that some of these errors are also because we are dogfooding our cloud native installer which still has room for improvements.
We will have something in this line of work to improve the situation. cc @kwiebers@tpazitny for the demo data aspect.
That sounds great @meks! Wondering if a collaboration with UX might prove useful here to keep tabs on the perceived experience here vs raw data as that will ultimately influence how much it is used as well. This might prove helpful as well as a platform for error reporting so failures do not go without reason/background knowledge
@dimitrieh do we have a known catalog of seed/demo data that the UX dept uses frequently? We could look at expediting some of this data into the review apps.
If I remember correctly, having demo projects correctly set up for testing features is something that the Secure team is also struggling with, so this might be interesting as well for @jmandell.
Thanks for ping @mvanremmerden Actually both of my teams have had issues getting a GDK setup that supports their use cases. Configure/Monitor struggle with getting clusters and terraform setup etc to see those UIs in action and Sec/Def also need dummy projects to see their UI in action.
For Configure features, you need to set up QA Tunnel access in order to create a cluster and install GitLab Runner. This allows you to test Auto DevOps or other features such as Serverless and Prometheus. This isn't as simple as having a demo project but maybe someone smarter than me can figure out a way to automate this setup further.
If using a review app, you could skip the QA tunnel portion needed for local development but I imagine a blocker could potentially be the cost of automatically spinning up a cluster for each review app?
@gl-quality/managers please kindly follow along the discussions here. We are working on seed data this Q3 but the valuable feedback here points to more opportunities besides just static project/issues/MR/CI setup.
At Monitor we're experiencing the same challenges @tauriedavis described in her comment above. I need to have QA Tunnel set up to create a cluster, install Prometheus, and deploy to an environment to monitor Metrics.
@svistas@ddavison given the nice work we have done with K3s to decrease the cost to setup autodevops tests, would it be possible to use K3s to help with some efficiencies here?
k3s is using the GitLab Tunnel and the test app is deployed using Auto DevOps but we would at least be saving the costs of running time in GKE clusters
Mek Stittrichanged title from Remove friction for review apps to be deployed spin up to Remove friction for review apps to be deployed so environments are available for UX and PM
changed title from Remove friction for review apps to be deployed spin up to Remove friction for review apps to be deployed so environments are available for UX and PM
A very small (I think) improvement that I've mentioned to @meks would be adding a description to each demo project of what's seeded/configured. For example, in this screenshot, replace "My awesome project" with what's actually available in the project.
In theory, review apps would be more accessible than local GDK or Gitpod, but responses show that review apps are largely unused (31.82% have never used them).
Pros: Useful to review documentation changes (Pajamas or GitLab handbook).
Cons: Some mentioned their lack of familiarity with review apps. They are seen as unreliable even though they are much more reliable than what they were in the past. As mentioned in the local GDK section, using review apps often means setting up the environment from scratch.
Engineering Productivity has improved stability and usefulness a lot for Review App implementation in gitlab-org/gitlab since this issue was created.
I'm going to close this based on the improvements listed below. Please open new issues and mention me if there's specific usefulness opportunities to improve UX/PM interaction with Review Apps that aren't captured in &606 or a child epic.