Test plan to verify failed-over instance
As part of the GCP Migration, we need a comprehensive test plan that we can use to certify that the failed over instance of GitLab is working correctly.
This plan should be used in Staging and Production Environments
Some questions:
- Can we use GitLab QA against the failed-over environment?
- How much functionality that is not included in GitLab QA will we need to test manually?
- Once the production environment is failed over to GCP..
- What's the best way for us get access to the instance, for testing, without allowing the public on yet?
- How long will the QA of the failed-over production environment? This will affect the overall outage window time.
Test plan
We have a number of functionality covered with test automation. We aim to run these automated tests as soon as the failover has finished.
High level plan
Staging
- Run the automated tests on staging right after failover
- Triage test results
- Rerun tests / debug & fix as necessary
- Start the manual testing per areas discussed below per assignments
- Track this in a staging manual test issue, each owner is responsible to update status and check this off
Production
- Run the automated tests on staging right after failover
- Triage test results
- If we do this right we should't run in the the same failures we fix in staging.
- Start the manual testing per areas discussed below
- Track this in a staging manual test issue, each owner is responsible to update status and check this off
- Communicate with RMs
During the testing phase, we should have assigned production engineers to be ready to fix issues in production.
Automated Tests
These are the automated features:
- Login to the site
- Basic features
- CI/CD pipeline
- pushing code
- registering runners
- using variables
- deploy tokens
- Repository operations
- https
- Traces
- CI runners
- receiving jobs
- uploading traces and artifacts Run automated tests on staging after the failover
Run command
env GITLAB_USERNAME=gitlab-qa GITLAB_PASSWORD=... gitlab-qa Test::Instance::Any EE latest https://staging.gitlab.com
Geo tests
CHROME_HEADLESS=0 GITLAB_USERNAME=gitlab-qa GITLAB_SANDBOX_NAME=mkozono-gitlab-qa-sandbox bin/qa QA::EE::Scenario::Test::Geo --primary-address https://gitlab.com --secondary-address https://gprd.gitlab.com --primary-name primary --secondary-name secondary --without-setup
Manual Testing
Here are the area and functionality and assignments
Area | Functionality | Run in parallel w/ GitLab QA |
Test on staging | Test on production | owner |
---|---|---|---|---|---|
Emails | - | - | - | - | @toon |
Outgoing notification does DKIM, SPF continue to work from new sending hosts, etc?) |
yes | yes | |||
Incoming => notes | yes | yes | |||
Repository operations | - | - | - | - | @mkozono |
pushing to protected branch (provide access when it should and protect when it should) |
yes | yes | |||
ssh | yes | yes | |||
forking | yes | yes | |||
Access existing LFS object | yes | yes | |||
Push commit with LFS object | yes | yes | |||
Wikis | - | - | - | - | @mkozono |
Access existing wiki | yes | yes | |||
Create new wiki | yes | yes | |||
Uploads | - | - | - | - | @mkozono |
Access existing upload | yes | yes | |||
Create new upload | yes | yes | |||
Artifacts | - | - | - | - | @mkozono |
Access existing artifact | yes | yes | |||
Create new artifact | yes | yes | |||
Pages | - | - | - | - | @toon |
Access existing page | yes | post deploy | |||
Create new page | yes | post deploy | |||
Access from Azure (need a browser on a VM in Azure ?) |
yes | post deploy | |||
Access from GCP (need a browser on a VM in GCP ?) |
yes | post deploy | |||
Mirror | - | - | - | - | @vsizov |
Push | yes | post deploy | |||
Pull | yes | post deploy | |||
Backups Production team |
- | - | - | - | @dawsmith |
Redis | yes | post deploy | |||
Postgres | yes | post deploy | |||
Git Data | yes | post deploy | |||
GitLab Service Desk | - | - | - | - | @stanhu |
Do we only need to test this on production | ??? | yes | |||
version.gitlab.com | - | - | - | - | @dbalexandre |
Do we only need to test this on production | ??? | yes | |||
Alerts | yes | yes | @dawsmith |