Design of initial data for performance testbed and test data created from test runs
Overview
For our on-prem performance testbed, we should have a way to draw/separate static data and incremental data generated from test.
Static data should be the baseline. This is to ensure that as times progresses we are against a consistent database shape and that our data is not continously growing. If the database is getting bigger performance from today compared in the next 6-12 months may be different and improvements/degradation is not an accurate comparison with results prior.
-
Static data: This is initial data setup in the database. It should be static and large enough to satisfy the sql data shapes that we need.
- Will be setup in a seed state and will be static
- Will be use in a read fashion. Tests will not created more data into these projects and groups.
-
Incremental data These are data generated dynamically from our test runs. Data here is mainly used for traffic generation and functional usage load.
- New projects, groups, issues, merge requests that gets created or setup as part of an automated test
- This data should be in a separate project / group from the above
- Allow easy delete/drop to maintain data size baseline.
An simple outline is shown below.
Task
- Identify Static data needed, this can be setup via project import. These are meant to be static so cleanup is not a priority.
- Import data into the environment
- Setup sandbox area for incremental data generated from tests
- Ensure that tests are outputing data/artifacts into this sandbox area (group/project)
- Delete/drop data mechanism. We might want to consider having the sandbox data on a separate db shard so we can just drop them. Deleting will still keep the data in the database hence the database shape will still be bigger.
@at.ramya @sliaquat @stanhu @ayufan this is the issue for the design of the test data. Let's use this as the starting point.
Edited by Eric Brinkman