Design of initial data for performance testbed and test data created from test runs
For our on-prem performance testbed, we should have a way to draw/separate static data and incremental data generated from test.
Static data should be the baseline. This is to ensure that as times progresses we are against a consistent database shape and that our data is not continously growing. If the database is getting bigger performance from today compared in the next 6-12 months may be different and improvements/degradation is not an accurate comparison with results prior.
Static data: This is initial data setup in the database. It should be static and large enough to satisfy the sql data shapes that we need.
- Will be setup in a seed state and will be static
- Will be use in a read fashion. Tests will not created more data into these projects and groups.
Incremental data These are data generated dynamically from our test runs. Data here is mainly used for traffic generation and functional usage load.
- New projects, groups, issues, merge requests that gets created or setup as part of an automated test
- This data should be in a separate project / group from the above
- Allow easy delete/drop to maintain data size baseline.
An simple outline is shown below.
- Identify Static data needed, this can be setup via project import. These are meant to be static so cleanup is not a priority.
- Import data into the environment
- Setup sandbox area for incremental data generated from tests
- Ensure that tests are outputing data/artifacts into this sandbox area (group/project)
- Delete/drop data mechanism. We might want to consider having the sandbox data on a separate db shard so we can just drop them. Deleting will still keep the data in the database hence the database shape will still be bigger.