#356 Use TimescaleDB in Timeseries Endpoints
Merge Request
💣 Breaking Changes
- InfluxDB has been replaced with TimescaleDB to persist timeseries data
- The
POSTGRES_...variables described inenv.examplehave to be set for TimescaleDB to work properly - Timeseries data needs to be migrated from InfluxDB to the new database. Otherwise shepard will NOT start!
- Update/Migration instructions can be found below
- Some Influx specific classes have been renamed to be more generic. This does not change the API but can lead to changes in generated clients. The following classes have been renamed:
-
InfluxPointhas been renamed toDataPoint -
SingleValuedUnaryFunctionhas been renamed toAggregateFunction -
TimeseriesPayload...has been renamed toTimeseriesWithDataPoints...
-
- The integral function is not available for the moment since it needs to be implemented for TimescaleDB
- When uploading a list of data points, which contains two or more data points that share the same timestamp value, an HTTP error
InvalidBodyExecptionwith status code 400 (Bad Request) is returned.
🥅 Description
- The timeseries endpoints now talk to TimescaleDB instead of InfluxDB.
- Feature Flag removed for experimental timeseries.
- Experimental timeseries code is not experimental anymore.
Migration Instructions
- Update shepard to version 4.0.0 (migration is only part of this specific version)
- Also update the shepard clients if you make use of them
- Before migration is started, it is recommended to run a short check beforehand to check existing data for known issues.
- The
docker-compose.ymlfile in theinfrastructurefolder contains an entrytimescale-migration-preparation. It is in a separate profiletimeseries-migrationand thus will only be executed by manual interaction. You can takeover that service into your own docker-compose file or directly use it from the infrastructure folder if you have set up the environment variables correctly. - Stop all shepard services (
docker compose down). - run
docker compose --profile timeseries-migration run timescale-migration-preparationto start the migration container. This will start the migration container, as well as all necessary dependencies (InfluxDB, Neo4j) and open a bash-shell within the container. - Wait for all containers having started (around 10s).
- Within the container, run
python check.pyto analyze the database. - If necessary, run
python fix.pyto fix any problems (can take considerable time). - To exit the container, type
exit. - Stop all other containers using
docker compose down - Continue with shepard TimescaleDB migration
- The
- In your
.envfile, setSHEPARD_MIGRATION_MODE_ENABLED=true - Start shepard to run the migrations (
docker compose up) - The migration state can be retrieved via the
/temp/migrations/stateREST endpoint - Stop shepard once the migrations succeeded and the endpoint returns success
- If there are any errors during migration you have to fix them and run the migration again. There is a retry mechanism that will repeat the migration for all containers that were not successfully migrated.
- The migration mode flag can now be removed from the .env file or set to false (default)
- Start shepard again. It will now run with all data migrated to TimescaleDB.
- Shepard will not start if not all containers are successfully migrated. This behavior is by design to prevent data loss. You have to finish the migration first.
- Cleanup
- InfluxDB and Chronograf docker containers can be stopped with
docker compose down influxdb chronograf - docker-compose.yml can be cleaned up by removing
INFLUX_HOST,INFLUX_USERNAMEandINFLUX_PASSWORDfrom thebackendservice and theinfluxdbandchronografservices
- InfluxDB and Chronograf docker containers can be stopped with
Further notes:
- With one of the next versions (probably 4.1.0) we will remove the migration-mode. In order to migrate you have to first migrate to 4.0.0 and afterwards migrate to the current version.
- shepard will not start if the migration is not finished without an error
- In case of errors: -- fix the problem first -- restart shepard, it will automatically repeat the migration for containers that were not successfully migrated -- the whole migration can be re-run by deleting the data in the timescaledb (or removing the timescale container including it's volumes).
📓 Checklist
-
The code compiles without any warnings. -
I followed the code review checklist. -
The documentation has been added/updated.
🔗 Related Issues
- Related #356 (closed)
Edited by Maximilian Heykeroth