#356 Use TimescaleDB in Timeseries Endpoints

Merge Request

💣 Breaking Changes

  • InfluxDB has been replaced with TimescaleDB to persist timeseries data
  • The POSTGRES_... variables described in env.example have to be set for TimescaleDB to work properly
  • Timeseries data needs to be migrated from InfluxDB to the new database. Otherwise shepard will NOT start!
  • Update/Migration instructions can be found below
  • Some Influx specific classes have been renamed to be more generic. This does not change the API but can lead to changes in generated clients. The following classes have been renamed:
    • InfluxPoint has been renamed to DataPoint
    • SingleValuedUnaryFunction has been renamed to AggregateFunction
    • TimeseriesPayload... has been renamed to TimeseriesWithDataPoints...
  • The integral function is not available for the moment since it needs to be implemented for TimescaleDB
  • When uploading a list of data points, which contains two or more data points that share the same timestamp value, an HTTP error InvalidBodyExecption with status code 400 (Bad Request) is returned.

🥅 Description

  • The timeseries endpoints now talk to TimescaleDB instead of InfluxDB.
  • Feature Flag removed for experimental timeseries.
  • Experimental timeseries code is not experimental anymore.

Migration Instructions

  1. Update shepard to version 4.0.0 (migration is only part of this specific version)
  2. Also update the shepard clients if you make use of them
  3. Before migration is started, it is recommended to run a short check beforehand to check existing data for known issues.
    1. The docker-compose.yml file in the infrastructure folder contains an entry timescale-migration-preparation. It is in a separate profile timeseries-migration and thus will only be executed by manual interaction. You can takeover that service into your own docker-compose file or directly use it from the infrastructure folder if you have set up the environment variables correctly.
    2. Stop all shepard services (docker compose down).
    3. run docker compose --profile timeseries-migration run timescale-migration-preparation to start the migration container. This will start the migration container, as well as all necessary dependencies (InfluxDB, Neo4j) and open a bash-shell within the container.
    4. Wait for all containers having started (around 10s).
    5. Within the container, run python check.py to analyze the database.
    6. If necessary, run python fix.py to fix any problems (can take considerable time).
    7. To exit the container, type exit.
    8. Stop all other containers using docker compose down
    9. Continue with shepard TimescaleDB migration
  4. In your .env file, set SHEPARD_MIGRATION_MODE_ENABLED=true
  5. Start shepard to run the migrations (docker compose up)
  6. The migration state can be retrieved via the /temp/migrations/state REST endpoint
  7. Stop shepard once the migrations succeeded and the endpoint returns success
  8. If there are any errors during migration you have to fix them and run the migration again. There is a retry mechanism that will repeat the migration for all containers that were not successfully migrated.
  9. The migration mode flag can now be removed from the .env file or set to false (default)
  10. Start shepard again. It will now run with all data migrated to TimescaleDB.
  11. Shepard will not start if not all containers are successfully migrated. This behavior is by design to prevent data loss. You have to finish the migration first.
  12. Cleanup
    1. InfluxDB and Chronograf docker containers can be stopped with docker compose down influxdb chronograf
    2. docker-compose.yml can be cleaned up by removing INFLUX_HOST, INFLUX_USERNAME and INFLUX_PASSWORD from the backend service and the influxdb and chronograf services

Further notes:

  • With one of the next versions (probably 4.1.0) we will remove the migration-mode. In order to migrate you have to first migrate to 4.0.0 and afterwards migrate to the current version.
  • shepard will not start if the migration is not finished without an error
  • In case of errors: -- fix the problem first -- restart shepard, it will automatically repeat the migration for containers that were not successfully migrated -- the whole migration can be re-run by deleting the data in the timescaledb (or removing the timescale container including it's volumes).

📓 Checklist

  • The code compiles without any warnings.
  • I followed the code review checklist.
  • The documentation has been added/updated.

🔗 Related Issues

Edited by Maximilian Heykeroth

Merge request reports

Loading