Use case for required data integration functions
Note: Issue still in work
User stories
As an energy supplier, I would like to analyze the district heating network, bring together all the necessary data, and transfer it into a suitable structure.
Source Systems
Basic Data: The basic data for a production plant is available in an Excel file. The plants and their components are listed in the table as entries with unique IDs. The id of every entry is prefixed with a p.
Operating Data: The operating data for a plant is exported daily from the source system as a CSV file and uploaded to an FTP server. The plant IDs in these CSV files are not prefixed with a p.
Remote meters: All remotely transferred meters are stored in an SQL database. Because this system does not recognize plants as such, the plant are stored there as customers with their meter readings. In the source system, a meter can have multiple customers, but this is not used in practice because a meter is always assigned to exactly one system.
Manually read meters: All manually read meters are stored in a system that writes the data to two MySQL databases. The information about the meters is stored in one database and the information about the readings in a second database. Due to this structure, there is no relationship at the database level, i.e., no defined primary key and foreign key, even though these exist in terms of content.
Important
Apart from the features described above, the remaining information in the source systems corresponds to the database schema of the target system. For example, the operating hours files have an id, operating_hours, starts, and plant_id, whereby only the plant_id is missing the prefix p.
Target System
Description of needs and purpose
Basic Data:
- It should be possible to import an Excel file at regular intervals.
- It should be possible to write all plants and all parts from the Excel file into a table. Plants are only those that do not have a type, and parts are those with a type.
- It should be possible to write the nets to a separate table, with each net occurring only once.
Operating Data
- It should be possible to import the latest CSV files from an FTP server on a daily basis.
- It should be possible to import only certain files from the directory using a regular expression.
- It should be possible to prefix values within the CSV, as the plant ID is stored in this CSV without the prefix
p.
Remote Meters
- It should be possible to combine two fields from different tables into one field in a new table.
- It should be possible to combine two fields from different tables into one field in a new table even if they are not directly joined
- It should be possible to adjust the relation of the source system from a
many to manyrelation to aone to manyrelation.
Manually read meters
- It should be possible to join values of two different tables in two different databases
- It should be possible to create a table from a source table under certain conditions. For example, it should be possible to create a table called “CurrentMeter” containing all current measurements for a specific location. This ensures that old meters that have already been replaced do not appear there.




