|
|
# pgetu
|
|
|
## An extension to PostgreSQL and TimescaleDB
|
|
|
|
|
|
This library implements C extensions to PostgreSQL that allow many of the feature extraction functions from [tsfresh](https://tsfresh.readthedocs.io/en/latest/) to be run in PostgreSQL as aggregations (or as window functions). In addition, these same functions are implemented to support `TIME_BUCKET`s from TimescaleDB to account for uncertainty if the values will be in order by time.
|
|
|
|
|
|
## Requirements
|
|
|
|
|
|
- [etu](https://gitlab.com/tsetu/etu)
|
|
|
- [cmake](https://cmake.org/) version 3.10
|
|
|
|
|
|
In addition, pgetu was originally written using the [Intel oneAPI Math Kernel Library](https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl.html) and the included C compiler. While this compiler is no longer needed, the package will take advantage of it when it exists.
|
|
|
|
|
|
## Installation
|
|
|
|
|
|
Clone etu:
|
|
|
|
|
|
```
|
|
|
git clone https://gitlab.com/tsetu/etu
|
|
|
```
|
|
|
|
|
|
Change to the build directory and run `cmake`:
|
|
|
|
|
|
```
|
|
|
cd build
|
|
|
cmake ..
|
|
|
```
|
|
|
|
|
|
Compile the library with make:
|
|
|
|
|
|
```
|
|
|
make
|
|
|
```
|
|
|
|
|
|
This will create the library, libetu.so. The library and the include file `etu.h` can be installed:
|
|
|
|
|
|
```
|
|
|
make install
|
|
|
```
|
|
|
|
|
|
Next, load the relevant functions into the database. Change to the sql directory and connect to the database with `psql`:
|
|
|
|
|
|
```
|
|
|
cd sql
|
|
|
psql <database>
|
|
|
```
|
|
|
|
|
|
Then run the following scripts in order to create the necessary types, functions and aggregate functions.
|
|
|
|
|
|
```
|
|
|
\i 'create_types.sql'
|
|
|
\i 'create_tsfunctions.sql'
|
|
|
\i 'create_aggregate.sql'
|
|
|
```
|
|
|
|
|
|
## Notes
|
|
|
|
|
|
At this time, not all the feature calculations from tsfresh have been implemented. Currently missing are:
|
|
|
|
|
|
- augmented dickey fuller
|
|
|
- absolute maximum
|
|
|
- agg linear trend
|
|
|
- ar coefficient
|
|
|
- Benford correlation
|
|
|
- change quantiles
|
|
|
- count above t
|
|
|
- count below t
|
|
|
- CWT coefficients
|
|
|
- Fourier entropy
|
|
|
- Friedrich coefficients
|
|
|
- Lempel Ziv complexity
|
|
|
- linear trend timewise
|
|
|
- matrix profile
|
|
|
- max langevin fixed point
|
|
|
- maximum
|
|
|
- mean
|
|
|
- mean N absolute maximum
|
|
|
- minimum
|
|
|
- permutation entropy
|
|
|
- query similarity count
|
|
|
- range count
|
|
|
- spkt Welch density
|
|
|
- standard deviation
|
|
|
- sum values
|
|
|
- value count
|
|
|
- variance
|
|
|
|
|
|
Once they have been added to etu, they will be added to this library as well.
|
|
|
|
|
|
In addition, not all the functions return the same values as their tsfresh equivalent. tsfresh tends to use biased statistics in its calculations, while etu chooses to use the unbiased version. |