Skip to content

Add configuration file feature for learn

deck requested to merge deckbsd/polaris:add-ai-process-config-file into master

This work is a part of needed for #66 (closed)

This work is building the basics for making possible for users to use a configuration file for ai processes. Here it's for the learn part (cross correlation), but the idea is to do the same for anomaly to have a coherent and general way to handle this feature across Polaris.

Main principles :

  • A parameters object that wrap every parameters needed for a Polaris ai process.
  • A configurator object that handle in one place the configuration logic (default and custom)
  • The parameters generated by the configurator is passed to the ai process (cross correlation in this case).

Why those principles ? :

I found that in the current version of the code it s difficult to know what are the default parameters used for Xcorr. The configuration is at different places in the code and XCorr actually shouldn t handle the config part, only using parameters that we submit. So i created a configurator class to remove all the configuration code from XCorr (making the class less complicated to go trough) and having in one place all the configuration. So you want to know what are the default parameters we use for XCorr, just go to the configurator it should be clearer.

Configuration file for the cross correlation process :

Here is a configuration file example :

{
        "use_gridsearch": false,
        "random_state": 42,
        "test_size": 0.2,
        "gridsearch_scoring": "neg_mean_squared_error",
        "gridsearch_n_splits": 18,
        "dataset_cleaning_params": {
            "col_max_na_percentage": 100,
            "row_max_na_percentage": 100
	},
        "model_cpu_params": {
            "objective": "reg:squarederror",
            "n_estimators": 81,
            "learning_rate": 0.9,
            "n_jobs": 1,
            "predictor": "cpu_predictor",
            "tree_method": "auto",
            "max_depth": 10
        },
        "model_params": {
            "objective": "reg:squarederror",
            "n_estimators": 80,
            "learning_rate": 0.1,
            "n_jobs": -1,
            "max_depth": 8
        }
}

Of course in the model_params (and model_cpu_params) part we can add any parameters supported by xgboost (that's why i let it as a dict). I just took the default configuration as an example

Configuration file for batch to run a learn (cross correlation in that case) process :

Here is an example of a batch config file customizing learn parameters :

{
  "file_layout": {
    "root_dir": "/tmp/polaris"
  },
  "satellite": {
    "name": "LightSail-2",
    "_comment": "Fields that begin with an underscore are ignored but preserved.  _comment is suggested as the default way to include comments anywhere they might be needed.",
    "batch": {
      "fetch": false,
      "learn": true,
      "viz": false
    },
    "learn": {
      "configuration_file": "/home/deckbsd/repos/xcorr_cfg.json",
      "input_file": "/tmp/outm.json",
      "output_graph_file": "/tmp/graph.json"
    }
  }
}

(Every arguments that you can use directly with learn can be overrided in the file, here i only set the files parameters, but you can add the graph link threshold, ...).

Readme :

  • The README file has been updated, i added a configuring Polaris part.
  • I updated the polaris_config.json.EXAMPLE file by adding a learn example.
Edited by deck

Merge request reports