Skip to content

ML Interface Engine (InterfaceFinal branch)

I open this issue to follow up on our earlier discussion and pinpoint quick changes for the branch InterfaceFinal.

Branch InterfaceFinal:

  1. In ml/engine.py. Attribute isBest was renamed best_score. Function test() requires fix (self.isBest into self.best_score)

  2. In ml/loss/loss.py. Renaming the function log_loss into store_loss or write_loss. In ML, the term log_loss often refer to the logistic loss (logit loss).

  3. In ml/metrics/metrics.py. To follow this nomenclature, rename log_metric into store_metric or write_metric.

  4. py3 syntax: instead of super(sub_class, self).__init__(*args, **kwargs), use super().__init__(*args, **kwargs)

  5. In ml/utils/*. What is the utility of the files preprocess.py; postprocess.py and vis_utils.py ?

  6. In ml/utils/config_parser.py. Remark on the Getter&setter functions. Please refer to Romain's note [!15 (comment 420330969)].

  7. It can be possible to merge ml/metrics/metrics_custom.py into ml/metrics/metrics.py. Similar remark on ml/loss/loss_custom.py into ml/loss/loss.py.

  8. in ml/models/*. unet.py and unet_dynamic_prune.py differ by the use of the pruning percent stored in self.scale, injected as the output_size for the ConvNet blocks. We can merge unet.py & unet_dynamic_prune.py and retain the use of the pruning percent as an option (boolean) in the configuration parameters.

  9. When using existing functions/implementation from the literature, the comment section for each function needs to refer to the source (for instance, the reference to the original paper/book of the method/algorithm)

  10. in ml/data/*.

  • The function _split_set in coronalholes_testing.py exploits an unused variable c1.
  • Duplicates : ml/data/coronalholes_testing.py and ml/data/coronalholes.py, apart from import packages.
  • What is the utility of ml/data/augmenter.py?
  1. in ml/data/augmentations/*. Duplicates of the class ml/data/augmentations/coronalholes.py within ml/data/augmentations/ch_augs.py.

  2. in ml/utils/config_parser The keys of the data dictionary are dependent on the listed mentions in the yaml files. There is potentially a need to define a mock/initialization procedure to set (to default values, None or 0) all of the properties of interest that are exploited in the codes.

  3. in ml/data/omniwebdata.py. The scaler in "data/omniwebdata.py" is never stored. Predictions should be transformed back into the original scale using the sklearn.preprocessing.StandardScaler.inverse_transform()

  4. in ml/data/*. Data should be segmented into TRAIN/VALIDATION/TEST sets. For instance, the use-case "../configs/config_omni_web.yml" run through "aidapy/ml/cli.py" should make usage of the engine train() and test() functions evaluated on the training and the validation datasets. Branch "Interface_v1.0" mentions "validation procedures" in the engine test().

Validated changes from "Interface_v1.0" should be moved into "Interface_final".

Edited by Sara Jamal