Create wrapper class for handling fit data normalization and constraints

Description

Some optimization methods, specifically ARDR, perform much better if the fit data is normalized and centered prior to it being forwarded to the optimizer. This is currently possible but is inconvenient. For both applications in icet and hiphive, there is also a need to impose constraints. Both of these situations can be handled via an intermediate class as suggested by @angqvist and @erikfransson.

Background

Current procedure

A, y = sc.get_fit_data(key='metal_spacing_01')
y_mean_01 = y.mean()
y -= y_mean_01
y_scale_01 = 1/y.std()
y *= y_scale_01
cve = CrossValidationEstimator((A, y))
cve.validate()
cve.train()
parameters = cve.parameters
parameters /= y_scale_01
parameters[0] += y_mean_01
ce = ClusterExpansion(cs, parameters)

Suggested solution

Pass the fit data through a FitDataTransformer object/class.

Normalize data

fdt = FitDataTransformer(sc.get_fit_data(key='metal_spacing_01'), normalize_target=True, ...)
cve = CrossValidationEstimator(fdt.get_fit_data())
cve.validate()
cve.train()
ce = ClusterExpansion(cs, fdt.transform_parameters(cve.parameters))

Impose constraints

fdt = FitDataTransformer(sc.get_fit_data(key='metal_spacing_01'), normalize_target=True, ...)
fdt.add_constraint(constraint_matrix1, weight=100)
fdt.add_constraint(constraint_matrix2, analytical=True)

cve = CrossValidationEstimator(fdt.get_fit_data())
cve.validate()
cve.train()
ce = ClusterExpansion(cs, fdt.transform_parameters(cve.parameters))

Demo

  • FitDataTransformer class implemented for both normalization and constraints
  • new code is fully tested and documented (docstrings)
  • new code is type hinted (#362)
  • new module integrated in user guide (under doc/source/module_ref)
  • new tutorial section

2019/03/15 @erhart rewrote issue description after EXPLORE was resolved

Edited by Paul Erhart