Create wrapper class for handling fit data normalization and constraints
Description
Some optimization methods, specifically ARDR, perform much better if the fit data is normalized and centered prior to it being forwarded to the optimizer. This is currently possible but is inconvenient. For both applications in icet and hiphive, there is also a need to impose constraints. Both of these situations can be handled via an intermediate class as suggested by @angqvist and @erikfransson.
Background
Current procedure
A, y = sc.get_fit_data(key='metal_spacing_01')
y_mean_01 = y.mean()
y -= y_mean_01
y_scale_01 = 1/y.std()
y *= y_scale_01
cve = CrossValidationEstimator((A, y))
cve.validate()
cve.train()
parameters = cve.parameters
parameters /= y_scale_01
parameters[0] += y_mean_01
ce = ClusterExpansion(cs, parameters)
Suggested solution
Pass the fit data through a FitDataTransformer object/class.
Normalize data
fdt = FitDataTransformer(sc.get_fit_data(key='metal_spacing_01'), normalize_target=True, ...)
cve = CrossValidationEstimator(fdt.get_fit_data())
cve.validate()
cve.train()
ce = ClusterExpansion(cs, fdt.transform_parameters(cve.parameters))
Impose constraints
fdt = FitDataTransformer(sc.get_fit_data(key='metal_spacing_01'), normalize_target=True, ...)
fdt.add_constraint(constraint_matrix1, weight=100)
fdt.add_constraint(constraint_matrix2, analytical=True)
cve = CrossValidationEstimator(fdt.get_fit_data())
cve.validate()
cve.train()
ce = ClusterExpansion(cs, fdt.transform_parameters(cve.parameters))
Demo
-
FitDataTransformerclass implemented for both normalization and constraints - new code is fully tested and documented (docstrings)
- new code is type hinted (#362)
- new module integrated in user guide (under
doc/source/module_ref) - new tutorial section
2019/03/15 @erhart rewrote issue description after EXPLORE was resolved
Edited by Paul Erhart