Improve sampling of the concentration domain (!598) · Merge requests · computationalmaterials / CLEASE

This merge request is an attempt to address concentration sampling issues outlined in https://gitlab.com/computationalmaterials/clease/-/issues/337 and https://gitlab.com/computationalmaterials/clease/-/issues/334.

CLEASE employs scipy.optimize.minimize (David describes how in more detail in https://gitlab.com/computationalmaterials/clease/-/issues/334) to randomly sample concentrations that satisfy provided constraints. It then attempts to convert these concentrations to integers that correspond to the number of atoms of each element in given structure. Conversions may include rounding and then are validated. Failures to convert result in an error. However, it seems that throwing an error is not particularly useful because sampling a concentration that can be successfully converted to valid integer quantities involves a degree of luck, i.e. not all valid concentrations can be converted to corresponding integer quantities. What I suggest instead is to allow sampling for some maximum number of attempts, e.g. 100 in get_random_concentration function, before throwing an error.

Another purposed significant change is in the initial guess passed as a parameter to scipy.optimize.minimize. Currently it is set as:

# Setup the constraints
constraints = self._get_constraints()
x0 = np.random.rand(self.num_concs)
pinv = invert_matrix(self.A_eq)
x_init = pinv.dot(self.b_eq)

I'm not sure what was reasoning behind such choice. I can see how it make sense for systems with equality constraints but I would also argue that it significantly reduces randomness for systems with inequality constraints.

To improve concentration domain sampling, I suggest changing x_init and x_0 here:

opt_res = minimize(
    objective_random,
    x_init,
    args=(x0,),
    method="SLSQP",
    jac=obj_jac_random,
    constraints=constraints,
    bounds=self.trivial_bounds,
)

to randomly generated values:

opt_res = minimize(
    objective_random,
    np.random.rand(self.num_concs),
    args=(np.random.rand(self.num_concs),),
    method="SLSQP",
    jac=obj_jac_random,
    constraints=constraints,
    bounds=self.trivial_bounds,
)

With all these changes implemented, I've managed to obtain satisfying concentration domain converage for the case described in https://gitlab.com/computationalmaterials/clease/-/issues/337. To achieve the desired results, I also had to disable the _add_fixed_element_in_each_basis function before running the minimizer. I reckon that function can be safely discarded.

Please note that this merge request has some flaws. For example, because I moved concentration validation into get_random_concentration, concentration validation filter becomes kinda superfluous (correct me if I'm wrong). Currently, concentration is validated twice (both in get_random_concentration and concentration validation filter) which is an overkill. If the filter is not used elsewhere, perhaps we can safely remove it?

@davidkleiven, @AlTy, @jinchang - your input on this merge request is greatly appreciated.

Edited May 14, 2024 by Juozas Miškinis

Improve sampling of the concentration domain

Merge request reports