Ground state finder using Mixed Integer Programming

As a user, I am frequently interested in ground state structures at specific compositions. To find these, I would typically run simulated annealing or (if feasible) enumerate structures. These methods can often be very time-consuming, and the former does not even leave any guarantees for whether the true ground state is found. An elegant solution to this problem was recently published in PRL. It is based on Mixed Integer Programming (MIP) and very efficiently finds ground states even in large cells, and guarantees the right solution.

Use case

In principle this should be possible to implement such that you use it in this way:

gsf = GroundStateFinder(cluster_expansion)
gs = gsf.get_ground_state(supercell, {'Ga': 10, 'Ge': 12})

I've done some test implementations and it is indeed extremely efficient, at least if you have a large primitive cell, so that you're not interested in every possible supercell.

Implementation considerations

MIP solver

To implement this, an MIP solver is needed. I haven't looked into the details of how this actually works but I don't expect this to be something we want to implement ourselves. We thus need an external library. There is a python library called PuLP which I used successfully. Its installation is not very clean though, since it requires an extra step sudo pulptest. Also, the project is no longer actively developed. Thus I do not think we want to make this a dependency, but perhaps it could be an optional dependency?

Cluster vector transformation

The algorithm described in the above linked paper requires that, for a binary system, one uses spin variables 0 and 1 instead of -1 and 1. This is no big deal per se, but if one does such a transformation after fitting ECI:s, one has to transform the ECI:s too. In my implmenentation, I made a dirty solution where I calculated the cluster vector in both versions for a bunch of structures, solved for the transformation matrix that mapped one into the other, and then applied that transformation on the ECI:s. This will not be doable in a general fashion, so one basically would have to find this transformation without comparing to cluster vectors. I think it basically would involve seeing what pairs are part of a triplet and so on, and maybe there is also such functionality hidden in icet somewhere? I would need help with this.

Multicomponent systems

I also don't know how the algorithm generalizes to ternaries etc. This would have to be addressed at some point.

Edited by Magnus Rahm