Feature: new calculator for RuNNer (Runner Neural Network Energy Representation)

Review changes
Download
Patches
Plain diff

Alexander Knoll requested to merge aknoll/ase:enhancement_io_runner into master Mar 02, 2022

Overview 7
Commits 63
Pipelines 5
Changes 14

Hello everyone,

this pull request contains an ASE calculator for RuNNer, a Fortran-based software package for generating machine learning potentials. RuNNer is open-source and is being actively developed in the group of Prof. Dr. Jörg Behler at Georg-August-Universität Göttingen, where I am also working.

The PR is very long (several thousand lines of code). Please let me know what the ASE policy is here. I am happy to provide e.g. stacked PRs to facilitate code review.

Checklist:

The code follows the style guidelines of this project.
Self-review and production testing was performed.
Comments and docstrings are provided.
Unit tests were generated for the calculator (yet coverage is incomplete, see "Possible Pain Points")
CI pipeline passes.
Up-to-date with the upstream master branch.

Overview

The PR contains the following changes:

Calculator Runner. The heart of the module, this calculator enables typical ASE jobs (like get_potential_energy) but also further functionalities (see "Detailed Description")
Calculator RunnerSinglePointCalculator: Extends the native SinglePointCalculator class to store the total charge in a structure (required by RuNNer, different from sum of charges).
I/O routines for reading and writing all RuNNer-typical file formats.
Storage classes for all results produced by RuNNer (energy and forces are still provided in the ASE standard format of float/numpy array).
Utility functions for automatically generating symmetry functions (a mandatory input to RuNNer)
Unit Tests for the calculator.

A diagram of the module architecture is attached on this PR.

Detailed Description

The calculator fulfills two main purposes:

provide a unified Python interface to RuNNer based on ASE so that one can make use of ASE's functionalities.
enable the management of RuNNer workflows via Python.

A typical RuNNer workflow consists of three modes:

Calculation of atom-centered many-body descriptors ("symmetry functions").
Generation of a potential by training atomic neural networks based on the symmetry functions from Mode 1.
Prediction of energies and forces for an unknown configuration based on the trained neural network from Mode 2.

As one can see, RuNNer Mode 3 corresponds to the typical use case of an ASE calculator: the prediction of energies, forces, ... for an atomic structure.

This calculator, however, strives to also enable Modes 1 and 2. As a result, it has many additional functionalities which differ from other, more 'traditional' calculators in the ASE code base. Major additions are:

A calc.run(mode=X) function, which can be used to run any of the RuNNer Modes.
The calculator has an additional dataset property which stores a list of Atoms objects. dataset serves as the training dataset and is mandatory for RuNNer Mode 1 and Mode 2.

Possible Pain Points / Help Appreciated

ASE Design Principles: I am aware that ASE aims for maximal code reusability. However, the additional functionalities supported by the Runner calculator also mean that it has a comparably large code base that differs from existing calculators quite a lot.
calc.atoms vs calc.dataset: When both a dataset and an atoms object are attached to the calculator, get_potential_energy, get_forces, etc. will always yield the result for all configurations in the dataset. Atoms will be ignored.
RunnerSinglePointCalculator: tbh I would love to eliminate this class. It extends the native SinglePointCalculator class only to store an additional property, the total charge in the system. In RuNNer, this value is a requirement for the correct treatment of electrostatics. It needs to be able to differ from the sum of all atomic charges. Could this perhaps be included in SinglePointCalculator?
Results Classes: The results classes currently do not know how to write themselves to JSON.
Unit tests: The coverage of the code is still in its infancy. More unit tests are probably needed. I would appreciate some guidance on which features in particular require test coverage before the PR can be accepted.

How Has This Been Tested?

Unit tests for running a whole workflow and restarting a calculation.
We are using it in our group for conversion to and from RuNNer file formats.
I have used the calculator many times to run RuNNer workflows.
Nudged Elastic Band calculations with the calcaltor and the ASE NEB module.
Phonon Calculations with the ASE Phonon Module powered by the calculator.

Sample code can be provided if necessary.

Conclusion

I would love if this calculator could be accepted into the ASE code base and I am eager to hear your thoughts!

Thank you very much for your time and comments Cheers Alex (PhD with Prof. Dr. Jörg Behler)