Make gmxapi.operation compatible with MPI-based ensemble management.

Initial execution management code in gmxapi.operation are very minimal. Ensemble awareness is minimal and only serial execution of operations for each ensemble members is supported.

Note that gmxapi.simulation.mdrun uses the legacy gmxapi.simulation.context module for parallel execution of ensemble simulations. While mpi4py is necessary for gmxapi.simulation.context, gmxapi.operation was not aware of it. This meant that mixing gmxapi.simulation operations with, say, gmxapi.commandline operations may not behave as intended for ensembles (non-gmxapi.simulation work would be duplicated across ranks.)

We will need to confirm that both naively parallel and broadcast data flow work correctly across ranks.

Updates

Some additional bugs were identified and fixed in the submitted patch.

Ensemble width for subgraph variables and their updates is now clarified when the subgraph instance is built.
while_loop explicitly behaves as an AllGather of the results it wraps, and correctly represents ensemble outputs as having an array dimension.
Naming of nodes in the workflow graph (operation instance identifiers) is clarified, normalized, and made consistent across the ensemble for operations in gmxapi.simulation module.
A distinction is clarified between work that is and is not duplicated on each MPI rank. Generally, non-MPI-aware tasks (that are not sufficiently integrated with gmxapi) should only be executed from a single process, whereas other tasks must be executed on all ranks (such as the subgraph+while_loop logic, and the MPI-aware ensemble mdrun task). A new allow_duplicate annotation determines whether tasks should be launched on all ranks, or whether they should run on a single rank and share their results. (The implementation is minimal and naive, with much room for future optimization, but the solution seems appropriate for now.)
For subgraphs and ensemble subgraphs, we recognized that Futures provided by the user to the subgraph should not be modified ("reset") during loop execution. In addition to much more rigorous handling and repackaging of inputs to subgraph variables, we introduce the ability to block the propagation of reset() to data providers.

Examples

Minimal tests of the necessary functionality are at https://gitlab.com/gromacs/gromacs/-/blob/8342ea76b1065524ad20768742a1bc859c84f4e4/python_packaging/src/test/test_subgraph.py#L98

A richer example is at https://github.com/kassonlab/gmxapi-tutorials/blob/main/examples/fs-peptide.py

Deferred

In conjunction with supporting parallel execution, we should expand the interface between Context and operations to describe coscheduling requirements and data locality issues.

Additionally, we need to improve the interaction between Contexts, such as with subscribability of Futures.

This is also related to data shaping issues (#2994 (closed)) and management of working files.

~~Some additional operations (e.g. scatter(), gather() and reduce()) may be needed.~~ update: some data shape transformation logic has been formalized, and two additional call-back facilities have been introduced to allow ResourceManagers to send and receive results between ranks when allow_duplate=False

Edited Feb 04, 2022 by M. Eric Irrgang