Infrastructure supporting external API - Redmine #2585
This issue describes the major feature developments necessary to support
development of external API facilities with appropriate abstraction and
integration testing. With issues like \#988 still open, it is still
possible to describe necessary GROMACS infrastructure changes to support
API development and the best known API functionality goals.
Goals
=====
The following list summarizes feature requirements established during
development of https://github.com/kassonlab/gmxapi
- MPI environment sharing
- Binding external objects with interfaces for external code to retain
access to launched tasks (such as objects instantiated during
thread-MPI launch)
- Accessible simulation stop signal
- Launch MD runner through library calls
- Extensible ForceProviders
- Restraint framework / extensible “pull” code
- Libgmxapi build target and public headers
- Libgmxapi doxygen documentation
- Manage filesystem input and output locations
- Interaction / synchronization with predetermined integer time steps
rather than time value.
Packaging and project structure
===============================
The following notes may belong in a different issue, such as \#2045, but
are provided here for context.
Libgromacs
----------
Long term expectations consistent with this proposal:
- Its modules may have headers internal to them,
- Exposes a library API of deliberately selected components
- Its modules may expose components to that library API e.g.
IForceProvider,
- That library API must be unit tested so it is known to work
- does not expose a user-facing API (ie no installed headers)
- Library API will be stable over the lifetime of a release branch (ie
from about the time of the official release),
- Soversion bumps with each major release
- Library API will not be stable in master branch (but must pass its
own tests in CI)
- Expects to be able to link itself dynamically
- Expects to be able to link itself statically (and its dependencies,
if available)
- Linked as position-independent code
- Requires python v3 (patch from Pascal in gerrit, also releng needs
an update too, probably)
Libgmxapi in GROMACS repo
-------------------------
While the exposed symbols and public headers are reconsidered and as
code in libgromacs continues to undergo more compartmentalization and
modernization, it is convenient to define a second library and set of
public headers. For now, we defer the question of whether libgmxapi
would ultimately be subsumed into libgromacs, but we propose that the
gmxapi headers ultimately replace the current set of public headers.
- Builds with access to libgromacs library API headers
- installs the user-facing C<span class="underline"></span> headers
for client code (such as what Python module builds against)
- “plugin” API. E.g. a stable framework(s) for implementing various
sorts of potentials while hiding IForceProvider e.g. by exposing
adapters to things like domain-distributed positions. May warrant
different versioning semantics than API/ABI supporting user
interfaces: probably more stable API; possibly less stable ABI.
- (Proposing) semantic versioning (standard guarantees starting
somewhere between 0.1 and 1.0) https://semver.org/
- One name or separate name for tMPI / MPI? (suggest “no”)
- Plugin API only in C<span class="underline"></span>, but has
supported method to expose bindings to the inserted code that can be
used in Python scripts, i.e plugins are not written in pure Python
- Main (high-level workflow) API complete only in python, but will
migrate to C<span class="underline"></span>. (We can accelerate the
C<span class="underline"></span> version if there is a need)
- No plans to load C<span class="underline"></span> plugin modules
natively (ie no dlopen) because this is able to be handled more
portably at Python level (or some other user interface code), but we
need to be able to receive and register factory functions to get
objects from external code
Parts of this discussion may belong under \#988
Python package
--------------
A GROMACS Python package like https://github.com/kassonlab/gmxapi is
nominally beyond the scope of this issue. A few points from \#2045 worth
mentioning, though:
- serves to prove libgmxapi functionality; needs integration testing
with libgromacs and libgmxapi
- Ultimately, it should be available as part of a GROMACS
installation, but should it be in the same repository and/or CMake
build environment?
- pybind11 source bundled with repo
- Provides both an implementation and an API specification.
- Researchers can be compatible with it without depending on it.
- Writers of GROMACS extension code are free to use other Python
bindings frameworks.
- We provide tools and helpers, but C<span class="underline"></span>
helpers use pybind11
Details
=======
A big picture of planned development is necessary even before Redmine
Issues exist, so milestones are enumerated with feature ID tags.
Dependencies are better illustrated in the accompanying chart. Each
numbered feature (chart node) is expected to be from 1 to 10 Gerrit
changes, generally 3 to 5.
Proposed development targets
----------------------------
### Gmx1 (this issue) Design documentation strategy / project management plan
- this issue and a cluster of Redmine issues with subtasks should be
good
- documentation and visual aids like the attached progress chart
should probably be in the repository somewhere
### Gmx2 (Issue #2586) Versioned libgmxapi target for build, install, headers, docs
### Gmx3 Integration testing (Issue #2756)
- Gmxapi interfaces should continue functioning with unchanged
semantics for other GROMACS changes, or API level needs to be
incremented according to semantic versioning.
- External projects need to be tested outside of the gromacs build
tree to validate external interfaces of installation. Suggested
external projects: Python package, sample\_restraint,
yet-to-be-written integration test suite.
- Tests should be clear about the API version they are testing, and we
should test all versions that aren’t unsupported (though we need a
policy in this regard) and we can note whether new API updates are
backwards compatible.
- Forward-compatibility testing: we should at least *know* whether we
are breaking old client code and include release notes, regardless
of policy
- ABI compatibility testing? (should we test mismatched compilers and
such?)
- Example code in documentation should be tested, if possible.
### Gmx4 Library access to MD runner
- mdrun CLI program is an API client
Relates to #2229
### Gmx5 (Issue #2587)Provide runner with context manager
### Gmx6 Extensible MDModules and ForceProviders
- ForceProviders obtained after tMPI threads have spawned.
- MDModules list extended at runtime during simulation launch.
- External code may be provided to the runner to instantiate or get a
handle to a module.
- Expanded Context class can broker object binding by registering and
holding factory functions for modules, as well as other resources
that may be implemented differently in different environments.
- Somewhere in here, MDModules either need access to the integral
timestep number or the ability to register call-backs or signals on
a schedule.
Relates to #2590, #2574, #1972
Do MDModules live in a scope of tight association with an integrator? Do
we need other concepts, like RunnerModules? Or subdivisions like
MDForceModule, MDObserverModule, MDControlModule?
### Gmx7 Binding API for higher-level client code
### Gmx8 Binding API for plug-in ForceProviders
Ultimately tied to gmx5 and gmx24, but we can start stabilizing the
external interfaces now. The external interfaces are for (a) user
interface / workflow management code, and (b) MD extension code. We
define a simple message-passing C structure along with PyCapsule name
and semantics. An MD extension object can provide a factory method with
which the MD Runner can get an IMDModules interface at simulation
launch. The object pointed to may exist before and/or after the lifetime
of the simulation. It must be understood that the IMDModule handle will
be obtained on every rank. Design should consider future infrastructure
and needs, but does not need to implement now. (expressing data
dependencies and locality, negotiating parallelism, expressing
periodicity) Short-term implementation may require workarounds for some
of these, but the workaround can mostly be segregated from this issue’s
resolution.
Relates to #2590
### Gmx9 Headers and adapter classes for Restraint framework
Relates to #1972, #2590,
https://github.com/kassonlab/sample\_restraint
### Gmx10 MD signalling API
Relates to #2224
### Gmx11 Replace MPI\_COMM\_WORLD with a Context resource
### Gmx12 Runtime API for sharing / discovering hardware / parallelism resources
- Libgmxapi requests resources from libgromacs from the current node
- CUDA environment can be manipulated but we shouldn’t have to deal
with that for a while
- Evolving task scheduling interfaces, expressing data locality
- Concepts of time and timestep
### Gmx13 API for working directory, input/output targets?
### Gmx14 Generalized pull groups / “generalized sites”
Christian Blau actively working on this from mid-July
### Gmx15 API logging resource
Log “file” artifacts are produced through API, allowing extensibility
and abstraction from filesystem dependence. Progress has already been
made in this direction, but the logging resource could be more clearly
owned by the client code (or a Context object owned or managed on behalf
of the client code) rather than created and destroyed in, say, the
Mdrunner.
Also relates to #2570
### Gmx16 Exception handler firewall
currently the gmx binary has a commandline runner thing that catches the
exceptions, reports an error and exits, but the API can and should do
something else, because it plays the same role as the commandline runner
### Gmx17 API status object
- Status type defines the interface for discovering operation success
or failure, plus details.
- Consistent status object interface is portable across Python, C<span
class="underline"></span>, and C
- Status object can be used to ferry information across API boundaries
from exceptions thrown. Exceptions could be chained / status nested.
Questions:
- What are concerns and solutions for memory allocation for status
objects? Should objects own one or generate one on function return?
- Should the API (or Context) keep a Status singleton? A Status stack?
Or should operations create ephemeral Status objects, or objects
implementing a Status interface?
- Should the status object contain strings, reference strings mapped
by enum, or defer textual messages to messaging and logging
facilities?
### Gmx18 Thinner test harness (for API client tests)
### Gmx19 API manipulation of simulation input / output
(for better testing) - GlobalTopology class and IGlobalTopologyUser
interface underway will help here, so that client changes to the global
topology can ripple through to the modules because the ones that care
have registered themselves at setup time
### Gmx20 Accessible test data resources
### Gmx21 Break up runner state into a hierarchy of RAII classes with API hooks
- break up mdrun program into clearly defined layers and phases
- CLI program parses various inputs in order to launch an Mdrunner
object that is CLI-agnostic
- launching tMPI threads and other significant changes of state
establish a sequence or hierarchy of invariants through RAII and/or
State pattern.
- Sebastian Wingbermuhle working now on aspects of this for hybrid
MC/MD (ref #2375, …)
### Gmx22 API management of input objects
- Structure, topology
- Microstate
- Simulation state
- Simulation parameters
- Runtime parameters / execution environment
- Anything else?
### Gmx23 Event hooks or signals
Event hooks or signals for
- checkpoint
- time step number or delta / trajectory advancement
- input configuration
- input topology
- input state
- simulation parameters
- output data streams
### Gmx24 API expression of MDOptions interfaces and embedded user documentation
### Gmx25 Avoid sys::exit
Generally, replace std::exit (gmx\_fatal)with exceptions
- Root out gmx\_fatal, clearly define regular exit points and
exception throwers
- API firewall should catch exceptions from gmx and convert to status
objects for ABI compatibility. (gmx17)
- Clearly document regular and irregular shutdown behavior under MPI,
tMPI, and generally, specifying responsibilities
- Create issue tickets for discovered missing exception safety, memory
leaks, opportunities for RAII refactoring, and complicated protocols
that should either be better documented or replaced with a clearer
hierarchy (or sequence) of invariants
### gmx26 API messaging resources
Abstraction for status messages, such as are currently printed to stdout
or stderr
### gmx27 (retracted)
### gmx28 set simulation parameters from API
Short term: mdrun CLI-like functionality to override other input is
sufficient
Long term: sufficient API to update parameters between phases of
simulation work
Implementation roadmap is probably
1. Inject argv fields
2. Write to input\_rec or other structures
3. Interact with MDOptions framework
### gmx29 API access to grompp functionality
- Generate runnable input from user input
- United implementation for workflow API and utility functions (e.g.
possibility of deferred execution / data transfer)
- Ultimately should not require writing output to (tpr) file
- File inputs ultimately should be generalized to API objects
### gmx30 API access to GROMACS file manipulation and topology manipulation tools.
- United implementation for workflow API and utility functions (e.g.
possibility of deferred execution / data transfer)
- Utility API should be sufficient to reimplement CLI tools
- I/O should ultimately be separate from algorithm; filesystem
interaction optional
- Consider feature requirements of other projects such as MDAnalysis.
### (Issue #2698) gmx31 Documentation integration.
Establish policies and layout for external (installed) API
documentation, extension API for plug-in developers, and developer
documentation for API and library implementation levels. Integrate with
previous \`webpage-sphinx\` and \`doxygen\` targets and output.
Scope
-----
There are definitely design points for consideration that are left out
of this list merely because they are not essential to gmxapi
functionality or because gmxapi doesn’t have strong dependence on the
ultimate design choice. These topics include:
- Task scheduling framework
- Insertion points in the MD loop
- Encapsulation of integrator
Further downstream, this infrastructure is necessary to support new high
level interfaces to GROMACS, but the discussion of such interfaces is
deferred as much as possible to separate issues to streamline
incorporation of the changes proposed here in less public / stable code.
Criteria for Completion
=======================
This issue is resolved when sufficient infrastructure is in place to
support ongoing development of the other subprojects in issue \#2045
against the GROMACS master branch in Gerrit. In particular, the
proof-of-concept client code at https://github.com/kassonlab/gmxapi and
https://github.com/kassonlab/sample\_restraint would not require a
forked specialized copy of GROMACS.
*(from redmine: issue id 2585, created on 2018-07-24 by eirrgang)*
* Relations:
* relates #2045
* relates #2229
* relates #2590
* relates #2574
* relates #1972
* relates #2224
* child #2586
* child #2587
* child #2605
* child #2610
* child #2620
* child #2623
* child #2630
* child #2651
* child #2756
issue