Skip to content
GitLab
    • GitLab: the DevOps platform
    • Explore GitLab
    • Install GitLab
    • How GitLab compares
    • Get started
    • GitLab docs
    • GitLab Learn
  • Pricing
  • Talk to an expert
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
    • Switch to GitLab Next
    Projects Groups Topics Snippets
  • Register
  • Sign in
  • GROMACS GROMACS
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
    • Locked files
  • Issues 316
    • Issues 316
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 95
    • Merge requests 95
  • Deployments
    • Deployments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • GROMACS
  • GROMACSGROMACS
  • Issues
  • #3379
Closed
Open
Issue created Feb 13, 2020 by Paul Bauer@acmnpvMaintainer

C++ API for simulation input and output

Functionalities such as (hybrid) Monte Carlo, simulation replicas, replica exchange, and input preparation/manipulation share a need for API access to simulation inputs and outputs.

Additionally, efforts to limit the responsibilities of individual tools (and separate out convenience options) warrant light-weight ways to connect tools, including ways to filter or manipulate trajectory output before it hits the filesystem. See, for instance, #3286 (closed).

This issue is intended to collect a roadmap for design and development.

Related efforts include

  • encapsulation, abstraction, and interface development under nb-lib
  • restructuring of simulator launch, collaborations, and data structures related to expansion of the ModularSimulator (links from Pascal? Paul?)
  • expansion of the MdModules framework (links from Christian? Others?)
  • evolution of modular input handling
  • evolution of the checkpoint facilities
  • clarifying simulator program state and invariants (#3325 (closed), #2375 (closed))

Preliminary tasks

  • Confirm test coverage for parallel runs.
  • Confirm test coverage for checkpointed "multisim" multi-simulation runs.
  • Confirm test coverage for start when -cpi is specified but checkpoint file does not (yet) exist.
  • Confirm test coverage for out-of-sync multisim case where some but not all checkpoint files are present.

Use cases

To clarify the scope of this issue, define some use cases.

Application level

Features and tools enabled by the API functionality described in this issue.

Ensemble simulation / multi-sim

Temperature replica exchange

Hamiltonian replica exchange

Monte Carlo rejection of a trajectory segment

convert-tpr / gmxapi.modify_input

gmx dump

grompp

nb-lib translation

Filesystem-decoupled input preparation and simulation

Filesystem-decoupled simulation output handling

API level

API use cases driving features within this issue scope, supporting the scenarios expected within the application use cases above.

Obtain a reference to the output of a simulation segment.

Produce input for a simulation segment from the output of a simulation segment.

Obtain a modified SimulationInput from an “editing” operation.

Compose a SimulationInput

Decompose a SimulationInput (topology, microstate, simulation parameters, metadata, others?)

Fingerprint a SimulationInput (identify the trajectory of which it is a part and the segment that will be produced (uniquely to the point of reproducibility and/or scientific relevance))

Library level

Library-internal use cases included by the above API implementation scenarios, or connected to the accompanying (re)factoring.

Apply SimulationInput to consuming modules.

Initialize volatile data (internal state) from the (immutable) record of input.

Coordinate a Memento, or publish light-weight (opaque) handle to simulator output or checkpoint (don’t bake in details of data locality or structure)

Module level

Interactions between GROMACS internal modules and the new API facilities or supporting infrastructure.

(Re)initialize internal state.

Dump internal state.

Confirm input validity.

Register information or collaboration dependencies.

Register, publish, or be able to describe available outputs.

Additional goals

Distinguish between (immutable) input and (mutable) program state (clarify stages of initialization, reform inputrec use cases).

Clarify the information hierarchy represented by SimulationInput (and SimulationOutput)

Maximize reusability of the MD runner

  • allow SimulationInput to be reapplied in a process lifetime
  • understand reusable resources or data structures that do not need reinitialization

Define SimulationState encapsulation, or coordinate with its road ma

Deferred

To further clarify the scope of this issue, identify related tasks that should have a more explicit road map, but which are (currently) considered beyond the scope of this feature topic.

  • Decouple Mdrunner collaborations from assumptions of file-based I/O (Remove the ArrayRef<const t_filenm> from gmx::Mdrunner.)
  • Modernize/unify run time simulation options handling (#2877 (closed))
  • clean up the mdrun call hierarchy and program flow (input aggregation, acquisition of run time resources, component initialization and binding, creation protocols, “runner” versus “simulator”)
  • Decouple Mdrunner from membed and essential-dynamics implementation details.
  • Logging abstraction (#2999 (closed))

Tasks

Use the new SimulationInput abstraction as the focal point for restructuring simulation setup and simulator initialization in flexible API-friendly ways. Work towards clearer representations of prescribed work while decoupling from specific file formats. Allow lighter weight representations and transformations of simulation input for ensemble methods and other many-simulation workflows.

  • Fork t_inputrec into the representation of (immutable) input data versus the remaining (mutable) working data. t_inputrec and t_working_data
  • Provide a way to initialize t_working_data from t_inputrec
  • Read files during SimulationInput construction and store in serialization memory buffer or copyable versions of t_inputrec, mtop, and t_state. Let the existence of TPR and checkpoint input files be client-level concerns, and encapsulate their handling from the rest of the mdrun call stack.
  • Create RAII holders for shared data that does not already have a clear owner at the Mdrunner::mdrunner level.
  • Apply SimulationInput directly to its consumers. Remove legacy structures from the Mdrunner::mdrunner() level that are used only to ferry data between SimulationInput and its consumers.
  • Allow the SimulationInput to manage distributed data. (Encapsulate the data locality management.)
  • Identify and implement some minor transformations that are possible to SimulationInput without re-preprocessing.
  • Allow SimulationInput to be used to generate the filesystem representation of simulation input data (TPR+CPT or some new format).
  • Let grompp produce complete simulation input (including the structures that are not initialized until the checkpointing has been set up).
  • Decouple grompp from the file format(s), and just produce a SimulationInput object.
  • Reimplement gmxapi.simulation operations in terms of SimulationInput.
  • Reimplement gmxapi.modify_input and convert-tpr in terms of SimulationInput. (also ref #3295 (closed))
  • Allow the Simulator (or its output object) to produce a SimulationInput.
  • Converge SimulationInput development with hybrid MD/MC development and Nb-lib input handling.
  • more (please contribute)

A complete concept of the hierarchy of information comprising SimulationInput should be explored, but is neither necessary nor likely for near-term efforts.

Criteria for completion

This issue may remain open as long as it is a useful road map, but can likely be considered “resolved” when the API use cases to support the targeted applications are well understood, and either implemented or independently tracked on another road ma

(from redmine: issue id 3379, created on 2020-02-13 by eirrgang)

  • Relations:
    • relates #3286 (closed)
    • relates #3285 (closed)
    • relates #3433 (closed)
    • relates #3422 (closed)
    • child #3374 (closed)
    • child #3439 (closed)
Edited Mar 08, 2022 by M. Eric Irrgang
Assignee
Assign to
Time tracking