Python user interface for obtaining simulation artifacts as files.
Guarantee access to trajectory and checkpoint output files.
-
Allow arbitrary command line arguments to mdrun. (#4284 (closed)) -
Guarantee -o
, and map to mdrun.output.trajectory. (!2192 (merged)) -
Allow a runtime_args key word argument to gmxapi.mdrun()
. (!2192 (merged)) -
Map mdrun.output.checkpoint to the (curated) -cpo
value. (!2340 (merged))
Scenario
A user wants to run a simulation and acquire the resulting trajectory in a file with a specific name or even a specific filesystem location.
Remediation
With the resolution of #3144 (closed) and #4284 (closed), a user is able to get
the trajectory file name from an MD operation handle with
md.output.trajectory.result()
.
Tools outside of the API can be used to stash the output where the user
would like.
There are several problems. Some of them are:
-
If the file is not named or located where the user wants, the user must either copy or move the file. If the file is moved, gmxapi loses access to the simulation output for an operation that was previously considered to have been completed.
-
The “trajectory” output of the simulation operation is not intended to be a file path string in the near future. Treating it as one has to be considered an unsupported use case.
-
This is completely unintegrated with the trajectory appending semantics.
Deferred
- integration with the gromacs library-internal output file management
- robust workflow checkpointing and artifact validation
- non-filesystem-based output.
Follow up (tbd)
Significant questions:
- Under what circumstances should we checkpoint the entire trajectory product of the simulation (retain a complete copy)? I.e. Are trajectory frames a stream of data events that are consumed and then dropped, or is the entire trajectory a single result? My initial thought is that, in the long run, it is a series of data events, and whether they are retained for the entire graph execution is a function of the consuming operation, not of the mdrun operation.
- If
write_trajectory()
is a gmxapi operation, what options (if any) should the user have regarding how it is checkpointed? Should completion be a function of the target location, and, if so, what sort of error is encountered if the Context thinks the work is complete, but the output file does not exist or has a different fingerprint than expected? Should the Context retain the ability to re-deliver the file? How might the answer be affected by whether the working directory of the operation is on the same filesystem as the user-named output target (i.e. whether a filesystem “move” operation is a rename versus a data transfer)?
Proposal
I believe that the best idiom at the high-level interface is to
explicitly convert gmxapi operation output to a filesystem artifact with
a helper that consumes the output of the simulation operation’s
trajectory
output. I.e.
write_trajectory(myfilename, trajectory=md.output.trajectory)
This allows us flexibility in the details of the trajectory output
handle. It also allows us to confine details of trajectory writing
semantics to the write_trajectory
operation. For instance, it is clear
when the expected behavior is to produce a complete trajectory in a
single file, so it becomes clear what is required to checkpoint or
relaunch a partially executed work graph. It also clarifies that we do
not need to keep the full trajectory produced by mdrun in order to
consider an MD node to be complete, as long as the Context is able to
support the checkpointing behavior required by the trajectory consumers.
Additional discussion
Some older relevant discussions should be migrated to the GROMACS issue tracking system, but can be found on GitHub: