Use build system infrastructure instead of custom scripts to manage API levels.
Through GROMACS 2021, we use source validation scripts and extra macros in various build system tools to reconcile documentation organization and visibility, and internal header usage in terms of declared API levels.
The heuristics are not particularly easy to resolve during development and are not enforced at the most useful times.
This is a proposal to use a combination of repository layout and basic CMake functionality to improve the clarity, maintainability, and rigor of API / header scoping. As a result, we will need less custom infrastructure (source tree checker scripts).
Side benefits include decoupling from Doxygen in several ways that should make it easier to update our Doxygen version requirements in the future.
Proposal: Define API level and dependency policies via CMake
Module dependencies are explicit in the CMakeLists.txt.
Dependence on a target in api/
allows access to (one of the) installed API(s).
Additional build-tree-only targets provide interfaces within the build-tree / library-internal module public API.
This necessarily means that the sources and headers of all API levels
cannot stay mixed together the way they are. The ability to
#include "gromacs/moduleA_installed_header.h"
should not convey the
ability to #include "gromacs/moduleB_detail.h"
at build time.
moduleB_detail.h
would only be available as moduleB_detail.h
to
sources in the moduleB
target. moduleB_public.h
would only be
available to moduleA
as gromacs/moduleB_public.h
by depending on
moduleA in CMakeLists.txt.
api/gmxapi
and api/gmxapi/include/
demonstrate how this can
work.
However, applied within src/
, this confuses the current include
sorter script.
Headers are accidentally exposed because all of the modules in
src/gromacs/moduleX
have src/
in their include path.
The five headers in src/
could easily be moved to a subdirectory.
A target that exposes headers for inclusion without an additional path element
(e.g. #include "header.h"
) should not transitively allow headers to be
included through nested directories if those include paths are expected to
have different visibility or control.
Similarly, the headers that are accessible through #include "gromacs/header.h"
should be exposed by a target that does not transitively allow
#include "gromacs/module/header.h"
if we want to enforce a difference in
visibility of the latter.
Module-internal headers can be in the same directory as the module source
files. The simplest way to distinguish module-internal headers from the
allowed library-internal or public installed interface is not to make this
directory accessible through any CMake INCLUDE_DIRECTORY (either directly
or via the immediate parent directory, which would correspond to enabling
the two #include
styles).
Module API level:
Option 1: API level is determined at the scope of a module. A module’s public interface is either in the installed headers / public API or the module is available in the build-tree only. Doxygen visibility directives are not necessary, because the modules in the two APIs are curated.
Option 1.1: The installed API is provided by one (or a few) module(s) explicitly as a distinct API and C++ namespace.
Option 2: Each module may have installed headers and library-internal
headers. Two completely separate sets of API documentation are
generated, or at least C++ namespace should
help alert library API doc viewers / module maintainers when a symbol is
actually part of the installed API. It is straight-forward to export
different sets of include directories for the build tree interface and
for the installed interface / imported targets. I think that the scheme
that would be most consistent with past interface levels would be to
export the build-tree / library-internal interface include directory
such that headers are properly referenced as #include "module-header.h"
while installed headers are in include directories such that they are
included as #include "gromacs/module/module-header.h"
. If that is a sufficient
distinction, then only a single build-tree linking target is necessary
per module.
Documentation for API levels becomes scoped by file and location. The list(s) of installed headers (necessary for the CMake target INSTALL command) is the complete list of files for Doxygen to process for the public/installed API. The library-internal / modules API can be documented largely as before, but should link to the public API documentation for entities that are no longer in scope.
Update
Summarizing the conclusions of recent meetings and code reviews:
- Modules within
src/
are not installed. - Modules within
src/
provide their "public" headers (to other modules and build-tree clients) by providing an INTERFACE_INCLUDE_DIR such assrc/gromacs/moduleA/include/
. - Installed headers are provided by a target in
api/
. The canonical interfaces includeapi/nblib
andapi/gmxapi
. Alegacy_api
target is considered "unsupported," but is established as the transitional home for previous installed headers that were not tightly coupled to any CMake target (though nominally related to thelibgromacs
installed target). - The
legacy_modules
target in !793 (merged) is a very short-term solution and should be eliminated within a week or two through a concerted effort. - Since we will minimize the transitional period, it is reasonable to allow temporary transitive exposure of private headers, so we will not bother to move sources from, say,
src/gromacs/moduleA/private.h
tosrc/gromacs/moduleA/cpp/private.h
. Developer suggestion: temporarily move sources to a subdirectory to help identify dependencies, but move them back again before committing the update.
Distinguish between API documentation scope and API documentation verbosity.
- It may be appropriate or unavoidable in some cases for public API
headers to include notes, details, or references beyond the scope of
the public API (such as a developer note explaining the need for a
forward reference from the library API, or a necessary detail that
is explicitly excluded from the compatibility guarantee of the
public API specification). Such details can be hidden (e.g.
`\internal) to be excluded from the standard public API documentation, but revealed in developer builds of the public API documentation. Ref #3402 (closed).
- The library-internal API may similarly have standard documentation for normative use cases, plus internal implementation details or other more verbose documentation revealed by a more verbose documentation build target. Public API documentation accessible from the library-internal API documentation should have multiple visual reminders of the API distinction.
- When browsing the library documentation, it should be obvious to the developer of new library code whether a dependency they are considering is in the public API or library-internal API, and easy to identify module details that are not part of the library-internal API. Doxygen doesn't have great mechanisms for identifying which content was rendered conditionally (
\\cond@ or
\internal`), so it may be appropriate to use a clearly distinct documentation scope for module-internal docs, so that the scope of the library-internal API is not obscured when browsing non-public module details.
The easiest way to make these distinctions is to use distinct Doxygen groups for different API levels. In other words, if we want to continue to define “modules” such that a module expresses in multiple documented API levels, documentation clarity and maintainability can be improved by using multiple Doxygen groups per module instead of one group per module.
Questions:
- Should the public API documentation link to library API documentation or provide separate local docs for e.g. forward references or types / functionality that is opaque at that API level?
Further Questions
Question: Is there a preference for whether library or module interface headers
are necessarily nested in single subdirectory trees versus separated
into parallel directory trees? E.g.
$PROJECT_SOURCE_DIR/src/gromacs/module/private.h
plus
$PROJECT_SOURCE_DIR/src/gromacs/module/include/gromacs/module/private.h
,
versus $PROJECT_SOURCE_DIR/include/gromacs/module
plus
$PROJECT_SOURCE_DIR/src/gromacs/module
.
Answer: Modules in src/
are not installed targets.
Modules in src/
may be responsible for implementing headers from the legacy_api
INTERFACE target as part of the libgromacs installed target, but this is only encoded by naming conventions and not enforced.
Question: A handful of “convenience headers” in src/gromacs
aggregate “include”s
of module headers, providing multiple ways to access the same API
without a clear indication of what is normative. If we want module APIs
to be expressed in gromacs/module.h
headers, or to distinguish between
gromacs/module.h
and gromacs/module/feature.h
, we should clarify the
meaning of these paths. Is there a compelling reason not to just get rid
of these headers?
Answer: (proposed) these headers are part of the legacy API, which has been officially unsupported.
We can remove these headers as an easy step in trimming the installation footprint of legacy_api
,
but there are some side-effects with non-zero visibility, such as the sample TAF code in share/template
.
Question: Is more necessary to decouple from specific Doxygen versions (constraints on Doxygen version?)
Other checks to reimplement
gmxpre.h
go away?
Can Check for a precompiler define instead of #include "gmxpre.h"
, or
replace “gmxpre.h” with a preprocessor define provided by CMake?
sort includes with clang-format instead of includesorter.py
Use Doxygen directly instead of script acting on Doxygen XML for dependency graphs.
reconsider ${PROJECT_BINARY_DIR}/src/gromacs/installed-headers.txt
Document doxygen groups and directories in module headers instead of misc.cpp and directories.cpp
Consider freezing API and implementation versions.
Use monotonically versioned implementation namespaces (for (a) supporting multiple API versions such as file format revisions, and (b) clarifying API versions at compile time without worrying about API compatibility guarantees or versioning API. Very limited use may be allowed for © feature development or API migration across multiple commits, with tracking and approval of the release manager)
Documentation
(From !902 (closed))
Several policies and conventions are affected by these changes to build system dependency management.
Several files should at least be reviewed. I don't think GitLab let's us comment on files that aren't in the MR yet, so here is a list we can check off once files are confirmed "okay" or have been added to the MR.
-
docs/doxygen/user/usinglibrary.md
-
docs/release-notes/2022/?
-
docs/dev-manual/build-system.rst
-
docs/dev-manual/documentation-generation.rst
-
docs/dev-manual/doxygen.rst
-
docs/dev-manual/formatting.rst
-
docs/dev-manual/gmxtree.rst
-
docs/dev-manual/includestyle.rst
-
docs/dev-manual/naming.rst
-
docs/dev-manual/style.rst
-
docs/dev-manual/testutils.rst
-
docs/dev-manual/tools.rst
There isn't really a single doc dedicated to our testing patterns, so it may not be clear where to document strategies for test dependencies. If we want to reinforce the notion that gtest tests should be built in terms of the src/testutils/unittest_main.cpp
, then this file or other testutils
module documentation would be reasonable. Otherwise, we can just double-check the clarity of the docs/dev-manual
tree.