Terminology for extension module naming

A slightly long one here @lewisjarednz. Jumping on a call might be simpler, just shoot me a message

The problem

Our terminology isn't crystal clear, which makes using fgen downstream a little bit tricky and will impact our own development too at some point.

Further context

I am doing a copier template in https://gitlab.com/magicc/copier-fgen-based-repository. This has revealed the following difference in language:

  • in pyproject.toml, under [tool.scikit-build.cmake.define] we set (for example) EXTENSION_MODULE_NAME = "_lib". Here we're saying that the extension module is called "_lib" (so scikit-build creates a library called "_lib" which we can then import and use from Python if the paths/installation is correct)
  • when we call fgen from the command-line, we do something like fgen generate --extension derived_types._lib .... Here 'extension' (which is close to extension module) requires the python package (in this case derived_types) to be in the name too so that the Python code does from derived_types._lib import w_fortran_thing rather than the incorrect from _lib import w_fortran_thing (the second is incorrect because the lib is installed inside a Python package i.e. isn't a top-level import)

So, we use extension/extension module name to mean the name of the extension module that scikit will build in one case and to be the full Python import path in the other case.

Proposed solution

  1. Clarify terminology
  2. Re-consider our CLI

For clarifying terminology, some notes on terms I think we currently use (although this is probably something we should capture somewhere more formally):

  • extension/extension module name: the name of the extension module that will be available from Python, usually a more generic name like "_lib". We should also recommend using an underscore in the name to highlight to users that the intent is not to use this library directly (numpy also discourages users from directly using the Fortran libraries).
  • ancillary_lib_name: the name of the ancillary library that gets built from your Fortran e.g. ocean_carbon_cycle, core, methane_cycle building the library. This name is only used internally, I think, so is less important
  • python_project_name: the name of the Python project (e.g. magiccly)
  • wrapper-directory: when generating files with fgen, where to put the generated Fortran *manager.f90 and *wrapped.f90 files
  • python-directory: when generating files with fgen, where to put the generated Python files

For our CLI, it currently looks like

fgen generate --extension [x in 'from x import w_fortran_thing'] --wrapper-directory [wrapper-directory] --python-directory [python-directory] [yaml-file-to-generate-from]

I would suggest updating to

fgen generate --wrapper-directory [wrapper-directory] --python-directory [python-directory] [yaml-file-to-generate-from]

where we would put the Python information stuff into the .yaml file. I think this could work as it would make a single, clear space for our Python import stuff, where "Python import stuff" is both the Python import path when using the stuff the yaml file refers to (e.g. from derived_types._lib import w_derived_type_build) as well as being a place for any other imports (I remember the other day us discussing how you need to know the location of other wrapped Python classes in order for some dependencies to flow correctly so this would be where we could specify that, in the Python wrapper, you also need from some_other_wrapped_class.path import WrappedClass).

Edited by Zebedee Nicholls