Why is importlib.metadata much slower than importlib.util.find_spec?
Not sure whether this should be reported here or on bugs.python.org.
In a clean Py3.8 venv with just ipython installed:
In [1]: import importlib.util, importlib.metadata
In [2]: %timeit importlib.util.find_spec("pip")
93.5 µs ± 256 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [3]: %timeit importlib.metadata.distribution("pip")
5.59 ms ± 58.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
As far as I know, importlib.util.find_spec does no caching (that's why I'm not actually timing the import itself), so my uneducated guess is that it should spend a similar amount of time looking for the importable module/package as importlib.metadata.distribution spends looking for the dist-info/egg-info. But importlib.metadata is ~50x slower. Why is this the case?
As a real application, consider packages who may want to set their __version__
at runtime by querying their own distribution version using importlib.metadata (e.g., if the version originally comes from git tags, using setuptools_scm). 5ms would be a not insignificant price to pay for that, especially if multiple dependent packages also do the same.