Decentralised package manager
Summary
Inko needs a package manager to facilitate the creation and distribution of third-party modules. Traditionally languages use a centralised package archive, making it easy to find modules to install. For Inko, I want a decentralised model like Go: modules are shipped as Git repositories, and can be hosted anywhere you'd like. The package manager has to work on all platforms that Inko supports.
Motivation
Without a package manager, Inko will never take off; unless somebody writes their own. The problem with a third-party package manager is that it may not integrate as well. There's also the risk of different people writing their own package managers. Take a look at Python for example, it has: easy_install, pip, conda, poetry, pipenv, and probably more. Some of these tools overlap, some don't. I don't want to repeat this mess.
Since we don't have the resources to host a central repository, I would like the package manager to take a decentralised approach. This allows us to offload hosting costs to the likes of GitHub and GitLab, allowing us to focus on building the CLI.
To make it easier to discover these packages, we could provide some sort of package registry. This would just be a Git repository and a static website. Packages are added using regular merge requests, and only provide some basic information such as:
- The URL they are located at
- A description
- Some contact information
- Links to the issue tracker, documentation, etc
The decentralised approach also comes with the benefit that name squatting is not an issue. Since packages are identified by a URL (or part of a URL), you can't register the name "apple" and deny anybody else from using it. Instead, you'd end up registering "gitlab.com/alice/apple".
Implementation
The list below is not exhaustive, just what I can think of at this time.
Package names and namespaces
Packages are just Git repositories, for now. In the future we could perhaps
support other SCM software such as Mercurial, but that's not a priority for now.
The package name is not tied in to the namespace, meaning that the package
gitlab.com/inko-lang/http
is free to choose what namespace(s) it defines;
instead of being forced to gitlab_com::inko_lang::http
or something like that.
Instead, the namespace could just be http
.
Packages aren't allowed to use the std
namespace, mostly so they don't mess
with the standard library by accident. Installing such a package should lead to
an error.
To reduce the amount of typing, a package is identified by its host and a path.
So instead of https://gitlab.com/inko-lang/http
, the package ID is
gitlab.com/inko-lang/http
.
Each package must contain a src
directory that contains the Inko source code
of the package.
Versioning and dependency resolution
For versioning we'll use semantic versioning. To simplify version selection, we should only support specifying a minimum version of a dependency. When multiple packages depend on the same dependency, we'll pick the most recent version that matches the constraints. For this we'll enforce semantic versioning, meaning it's an error to have different major versions for the same package in your dependency tree.
An example: your project "kittens" specifies the following dependencies:
- foo >= 1.0
- bar >= 1.5
- baz >= 2.0
Because we enforce semantic versioning, this translates to the following:
- foo >= 1.0, <= 2.0
- bar >= 1.5, <= 2.0
- baz >= 2.0, <= 3.0
Package "baz" specifies the following dependency:
- bar >= 1.6
In this case, the following packages/versions will be installed:
- foo: 1.0 or a newer 1.x
- bar: 1.5 or a newer 1.x
- baz: 1.6 or a newer 1.x
If "baz" instead required bar >= 2.0
, a dependency error is produced, because
"kittens" doesn't work with "bar" 2.0 or newer.
More complex ranges such as >= 1.5, <= 1.8
are not supported. I don't want to
complicate our dependency solver just so a few people out there can specify
crazy version requirements. Instead, packages should follow semantic versioning
properly.
To resolve such versions we probably need some form of SAT solver. I think we might be able to implement this using a heuristic, provided we can prove that it completes in reasonable time.
Package versions and metadata
A version is just a Git tag or branch. So when you install
gitlab.com/inko-lang/http
version 1.0.0, it will install it from tag v1.0.0.
Each package must define a file at the root called package.toml
. I'm not a
huge fan of TOML, but it's better than YAML and JSON. This file must specify at
least the following:
- The current version
- The dependencies and their versions
- Some basic info such as the description, authors, project website, etc
- The minimum Inko version required
The version in this TOML file is used when creating the necessary directories (see below). The exact structure of this file is not yet defined.
When specifying a dependency, you must specify not only the version but also the SHA. This way if a tag is changed, you don't end up installing the wrong thing unexpectedly.
Install location and structure
Packages should be installed into XDG_DATA_HOME/inko/packages. We shouldn't install directly into XDG_DATA_HOME/inko so we can add future directories/files in there more easily.
In this directory there is a directory for every host, so:
- XDG_DATA_HOME/inko/packages/gitlab.com/
- XDG_DATA_HOME/inko/packages/github.com/
These directories will then contain directory trees that match the package
paths. So a package gitlab.com/inko-lang/http
maps to
XDG_DATA_HOME/inko/packages/gitlab.com/inko-lang/http
.
These directories will then have a version directory for every version. For example:
- XDG_DATA_HOME/inko/packages/gitlab.com/inko-lang/http/1.0.0
- XDG_DATA_HOME/inko/packages/gitlab.com/inko-lang/http/1.0.1
- XDG_DATA_HOME/inko/packages/gitlab.com/inko-lang/http/1.1.0
These directories in turn will have a src
directory that contains the source
code:
- XDG_DATA_HOME/inko/packages/gitlab.com/inko-lang/http/1.0.0/src
- XDG_DATA_HOME/inko/packages/gitlab.com/inko-lang/http/1.0.1/src
- XDG_DATA_HOME/inko/packages/gitlab.com/inko-lang/http/1.1.0/src
These directories can then define whatever modules they wish to expose. These
src
directories can then be added to the compiler's load path. Assuming you
were to specify these manually using the CLI, that would translate to something
like this:
inko run \
-i XDG_DATA_HOME/inko/packages/gitlab.com/inko-lang/http/1.0.1/src \
-i XDG_DATA_HOME/inko/packages/gitlab.com/inko-lang/kittens/1.5.0/src \
foo.inko
Namespace collisions
In this setup it's possible for two packages to define the same namespace. For
example, both the http and kittens package above could define foo::bar
as a
namespace. Due to how the load path is traversed, whatever package defines the
namespace first ends up winning. I can see three options:
- Error when one or more packages define the same namespace (when compiling your project).
- Ignore it and let the developer handle it.
- Make the last segment of the package identifier's path the root namespace.
For option 3 that means package gitlab.com/inko-lang/http
can only introduce a
top-level http
namespace. This would have to be enforced at install time. This
doesn't require a different directory structure compared to the one mentioned
above, though it leads to a somewhat funny structure like so:
XDG_DATA_HOME/inko/packages/gitlab.com/inko-lang/http/1.0.0/src/http/foo.inko
The downside of this approach is that renaming the repository also requires renaming the namespace. With that said, if a repository is renamed I think it should be treated as a different package (it may very well be at that point), and thus also a different namespace.
Either way, this will require some thinking before we make a final decision.
Incremental compilation
One day the Inko compiler will support incremental compilation. At that point installing a package should also compile its bytecode, removing the need to do so every time a project needs it. For now this is not a priority, as incremental compilation is something we likely won't support for quite some time.
System dependencies
I'm not planning on adding support for compiling code upon installation, such as
C libraries (e.g. how in Rust you can use a build.rs
file to compile
additional non-Rust code). This introduces a myriad of security problems (e.g.
build scripts uploading your SSH keys), and I want developers to write their
code in Inko; not C or something else. Managing system-level dependencies gets
super hairy, and I prefer to defer that to tools dedicated towards that.
We may change this decision in the future, but not until there are enough use cases that warrant supporting this.
CLI
The CLI should be built into the inko
command. I'm thinking of the following
commands:
-
inko install
: installs a package -
inko uninstall
: uninstalls a package -
inko show
: shows info about a package (its name, install location, etc) -
inko list
: list all installed packages
These would all be implemented in Rust, as Inko won't have the necessary dependencies (e.g. an HTTP client) for quite some time.
Drawbacks
Using Git requires the use of libgit2, which in turn introduces a whole bunch of Rust dependencies. It may also slow down compile times. An alternative is to install tarballs instead of Git repositories. This requires that for every host (GitHub, GitLab, etc) we know how to map a package name and version to the correct tarball URL. This is not that much work, but it prevents developers from hosting packages in private Git repositories not supported by the Inko package manager. In addition, this may not play nicely with installing dependencies from a branch. Since developers inevitably want to install from Git (e.g. when testing some quick fix), we need to probably support that anyway.