Scaling Git: S3 Proof of Concept

Introduce a new "s3" object database source that stores Git packfiles in S3-compatible object storage. This complements the existing "files" backend and allows Git repositories to back their object database entirely in object storage.

The backend is built on top of the embedded packed-object source: on first access it fetches a content-addressed manifest from S3, downloads any packfiles not already present in a local cache, and then delegates all object lookups to the packed store. Writes produce a new packfile, upload it together with its index, and atomically update the manifest pointer. Transactions queue objects and flush them as a single packfile on commit.

Authentication uses AWS Signature Version 4; credentials are read from the S3_KEY_ID and S3_KEY_SECRET environment variables. The current manifest version can be pinned via GIT_S3_MANIFEST to obtain a stable, point-in-time view of the object store.

To use the backend, spin up an S3-compatible store (e.g. MinIO or SeaweedFS), initialize a repository, and set the object format extension:

$ git init repo
$ git -C repo config set extensions.objectFormat \
      s3://http://localhost:9000/<bucket>/<prefix>
$ S3_KEY_ID=... S3_KEY_SECRET=... git <command>
Edited by Patrick Steinhardt

Merge request reports

Loading