S

sel4-alloc

sel4-alloc

Crates.io

x86 Documentation | ARM Documentation

Implementations of the ObjectAllocator trait from rust-sel4. You can use this crate to manage CSpace, VSpace, and untyped regions.

Resource allocation on seL4 is entirely delegated to userspace. The kernel has no internal dynamic memory at all, and relies on userspace to provide it memory (in the form of untyped capabilities) where it can store objects.

There are two main resources that need management: capability slots and untyped memory objects. There is a delicate interplay between these - in order to make more capability slots available, more CNodes may need to be created, which needs untyped memory and some slots to store capabilities.

Implicit is the assumption that there is also some way to acquire blocks of memory to store bookkeeping information for the allocators. These dependencies are also fundamentally cyclic - if you need a slot, you may need to allocate a new CNode, which may need more memory to store bitmaps or soforth, which in turn may require a new untyped to be mapped, which might need some slots.

Thus, while the strategies for managing these resources can be distinct, the managers themselves often need some amount of interaction. Some strategies are defined by this crate. Others may be created externally by implementing the various *Manager traits. When aggregated together into a single Allocator, the amount of resources that must be maintained for guaranteed availability can be determined. It is the responsibility of the user to determine that at least that many resources are available before making an allocation.

Design

This library tries to not impose policy, but instead provide the mechanisms for easily implementing policy.

CSpace layout

There are two CSpace allocators defined. One is a simple bump allocator which supports free's only if a strict stack discipline is followed (that is, every time you free a slot allocation, you free the most recently allocated slot). The other uses a bitmap to track used entries, and a linear scan to find free slots. The linear scan is relatively efficient, but still scales poorly to larger CNode sizes.

The default CSpace layout is a two-level tree. The top-level is a single CNode with 216 entries and no guard. The second level is parameterizable. By default, you should use a CNode with 216 entries and no guard. Otherwise, you can use anything else, but the only thing that really makes sense is a boxed trait object.

Untyped management

Currently, only a primitive UtBucket is provided, which mirrors the kernel's watermark bookkeeping. It is currently expected that users will do their own slab management, using one UtBucket per object type, and recycling objects as necessary, without retyping. This is somewhat prone to fragmentation, in that a worst-case access pattern could cause mostly-empty buckets except for one object, causing excessive resource consumption. This is not a fundamental limitation of the kernel - an intelligent userspace could make it possible to revoke objects and re-create the object graph, with the objects living in new, more compact untyped memory objects. The current implementation provides no mechanism for this.

Memory management

The entire memory management scheme is ad-hoc right now. Eventually, it should provide a flexible base for implementing things like demand paging and efficient shared memory.

Frame allocation

The frame allocator is responsible for taking untyped memory objects and splitting them into pages, and dishing them out as request. Currently, it uses a simple free list, and is not aware of large pages.

Eventually this will be replaced with something like OpenBSD's pmemrange or Windows' PFN database.

Reservation Trees

The core of virtual address space management is the "reservation tree". At its heart is a binary search tree (in particular, an intrusive red-black tree). This is used to provide best-fit reservation: when a range of N bytes is requested, the node to service the request from is selected such that its available range of bytes M minimizes M-N, minimizing fragmentation. This node is then split. The red-black tree guarantees logN worst-case operations.

Reservations in no way reflect the actual structure of the page table, they just ensure that when requesting virtual memory only free memory is returned.

Adjacent blocks are eagerly coalesced on free and reservations can be shrunk, or expanded if there is available space after it.

Page table management

Page table management is provided as a hierarchical tree, mirroring the hardware layout in spirit. It provides for large pages. It doesn't care about where pages are mapped - it will lazily create paging structures as needed as the address space fills. Currently, there is no facility for coalescing small pages into large pages, although it can split large pages into small pages where necessary due to changing mappings.

In the future, a paging daemon could be responsible for coalescing small pages, freeing unused paging structures, and implementing demand paging maintenance tasks. Right now, a small amount of bookkeeping is tracked that would be useful for such a daemon to determine when cleanup should be performed.

Bootstrapping

The most challenging part of this entire setup is initialization of bookkeeping structures for these allocators. There is a certain amount of magic and handwaving involved - before these structures are set up, there is no reasonable malloc. The current solution is to sidestep the problem and use resources from a fixed-size pool to bring up these structures to satisfaction. The resources allocated during bootstrap are never freed.

Currently, the following resources are required per complete set of allocators (frame, cspace, utspace, vspace):

  • At least 8 slots in the CSpace.
  • 4KiB for a page directory
  • 4KiB for a page table
  • At least one 4KiB page in that page table. This space is required for initial bookkeeping.

Given this and either additional untyped memory or an endpoint for acquiring such, this is an adequate environment to fully initialize a set of allocators.

Given the restriction on untyped memory to be a power of two number of bytes large, rounding this up to 16KiB and expanding the CSpace to 256 slots comes at no cost.

This ultimately is the minimum cost of any process using Robigalia: 16KiB. This also assumes there is no memory reserved for code or the stack, so the practical minimum resource consumption will be somewhat higher.

Status

Functional, still needs polish and testing.