Many objects require extra files or data that is external to its own data. For instance, a piece of software may require a source code repository, perhaps hosted on GitHub, or a .tar file containing its source code.
As opposed to Versioned Objects, these files are often large and do not change in the same manner as other objects. As a result, we maintain these objects separately as a Resource Object. In this case, such files or data are usually stored as themselves and versioned by a hashing function, such as SHA-256.
Occam supports a variety of archive packages as resources and can performs actions upon them. For instance, Occam understands .zip and .tar files and can look at their contents. The handler for these files can even retrieve parts of files from within them. This is useful when exploring preserved archive file formats and doing some type of analysis on files within or when someone is interested in only a small file from the larger whole.
These different types of resource files are implemented using a common interface. They are located in the resources/plugins directory. Currently, several are supported:
file (the fallback when all else fails)
tar (archive file)
zip (archive file)
git (versioned repository)
mercurial (versioned repository)
docker (docker containers)
Each provides a set of functions to retrieve the data or data within the archive if it acts as a collection of files. All files are versioned by hashing their contents, so a particular version can be requested. Sometimes the versioning is inherent to the format and stored a particular way. git resources, for instance, have a mechanism for versioning and so that hash is provided by the git service itself.
The resource plugin can create any data it wants to represent the resource. The file plugin does the most basic representation: the file data in a directory named after its SHA hash. However, this can be extended based on the format. The tar plugin, for instance, creates a cache of its directory listing when reading a directory within its archive file. This is because reading the file list of a tar file can be rather slow, but listing the contents can be very useful. It can create this cache within its own space however it wishes. The plugin essentially owns the storage directory it is given.