Feature: Partial Tasks and Their Maintenance
This feature will add specific and explicit memory of tasks for individual objects.
These "partial" tasks just resolve dependencies local to the object in question and are not "full" tasks, which are defined as those that can be deployed to a backend and/or involve a Provider object. Such Provider objects are those that provide an environment/architecture pair that can support an object. Examples of such providers would be emulators or container base environments.
The partial tasks, in contrast, have a single level: the object desired to build or run, and they contain a full resolution of dependencies. Those dependencies have revision
information based on their version
tags, if any, and resolved to objects known to the system at the point where the partial task was generated.
Progress
-
Update the ManifestManager
to generate partial tasks -
Update the ManifestManager
to store partial tasks -
Update the KeyManager
to sign partial tasks for discovery. -
Update the KeyManager
to verify partial tasks for discovery. -
Update the BuildManager
to generate a run task based on the successful build. (mostly done) -
Update the DiscoverManager
to pull the partial run task. -
Update the DiscoverManager
to pull the partial build task. -
Update the objects status
command to expose the build or run task. -
Update the builds status
command to export the run task. -
Update the objects pull
command to generate a build or run task upon seeing a new stored revision. (mostly done, staged objects are a complication as usual)
Motivation
There were several reasons why task generation was lacking in the original design. First, the task generation and dependency resolution was a taxing process. It was non-trivial to look up and possibly reject several rounds of dependencies and, recursively, their sub-dependencies to decide the eventual full task.
Second, it was possible to pull an object and its built binary and not know which objects were necessary to pull in order to fully resolve the task. There were occasions where the "latest" versions of dependencies that would match the listed version tag on the object being discovered would be rejected when that object's run task was being generated on that system. This places us in a state of limbo where an object can be run on one system but not on a new system.
Third, and similarly, an object might decide on a completely new approach to its dependency resolution from one run to the next. This is true if a new version of a dependency is pushed. This might break known objects over time, and this would be surprising behavior. Generating a task upon creating a new revision of an object (or a new build, as the case may be) would lock in a particular way of running the object. This task would then be discovered along with the object. This would ensure that the exact same set of objects are being used by all people making use of that object or its build.
Advantages
Along with solving some of the issues listed above, the approach to keeping track of partial builds has a few advantages over the more naïve design. In line with the first issue listed above, the partial task process is much more efficient. The system can take a resolved partial task and simply layer it together with each Provider (emulator, container environment, operating system, etc) to form the full task. The manifest generation only has to augment the partial task as needed to comply with the extra requirements that might be needed, such as updating the command being invoked or adding input objects.
The code for task generation is also much clearer since the design now presumes that partial tasks can be generated in isolation, and then it presumes that such partial tasks are always available. Instead of one large function that generates a full task, there is now a small function that generates the full task that calls upon a new function to generate (or pull from the object store) a partial task. It should be far easier to test the task generation and document exactly how it is meant to function.
As part of issue three listed above, the discover methods can now be redesigned to pull the object and then use the existing methods to pull the partial task and the task's (and thus object's) direct dependencies. There will be no real need, other than for purposes of general healthy propagation of objects throughout the system, to recursively pull dependency objects as listed in the normal object metadata. This breaks down if a partial task is not actually made available, so it might be true that the normal object discovery methods might still need to be there as a fallback mechanism.
Design
The manifest manager desires to reuse task generation and dependency resolution whenever possible. Partial tasks are a slice of a full task and stored alongside the object they depict. In order to promote trust upon discovery, such partial tasks are signed by an actor. The actor that maintains an object will hopefully provide such a signed object, much like that actor hopefully provides a signed build binary. However, at any time any actor might create their own task to build or run an object when they wish to make different dependency resolution decisions. This is the same as how any actor can decide to build their own version of any object in spite of a signed build already existing.
Changelog
New methods have been added to the ManifestManager
to support partial tasks:
ManifestManager#partialTaskFor
Generates or retrieves the cached partial task for the given object.
It is possible that such a task is cached. In that case, it will retrieve and return that existing partial task.
This function does not resolve tags or mount paths. The main taskFor function will do this, instead, in order to preserve more complicated tasks in the future from an existing partial task.
Arguments:
- object (Object): The object for which to generate a task.
- section (str): The type of task to generate (run or build).
- running (dict): The manifest for the subtask this task is invoking.
- inputs (list): A set of wires depicting objects connected to this object.
- buildTask (dict): The task metadata used to previously build the object.
- person (Person): The actor generating the task manifest.
Returns:
- dict: The partial task manifest.
ManifestManager#generateTaskFor
Generates the partial task for the given object.
Ignores any existing cached version of the partial task.
When BuildTask
is provided, that task metadata supplies which build is
desired to be used to run the object. It does not apply when a build task
is being generated (when section is 'build'.)
Arguments:
- object (Object): The object for which to generate a task.
- section (str): The type of task to generate (run or build).
- buildTask (dict): The task metadata used to previously build the object.
- person (Person): The actor generating the task manifest.
Returns:
- dict: The generated task manifest.
ManifestManager#writeTaskFor
Stores the generated task for the given object and criteria.
This creates and stores the task as a normal object as the actor provided
in the person
field. The resulting record of the task is stored along with
a signature also associated with that actor. This signature will be verified
when the object given in object
is discovered and pulled on another system
along with this new task object.
The buildId
helps identify run tasks associated with particular builds of
the provided object. It does not apply when writing a task for building said
object (when the section is 'build'.) In that case, buildId should be None.
Arguments:
- object (Object): The object for which the given task is associated.
- task (dict): The generated task metadata.
- person (Person): The actor generating this task.
- section (str): The type of task (run or build).
- buildId (str): The build task id used for this task.
Returns:
- Object: The newly formed task as an Object.
ManifestManager#retrieveTaskFor
Retrieves the existing cached task for the given object and criteria.
The task records were previously created or discovered and associated with
the given object. They are identified by a combination of the given
section
and buildId
fields along with the object id and its owner's
public key (identity URI.)
This also falls back to the provided person
identity in case the given
actor has generated their own task for the object.
This assumes that the task records that are queryable have already been validated via their signatures. It makes no attempt to verify signatures upon retrieval.
The value returned is the task metadata and is meant to be exactly aligned
with the expected data returned from a call to the generateTaskFor
function. If no such task was stored or discovered on the local instance,
this function returns None.
Arguments:
- object (Object): The object for which to retrieve a task.
- section (str): The type of task (run or build).
- buildTask (dict): The build task metadata to use.
Returns:
- Object: The task manifest or None if such a task was not found locally.
ManifestManager#taskFor
Refactored: This function has been greatly simplified to make use of the above functions.
KeyManager#verifyTask
Verifies that the given task association was signed by the given identity.
Arguments:
- obj (Object): The object associated with the task.
- verifyKeyId (str): The verification key uri.
- signature (bytes): The signature data.
- signature_type (str): The signature algorithm.
- signature_digest (str): The signature digest used.
- uri (str): The identity of the signer.
- task (Object): The task object.
- section (str): The type of task (run or build).
- buildId (str): The id of the build task used to build the object.
- signed (datetime): The time the task was signed.
- published (datetime): The time the task was generated.
Returns:
- bool: True if the signature matches the provided association.
KeyWriteManager#signTask
Returns a signature representing the association of a task and an object.
Arguments:
- obj (Object): The object whose build is given.
- uri (str): The identity to sign as.
- task (Object): The build task.
- section (str): The type of task (build or run).
- buildId (str): The build task id.
- published (datetime): When the task was generated.
- signed (datetime): When the build was signed (default: now)
Returns: (tuple)
- str: The hashing function used.
- str: The signing algorithm used.
- bytes: A digest representing the derived signature.
- datetime: The time that the signature was created or the value of 'signed'.
- string: The verify key id that was used to sign this signature.
manifests.records.task.TaskRecord
Database: The tasks
table and TaskRecord
are introduced to associate an object, optionally its build, with the generated partial task. This record is then retrieved and used to pull the task object when it is known. This effectively caches the generation of the task.
This TaskRecord
also contains a signature that associates the generated task directly with an actor. Generally, the actor most trusted is the actor that also owns the object and build within the given association depicted by the generated task. However, a task can be generated and signed by any actor, and perhaps desired by some particular entity when they want to make different or more up-to-date choices on dependency resolution.