Compare revisions

Jürg Billeter · Jonathan Maw · Jonathan Maw · Tristan Van Berkom · Tristan Van Berkom · Angelos Evripiotis
--- a/CONTRIBUTING.rst
+++ b/CONTRIBUTING.rst
@@ -3,72 +3,137 @@ Contributing
 Some tips and guidelines for developers hacking on BuildStream


-Feature additions
-----------------
-Major feature additions should be proposed on the
-`mailing list <https://mail.gnome.org/mailman/listinfo/buildstream-list>`_
-before being considered for inclusion, we strongly recommend proposing
-in advance of commencing work.
+.. _contributing_filing_issues:
+
+Filing issues
+-------------
+If you are experiencing an issue with BuildStream, or would like to submit a patch
+to fix an issue, then you should first search the list of `open issues <https://gitlab.com/BuildStream/buildstream/issues>`_
+to see if the issue is already filed, and `open an issue <https://gitlab.com/BuildStream/buildstream/issues/new>`_
+if no issue already exists.
+
+For policies on how to submit an issue and how to use our project labels,
+we recommend that you read the `policies guide
+<https://gitlab.com/BuildStream/nosoftware/alignment/blob/master/BuildStream_policies.md>`_.
+
+
+.. _contributing_fixing_bugs:
+
+Fixing bugs
+-----------
+Before fixing a bug, it is preferred that an :ref:`issue be filed <contributing_filing_issues>`
+first in order to better document the defect, however this need not be followed to the
+letter for minor fixes.
+
+Patches which fix bugs should always come with a regression test.
+

-If you are experiencing an issue with BuildStream or would like to submit a small patch/feature, then
-you can `oepn an issue <https://gitlab.com/BuildStream/buildstream/issues/new?issue%5Bassignee_id%5D=&issue%5Bmilestone_id%5D=>`_
+.. _contributing_adding_features:

-For policies on how to submit and issue and how to use our project labels, we recommend that you read the `policies guide <https://gitlab.com/BuildStream/nosoftware/alignment/blob/master/BuildStream_policies.md>`_
+Adding new features
+-------------------
+Feature additions should be proposed on the `mailing list
+<https://mail.gnome.org/mailman/listinfo/buildstream-list>`_
+before being considered for inclusion. To save time and avoid any frustration,
+we strongly recommend proposing your new feature in advance of commencing work.
+
+Once consensus has been reached on the mailing list, then the proposing
+party should :ref:`file an issue <contributing_filing_issues>` to track the
+work. Please use the *bst_task* template for issues which represent
+feature additions.

-New features must be well documented and tested either in our main
-test suite if possible, or otherwise in the integration tests.
+New features must be well documented and tested in our test suite.

 It is expected that the individual submitting the work take ownership
 of their feature within BuildStream for a reasonable timeframe of at least
 one release cycle after their work has landed on the master branch. This is
-to say that the submitter is expected to address and fix any side effects and
-bugs which may have fell through the cracks in the review process, giving us
-a reasonable timeframe for identifying these.
+to say that the submitter is expected to address and fix any side effects,
+bugs or regressions which may have fell through the cracks in the review
+process, giving us a reasonable timeframe for identifying these.


-Patch submissions
-----------------
-If you want to submit a patch, do ask for developer permissions on our
-IRC channel first (GitLab's button also works, but you may need to
-shout about it - we often overlook this) - for CI reasons, it's much
-easier if patches are in branches of the main repository.
+.. _contributing_submitting_patches:
+
+Submitting patches
+------------------

-Branches must be submitted as merge requests in gitlab. If the branch
-fixes an issue or is related to any issues, these issues must be mentioned
-in the merge request or preferably the commit messages themselves.

+Ask for developer access
+~~~~~~~~~~~~~~~~~~~~~~~~
+If you want to submit a patch, do ask for developer permissions, either
+by asking us directly on our public IRC channel (irc://irc.gnome.org/#buildstream)
+or by visiting our `project page on GitLab <https://gitlab.com/BuildStream/buildstream>`_
+and using the GitLab UI to ask for permission.
+
+This will make your contribution experience smoother, as you will not
+need to setup any complicated CI settings, and rebasing your branch
+against the upstream master branch will be more painless.
+
+
+Branch names
+~~~~~~~~~~~~
 Branch names for merge requests should be prefixed with the submitter's
-name or nickname, e.g. ``username/implement-flying-ponies``.
+name or nickname, followed by a forward slash, and then a descriptive
+name. e.g.::

-You may open merge requests for the branches you create before you
-are ready to have them reviewed upstream, as long as your merge request
-is not yet ready for review then it must be prefixed with the ``WIP:``
-identifier.
+  username/fix-that-bug

+This allows us to more easily identify which branch does what and
+belongs to whom, especially so that we can effectively cleanup stale
+branches in the upstream repository over time.
+
+
+Merge requests
+~~~~~~~~~~~~~~
+Once you have created a local branch, you can push it to the upstream
+BuildStream repository using the command line::
+
+  git push origin username/fix-that-bug:username/fix-that-bug
+
+GitLab will respond to this with a message and a link to allow you to create
+a new merge request. You can also `create a merge request for an existing branch
+<https://gitlab.com/BuildStream/buildstream/merge_requests/new>`_.
+
+You may open merge requests for the branches you create before you are ready
+to have them reviewed and considered for inclusion if you like. Until your merge
+request is ready for review, the merge request title must be prefixed with the
+``WIP:`` identifier.
+
+
+Organized commits
+~~~~~~~~~~~~~~~~~
 Submitted branches must not contain a history of the work done in the
-feature branch. Please use git's interactive rebase feature in order to
-compose a clean patch series suitable for submission.
+feature branch. For example, if you had to change your approach, or
+have a later commit which fixes something in a previous commit on your
+branch, we do not want to include the history of how you came up with
+your patch in the upstream master branch.
+
+Please use git's interactive rebase feature in order to compose a clean
+patch series suitable for submission upstream.
+
+Every commit in series should pass the test suite, this is very important
+for tracking down regressions and performing git bisections in the future.

 We prefer that documentation changes be submitted in separate commits from
-the code changes which they document, and new test cases are also preferred
-in separate commits.
+the code changes which they document, and newly added test cases are also
+preferred in separate commits.

 If a commit in your branch modifies behavior such that a test must also
 be changed to match the new behavior, then the tests should be updated
-with the same commit. Ideally every commit in the history of master passes
-its test cases, this makes bisections more easy to perform, but is not
-always practical with more complex branches.
+with the same commit, so that every commit passes it's own tests.


 Commit messages
 ~~~~~~~~~~~~~~~
-Commit messages must be formatted with a brief summary line, optionally
-followed by an empty line and then a free form detailed description of
-the change.
+Commit messages must be formatted with a brief summary line, followed by
+an empty line and then a free form detailed description of the change.

 The summary line must start with what changed, followed by a colon and
 a very brief description of the change.

+If the commit fixes an issue, or is related to an issue; then the issue
+number must be referenced in the commit message.
+
 **Example**::

  element.py: Added the frobnicator so that foos are properly frobbed.
@@ -77,33 +142,464 @@ a very brief description of the change.
  the element. Elements that are not properly frobnicated raise
  an error to inform the user of invalid frobnication rules.

+  Fixes #123
+
+In the case that you have a commit which necessarily modifies multiple
+components, then the summary line should still mention generally what
+changed (if possible), followed by a colon and a brief summary.
+
+In this case the free form detailed description of the change should
+contain a bullet list describing what was changed in each component
+separately.
+
+**Example**::
+
+  artifact cache: Fixed automatic expiry in the local cache

-Coding style
------------
-Coding style details for BuildStream
+    o _artifactcache/artifactcache.py: Updated the API contract
+      of ArtifactCache.remove() so that something detailed is
+      explained here.

+    o _artifactcache/cascache.py: Adhere to the new API contract
+      dictated by the abstract ArtifactCache class.

-Style guide
-~~~~~~~~~~~
-Python coding style for BuildStream is pep8, which is documented here: https://www.python.org/dev/peps/pep-0008/
+    o tests/artifactcache/expiry.py: Modified test expectations to
+      match the new behavior.
+
+  This is a part of #123
+
+
+Coding guidelines
+-----------------
+This section discusses coding style and other guidelines for hacking
+on BuildStream. This is important to read through for writing any non-trivial
+patches and especially outlines what people should watch out for when
+reviewing patches.
+
+Much of the rationale behind what is layed out in this section considers
+good traceability of lines of code with *git blame*, overall sensible
+modular structure, consistency in how we write code, and long term maintenance
+in mind.
+
+
+Approximate PEP-8 Style
+~~~~~~~~~~~~~~~~~~~~~~~
+Python coding style for BuildStream is approximately `pep8 <https://www.python.org/dev/peps/pep-0008/>`_.

 We have a couple of minor exceptions to this standard, we dont want to compromise
 code readability by being overly restrictive on line length for instance.

-The pep8 linter will run automatically when running the test suite.
+The pep8 linter will run automatically when :ref:`running the test suite <contributing_testing>`.
+
+
+Line lengths
+''''''''''''
+Regarding laxness on the line length in our linter settings, it should be clarified
+that the line length limit is a hard limit which causes the linter to bail out
+and reject commits which break the high limit - not an invitation to write exceedingly
+long lines of code, comments, or API documenting docstrings.
+
+Code, comments and docstrings should strive to remain written for approximately 80
+or 90 character lines, where exceptions can be made when code would be less readable
+when exceeding 80 or 90 characters (often this happens in conditional statements
+when raising an exception, for example). Or, when comments contain a long link that
+causes the given line to to exceed 80 or 90 characters, we don't want this to cause
+the linter to refuse the commit.
+
+
+.. _contributing_documenting_symbols:
+
+Documenting symbols
+~~~~~~~~~~~~~~~~~~~
+In BuildStream, we maintain what we call a *"Public API Surface"* that
+is guaranteed to be stable and unchanging across stable releases. The
+symbols which fall into this special class are documented using Python's
+standard *docstrings*, while all other internals of BuildStream are documented
+with comments above the related symbol.
+
+When documenting the public API surface which is rendered in the reference
+manual, we always mention the major version in which the API was introduced,
+as shown in the examples below. If a public API exists without the *Since*
+annotation, this is taken to mean that it was available since the first stable
+release 1.0.
+
+Here are some examples to get the hang of the format of API documenting
+comments and docstrings.
+
+**Public API Surface method**::
+
+  def frobnicate(self, source, *, frobilicious=False):
+      """Frobnicates this element with the specified source
+
+      Args:
+         source (Source): The Source to frobnicate with
+         frobilicious (bool): Optionally specify that frobnication should be
+                              performed fribiliciously
+
+      Returns:
+         (Element): The frobnicated version of this Element.
+
+      *Since: 1.2*
+      """
+      ...
+
+**Internal method**::
+
+  # frobnicate():
+  #
+  # Frobnicates this element with the specified source
+  #
+  # Args:
+  #    source (Source): The Source to frobnicate with
+  #    frobilicious (bool): Optionally specify that frobnication should be
+  #                         performed fribiliciously
+  #
+  # Returns:
+  #    (Element): The frobnicated version of this Element.
+  #
+  def frobnicate(self, source, *, frobilicious=False):
+      ...
+
+**Public API Surface instance variable**::
+
+  def __init__(self, context, element):
+
+    self.name = self._compute_name(context, element)
+    """The name of this foo
+
+    *Since: 1.2*
+    """
+
+.. note::
+
+   Python does not support docstrings on instance variables, but sphinx does
+   pick them up and includes them in the generated documentation.
+
+**Internal instance variable**::
+
+  def __init__(self, context, element):
+
+    self.name = self._compute_name(context, element) # The name of this foo
+
+**Internal instance variable (long)**::
+
+  def __init__(self, context, element):
+
+    # This instance variable required a longer explanation, so
+    # it is on a line above the instance variable declaration.
+    self.name = self._compute_name(context, element)
+
+
+**Public API Surface class**::
+
+  class Foo(Bar):
+      """The main Foo object in the data model
+
+      Explanation about Foo. Note that we always document
+      the constructor arguments here, and not beside the __init__
+      method.
+
+      Args:
+         context (Context): The invocation Context
+         count (int): The number to count
+
+      *Since: 1.2*
+      """
+      ...
+
+**Internal class**::
+
+  # Foo()
+  #
+  # The main Foo object in the data model
+  #
+  # Args:
+  #    context (Context): The invocation Context
+  #    count (int): The number to count
+  #
+  class Foo(Bar):
+      ...
+
+
+.. _contributing_class_order:
+
+Class structure and ordering
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+When creating or modifying an object class in BuildStream, it is
+important to keep in mind the order in which symbols should appear
+and keep this consistent.
+
+Here is an example to illustrate the expected ordering of symbols
+on a Python class in BuildStream::
+
+  class Foo(Bar):
+
+      # Public class-wide variables come first, if any.
+
+      # Private class-wide variables, if any
+
+      # Now we have the dunder/magic methods, always starting
+      # with the __init__() method.
+
+      def __init__(self, name):
+
+          super().__init__()
+
+          # NOTE: In the instance initializer we declare any instance variables,
+          #       always declare the public instance variables (if any) before
+          #       the private ones.
+          #
+          #       It is preferred to avoid any public instance variables, and
+          #       always expose an accessor method for it instead.
+
+          #
+          # Public instance variables
+          #
+          self.name = name  # The name of this foo
+
+          #
+          # Private instance variables
+          #
+          self._count = 0   # The count of this foo
+
+      ################################################
+      #               Abstract Methods               #
+      ################################################
+
+      # NOTE: Abstract methods in BuildStream are allowed to have
+      #       default methods.
+      #
+      #       Subclasses must NEVER override any method which was
+      #       not advertized as an abstract method by the parent class.
+
+      # frob()
+      #
+      # Implementors should implement this to frob this foo
+      # count times if possible.
+      #
+      # Args:
+      #    count (int): The number of times to frob this foo
+      #
+      # Returns:
+      #    (int): The number of times this foo was frobbed.
+      #
+      # Raises:
+      #    (FooError): Implementors are expected to raise this error
+      #
+      def frob(self, count):
+
+          #
+          # An abstract method in BuildStream is allowed to have
+          # a default implementation.
+          #
+          self._count = self._do_frobbing(count)
+
+          return self._count
+
+      ################################################
+      #     Implementation of abstract methods       #
+      ################################################
+
+      # NOTE: Implementations of abstract methods defined by
+      #       the parent class should NEVER document the API
+      #       here redundantly.
+
+      def frobbish(self):
+         #
+         # Implementation of the "frobbish" abstract method
+         # defined by the parent Bar class.
+         #
+         return True
+
+      ################################################
+      #                 Public Methods               #
+      ################################################
+
+      # NOTE: Public methods here are the ones which are expected
+      #       to be called from outside of this class.
+      #
+      #       These, along with any abstract methods, usually
+      #       constitute the API surface of this class.
+
+      # frobnicate()
+      #
+      # Perform the frobnication process on this Foo
+      #
+      # Raises:
+      #    (FrobError): In the case that a frobnication error was
+      #                 encountered
+      #
+      def frobnicate(self):
+          frobnicator.frobnicate(self)
+
+      # set_count()
+      #
+      # Sets the count of this foo
+      #
+      # Args:
+      #    count (int): The new count to set
+      #
+      def set_count(self, count):
+
+          self._count = count
+
+      # get_count()
+      #
+      # Accessor for the count value of this foo.
+      #
+      # Returns:
+      #    (int): The count of this foo
+      #
+      def get_count(self, count):
+
+          return self._count
+
+      ################################################
+      #                 Private Methods              #
+      ################################################
+
+      # NOTE: Private methods are the ones which are internal
+      #       implementation details of this class.
+      #
+      #       Even though these are private implementation
+      #       details, they still MUST have API documenting
+      #       comments on them.
+
+      # _do_frobbing()
+      #
+      # Does the actual frobbing
+      #
+      # Args:
+      #    count (int): The number of times to frob this foo
+      #
+      # Returns:
+      #    (int): The number of times this foo was frobbed.
+      #
+      def self._do_frobbing(self, count):
+          return count
+
+
+.. _contributing_public_and_private:
+
+Public and private symbols
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+BuildStream mostly follows the PEP-8 for defining *public* and *private* symbols
+for any given class, with some deviations. Please read the `section on inheritance
+<https://www.python.org/dev/peps/pep-0008/#designing-for-inheritance>`_ for
+reference on how the PEP-8 defines public and non-public.
+
+* A *public* symbol is any symbol which you expect to be used by clients
+  of your class or module within BuildStream.
+
+  Public symbols are written without any leading underscores.
+
+* A *private* symbol is any symbol which is entirely internal to your class
+  or module within BuildStream. These symbols cannot ever be accessed by
+  external clients or modules.
+
+  A private symbol must be denoted by a leading underscore.
+
+* When a class can have subclasses, then private symbols should be denoted
+  by two leading underscores. For example, the ``Sandbox`` or ``Platform``
+  classes which have various implementations, or the ``Element`` and ``Source``
+  classes which plugins derive from.
+
+  The double leading underscore naming convention invokes Python's name
+  mangling algorithm which helps prevent namespace collisions in the case
+  that subclasses might have a private symbol with the same name.
+
+In BuildStream, we have what we call a *"Public API Surface"*, as previously
+mentioned in :ref:`contributing_documenting_symbols`. In the :ref:`next section
+<contributing_public_api_surface>` we will discuss the *"Public API Surface"* and
+outline the exceptions to the rules discussed here.
+
+
+.. _contributing_public_api_surface:
+
+Public API surface
+~~~~~~~~~~~~~~~~~~
+BuildStream exposes what we call a *"Public API Surface"* which is stable
+and unchanging. This is for the sake of stability of the interfaces which
+plugins use, so it can also be referred to as the *"Plugin facing API"*.
+
+Any symbols which are a part of the *"Public API Surface*" are never allowed
+to change once they have landed in a stable release version of BuildStream. As
+such, we aim to keep the *"Public API Surface"* as small as possible at all
+times, and never expose any internal details to plugins inadvertently.
+
+One problem which arises from this is that we end up having symbols
+which are *public* according to the :ref:`rules discussed in the previous section
+<contributing_public_and_private>`, but must be hidden away from the
+*"Public API Surface"*. For example, BuildStream internal classes need
+to invoke methods on the ``Element`` and ``Source`` classes, wheras these
+methods need to be hidden from the *"Public API Surface"*.
+
+This is where BuildStream deviates from the PEP-8 standard for public
+and private symbol naming.
+
+In order to disambiguate between:
+
+* Symbols which are publicly accessible details of the ``Element`` class, can
+  be accessed by BuildStream internals, but must remain hidden from the
+  *"Public API Surface"*.
+
+* Symbols which are private to the ``Element`` class, and cannot be accessed
+  from outside of the ``Element`` class at all.
+
+We denote the former category of symbols with only a single underscore, and the latter
+category of symbols with a double underscore. We often refer to this distinction
+as *"API Private"* (the former category) and *"Local Private"* (the latter category).
+
+Classes which are a part of the *"Public API Surface"* and require this disambiguation
+were not discussed in :ref:`the class ordering section <contributing_class_order>`, for
+these classes, the *"API Private"* symbols always come **before** the *"Local Private"*
+symbols in the class declaration.
+
+Modules which are not a part of the *"Public API Surface"* have their Python files
+prefixed with a single underscore, and are not imported in BuildStream's the master
+``__init__.py`` which is used by plugins.
+
+.. note::
+
+   The ``utils.py`` module is public and exposes a handful of utility functions,
+   however many of the functions it provides are *"API Private"*.
+
+   In this case, the *"API Private"* functions are prefixed with a single underscore.
+
+Any objects which are a part of the *"Public API Surface"* should be exposed via the
+toplevel ``__init__.py`` of the ``buildstream`` package.
+
+
+File naming convention
+~~~~~~~~~~~~~~~~~~~~~~
+With the exception of a few helper objects and data structures, we structure
+the code in BuildStream such that every filename is named after the object it
+implements. E.g. The ``Project`` object is implemented in ``_project.py``, the
+``Context`` object in ``_context.py``, the base ``Element`` class in ``element.py``,
+etc.
+
+As mentioned in the previous section, objects which are not a part of the
+:ref:`public, plugin facing API surface <contributing_public_api_surface>` have their
+filenames prefixed with a leading underscore (like ``_context.py`` and ``_project.py``
+in the examples above).
+
+When an object name has multiple words in it, e.g. ``ArtifactCache``, then the
+resulting file is named all in lower case without any underscore to separate
+words. In the case of ``ArtifactCache``, the filename implementing this object
+is found at ``_artifactcache/artifactcache.py``.


 Imports
 ~~~~~~~
-Module imports inside BuildStream are done with relative ``.`` notation
+Module imports inside BuildStream are done with relative ``.`` notation:

-Good::
+**Good**::

-  from .context import Context
+  from ._context import Context

-Bad::
+**Bad**::

-  from buildstream.context import Context
+  from buildstream._context import Context

 The exception to the above rule is when authoring plugins,
 plugins do not reside in the same namespace so they must
@@ -122,128 +618,586 @@ This makes things clear when reading code that said functions
 are not defined in the same file but come from utils.py for example.


-Policy for private symbols
-~~~~~~~~~~~~~~~~~~~~~~~~~~
-Private symbols are expressed via a leading ``_`` single underscore, or
-in some special circumstances with a leading ``__`` double underscore.
+.. _contributing_instance_variables:
+
+Instance variables
+~~~~~~~~~~~~~~~~~~
+It is preferred that all instance state variables be declared as :ref:`private symbols
+<contributing_public_and_private>`, however in some cases, especially when the state
+is immutable for the object's life time (like an ``Element`` name for example), it
+is acceptable to save some typing by using a publicly accessible instance variable.
+
+It is never acceptable to modify the value of an instance variable from outside
+of the declaring class, even if the variable is *public*. In other words, the class
+which exposes an instance variable is the only one in control of the value of this
+variable.
+
+* If an instance variable is public and must be modified; then it must be
+  modified using a :ref:`mutator <contributing_accessor_mutator>`.
+
+* Ideally for better encapsulation, all object state is declared as
+  :ref:`private instance variables <contributing_public_and_private>` and can
+  only be accessed by external classes via public :ref:`accessors and mutators
+  <contributing_accessor_mutator>`.
+
+.. note::
+
+   In some cases, we may use small data structures declared as objects for the sake
+   of better readability, where the object class itself has no real supporting code.
+
+   In these exceptions, it can be acceptable to modify the instance variables
+   of these objects directly, unless they are otherwise documented to be immutable.
+
+
+.. _contributing_accessor_mutator:
+
+Accessors and mutators
+~~~~~~~~~~~~~~~~~~~~~~
+An accessor and mutator, are methods defined on the object class to access (get)
+or mutate (set) a value owned by the declaring class, respectively.
+
+An accessor might derive the returned value from one or more of its components,
+and a mutator might have side effects, or delegate the mutation to a component.
+
+Accessors and mutators are always :ref:`public <contributing_public_and_private>`
+(even if they might have a single leading underscore and are considered
+:ref:`API Private <contributing_public_api_surface>`), as their purpose is to
+enforce encapsulation with regards to any accesses to the state which is owned
+by the declaring class.
+
+Accessors and mutators are functions prefixed with ``get_`` and ``set_``
+respectively, e.g.::
+
+  class Foo():
+
+      def __init__(self):
+
+          # Declare some internal state
+          self._count = 0
+
+      # get_count()
+      #
+      # Gets the count of this Foo.
+      #
+      # Returns:
+      #    (int): The current count of this Foo
+      #
+      def get_foo(self):
+          return self._count
+
+      # set_count()
+      #
+      # Sets the count of this Foo.
+      #
+      # Args:
+      #    count (int): The new count for this Foo
+      #
+      def set_foo(self, count):
+          self._count = count
+
+.. attention::

-Before understanding the naming policy, it is first important to understand
-that in BuildStream, there are two levels of privateness which need to be
-considered.
+   We are aware that Python offers a facility for accessors and
+   mutators using the ``@property`` decorator instead. Do not use
+   the ``@property`` decorator.

-These are treated subtly differently and thus need to be understood:
+   The decision to use explicitly defined functions instead of the
+   ``@property`` decorator is rather arbitrary, there is not much
+   technical merit to preferring one technique over the other.
+   However as :ref:`discussed below <contributing_always_consistent>`,
+   it is of the utmost importance that we do not mix both techniques
+   in the same codebase.

-* API Private

-  A symbol is considered to be *API private* if it is not exposed in the *public API*.
+.. _contributing_abstract_methods:

-  Even if a symbol does not have any leading underscore, it may still be *API private*
-  if the containing *class* or *module* is named with a leading underscore.
+Abstract methods
+~~~~~~~~~~~~~~~~
+In BuildStream, an *"Abstract Method"* is a bit of a misnomer and does
+not match up to how Python defines abstract methods, we need to seek out
+a new nomanclature to refer to these methods.

-* Local private
+In Python, an *"Abstract Method"* is a method which **must** be
+implemented by a subclass, whereas all methods in Python can be
+overridden.

-  A symbol is considered to be *local private* if it is not intended for access
-  outside of the defining *scope*.
+In BuildStream, we use the term *"Abstract Method"*, to refer to
+a method which **can** be overridden by a subclass, whereas it
+is **illegal** to override any other method.

-  If a symbol has a leading underscore, it might not be *local private* if it is
-  declared on a publicly visible class, but needs to be accessed internally by
-  other modules in the BuildStream core.
+* Abstract methods are allowed to have default implementations.

+* Subclasses are not allowed to redefine the calling signature
+  of an abstract method, or redefine the API contract in any way.

-Ordering
-''''''''
-For better readability and consistency, we try to keep private symbols below
-public symbols. In the case of public modules where we may have a mix of
-*API private* and *local private* symbols, *API private* symbols should come
-before *local private* symbols.
+* Subclasses are not allowed to override any other methods.

+The key here is that in BuildStream, we consider it unacceptable
+that a subclass overrides a method of it's parent class unless
+the said parent class has explicitly given permission to subclasses
+to do so, and outlined the API contract for this purpose. No surprises
+are allowed.

-Symbol naming
-'''''''''''''
-Any private symbol must start with a single leading underscore for two reasons:

-* So that it does not bleed into documentation and *public API*.
+Error handling
+~~~~~~~~~~~~~~
+In BuildStream, all non recoverable errors are expressed via
+subclasses of the ``BstError`` exception.

-* So that it is clear to developers which symbols are not used outside of the declaring *scope*
+This exception is handled deep in the core in a few places, and
+it is rarely necessary to handle a ``BstError``.

-Remember that with python, the modules (python files) are also symbols
-within their containing *package*, as such; modules which are entirely
-private to BuildStream are named as such, e.g. ``_thismodule.py``.

+Raising exceptions
+''''''''''''''''''
+When writing code in the BuildStream core, ensure that all system
+calls and third party library calls are wrapped in a ``try:`` block,
+and raise a descriptive ``BstError`` of the appropriate class explaining
+what exactly failed.

-Cases for double underscores
-''''''''''''''''''''''''''''
-The double underscore in python has a special function. When declaring
-a symbol in class scope which has a leading underscore, it can only be
-accessed within the class scope using the same name. Outside of class
-scope, it can only be accessed with a *cheat*.
+Ensure that the original system call error is formatted into your new
+exception, and that you use the Python ``from`` semantic to retain the
+original call trace, example::

-We use the double underscore in cases where the type of privateness can be
-ambiguous.
+  try:
+      os.utime(self._refpath(ref))
+  except FileNotFoundError as e:
+      raise ArtifactError("Attempt to access unavailable artifact: {}".format(e)) from e

-* For private modules and classes

-  We never need to disambiguate with a double underscore
+Enhancing exceptions
+''''''''''''''''''''
+Sometimes the ``BstError`` originates from a lower level component,
+and the code segment which raised the exception did not have enough context
+to create a complete, informative summary of the error for the user.

-* For private symbols declared in a public *scope*
+In these cases it is necessary to handle the error and raise a new
+one, e.g.::

-  In the case that we declare a private method on a public object, it
-  becomes ambiguous whether:
+  try:
+      extracted_artifact = self._artifacts.extract(self, cache_key)
+  except ArtifactError as e:
+      raise ElementError("Failed to extract {} while checking out {}: {}"
+                         .format(cache_key, self.name, e)) from e

-  * The symbol is *local private*, and only used within the given scope

-  * The symbol is *API private*, and will be used internally by BuildStream
-    from other parts of the codebase.
+Programming errors
+''''''''''''''''''
+Sometimes you are writing code and have detected an unexpected condition,
+or a broken invariant for which the code cannot be prepared to handle
+gracefully.

-  In this case, we use a single underscore for *API private* methods which
-  are not *local private*, and we use a double underscore for *local private*
-  methods declared in public scope.
+In these cases, do **not** raise any of the ``BstError`` class exceptions.

+Instead, use the ``assert`` statement, e.g.::

-Documenting private symbols
-'''''''''''''''''''''''''''
-Any symbol which is *API Private* (regardless of whether it is also
-*local private*), should have some documentation for developers to
-better understand the codebase.
+  assert utils._is_main_process(), \
+      "Attempted to save workspace configuration from child process"

-Contrary to many other python projects, we do not use docstrings to
-document private symbols, but prefer to keep *API Private* symbols
-documented in code comments placed *above* the symbol (or *beside* the
-symbol in some cases, such as variable declarations in a class where
-a shorter comment is more desirable), rather than docstrings placed *below*
-the symbols being documented.
+This will result in a ``BUG`` message with the stack trace included being
+logged and reported in the frontend.

-Other than this detail, follow the same guidelines for documenting
-symbols as described below.

+BstError parameters
+'''''''''''''''''''
+When raising ``BstError`` class exceptions, there are some common properties
+which can be useful to know about:

-Documenting BuildStream
-----------------------
+* **message:** The brief human readable error, will be formatted on one line in the frontend.
+
+* **detail:** An optional detailed human readable message to accompany the **message** summary
+  of the error. This is often used to recommend the user some course of action, or to provide
+  additional context about the error.
+
+* **temporary:** Some errors are allowed to be *temporary*, this attribute is only
+  observed from child processes which fail in a temporary way. This distinction
+  is used to determine whether the task should be *retried* or not. An error is usually
+  only a *temporary* error if the cause of the error was a network timeout.
+
+* **reason:** A machine readable identifier for the error. This is used for the purpose
+  of regression testing, such that we check that BuildStream has errored out for the
+  expected reason in a given failure mode.
+
+
+Documenting Exceptions
+''''''''''''''''''''''
+We have already seen :ref:`some examples <contributing_class_order>` of how
+exceptions are documented in API documenting comments, but this is worth some
+additional disambiguation.
+
+* Only document the exceptions which are raised directly by the function in question.
+  It is otherwise nearly impossible to keep track of what exceptions *might* be raised
+  indirectly by calling the given function.
+
+* For a regular public or private method, your audience is a caller of the function;
+  document the exception in terms of what exception might be raised as a result of
+  calling this method.
+
+* For an :ref:`abstract method <contributing_abstract_methods>`, your audience is the
+  implementor of the method in a subclass; document the exception in terms of what
+  exception is prescribed for the implementing class to raise.
+
+
+.. _contributing_always_consistent:
+
+Always be consistent
+~~~~~~~~~~~~~~~~~~~~
+There are various ways to define functions and classes in Python,
+which has evolved with various features over time.
+
+In BuildStream, we may not have leveraged all of the nice features
+we could have, that is okay, and where it does not break API, we
+can consider changing it.
+
+Even if you know there is a *better* way to do a given thing in
+Python when compared to the way we do it in BuildStream, *do not do it*.
+
+Consistency of how we do things in the codebase is more important
+than the actual way in which things are done, always.
+
+Instead, if you like a certain Python feature and think the BuildStream
+codebase should use it, then propose your change on the `mailing list
+<https://mail.gnome.org/mailman/listinfo/buildstream-list>`_. Chances
+are that we will reach agreement to use your preferred approach, and
+in that case, it will be important to apply the change unilaterally
+across the entire codebase, such that we continue to have a consistent
+codebase.
+
+
+Avoid tail calling
+~~~~~~~~~~~~~~~~~~
+With the exception of tail calling with simple functions from
+the standard Python library, such as splitting and joining lines
+of text and encoding/decoding text; always avoid tail calling.
+
+**Good**::
+
+  # Variables that we will need declared up top
+  context = self._get_context()
+  workspaces = context.get_workspaces()
+
+  ...
+
+  # Saving the workspace configuration
+  workspaces.save_config()
+
+**Bad**::
+
+  # Saving the workspace configuration
+  self._get_context().get_workspaces().save_config()
+
+**Acceptable**::
+
+  # Decode the raw text loaded from a log file for display,
+  # join them into a single utf-8 string and strip away any
+  # trailing whitespace.
+  return '\n'.join([line.decode('utf-8') for line in lines]).rstrip()
+
+When you need to obtain a delegate object via an accessor function,
+either do it at the beginning of the function, or at the beginning
+of a code block within the function that will use that object.
+
+There are several reasons for this convention:
+
+* When observing a stack trace, it is always faster and easier to
+  determine what went wrong when all statements are on separate lines.
+
+* We always want individual lines to trace back to their origin as
+  much as possible for the purpose of tracing the history of code
+  with *git blame*.
+
+  One day, you might need the ``Context`` or ``Workspaces`` object
+  in the same function for another reason, at which point it will
+  be unacceptable to leave the existing line as written, because it
+  will introduce a redundant accessor to the same object, so the
+  line written as::
+
+    self._get_context().get_workspaces().save_config()
+
+  Will have to change at that point, meaning we lose the valuable
+  information of which commit originally introduced this call
+  when running *git blame*.
+
+* For similar reasons, we prefer delegate objects be accessed near
+  the beginning of a function or code block so that there is less
+  chance that this statement will have to move in the future, if
+  the same function or code block needs the delegate object for any
+  other reason.
+
+  Asides from this, code is generally more legible and uniform when
+  variables are declared at the beginning of function blocks.
+
+
+Vertical stacking of modules
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+For the sake of overall comprehensiveness of the BuildStream
+architecture, it is important that we retain vertical stacking
+order of the dependencies and knowledge of modules as much as
+possible, and avoid any cyclic relationships in modules.
+
+For instance, the ``Source`` objects are owned by ``Element``
+objects in the BuildStream data model, and as such the ``Element``
+will delegate some activities to the ``Source`` objects in it's
+possesion. The ``Source`` objects should however never call functions
+on the ``Element`` object, nor should the ``Source`` object itself
+have any understanding of what an ``Element`` is.
+
+If you are implementing a low level utility layer, for example
+as a part of the ``YAML`` loading code layers, it can be tempting
+to derive context from the higher levels of the codebase which use
+these low level utilities, instead of defining properly stand alone
+APIs for these utilities to work: Never do this.
+
+Unfortunately, unlike other languages where include files play
+a big part in ensuring that it is difficult to make a mess; Python,
+allows you to just call methods on arbitrary objects passed through
+a function call without having to import the module which defines
+those methods - this leads to cyclic dependencies of modules quickly
+if the developer does not take special care of ensuring this does not
+happen.
+
+
+Minimize arguments in methods
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+When creating an object, or adding a new API method to an existing
+object, always strive to keep as much context as possible on the
+object itself rather than expecting callers of the methods to provide
+everything the method needs every time.
+
+If the value or object that is needed in a function call is a constant
+for the lifetime of the object which exposes the given method, then
+that value or object should be passed in the constructor instead of
+via a method call.
+
+
+Minimize API surfaces
+~~~~~~~~~~~~~~~~~~~~~
+When creating an object, or adding new functionality in any way,
+try to keep the number of :ref:`public, outward facing <contributing_public_and_private>`
+symbols to a minimum, this is important for both
+:ref:`internal and public, plugin facing API surfaces <contributing_public_api_surface>`.
+
+When anyone visits a file, there are two levels of comprehension:
+
+* What do I need to know in order to *use* this object.
+
+* What do I need to know in order to *modify* this object.
+
+For the former, we want the reader to understand with as little effort
+as possible, what the public API contract is for a given object and consequently,
+how it is expected to be used. This is also why we
+:ref:`order the symbols of a class <contributing_class_order>` in such a way
+as to keep all outward facing public API surfaces at the top of the file, so that the
+reader never needs to dig deep into the bottom of the file to find something they
+might need to use.
+
+For the latter, when it comes to having to modify the file or add functionality,
+you want to retain as much freedom as possible to modify internals, while
+being sure that nothing external will be affected by internal modifications.
+Less client facing API means that you have less surrounding code to modify
+when your API changes. Further, ensuring that there is minimal outward facing
+API for any module minimizes the complexity for the developer working on
+that module, by limiting the considerations needed regarding external side
+effects of their modifications to the module.
+
+When modifying a file, one should not have to understand or think too
+much about external side effects, when the API surface of the file is
+well documented and minimal.
+
+When adding new API to a given object for a new purpose, consider whether
+the new API is in any way redundant with other API (should this value now
+go into the constructor, since we use it more than once? could this
+value be passed along with another function, and the other function renamed,
+to better suit the new purposes of this module/object?) and repurpose
+the outward facing API of an object as a whole every time.
+
+
+Avoid transient state on instances
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+At times, it can be tempting to store transient state that is
+the result of one operation on an instance, only to be retrieved
+later via an accessor function elsewhere.
+
+As a basic rule of thumb, if the value is transient and just the
+result of one operation, which needs to be observed directly after
+by another code segment, then never store it on the instance.
+
+BuildStream is complicated in the sense that it is multi processed
+and it is not always obvious how to pass the transient state around
+as a return value or a function parameter. Do not fall prey to this
+obstacle and pollute object instances with transient state.
+
+Instead, always refactor the surrounding code so that the value
+is propagated to the desired end point via a well defined API, either
+by adding new code paths or changing the design such that the
+architecture continues to make sense.
+
+
+Refactor the codebase as needed
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Especially when implementing features, always move the BuildStream
+codebase forward as a whole.
+
+Taking a short cut is alright when prototyping, but circumventing
+existing architecture and design to get a feature implemented without
+re-designing the surrounding architecture to accommodate the new
+feature instead, is never acceptable upstream.
+
+For example, let's say that you have to implement a feature and you've
+successfully prototyped it, but it launches a ``Job`` directly from a
+``Queue`` implementation to get the feature to work, while the ``Scheduler``
+is normally responsible for dispatching ``Jobs`` for the elements on
+a ``Queue``. This means that you've proven that your feature can work,
+and now it is time to start working on a patch for upstream.
+
+Consider what the scenario is and why you are circumventing the design,
+and then redesign the ``Scheduler`` and ``Queue`` objects to accommodate for
+the new feature and condition under which you need to dispatch a ``Job``,
+or how you can give the ``Queue`` implementation the additional context it
+needs.
+
+
+Adding core plugins
+-------------------
+This is a checklist of things which need to be done when adding a new
+core plugin to BuildStream proper.
+
+
+Update documentation index
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+The documentation generating scripts will automatically pick up your
+newly added plugin and generate HTML, but will not add a link to the
+documentation of your plugin automatically.
+
+Whenever adding a new plugin, you must add an entry for it in ``doc/source/core_plugins.rst``.
+
+
+Bump format version
+~~~~~~~~~~~~~~~~~~~
+In order for projects to assert that they have a new enough version
+of BuildStream to use the new plugin, the ``BST_FORMAT_VERSION`` must
+be incremented in the ``_versions.py`` file.
+
+Remember to include in your plugin's main documentation, the format
+version in which the plugin was introduced, using the standard annotation
+which we use throughout the documentation, e.g.::
+
+  .. note::
+
+     The ``foo`` plugin is available since :ref:`format version 16 <project_format_version>`
+
+
+Add tests
+~~~~~~~~~
+Needless to say, all new feature additions need to be tested. For ``Element``
+plugins, these usually need to be added to the integration tests. For ``Source``
+plugins, the tests are added in two ways:
+
+* For most normal ``Source`` plugins, it is important to add a new ``Repo``
+  implementation for your plugin in the ``tests/testutils/repo/`` directory
+  and update ``ALL_REPO_KINDS`` in ``tests/testutils/repo/__init__.py``. This
+  will include your new ``Source`` implementation in a series of already existing
+  tests, ensuring it works well under normal operating conditions.
+
+* For other source plugins, or in order to test edge cases, such as failure modes,
+  which are not tested under the normal test battery, add new tests in ``tests/sources``.
+
+
+Extend the cachekey test
+~~~~~~~~~~~~~~~~~~~~~~~~
+For any newly added plugins, it is important to add some new simple elements
+in ``tests/cachekey/project/elements`` or ``tests/cachekey/project/sources``,
+and ensure that the newly added elements are depended on by ``tests/cachekey/project/target.bst``.
+
+One new element should be added to the cache key test for every configuration
+value which your plugin understands which can possibly affect the result of
+your plugin's ``Plugin.get_unique_key()`` implementation.
+
+This test ensures that cache keys do not unexpectedly change or become incompatible
+due to code changes. As such, the cache key test should have full coverage of every
+YAML configuration which can possibly affect cache key outcome at all times.
+
+See the ``tests/cachekey/update.py`` file for instructions on running the updater,
+you need to run the updater to generate the ``.expected`` files and add the new
+``.expected`` files in the same commit which extends the cache key test.
+
+
+Protocol buffers
+----------------
+BuildStream uses protobuf and gRPC for serialization and communication with
+artifact cache servers.  This requires ``.proto`` files and Python code
+generated from the ``.proto`` files using protoc.  All these files live in the
+``buildstream/_protos`` directory.  The generated files are included in the
+git repository to avoid depending on grpcio-tools for user installations.
+
+
+Regenerating code
+~~~~~~~~~~~~~~~~~
+When ``.proto`` files are modified, the corresponding Python code needs to
+be regenerated.  As a prerequisite for code generation you need to install
+``grpcio-tools`` using pip or some other mechanism::
+
+  pip3 install --user grpcio-tools
+
+To actually regenerate the code::
+
+  ./setup.py build_grpc
+
+
+Documenting
+-----------
 BuildStream starts out as a documented project from day one and uses
 sphinx to document itself.

+This section discusses formatting policies for editing files in the
+``doc/source`` directory, and describes the details of how the docs are
+generated so that you can easily generate and view the docs yourself before
+submitting patches to the documentation.
+
+For details on how API documenting comments and docstrings are formatted,
+refer to the :ref:`documenting section of the coding guidelines
+<contributing_documenting_symbols>`.
+

 Documentation formatting policy
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 The BuildStream documentation style is as follows:

-* Titles and headings require two leading empty lines above them. Only the first word should be capitalized.
+* Titles and headings require two leading empty lines above them.
+  Only the first word in a title should be capitalized.

-  * If there is an ``.. _internal_link`` anchor, there should be two empty lines above the anchor, followed by one leading empty line.
+  * If there is an ``.. _internal_link:`` anchor, there should be two empty lines
+    above the anchor, followed by one leading empty line.

 * Within a section, paragraphs should be separated by one empty line.

-* Notes are defined using: ``.. note::`` blocks, followed by an empty line and then indented (3 spaces) text.
+* Notes are defined using: ``.. note::`` blocks, followed by an empty line
+  and then indented (3 spaces) text.

-* Code blocks are defined using: ``.. code:: LANGUAGE`` blocks, followed by an empty line and then indented (3 spaces) text. Note that the default language is `python`.
+  * Other kinds of notes can be used throughout the documentation and will
+    be decorated in different ways, these work in the same way as ``.. note::`` does.
+
+    Feel free to also use ``.. attention::`` or ``.. important::`` to call special
+    attention to a paragraph, ``.. tip::`` to give the reader a special tip on how
+    to use an advanced feature or ``.. warning::`` to warn the user about a potential
+    misuse of the API and explain it's consequences.
+
+* Code blocks are defined using: ``.. code:: LANGUAGE`` blocks, followed by an empty
+  line and then indented (3 spaces) text. Note that the default language is ``python``.

 * Cross references should be of the form ``:role:`target```.

-  * To cross reference arbitrary locations with, for example, the anchor ``_anchor_name``, you must give the link an explicit title: ``:ref:`Link text <anchor_name>```. Note that the "_" prefix is not required.
+  * Explicit anchors can be declared as ``.. _anchor_name:`` on a line by itself.
+
+  * To cross reference arbitrary locations with, for example, the anchor ``anchor_name``,
+    always provide some explicit text in the link instead of deriving the text from
+    the target, e.g.: ``:ref:`Link text <anchor_name>```.
+    Note that the "_" prefix is not used when referring to the target.

 Useful links:

-For further information, please see the `Sphinx Documentation <http://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html>`_.
+For further information, please see the `Sphinx Documentation
+<http://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html>`_.


 Building Docs
@@ -312,54 +1266,28 @@ the ``man/`` subdirectory, which will be automatically included
 in the buildstream distribution.


-Documenting conventions
-~~~~~~~~~~~~~~~~~~~~~~~
-We use the sphinx.ext.napoleon extension for the purpose of having
-a bit nicer docstrings than the default sphinx docstrings.
-
-A docstring for a method, class or function should have the following
-format::
-
-  """Brief description of entity
-
-  Args:
-     argument1 (type): Description of arg
-     argument2 (type): Description of arg
-
-  Returns:
-     (type): Description of returned thing of the specified type
-
-  Raises:
-     (SomeError): When some error occurs
-     (SomeOtherError): When some other error occurs
-
-  A detailed description can go here if one is needed, only
-  after the above part documents the calling conventions.
-  """
-
-
 Documentation Examples
 ~~~~~~~~~~~~~~~~~~~~~~
 The examples section of the documentation contains a series of standalone
 examples, here are the criteria for an example addition.

-* The example has a ``${name}``
+* The example has a ``${name}``.

-* The example has a project users can copy and use
+* The example has a project users can copy and use.

-  * This project is added in the directory ``doc/examples/${name}``
+  * This project is added in the directory ``doc/examples/${name}``.

-* The example has a documentation component
+* The example has a documentation component.

  * This is added at ``doc/source/examples/${name}.rst``
  * A reference to ``examples/${name}`` is added to the toctree in ``doc/source/examples.rst``
  * This documentation discusses the project elements declared in the project and may
-    provide some BuildStream command examples
-  * This documentation links out to the reference manual at every opportunity
+    provide some BuildStream command examples.
+  * This documentation links out to the reference manual at every opportunity.

-* The example has a CI test component
+* The example has a CI test component.

-  * This is an integration test added at ``tests/examples/${name}``
+  * This is an integration test added at ``tests/examples/${name}``.
  * This test runs BuildStream in the ways described in the example
    and assert that we get the results which we advertize to users in
    the said examples.
@@ -386,17 +1314,17 @@ The ``.run`` file format is just another YAML dictionary which consists of a

 Each *command* is a dictionary, the members of which are listed here:

-* ``directory``: The input file relative project directory
+* ``directory``: The input file relative project directory.

-* ``output``: The input file relative output html file to generate (optional)
+* ``output``: The input file relative output html file to generate (optional).

 * ``fake-output``: Don't really run the command, just pretend to and pretend
  this was the output, an empty string will enable this too.

-* ``command``: The command to run, without the leading ``bst``
+* ``command``: The command to run, without the leading ``bst``.

 * ``shell``: Specifying ``True`` indicates that ``command`` should be run as
-  a shell command from the project directory, instead of a bst command (optional)
+  a shell command from the project directory, instead of a bst command (optional).

 When adding a new ``.run`` file, one should normally also commit the new
 resulting generated ``.html`` file(s) into the ``doc/source/sessions-stored/``
@@ -419,30 +1347,10 @@ regenerate them locally in order to build the docs.
     command: build hello.bst


-Protocol Buffers
----------------
-BuildStream uses protobuf and gRPC for serialization and communication with
-artifact cache servers.  This requires ``.proto`` files and Python code
-generated from the ``.proto`` files using protoc.  All these files live in the
-``buildstream/_protos`` directory.  The generated files are included in the
-git repository to avoid depending on grpcio-tools for user installations.
+.. _contributing_testing:

-
-Regenerating code
-~~~~~~~~~~~~~~~~~
-When ``.proto`` files are modified, the corresponding Python code needs to
-be regenerated.  As a prerequisite for code generation you need to install
-``grpcio-tools`` using pip or some other mechanism::
-
-  pip3 install --user grpcio-tools
-
-To actually regenerate the code::
-
-  ./setup.py build_grpc
-
-
-Testing BuildStream
-------------------
+Testing
+-------
 BuildStream uses pytest for regression tests and testing out
 the behavior of newly added components.

@@ -495,6 +1403,7 @@ with::
 Alternatively, any IDE plugin that uses pytest should automatically
 detect the ``.pylintrc`` in the project's root directory.

+
 Adding tests
 ~~~~~~~~~~~~
 Tests are found in the tests subdirectory, inside of which
@@ -513,7 +1422,7 @@ a subdirectory beside your test in which to store data.
 When creating a test that needs data, use the datafiles extension
 to decorate your test case (again, examples exist in the existing
 tests for this), documentation on the datafiles extension can
-be found here: https://pypi.python.org/pypi/pytest-datafiles
+be found here: https://pypi.python.org/pypi/pytest-datafiles.

 Tests that run a sandbox should be decorated with::

@@ -521,8 +1430,9 @@ Tests that run a sandbox should be decorated with::

 and use the integration cli helper.

-Measuring BuildStream performance
---------------------------------
+
+Measuring performance
+---------------------


 Benchmarking framework

--- a/NEWS
+++ b/NEWS
@@ -24,6 +24,9 @@ buildstream 1.3.1
  o Add new `pip` source plugin for downloading python packages using pip,
    based on requirements files from previous sources.

+  o Generate Docker images from built artifacts using
+    `contrib/bst-docker-import` script.
+

 =================
 buildstream 1.1.5

--- a/buildstream/_artifactcache/artifactcache.py
+++ b/buildstream/_artifactcache/artifactcache.py
@@ -426,6 +426,22 @@ class ArtifactCache():
        raise ImplError("Cache '{kind}' does not implement contains()"
                        .format(kind=type(self).__name__))

+    # contains_subdir_artifact():
+    #
+    # Check whether an artifact element contains a digest for a subdir
+    # which is populated in the cache, i.e non dangling.
+    #
+    # Args:
+    #     element (Element): The Element to check
+    #     key (str): The cache key to use
+    #     subdir (str): The subdir to check
+    #
+    # Returns: True if the subdir exists & is populated in the cache, False otherwise
+    #
+    def contains_subdir_artifact(self, element, key, subdir):
+        raise ImplError("Cache '{kind}' does not implement contains_subdir_artifact()"
+                        .format(kind=type(self).__name__))
+
    # list_artifacts():
    #
    # List artifacts in this cache in LRU order.
@@ -551,11 +567,12 @@ class ArtifactCache():
    #     element (Element): The Element whose artifact is to be fetched
    #     key (str): The cache key to use
    #     progress (callable): The progress callback, if any
+    #     subdir (str): The optional specific subdir to pull
    #
    # Returns:
    #   (bool): True if pull was successful, False if artifact was not available
    #
-    def pull(self, element, key, *, progress=None):
+    def pull(self, element, key, *, progress=None, subdir=None, excluded_subdirs=None):
        raise ImplError("Cache '{kind}' does not implement pull()"
                        .format(kind=type(self).__name__))


--- a/buildstream/_artifactcache/cascache.py
+++ b/buildstream/_artifactcache/cascache.py
@@ -92,6 +92,16 @@ class CASCache(ArtifactCache):
        # This assumes that the repository doesn't have any dangling pointers
        return os.path.exists(refpath)

+    def contains_subdir_artifact(self, element, key, subdir):
+        tree = self.resolve_ref(self.get_artifact_fullname(element, key))
+
+        # This assumes that the subdir digest is present in the element tree
+        subdirdigest = self._get_subdir(tree, subdir)
+        objpath = self.objpath(subdirdigest)
+
+        # True if subdir content is cached or if empty as expected
+        return os.path.exists(objpath)
+
    def extract(self, element, key):
        ref = self.get_artifact_fullname(element, key)

@@ -228,7 +238,7 @@ class CASCache(ArtifactCache):
            remotes_for_project = self._remotes[element._get_project()]
            return any(remote.spec.push for remote in remotes_for_project)

-    def pull(self, element, key, *, progress=None):
+    def pull(self, element, key, *, progress=None, subdir=None, excluded_subdirs=None):
        ref = self.get_artifact_fullname(element, key)

        project = element._get_project()
@@ -247,8 +257,14 @@ class CASCache(ArtifactCache):
                tree.hash = response.digest.hash
                tree.size_bytes = response.digest.size_bytes

-                self._fetch_directory(remote, tree)
+                # Check if the element artifact is present, if so just fetch subdir
+                if subdir and os.path.exists(self.objpath(tree)):
+                    self._fetch_subdir(remote, tree, subdir)
+                else:
+                    # Fetch artifact, excluded_subdirs determined in pullqueue
+                    self._fetch_directory(remote, tree, excluded_subdirs=excluded_subdirs)

+                # tree is the remote value, so is the same without or without dangling ref locally
                self.set_ref(ref, tree)

                element.info("Pulled artifact {} <- {}".format(display_key, remote.spec.url))
@@ -668,8 +684,10 @@ class CASCache(ArtifactCache):
                         stat.S_IRGRP | stat.S_IXGRP | stat.S_IROTH | stat.S_IXOTH)

        for dirnode in directory.directories:
-            fullpath = os.path.join(dest, dirnode.name)
-            self._checkout(fullpath, dirnode.digest)
+            # Don't try to checkout a dangling ref
+            if os.path.exists(self.objpath(dirnode.digest)):
+                fullpath = os.path.join(dest, dirnode.name)
+                self._checkout(fullpath, dirnode.digest)

        for symlinknode in directory.symlinks:
            # symlink
@@ -948,10 +966,12 @@ class CASCache(ArtifactCache):
    #     remote (Remote): The remote to use.
    #     dir_digest (Digest): Digest object for the directory to fetch.
    #
-    def _fetch_directory(self, remote, dir_digest):
+    def _fetch_directory(self, remote, dir_digest, *, excluded_subdirs=None):
        fetch_queue = [dir_digest]
        fetch_next_queue = []
        batch = _CASBatchRead(remote)
+        if not excluded_subdirs:
+            excluded_subdirs = []

        while len(fetch_queue) + len(fetch_next_queue) > 0:
            if len(fetch_queue) == 0:
@@ -966,8 +986,9 @@ class CASCache(ArtifactCache):
                directory.ParseFromString(f.read())

            for dirnode in directory.directories:
-                batch = self._fetch_directory_node(remote, dirnode.digest, batch,
-                                                   fetch_queue, fetch_next_queue, recursive=True)
+                if dirnode.name not in excluded_subdirs:
+                    batch = self._fetch_directory_node(remote, dirnode.digest, batch,
+                                                       fetch_queue, fetch_next_queue, recursive=True)

            for filenode in directory.files:
                batch = self._fetch_directory_node(remote, filenode.digest, batch,
@@ -976,6 +997,10 @@ class CASCache(ArtifactCache):
        # Fetch final batch
        self._fetch_directory_batch(remote, batch, fetch_queue, fetch_next_queue)

+    def _fetch_subdir(self, remote, tree, subdir):
+        subdirdigest = self._get_subdir(tree, subdir)
+        self._fetch_directory(remote, subdirdigest)
+
    def _fetch_tree(self, remote, digest):
        # download but do not store the Tree object
        with tempfile.NamedTemporaryFile(dir=self.tmpdir) as out:

--- a/buildstream/_context.py
+++ b/buildstream/_context.py
@@ -110,6 +110,9 @@ class Context():
        # Make sure the XDG vars are set in the environment before loading anything
        self._init_xdg()

+        # Whether or not to attempt to pull buildtrees globally
+        self.pullbuildtrees = False
+
        # Private variables
        self._cache_key = None
        self._message_handler = None
@@ -160,7 +163,7 @@ class Context():
        _yaml.node_validate(defaults, [
            'sourcedir', 'builddir', 'artifactdir', 'logdir',
            'scheduler', 'artifacts', 'logging', 'projects',
-            'cache'
+            'cache', 'pullbuildtrees'
        ])

        for directory in ['sourcedir', 'builddir', 'artifactdir', 'logdir']:
@@ -185,6 +188,9 @@ class Context():
        # Load artifact share configuration
        self.artifact_cache_specs = ArtifactCache.specs_from_config_node(defaults)

+        # Load pull buildtrees configuration
+        self.pullbuildtrees = _yaml.node_get(defaults, bool, 'pullbuildtrees', default_value='False')
+
        # Load logging config
        logging = _yaml.node_get(defaults, Mapping, 'logging')
        _yaml.node_validate(logging, [

--- a/buildstream/_frontend/cli.py
+++ b/buildstream/_frontend/cli.py
@@ -305,10 +305,12 @@ def init(app, project_name, format_version, element_path, force):
              help="Allow tracking to cross junction boundaries")
 @click.option('--track-save', default=False, is_flag=True,
              help="Deprecated: This is ignored")
+@click.option('--pull-buildtrees', default=False, is_flag=True,
+              help="Pull buildtrees from a remote cache server")
 @click.argument('elements', nargs=-1,
                type=click.Path(readable=False))
 @click.pass_obj
-def build(app, elements, all_, track_, track_save, track_all, track_except, track_cross_junctions):
+def build(app, elements, all_, track_, track_save, track_all, track_except, track_cross_junctions, pull_buildtrees):
    """Build elements in a pipeline"""

    if (track_except or track_cross_junctions) and not (track_ or track_all):
@@ -327,7 +329,8 @@ def build(app, elements, all_, track_, track_save, track_all, track_except, trac
                         track_targets=track_,
                         track_except=track_except,
                         track_cross_junctions=track_cross_junctions,
-                         build_all=all_)
+                         build_all=all_,
+                         pull_buildtrees=pull_buildtrees)


 ##################################################################
@@ -429,10 +432,12 @@ def track(app, elements, deps, except_, cross_junctions):
              help='The dependency artifacts to pull (default: none)')
 @click.option('--remote', '-r',
              help="The URL of the remote cache (defaults to the first configured cache)")
+@click.option('--pull-buildtrees', default=False, is_flag=True,
+              help="Pull buildtrees from a remote cache server")
 @click.argument('elements', nargs=-1,
                type=click.Path(readable=False))
 @click.pass_obj
-def pull(app, elements, deps, remote):
+def pull(app, elements, deps, remote, pull_buildtrees):
    """Pull a built artifact from the configured remote artifact cache.

    By default the artifact will be pulled one of the configured caches
@@ -446,7 +451,7 @@ def pull(app, elements, deps, remote):
        all:   All dependencies
    """
    with app.initialized(session_name="Pull"):
-        app.stream.pull(elements, selection=deps, remote=remote)
+        app.stream.pull(elements, selection=deps, remote=remote, pull_buildtrees=pull_buildtrees)


 ##################################################################

--- a/buildstream/_scheduler/queues/pullqueue.py
+++ b/buildstream/_scheduler/queues/pullqueue.py
@@ -32,9 +32,20 @@ class PullQueue(Queue):
    complete_name = "Pulled"
    resources = [ResourceType.DOWNLOAD, ResourceType.CACHE]

+    def __init__(self, scheduler, buildtrees=False):
+        super().__init__(scheduler)
+
+        # Current default exclusions on pull
+        self._excluded_subdirs = ["buildtree"]
+        self._subdir = None
+        # If buildtrees are to be pulled, remove the value from exclusion list
+        if buildtrees:
+            self._subdir = "buildtree"
+            self._excluded_subdirs.remove(self._subdir)
+
    def process(self, element):
        # returns whether an artifact was downloaded or not
-        if not element._pull():
+        if not element._pull(subdir=self._subdir, excluded_subdirs=self._excluded_subdirs):
            raise SkipJob(self.action_name)

    def status(self, element):
@@ -49,7 +60,7 @@ class PullQueue(Queue):
        if not element._can_query_cache():
            return QueueStatus.WAIT

-        if element._pull_pending():
+        if element._pull_pending(subdir=self._subdir):
            return QueueStatus.READY
        else:
            return QueueStatus.SKIP

--- a/buildstream/_stream.py
+++ b/buildstream/_stream.py
@@ -160,12 +160,14 @@ class Stream():
    #    track_cross_junctions (bool): Whether tracking should cross junction boundaries
    #    build_all (bool): Whether to build all elements, or only those
    #                      which are required to build the target.
+    #    pull_buildtrees (bool): Whether to pull buildtrees from a remote cache server
    #
    def build(self, targets, *,
              track_targets=None,
              track_except=None,
              track_cross_junctions=False,
-              build_all=False):
+              build_all=False,
+              pull_buildtrees=False):

        if build_all:
            selection = PipelineSelection.ALL
@@ -195,7 +197,10 @@ class Stream():
            self._add_queue(track_queue, track=True)

        if self._artifacts.has_fetch_remotes():
-            self._add_queue(PullQueue(self._scheduler))
+            # Query if pullbuildtrees has been set globally in user config
+            if self._context.pullbuildtrees:
+                pull_buildtrees = True
+            self._add_queue(PullQueue(self._scheduler, buildtrees=pull_buildtrees))

        self._add_queue(FetchQueue(self._scheduler, skip_cached=True))
        self._add_queue(BuildQueue(self._scheduler))
@@ -295,7 +300,8 @@ class Stream():
    #
    def pull(self, targets, *,
             selection=PipelineSelection.NONE,
-             remote=None):
+             remote=None,
+             pull_buildtrees=False):

        use_config = True
        if remote:
@@ -310,8 +316,12 @@ class Stream():
        if not self._artifacts.has_fetch_remotes():
            raise StreamError("No artifact caches available for pulling artifacts")

+        # Query if pullbuildtrees has been set globally in user config
+        if self._context.pullbuildtrees:
+            pull_buildtrees = True
+
        self._pipeline.assert_consistent(elements)
-        self._add_queue(PullQueue(self._scheduler))
+        self._add_queue(PullQueue(self._scheduler, buildtrees=pull_buildtrees))
        self._enqueue_plan(elements)
        self._run()


--- a/buildstream/element.py
+++ b/buildstream/element.py
@@ -1316,7 +1316,8 @@ class Element(Plugin):
    #
    @contextmanager
    def _prepare_sandbox(self, scope, directory, deps='run', integrate=True):
-        with self.__sandbox(directory, config=self.__sandbox_config) as sandbox:
+        # bst shell and bst checkout require a local sandbox.
+        with self.__sandbox(directory, config=self.__sandbox_config, allow_remote=False) as sandbox:

            # Configure always comes first, and we need it.
            self.configure_sandbox(sandbox)
@@ -1691,18 +1692,26 @@ class Element(Plugin):

    # _pull_pending()
    #
-    # Check whether the artifact will be pulled.
+    # Check whether the artifact will be pulled. If the pull operation is to
+    # include a specific subdir of the element artifact (from cli or user conf)
+    # then the local cache is queried for the subdirs existence.
+    #
+    # Args:
+    #    subdir (str): Whether the pull has been invoked with a specific subdir set
    #
    # Returns:
    #   (bool): Whether a pull operation is pending
    #
-    def _pull_pending(self):
+    def _pull_pending(self, subdir=None):
        if self._get_workspace():
            # Workspace builds are never pushed to artifact servers
            return False

-        if self.__strong_cached:
-            # Artifact already in local cache
+        if self.__strong_cached and subdir:
+            # If we've specified a subdir, check if the subdir is cached locally
+            if self.__artifacts.contains_subdir_artifact(self, self.__strict_cache_key, subdir):
+                return False
+        elif self.__strong_cached:
            return False

        # Pull is pending if artifact remote server available
@@ -1724,11 +1733,10 @@ class Element(Plugin):

        self._update_state()

-    def _pull_strong(self, *, progress=None):
+    def _pull_strong(self, *, progress=None, subdir=None, excluded_subdirs=None):
        weak_key = self._get_cache_key(strength=_KeyStrength.WEAK)
-
        key = self.__strict_cache_key
-        if not self.__artifacts.pull(self, key, progress=progress):
+        if not self.__artifacts.pull(self, key, progress=progress, subdir=subdir, excluded_subdirs=excluded_subdirs):
            return False

        # update weak ref by pointing it to this newly fetched artifact
@@ -1736,10 +1744,10 @@ class Element(Plugin):

        return True

-    def _pull_weak(self, *, progress=None):
+    def _pull_weak(self, *, progress=None, subdir=None, excluded_subdirs=None):
        weak_key = self._get_cache_key(strength=_KeyStrength.WEAK)
-
-        if not self.__artifacts.pull(self, weak_key, progress=progress):
+        if not self.__artifacts.pull(self, weak_key, progress=progress, subdir=subdir,
+                                     excluded_subdirs=excluded_subdirs):
            return False

        # extract strong cache key from this newly fetched artifact
@@ -1757,17 +1765,17 @@ class Element(Plugin):
    #
    # Returns: True if the artifact has been downloaded, False otherwise
    #
-    def _pull(self):
+    def _pull(self, subdir=None, excluded_subdirs=None):
        context = self._get_context()

        def progress(percent, message):
            self.status(message)

        # Attempt to pull artifact without knowing whether it's available
-        pulled = self._pull_strong(progress=progress)
+        pulled = self._pull_strong(progress=progress, subdir=subdir, excluded_subdirs=excluded_subdirs)

        if not pulled and not self._cached() and not context.get_strict():
-            pulled = self._pull_weak(progress=progress)
+            pulled = self._pull_weak(progress=progress, subdir=subdir, excluded_subdirs=excluded_subdirs)

        if not pulled:
            return False
@@ -1790,10 +1798,21 @@ class Element(Plugin):
        if not self._cached():
            return True

-        # Do not push tained artifact
+        # Do not push tainted artifact
        if self.__get_tainted():
            return True

+        # Do not push elements that have a dangling buildtree artifact unless element type is
+        # expected to have an empty buildtree directory
+        if not self.__artifacts.contains_subdir_artifact(self, self.__strict_cache_key, 'buildtree'):
+            return True
+
+        # strict_cache_key can't be relied on to be available when running in non strict mode
+        context = self._get_context()
+        if not context.get_strict():
+            if not self.__artifacts.contains_subdir_artifact(self, self.__weak_cache_key, 'buildtree'):
+                return True
+
        return False

    # _push():
@@ -2149,17 +2168,18 @@ class Element(Plugin):
    #    stdout (fileobject): The stream for stdout for the sandbox
    #    stderr (fileobject): The stream for stderr for the sandbox
    #    config (SandboxConfig): The SandboxConfig object
+    #    allow_remote (bool): Whether the sandbox is allowed to be remote
    #
    # Yields:
    #    (Sandbox): A usable sandbox
    #
    @contextmanager
-    def __sandbox(self, directory, stdout=None, stderr=None, config=None):
+    def __sandbox(self, directory, stdout=None, stderr=None, config=None, allow_remote=True):
        context = self._get_context()
        project = self._get_project()
        platform = Platform.get_platform()

-        if directory is not None and self.__use_remote_execution():
+        if directory is not None and allow_remote and self.__use_remote_execution():

            self.info("Using a remote sandbox for artifact {} with directory '{}'".format(self.name, directory))

@@ -2173,7 +2193,7 @@ class Element(Plugin):
            yield sandbox

        elif directory is not None and os.path.exists(directory):
-            if self.__remote_execution_url:
+            if allow_remote and self.__remote_execution_url:
                self.warn("Artifact {} is configured to use remote execution but element plugin does not support it."
                          .format(self.name), detail="Element plugin '{kind}' does not support virtual directories."
                          .format(kind=self.get_kind()), warning_token="remote-failure")
@@ -2193,7 +2213,8 @@ class Element(Plugin):
            rootdir = tempfile.mkdtemp(prefix="{}-".format(self.normal_name), dir=context.builddir)

            # Recursive contextmanager...
-            with self.__sandbox(rootdir, stdout=stdout, stderr=stderr, config=config) as sandbox:
+            with self.__sandbox(rootdir, stdout=stdout, stderr=stderr, config=config,
+                                allow_remote=allow_remote) as sandbox:
                yield sandbox

            # Cleanup the build dir

--- a/contrib/bst-docker-import
+++ b/contrib/bst-docker-import
@@ -93,10 +93,10 @@ echo "INFO: Checking out $element ..." >&2
 $bst_cmd checkout --tar "$element" "$checkout_tar" || die "Failed to checkout $element"
 echo "INFO: Successfully checked out $element" >&2

-echo "INFO: Importing Docker image ..."
+echo "INFO: Importing Docker image ..." >&2
 "${docker_import_cmd[@]}" "$checkout_tar" "$docker_image_tag" || die "Failed to import Docker image from tarball"
-echo "INFO: Successfully import Docker image $docker_image_tag"
+echo "INFO: Successfully import Docker image $docker_image_tag" >&2

-echo "INFO: Cleaning up ..."
+echo "INFO: Cleaning up ..." >&2
 rm "$checkout_tar" || die "Failed to remove $checkout_tar"
-echo "INFO: Clean up finished"
+echo "INFO: Clean up finished" >&2
--- a/doc/source/additional_docker.rst
+++ b/doc/source/additional_docker.rst
+
+.. _bst_and_docker:
+
+
+BuildStream and Docker
+======================
+
+BuildStream integrates with Docker in multiple ways. Here are some ways in
+which these integrations work.
+
+Run BuildStream inside Docker
+-----------------------------
+
+Refer to the :ref:`BuildStream inside Docker <docker>` documentation for
+instructions on how to run BuildStream as a Docker container.
+
+Generate Docker images
+----------------------
+
+The
+`bst-docker-import script <https://gitlab.com/BuildStream/buildstream/blob/master/contrib/bst-docker-import>`_
+can be used to generate a Docker image from built artifacts.
+
+You can download it and make it executable like this:
+
+.. code:: bash
+
+  mkdir -p ~/.local/bin
+  curl --get https://gitlab.com/BuildStream/buildstream/raw/master/contrib/bst-docker-import > ~/.local/bin/bst-docker-import
+  chmod +x ~/.local/bin/bst-docker-import
+
+Check if ``~/.local/bin`` appears in your PATH environment variable -- if it
+doesn't, you should
+`edit your ~/.profile so that it does <https://stackoverflow.com/questions/14637979/>`_.
+
+Once the script is available in your PATH and assuming you have Docker
+installed, you can start using the ``bst-docker-import`` script. Here is a
+minimal example to generate an image called ``bst-hello`` from an element
+called ``hello.bst`` assuming it is already built:
+
+.. code:: bash
+
+  bst-docker-import -t bst-hello hello.bst
+
+This script can also be used if you are running BuildStream inside Docker. In
+this case, you will need to supply the command that you are using to run
+BuildStream using the ``-c`` option.  If you are using the
+`bst-here wrapper script <https://gitlab.com/BuildStream/buildstream/blob/master/contrib/bst-here>`_,
+you can achieve the same results as the above example like this:
+
+.. code:: bash
+
+  bst-docker-import -c bst-here -t bst-hello hello.bst
--- a/doc/source/core_additional.rst
+++ b/doc/source/core_additional.rst
@@ -8,3 +8,4 @@ Additional writings

   additional_cachekeys
   additional_sandboxing
+   additional_docker
--- a/tests/completions/completions.py
+++ b/tests/completions/completions.py
@@ -103,7 +103,7 @@ def test_commands(cli, cmd, word_idx, expected):
    ('bst --no-colors build -', 3, ['--all ', '--track ', '--track-all ',
                                    '--track-except ',
                                    '--track-cross-junctions ', '-J ',
-                                    '--track-save ']),
+                                    '--track-save ', '--pull-buildtrees ']),

    # Test the behavior of completing after an option that has a
    # parameter that cannot be completed, vs an option that has

--- a/tests/integration/pullbuildtrees.py
+++ b/tests/integration/pullbuildtrees.py
+import os
+import shutil
+import pytest
+
+from tests.testutils import cli_integration as cli, create_artifact_share
+from tests.testutils.integration import assert_contains
+
+
+DATA_DIR = os.path.join(
+    os.path.dirname(os.path.realpath(__file__)),
+    "project"
+)
+
+
+# Remove artifact cache & set cli.config value of pullbuildtrees
+# to false, which is the default user context
+def default_state(cli, integration_cache, share):
+    # cache_dir = os.path.join(integration_cache, 'artifacts')
+    # Is this leaving behind the buildtree dir object?
+    # Might need to nuke the whole /artifacts dir instead
+    # cli.remove_artifact_from_cache(project, element_name, cache_dir=cache_dir)
+    shutil.rmtree(os.path.join(integration_cache, 'artifacts'))
+    cli.configure({'pullbuildtrees': False, 'artifacts': {'url': share.repo, 'push': False}})
+
+
+# A test to capture the integration of the pullbuildtrees
+# behaviour, which by default is to not include the buildtree
+# directory of an element
+@pytest.mark.integration
+@pytest.mark.datafiles(DATA_DIR)
+def test_pullbuildtrees(cli, tmpdir, datafiles, integration_cache):
+    project = os.path.join(datafiles.dirname, datafiles.basename)
+    element_name = 'autotools/amhello.bst'
+
+    # Create artifact shares for pull & push testing
+    with create_artifact_share(os.path.join(str(tmpdir), 'remote1')) as share1,\
+        create_artifact_share(os.path.join(str(tmpdir), 'remote2')) as share2:
+        cli.configure({
+            'artifacts': {'url': share1.repo, 'push': True},
+        })
+
+        # Build autotools element, checked pushed, delete local
+        result = cli.run(project=project, args=['build', element_name])
+        assert result.exit_code == 0
+        assert cli.get_element_state(project, element_name) == 'cached'
+        assert share1.has_artifact('test', element_name, cli.get_element_key(project, element_name))
+        default_state(cli, integration_cache, share1)
+
+        # Pull artifact with default config, assert that pulling again
+        # doesn't create a pull job, then assert with buildtrees user
+        # config set creates a pull job.
+        result = cli.run(project=project, args=['pull', element_name])
+        assert element_name in result.get_pulled_elements()
+        result = cli.run(project=project, args=['pull', element_name])
+        assert element_name not in result.get_pulled_elements()
+        cli.configure({'pullbuildtrees': True})
+        result = cli.run(project=project, args=['pull', element_name])
+        assert element_name in result.get_pulled_elements()
+        default_state(cli, integration_cache, share1)
+
+        # Pull artifact with default config, then assert that pulling
+        # with buildtrees cli flag set creates a pull job.
+        result = cli.run(project=project, args=['pull', element_name])
+        assert element_name in result.get_pulled_elements()
+        result = cli.run(project=project, args=['pull', '--pull-buildtrees', element_name])
+        assert element_name in result.get_pulled_elements()
+        default_state(cli, integration_cache, share1)
+
+        # Pull artifact with pullbuildtrees set in user config, then assert
+        # that pulling with the same user config doesn't creates a pull job,
+        # or when buildtrees cli flag is set.
+        cli.configure({'pullbuildtrees': True})
+        result = cli.run(project=project, args=['pull', element_name])
+        assert element_name in result.get_pulled_elements()
+        result = cli.run(project=project, args=['pull', element_name])
+        assert element_name not in result.get_pulled_elements()
+        result = cli.run(project=project, args=['pull', '--pull-buildtrees', element_name])
+        assert element_name not in result.get_pulled_elements()
+        default_state(cli, integration_cache, share1)
+
+        # Pull artifact with default config and buildtrees cli flag set, then assert
+        # that pulling with pullbuildtrees set in user config doesn't create a pull
+        # job.
+        result = cli.run(project=project, args=['pull', '--pull-buildtrees', element_name])
+        assert element_name in result.get_pulled_elements()
+        cli.configure({'pullbuildtrees': True})
+        result = cli.run(project=project, args=['pull', element_name])
+        assert element_name not in result.get_pulled_elements()
+        default_state(cli, integration_cache, share1)
+
+        # Assert that a partial build element (not containing a populated buildtree dir)
+        # can't be pushed to an artifact share, then assert that a complete build element
+        # can be. This will attempt a partial pull from share1 and then a partial push
+        # to share2
+        result = cli.run(project=project, args=['pull', element_name])
+        assert element_name in result.get_pulled_elements()
+        cli.configure({'artifacts': {'url': share2.repo, 'push': True}})
+        result = cli.run(project=project, args=['push', element_name])
+        assert element_name not in result.get_pushed_elements()
+        assert not share2.has_artifact('test', element_name, cli.get_element_key(project, element_name))
+
+        # Assert that after pulling the missing buildtree the element artifact can be
+        # successfully pushed to the remote. This will attempt to pull the buildtree
+        # from share1 and then a 'complete' push to share2
+        cli.configure({'artifacts': {'url': share1.repo, 'push': False}})
+        result = cli.run(project=project, args=['pull', '--pull-buildtrees', element_name])
+        assert element_name in result.get_pulled_elements()
+        cli.configure({'artifacts': {'url': share2.repo, 'push': True}})
+        result = cli.run(project=project, args=['push', element_name])
+        assert element_name in result.get_pushed_elements()
+        assert share2.has_artifact('test', element_name, cli.get_element_key(project, element_name))
+        default_state(cli, integration_cache, share1)
--- a/tests/testutils/artifactshare.py
+++ b/tests/testutils/artifactshare.py
@@ -128,7 +128,7 @@ class ArtifactShare():

        valid_chars = string.digits + string.ascii_letters + '-._'
        element_name = ''.join([
-            x if x in valid_chars else '_'
+            x if x in valid_chars else '-'
            for x in element_name
        ])
        artifact_key = '{0}/{1}/{2}'.format(project_name, element_name, cache_key)
No results found