Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • willsalmon/buildstream
  • CumHoleZH/buildstream
  • tchaik/buildstream
  • DCotyPortfolio/buildstream
  • jesusoctavioas/buildstream
  • patrickmmartin/buildstream
  • franred/buildstream
  • tintou/buildstream
  • alatiera/buildstream
  • martinblanchard/buildstream
  • neverdie22042524/buildstream
  • Mattlk13/buildstream
  • PServers/buildstream
  • phamnghia610909/buildstream
  • chiaratolentino/buildstream
  • eysz7-x-x/buildstream
  • kerrick1/buildstream
  • matthew-yates/buildstream
  • twofeathers/buildstream
  • mhadjimichael/buildstream
  • pointswaves/buildstream
  • Mr.JackWilson/buildstream
  • Tw3akG33k/buildstream
  • AlexFazakas/buildstream
  • eruidfkiy/buildstream
  • clamotion2/buildstream
  • nanonyme/buildstream
  • wickyjaaa/buildstream
  • nmanchev/buildstream
  • bojorquez.ja/buildstream
  • mostynb/buildstream
  • highpit74/buildstream
  • Demo112/buildstream
  • ba2014sheer/buildstream
  • tonimadrino/buildstream
  • usuario2o/buildstream
  • Angelika123456/buildstream
  • neo355/buildstream
  • corentin-ferlay/buildstream
  • coldtom/buildstream
  • wifitvbox81/buildstream
  • 358253885/buildstream
  • seanborg/buildstream
  • SotK/buildstream
  • DouglasWinship/buildstream
  • karansthr97/buildstream
  • louib/buildstream
  • bwh-ct/buildstream
  • robjh/buildstream
  • we88c0de/buildstream
  • zhengxian5555/buildstream
51 results
Show changes
Commits on Source (25)
......@@ -3,72 +3,137 @@ Contributing
Some tips and guidelines for developers hacking on BuildStream
Feature additions
-----------------
Major feature additions should be proposed on the
`mailing list <https://mail.gnome.org/mailman/listinfo/buildstream-list>`_
before being considered for inclusion, we strongly recommend proposing
in advance of commencing work.
.. _contributing_filing_issues:
Filing issues
-------------
If you are experiencing an issue with BuildStream, or would like to submit a patch
to fix an issue, then you should first search the list of `open issues <https://gitlab.com/BuildStream/buildstream/issues>`_
to see if the issue is already filed, and `open an issue <https://gitlab.com/BuildStream/buildstream/issues/new>`_
if no issue already exists.
For policies on how to submit an issue and how to use our project labels,
we recommend that you read the `policies guide
<https://gitlab.com/BuildStream/nosoftware/alignment/blob/master/BuildStream_policies.md>`_.
.. _contributing_fixing_bugs:
Fixing bugs
-----------
Before fixing a bug, it is preferred that an :ref:`issue be filed <contributing_filing_issues>`
first in order to better document the defect, however this need not be followed to the
letter for minor fixes.
Patches which fix bugs should always come with a regression test.
If you are experiencing an issue with BuildStream or would like to submit a small patch/feature, then
you can `oepn an issue <https://gitlab.com/BuildStream/buildstream/issues/new?issue%5Bassignee_id%5D=&issue%5Bmilestone_id%5D=>`_
.. _contributing_adding_features:
For policies on how to submit and issue and how to use our project labels, we recommend that you read the `policies guide <https://gitlab.com/BuildStream/nosoftware/alignment/blob/master/BuildStream_policies.md>`_
Adding new features
-------------------
Feature additions should be proposed on the `mailing list
<https://mail.gnome.org/mailman/listinfo/buildstream-list>`_
before being considered for inclusion. To save time and avoid any frustration,
we strongly recommend proposing your new feature in advance of commencing work.
Once consensus has been reached on the mailing list, then the proposing
party should :ref:`file an issue <contributing_filing_issues>` to track the
work. Please use the *bst_task* template for issues which represent
feature additions.
New features must be well documented and tested either in our main
test suite if possible, or otherwise in the integration tests.
New features must be well documented and tested in our test suite.
It is expected that the individual submitting the work take ownership
of their feature within BuildStream for a reasonable timeframe of at least
one release cycle after their work has landed on the master branch. This is
to say that the submitter is expected to address and fix any side effects and
bugs which may have fell through the cracks in the review process, giving us
a reasonable timeframe for identifying these.
to say that the submitter is expected to address and fix any side effects,
bugs or regressions which may have fell through the cracks in the review
process, giving us a reasonable timeframe for identifying these.
Patch submissions
-----------------
If you want to submit a patch, do ask for developer permissions on our
IRC channel first (GitLab's button also works, but you may need to
shout about it - we often overlook this) - for CI reasons, it's much
easier if patches are in branches of the main repository.
.. _contributing_submitting_patches:
Submitting patches
------------------
Branches must be submitted as merge requests in gitlab. If the branch
fixes an issue or is related to any issues, these issues must be mentioned
in the merge request or preferably the commit messages themselves.
Ask for developer access
~~~~~~~~~~~~~~~~~~~~~~~~
If you want to submit a patch, do ask for developer permissions, either
by asking us directly on our public IRC channel (irc://irc.gnome.org/#buildstream)
or by visiting our `project page on GitLab <https://gitlab.com/BuildStream/buildstream>`_
and using the GitLab UI to ask for permission.
This will make your contribution experience smoother, as you will not
need to setup any complicated CI settings, and rebasing your branch
against the upstream master branch will be more painless.
Branch names
~~~~~~~~~~~~
Branch names for merge requests should be prefixed with the submitter's
name or nickname, e.g. ``username/implement-flying-ponies``.
name or nickname, followed by a forward slash, and then a descriptive
name. e.g.::
You may open merge requests for the branches you create before you
are ready to have them reviewed upstream, as long as your merge request
is not yet ready for review then it must be prefixed with the ``WIP:``
identifier.
username/fix-that-bug
This allows us to more easily identify which branch does what and
belongs to whom, especially so that we can effectively cleanup stale
branches in the upstream repository over time.
Merge requests
~~~~~~~~~~~~~~
Once you have created a local branch, you can push it to the upstream
BuildStream repository using the command line::
git push origin username/fix-that-bug:username/fix-that-bug
GitLab will respond to this with a message and a link to allow you to create
a new merge request. You can also `create a merge request for an existing branch
<https://gitlab.com/BuildStream/buildstream/merge_requests/new>`_.
You may open merge requests for the branches you create before you are ready
to have them reviewed and considered for inclusion if you like. Until your merge
request is ready for review, the merge request title must be prefixed with the
``WIP:`` identifier.
Organized commits
~~~~~~~~~~~~~~~~~
Submitted branches must not contain a history of the work done in the
feature branch. Please use git's interactive rebase feature in order to
compose a clean patch series suitable for submission.
feature branch. For example, if you had to change your approach, or
have a later commit which fixes something in a previous commit on your
branch, we do not want to include the history of how you came up with
your patch in the upstream master branch.
Please use git's interactive rebase feature in order to compose a clean
patch series suitable for submission upstream.
Every commit in series should pass the test suite, this is very important
for tracking down regressions and performing git bisections in the future.
We prefer that documentation changes be submitted in separate commits from
the code changes which they document, and new test cases are also preferred
in separate commits.
the code changes which they document, and newly added test cases are also
preferred in separate commits.
If a commit in your branch modifies behavior such that a test must also
be changed to match the new behavior, then the tests should be updated
with the same commit. Ideally every commit in the history of master passes
its test cases, this makes bisections more easy to perform, but is not
always practical with more complex branches.
with the same commit, so that every commit passes it's own tests.
Commit messages
~~~~~~~~~~~~~~~
Commit messages must be formatted with a brief summary line, optionally
followed by an empty line and then a free form detailed description of
the change.
Commit messages must be formatted with a brief summary line, followed by
an empty line and then a free form detailed description of the change.
The summary line must start with what changed, followed by a colon and
a very brief description of the change.
If the commit fixes an issue, or is related to an issue; then the issue
number must be referenced in the commit message.
**Example**::
element.py: Added the frobnicator so that foos are properly frobbed.
......@@ -77,33 +142,464 @@ a very brief description of the change.
the element. Elements that are not properly frobnicated raise
an error to inform the user of invalid frobnication rules.
Fixes #123
In the case that you have a commit which necessarily modifies multiple
components, then the summary line should still mention generally what
changed (if possible), followed by a colon and a brief summary.
In this case the free form detailed description of the change should
contain a bullet list describing what was changed in each component
separately.
**Example**::
artifact cache: Fixed automatic expiry in the local cache
Coding style
------------
Coding style details for BuildStream
o _artifactcache/artifactcache.py: Updated the API contract
of ArtifactCache.remove() so that something detailed is
explained here.
o _artifactcache/cascache.py: Adhere to the new API contract
dictated by the abstract ArtifactCache class.
Style guide
~~~~~~~~~~~
Python coding style for BuildStream is pep8, which is documented here: https://www.python.org/dev/peps/pep-0008/
o tests/artifactcache/expiry.py: Modified test expectations to
match the new behavior.
This is a part of #123
Coding guidelines
-----------------
This section discusses coding style and other guidelines for hacking
on BuildStream. This is important to read through for writing any non-trivial
patches and especially outlines what people should watch out for when
reviewing patches.
Much of the rationale behind what is layed out in this section considers
good traceability of lines of code with *git blame*, overall sensible
modular structure, consistency in how we write code, and long term maintenance
in mind.
Approximate PEP-8 Style
~~~~~~~~~~~~~~~~~~~~~~~
Python coding style for BuildStream is approximately `pep8 <https://www.python.org/dev/peps/pep-0008/>`_.
We have a couple of minor exceptions to this standard, we dont want to compromise
code readability by being overly restrictive on line length for instance.
The pep8 linter will run automatically when running the test suite.
The pep8 linter will run automatically when :ref:`running the test suite <contributing_testing>`.
Line lengths
''''''''''''
Regarding laxness on the line length in our linter settings, it should be clarified
that the line length limit is a hard limit which causes the linter to bail out
and reject commits which break the high limit - not an invitation to write exceedingly
long lines of code, comments, or API documenting docstrings.
Code, comments and docstrings should strive to remain written for approximately 80
or 90 character lines, where exceptions can be made when code would be less readable
when exceeding 80 or 90 characters (often this happens in conditional statements
when raising an exception, for example). Or, when comments contain a long link that
causes the given line to to exceed 80 or 90 characters, we don't want this to cause
the linter to refuse the commit.
.. _contributing_documenting_symbols:
Documenting symbols
~~~~~~~~~~~~~~~~~~~
In BuildStream, we maintain what we call a *"Public API Surface"* that
is guaranteed to be stable and unchanging across stable releases. The
symbols which fall into this special class are documented using Python's
standard *docstrings*, while all other internals of BuildStream are documented
with comments above the related symbol.
When documenting the public API surface which is rendered in the reference
manual, we always mention the major version in which the API was introduced,
as shown in the examples below. If a public API exists without the *Since*
annotation, this is taken to mean that it was available since the first stable
release 1.0.
Here are some examples to get the hang of the format of API documenting
comments and docstrings.
**Public API Surface method**::
def frobnicate(self, source, *, frobilicious=False):
"""Frobnicates this element with the specified source
Args:
source (Source): The Source to frobnicate with
frobilicious (bool): Optionally specify that frobnication should be
performed fribiliciously
Returns:
(Element): The frobnicated version of this Element.
*Since: 1.2*
"""
...
**Internal method**::
# frobnicate():
#
# Frobnicates this element with the specified source
#
# Args:
# source (Source): The Source to frobnicate with
# frobilicious (bool): Optionally specify that frobnication should be
# performed fribiliciously
#
# Returns:
# (Element): The frobnicated version of this Element.
#
def frobnicate(self, source, *, frobilicious=False):
...
**Public API Surface instance variable**::
def __init__(self, context, element):
self.name = self._compute_name(context, element)
"""The name of this foo
*Since: 1.2*
"""
.. note::
Python does not support docstrings on instance variables, but sphinx does
pick them up and includes them in the generated documentation.
**Internal instance variable**::
def __init__(self, context, element):
self.name = self._compute_name(context, element) # The name of this foo
**Internal instance variable (long)**::
def __init__(self, context, element):
# This instance variable required a longer explanation, so
# it is on a line above the instance variable declaration.
self.name = self._compute_name(context, element)
**Public API Surface class**::
class Foo(Bar):
"""The main Foo object in the data model
Explanation about Foo. Note that we always document
the constructor arguments here, and not beside the __init__
method.
Args:
context (Context): The invocation Context
count (int): The number to count
*Since: 1.2*
"""
...
**Internal class**::
# Foo()
#
# The main Foo object in the data model
#
# Args:
# context (Context): The invocation Context
# count (int): The number to count
#
class Foo(Bar):
...
.. _contributing_class_order:
Class structure and ordering
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When creating or modifying an object class in BuildStream, it is
important to keep in mind the order in which symbols should appear
and keep this consistent.
Here is an example to illustrate the expected ordering of symbols
on a Python class in BuildStream::
class Foo(Bar):
# Public class-wide variables come first, if any.
# Private class-wide variables, if any
# Now we have the dunder/magic methods, always starting
# with the __init__() method.
def __init__(self, name):
super().__init__()
# NOTE: In the instance initializer we declare any instance variables,
# always declare the public instance variables (if any) before
# the private ones.
#
# It is preferred to avoid any public instance variables, and
# always expose an accessor method for it instead.
#
# Public instance variables
#
self.name = name # The name of this foo
#
# Private instance variables
#
self._count = 0 # The count of this foo
################################################
# Abstract Methods #
################################################
# NOTE: Abstract methods in BuildStream are allowed to have
# default methods.
#
# Subclasses must NEVER override any method which was
# not advertized as an abstract method by the parent class.
# frob()
#
# Implementors should implement this to frob this foo
# count times if possible.
#
# Args:
# count (int): The number of times to frob this foo
#
# Returns:
# (int): The number of times this foo was frobbed.
#
# Raises:
# (FooError): Implementors are expected to raise this error
#
def frob(self, count):
#
# An abstract method in BuildStream is allowed to have
# a default implementation.
#
self._count = self._do_frobbing(count)
return self._count
################################################
# Implementation of abstract methods #
################################################
# NOTE: Implementations of abstract methods defined by
# the parent class should NEVER document the API
# here redundantly.
def frobbish(self):
#
# Implementation of the "frobbish" abstract method
# defined by the parent Bar class.
#
return True
################################################
# Public Methods #
################################################
# NOTE: Public methods here are the ones which are expected
# to be called from outside of this class.
#
# These, along with any abstract methods, usually
# constitute the API surface of this class.
# frobnicate()
#
# Perform the frobnication process on this Foo
#
# Raises:
# (FrobError): In the case that a frobnication error was
# encountered
#
def frobnicate(self):
frobnicator.frobnicate(self)
# set_count()
#
# Sets the count of this foo
#
# Args:
# count (int): The new count to set
#
def set_count(self, count):
self._count = count
# get_count()
#
# Accessor for the count value of this foo.
#
# Returns:
# (int): The count of this foo
#
def get_count(self, count):
return self._count
################################################
# Private Methods #
################################################
# NOTE: Private methods are the ones which are internal
# implementation details of this class.
#
# Even though these are private implementation
# details, they still MUST have API documenting
# comments on them.
# _do_frobbing()
#
# Does the actual frobbing
#
# Args:
# count (int): The number of times to frob this foo
#
# Returns:
# (int): The number of times this foo was frobbed.
#
def self._do_frobbing(self, count):
return count
.. _contributing_public_and_private:
Public and private symbols
~~~~~~~~~~~~~~~~~~~~~~~~~~
BuildStream mostly follows the PEP-8 for defining *public* and *private* symbols
for any given class, with some deviations. Please read the `section on inheritance
<https://www.python.org/dev/peps/pep-0008/#designing-for-inheritance>`_ for
reference on how the PEP-8 defines public and non-public.
* A *public* symbol is any symbol which you expect to be used by clients
of your class or module within BuildStream.
Public symbols are written without any leading underscores.
* A *private* symbol is any symbol which is entirely internal to your class
or module within BuildStream. These symbols cannot ever be accessed by
external clients or modules.
A private symbol must be denoted by a leading underscore.
* When a class can have subclasses, then private symbols should be denoted
by two leading underscores. For example, the ``Sandbox`` or ``Platform``
classes which have various implementations, or the ``Element`` and ``Source``
classes which plugins derive from.
The double leading underscore naming convention invokes Python's name
mangling algorithm which helps prevent namespace collisions in the case
that subclasses might have a private symbol with the same name.
In BuildStream, we have what we call a *"Public API Surface"*, as previously
mentioned in :ref:`contributing_documenting_symbols`. In the :ref:`next section
<contributing_public_api_surface>` we will discuss the *"Public API Surface"* and
outline the exceptions to the rules discussed here.
.. _contributing_public_api_surface:
Public API surface
~~~~~~~~~~~~~~~~~~
BuildStream exposes what we call a *"Public API Surface"* which is stable
and unchanging. This is for the sake of stability of the interfaces which
plugins use, so it can also be referred to as the *"Plugin facing API"*.
Any symbols which are a part of the *"Public API Surface*" are never allowed
to change once they have landed in a stable release version of BuildStream. As
such, we aim to keep the *"Public API Surface"* as small as possible at all
times, and never expose any internal details to plugins inadvertently.
One problem which arises from this is that we end up having symbols
which are *public* according to the :ref:`rules discussed in the previous section
<contributing_public_and_private>`, but must be hidden away from the
*"Public API Surface"*. For example, BuildStream internal classes need
to invoke methods on the ``Element`` and ``Source`` classes, wheras these
methods need to be hidden from the *"Public API Surface"*.
This is where BuildStream deviates from the PEP-8 standard for public
and private symbol naming.
In order to disambiguate between:
* Symbols which are publicly accessible details of the ``Element`` class, can
be accessed by BuildStream internals, but must remain hidden from the
*"Public API Surface"*.
* Symbols which are private to the ``Element`` class, and cannot be accessed
from outside of the ``Element`` class at all.
We denote the former category of symbols with only a single underscore, and the latter
category of symbols with a double underscore. We often refer to this distinction
as *"API Private"* (the former category) and *"Local Private"* (the latter category).
Classes which are a part of the *"Public API Surface"* and require this disambiguation
were not discussed in :ref:`the class ordering section <contributing_class_order>`, for
these classes, the *"API Private"* symbols always come **before** the *"Local Private"*
symbols in the class declaration.
Modules which are not a part of the *"Public API Surface"* have their Python files
prefixed with a single underscore, and are not imported in BuildStream's the master
``__init__.py`` which is used by plugins.
.. note::
The ``utils.py`` module is public and exposes a handful of utility functions,
however many of the functions it provides are *"API Private"*.
In this case, the *"API Private"* functions are prefixed with a single underscore.
Any objects which are a part of the *"Public API Surface"* should be exposed via the
toplevel ``__init__.py`` of the ``buildstream`` package.
File naming convention
~~~~~~~~~~~~~~~~~~~~~~
With the exception of a few helper objects and data structures, we structure
the code in BuildStream such that every filename is named after the object it
implements. E.g. The ``Project`` object is implemented in ``_project.py``, the
``Context`` object in ``_context.py``, the base ``Element`` class in ``element.py``,
etc.
As mentioned in the previous section, objects which are not a part of the
:ref:`public, plugin facing API surface <contributing_public_api_surface>` have their
filenames prefixed with a leading underscore (like ``_context.py`` and ``_project.py``
in the examples above).
When an object name has multiple words in it, e.g. ``ArtifactCache``, then the
resulting file is named all in lower case without any underscore to separate
words. In the case of ``ArtifactCache``, the filename implementing this object
is found at ``_artifactcache/artifactcache.py``.
Imports
~~~~~~~
Module imports inside BuildStream are done with relative ``.`` notation
Module imports inside BuildStream are done with relative ``.`` notation:
Good::
**Good**::
from .context import Context
from ._context import Context
Bad::
**Bad**::
from buildstream.context import Context
from buildstream._context import Context
The exception to the above rule is when authoring plugins,
plugins do not reside in the same namespace so they must
......@@ -122,128 +618,586 @@ This makes things clear when reading code that said functions
are not defined in the same file but come from utils.py for example.
Policy for private symbols
~~~~~~~~~~~~~~~~~~~~~~~~~~
Private symbols are expressed via a leading ``_`` single underscore, or
in some special circumstances with a leading ``__`` double underscore.
.. _contributing_instance_variables:
Instance variables
~~~~~~~~~~~~~~~~~~
It is preferred that all instance state variables be declared as :ref:`private symbols
<contributing_public_and_private>`, however in some cases, especially when the state
is immutable for the object's life time (like an ``Element`` name for example), it
is acceptable to save some typing by using a publicly accessible instance variable.
It is never acceptable to modify the value of an instance variable from outside
of the declaring class, even if the variable is *public*. In other words, the class
which exposes an instance variable is the only one in control of the value of this
variable.
* If an instance variable is public and must be modified; then it must be
modified using a :ref:`mutator <contributing_accessor_mutator>`.
* Ideally for better encapsulation, all object state is declared as
:ref:`private instance variables <contributing_public_and_private>` and can
only be accessed by external classes via public :ref:`accessors and mutators
<contributing_accessor_mutator>`.
.. note::
In some cases, we may use small data structures declared as objects for the sake
of better readability, where the object class itself has no real supporting code.
In these exceptions, it can be acceptable to modify the instance variables
of these objects directly, unless they are otherwise documented to be immutable.
.. _contributing_accessor_mutator:
Accessors and mutators
~~~~~~~~~~~~~~~~~~~~~~
An accessor and mutator, are methods defined on the object class to access (get)
or mutate (set) a value owned by the declaring class, respectively.
An accessor might derive the returned value from one or more of its components,
and a mutator might have side effects, or delegate the mutation to a component.
Accessors and mutators are always :ref:`public <contributing_public_and_private>`
(even if they might have a single leading underscore and are considered
:ref:`API Private <contributing_public_api_surface>`), as their purpose is to
enforce encapsulation with regards to any accesses to the state which is owned
by the declaring class.
Accessors and mutators are functions prefixed with ``get_`` and ``set_``
respectively, e.g.::
class Foo():
def __init__(self):
# Declare some internal state
self._count = 0
# get_count()
#
# Gets the count of this Foo.
#
# Returns:
# (int): The current count of this Foo
#
def get_foo(self):
return self._count
# set_count()
#
# Sets the count of this Foo.
#
# Args:
# count (int): The new count for this Foo
#
def set_foo(self, count):
self._count = count
.. attention::
Before understanding the naming policy, it is first important to understand
that in BuildStream, there are two levels of privateness which need to be
considered.
We are aware that Python offers a facility for accessors and
mutators using the ``@property`` decorator instead. Do not use
the ``@property`` decorator.
These are treated subtly differently and thus need to be understood:
The decision to use explicitly defined functions instead of the
``@property`` decorator is rather arbitrary, there is not much
technical merit to preferring one technique over the other.
However as :ref:`discussed below <contributing_always_consistent>`,
it is of the utmost importance that we do not mix both techniques
in the same codebase.
* API Private
A symbol is considered to be *API private* if it is not exposed in the *public API*.
.. _contributing_abstract_methods:
Even if a symbol does not have any leading underscore, it may still be *API private*
if the containing *class* or *module* is named with a leading underscore.
Abstract methods
~~~~~~~~~~~~~~~~
In BuildStream, an *"Abstract Method"* is a bit of a misnomer and does
not match up to how Python defines abstract methods, we need to seek out
a new nomanclature to refer to these methods.
* Local private
In Python, an *"Abstract Method"* is a method which **must** be
implemented by a subclass, whereas all methods in Python can be
overridden.
A symbol is considered to be *local private* if it is not intended for access
outside of the defining *scope*.
In BuildStream, we use the term *"Abstract Method"*, to refer to
a method which **can** be overridden by a subclass, whereas it
is **illegal** to override any other method.
If a symbol has a leading underscore, it might not be *local private* if it is
declared on a publicly visible class, but needs to be accessed internally by
other modules in the BuildStream core.
* Abstract methods are allowed to have default implementations.
* Subclasses are not allowed to redefine the calling signature
of an abstract method, or redefine the API contract in any way.
Ordering
''''''''
For better readability and consistency, we try to keep private symbols below
public symbols. In the case of public modules where we may have a mix of
*API private* and *local private* symbols, *API private* symbols should come
before *local private* symbols.
* Subclasses are not allowed to override any other methods.
The key here is that in BuildStream, we consider it unacceptable
that a subclass overrides a method of it's parent class unless
the said parent class has explicitly given permission to subclasses
to do so, and outlined the API contract for this purpose. No surprises
are allowed.
Symbol naming
'''''''''''''
Any private symbol must start with a single leading underscore for two reasons:
* So that it does not bleed into documentation and *public API*.
Error handling
~~~~~~~~~~~~~~
In BuildStream, all non recoverable errors are expressed via
subclasses of the ``BstError`` exception.
* So that it is clear to developers which symbols are not used outside of the declaring *scope*
This exception is handled deep in the core in a few places, and
it is rarely necessary to handle a ``BstError``.
Remember that with python, the modules (python files) are also symbols
within their containing *package*, as such; modules which are entirely
private to BuildStream are named as such, e.g. ``_thismodule.py``.
Raising exceptions
''''''''''''''''''
When writing code in the BuildStream core, ensure that all system
calls and third party library calls are wrapped in a ``try:`` block,
and raise a descriptive ``BstError`` of the appropriate class explaining
what exactly failed.
Cases for double underscores
''''''''''''''''''''''''''''
The double underscore in python has a special function. When declaring
a symbol in class scope which has a leading underscore, it can only be
accessed within the class scope using the same name. Outside of class
scope, it can only be accessed with a *cheat*.
Ensure that the original system call error is formatted into your new
exception, and that you use the Python ``from`` semantic to retain the
original call trace, example::
We use the double underscore in cases where the type of privateness can be
ambiguous.
try:
os.utime(self._refpath(ref))
except FileNotFoundError as e:
raise ArtifactError("Attempt to access unavailable artifact: {}".format(e)) from e
* For private modules and classes
We never need to disambiguate with a double underscore
Enhancing exceptions
''''''''''''''''''''
Sometimes the ``BstError`` originates from a lower level component,
and the code segment which raised the exception did not have enough context
to create a complete, informative summary of the error for the user.
* For private symbols declared in a public *scope*
In these cases it is necessary to handle the error and raise a new
one, e.g.::
In the case that we declare a private method on a public object, it
becomes ambiguous whether:
try:
extracted_artifact = self._artifacts.extract(self, cache_key)
except ArtifactError as e:
raise ElementError("Failed to extract {} while checking out {}: {}"
.format(cache_key, self.name, e)) from e
* The symbol is *local private*, and only used within the given scope
* The symbol is *API private*, and will be used internally by BuildStream
from other parts of the codebase.
Programming errors
''''''''''''''''''
Sometimes you are writing code and have detected an unexpected condition,
or a broken invariant for which the code cannot be prepared to handle
gracefully.
In this case, we use a single underscore for *API private* methods which
are not *local private*, and we use a double underscore for *local private*
methods declared in public scope.
In these cases, do **not** raise any of the ``BstError`` class exceptions.
Instead, use the ``assert`` statement, e.g.::
Documenting private symbols
'''''''''''''''''''''''''''
Any symbol which is *API Private* (regardless of whether it is also
*local private*), should have some documentation for developers to
better understand the codebase.
assert utils._is_main_process(), \
"Attempted to save workspace configuration from child process"
Contrary to many other python projects, we do not use docstrings to
document private symbols, but prefer to keep *API Private* symbols
documented in code comments placed *above* the symbol (or *beside* the
symbol in some cases, such as variable declarations in a class where
a shorter comment is more desirable), rather than docstrings placed *below*
the symbols being documented.
This will result in a ``BUG`` message with the stack trace included being
logged and reported in the frontend.
Other than this detail, follow the same guidelines for documenting
symbols as described below.
BstError parameters
'''''''''''''''''''
When raising ``BstError`` class exceptions, there are some common properties
which can be useful to know about:
Documenting BuildStream
-----------------------
* **message:** The brief human readable error, will be formatted on one line in the frontend.
* **detail:** An optional detailed human readable message to accompany the **message** summary
of the error. This is often used to recommend the user some course of action, or to provide
additional context about the error.
* **temporary:** Some errors are allowed to be *temporary*, this attribute is only
observed from child processes which fail in a temporary way. This distinction
is used to determine whether the task should be *retried* or not. An error is usually
only a *temporary* error if the cause of the error was a network timeout.
* **reason:** A machine readable identifier for the error. This is used for the purpose
of regression testing, such that we check that BuildStream has errored out for the
expected reason in a given failure mode.
Documenting Exceptions
''''''''''''''''''''''
We have already seen :ref:`some examples <contributing_class_order>` of how
exceptions are documented in API documenting comments, but this is worth some
additional disambiguation.
* Only document the exceptions which are raised directly by the function in question.
It is otherwise nearly impossible to keep track of what exceptions *might* be raised
indirectly by calling the given function.
* For a regular public or private method, your audience is a caller of the function;
document the exception in terms of what exception might be raised as a result of
calling this method.
* For an :ref:`abstract method <contributing_abstract_methods>`, your audience is the
implementor of the method in a subclass; document the exception in terms of what
exception is prescribed for the implementing class to raise.
.. _contributing_always_consistent:
Always be consistent
~~~~~~~~~~~~~~~~~~~~
There are various ways to define functions and classes in Python,
which has evolved with various features over time.
In BuildStream, we may not have leveraged all of the nice features
we could have, that is okay, and where it does not break API, we
can consider changing it.
Even if you know there is a *better* way to do a given thing in
Python when compared to the way we do it in BuildStream, *do not do it*.
Consistency of how we do things in the codebase is more important
than the actual way in which things are done, always.
Instead, if you like a certain Python feature and think the BuildStream
codebase should use it, then propose your change on the `mailing list
<https://mail.gnome.org/mailman/listinfo/buildstream-list>`_. Chances
are that we will reach agreement to use your preferred approach, and
in that case, it will be important to apply the change unilaterally
across the entire codebase, such that we continue to have a consistent
codebase.
Avoid tail calling
~~~~~~~~~~~~~~~~~~
With the exception of tail calling with simple functions from
the standard Python library, such as splitting and joining lines
of text and encoding/decoding text; always avoid tail calling.
**Good**::
# Variables that we will need declared up top
context = self._get_context()
workspaces = context.get_workspaces()
...
# Saving the workspace configuration
workspaces.save_config()
**Bad**::
# Saving the workspace configuration
self._get_context().get_workspaces().save_config()
**Acceptable**::
# Decode the raw text loaded from a log file for display,
# join them into a single utf-8 string and strip away any
# trailing whitespace.
return '\n'.join([line.decode('utf-8') for line in lines]).rstrip()
When you need to obtain a delegate object via an accessor function,
either do it at the beginning of the function, or at the beginning
of a code block within the function that will use that object.
There are several reasons for this convention:
* When observing a stack trace, it is always faster and easier to
determine what went wrong when all statements are on separate lines.
* We always want individual lines to trace back to their origin as
much as possible for the purpose of tracing the history of code
with *git blame*.
One day, you might need the ``Context`` or ``Workspaces`` object
in the same function for another reason, at which point it will
be unacceptable to leave the existing line as written, because it
will introduce a redundant accessor to the same object, so the
line written as::
self._get_context().get_workspaces().save_config()
Will have to change at that point, meaning we lose the valuable
information of which commit originally introduced this call
when running *git blame*.
* For similar reasons, we prefer delegate objects be accessed near
the beginning of a function or code block so that there is less
chance that this statement will have to move in the future, if
the same function or code block needs the delegate object for any
other reason.
Asides from this, code is generally more legible and uniform when
variables are declared at the beginning of function blocks.
Vertical stacking of modules
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For the sake of overall comprehensiveness of the BuildStream
architecture, it is important that we retain vertical stacking
order of the dependencies and knowledge of modules as much as
possible, and avoid any cyclic relationships in modules.
For instance, the ``Source`` objects are owned by ``Element``
objects in the BuildStream data model, and as such the ``Element``
will delegate some activities to the ``Source`` objects in it's
possesion. The ``Source`` objects should however never call functions
on the ``Element`` object, nor should the ``Source`` object itself
have any understanding of what an ``Element`` is.
If you are implementing a low level utility layer, for example
as a part of the ``YAML`` loading code layers, it can be tempting
to derive context from the higher levels of the codebase which use
these low level utilities, instead of defining properly stand alone
APIs for these utilities to work: Never do this.
Unfortunately, unlike other languages where include files play
a big part in ensuring that it is difficult to make a mess; Python,
allows you to just call methods on arbitrary objects passed through
a function call without having to import the module which defines
those methods - this leads to cyclic dependencies of modules quickly
if the developer does not take special care of ensuring this does not
happen.
Minimize arguments in methods
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When creating an object, or adding a new API method to an existing
object, always strive to keep as much context as possible on the
object itself rather than expecting callers of the methods to provide
everything the method needs every time.
If the value or object that is needed in a function call is a constant
for the lifetime of the object which exposes the given method, then
that value or object should be passed in the constructor instead of
via a method call.
Minimize API surfaces
~~~~~~~~~~~~~~~~~~~~~
When creating an object, or adding new functionality in any way,
try to keep the number of :ref:`public, outward facing <contributing_public_and_private>`
symbols to a minimum, this is important for both
:ref:`internal and public, plugin facing API surfaces <contributing_public_api_surface>`.
When anyone visits a file, there are two levels of comprehension:
* What do I need to know in order to *use* this object.
* What do I need to know in order to *modify* this object.
For the former, we want the reader to understand with as little effort
as possible, what the public API contract is for a given object and consequently,
how it is expected to be used. This is also why we
:ref:`order the symbols of a class <contributing_class_order>` in such a way
as to keep all outward facing public API surfaces at the top of the file, so that the
reader never needs to dig deep into the bottom of the file to find something they
might need to use.
For the latter, when it comes to having to modify the file or add functionality,
you want to retain as much freedom as possible to modify internals, while
being sure that nothing external will be affected by internal modifications.
Less client facing API means that you have less surrounding code to modify
when your API changes. Further, ensuring that there is minimal outward facing
API for any module minimizes the complexity for the developer working on
that module, by limiting the considerations needed regarding external side
effects of their modifications to the module.
When modifying a file, one should not have to understand or think too
much about external side effects, when the API surface of the file is
well documented and minimal.
When adding new API to a given object for a new purpose, consider whether
the new API is in any way redundant with other API (should this value now
go into the constructor, since we use it more than once? could this
value be passed along with another function, and the other function renamed,
to better suit the new purposes of this module/object?) and repurpose
the outward facing API of an object as a whole every time.
Avoid transient state on instances
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
At times, it can be tempting to store transient state that is
the result of one operation on an instance, only to be retrieved
later via an accessor function elsewhere.
As a basic rule of thumb, if the value is transient and just the
result of one operation, which needs to be observed directly after
by another code segment, then never store it on the instance.
BuildStream is complicated in the sense that it is multi processed
and it is not always obvious how to pass the transient state around
as a return value or a function parameter. Do not fall prey to this
obstacle and pollute object instances with transient state.
Instead, always refactor the surrounding code so that the value
is propagated to the desired end point via a well defined API, either
by adding new code paths or changing the design such that the
architecture continues to make sense.
Refactor the codebase as needed
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Especially when implementing features, always move the BuildStream
codebase forward as a whole.
Taking a short cut is alright when prototyping, but circumventing
existing architecture and design to get a feature implemented without
re-designing the surrounding architecture to accommodate the new
feature instead, is never acceptable upstream.
For example, let's say that you have to implement a feature and you've
successfully prototyped it, but it launches a ``Job`` directly from a
``Queue`` implementation to get the feature to work, while the ``Scheduler``
is normally responsible for dispatching ``Jobs`` for the elements on
a ``Queue``. This means that you've proven that your feature can work,
and now it is time to start working on a patch for upstream.
Consider what the scenario is and why you are circumventing the design,
and then redesign the ``Scheduler`` and ``Queue`` objects to accommodate for
the new feature and condition under which you need to dispatch a ``Job``,
or how you can give the ``Queue`` implementation the additional context it
needs.
Adding core plugins
-------------------
This is a checklist of things which need to be done when adding a new
core plugin to BuildStream proper.
Update documentation index
~~~~~~~~~~~~~~~~~~~~~~~~~~
The documentation generating scripts will automatically pick up your
newly added plugin and generate HTML, but will not add a link to the
documentation of your plugin automatically.
Whenever adding a new plugin, you must add an entry for it in ``doc/source/core_plugins.rst``.
Bump format version
~~~~~~~~~~~~~~~~~~~
In order for projects to assert that they have a new enough version
of BuildStream to use the new plugin, the ``BST_FORMAT_VERSION`` must
be incremented in the ``_versions.py`` file.
Remember to include in your plugin's main documentation, the format
version in which the plugin was introduced, using the standard annotation
which we use throughout the documentation, e.g.::
.. note::
The ``foo`` plugin is available since :ref:`format version 16 <project_format_version>`
Add tests
~~~~~~~~~
Needless to say, all new feature additions need to be tested. For ``Element``
plugins, these usually need to be added to the integration tests. For ``Source``
plugins, the tests are added in two ways:
* For most normal ``Source`` plugins, it is important to add a new ``Repo``
implementation for your plugin in the ``tests/testutils/repo/`` directory
and update ``ALL_REPO_KINDS`` in ``tests/testutils/repo/__init__.py``. This
will include your new ``Source`` implementation in a series of already existing
tests, ensuring it works well under normal operating conditions.
* For other source plugins, or in order to test edge cases, such as failure modes,
which are not tested under the normal test battery, add new tests in ``tests/sources``.
Extend the cachekey test
~~~~~~~~~~~~~~~~~~~~~~~~
For any newly added plugins, it is important to add some new simple elements
in ``tests/cachekey/project/elements`` or ``tests/cachekey/project/sources``,
and ensure that the newly added elements are depended on by ``tests/cachekey/project/target.bst``.
One new element should be added to the cache key test for every configuration
value which your plugin understands which can possibly affect the result of
your plugin's ``Plugin.get_unique_key()`` implementation.
This test ensures that cache keys do not unexpectedly change or become incompatible
due to code changes. As such, the cache key test should have full coverage of every
YAML configuration which can possibly affect cache key outcome at all times.
See the ``tests/cachekey/update.py`` file for instructions on running the updater,
you need to run the updater to generate the ``.expected`` files and add the new
``.expected`` files in the same commit which extends the cache key test.
Protocol buffers
----------------
BuildStream uses protobuf and gRPC for serialization and communication with
artifact cache servers. This requires ``.proto`` files and Python code
generated from the ``.proto`` files using protoc. All these files live in the
``buildstream/_protos`` directory. The generated files are included in the
git repository to avoid depending on grpcio-tools for user installations.
Regenerating code
~~~~~~~~~~~~~~~~~
When ``.proto`` files are modified, the corresponding Python code needs to
be regenerated. As a prerequisite for code generation you need to install
``grpcio-tools`` using pip or some other mechanism::
pip3 install --user grpcio-tools
To actually regenerate the code::
./setup.py build_grpc
Documenting
-----------
BuildStream starts out as a documented project from day one and uses
sphinx to document itself.
This section discusses formatting policies for editing files in the
``doc/source`` directory, and describes the details of how the docs are
generated so that you can easily generate and view the docs yourself before
submitting patches to the documentation.
For details on how API documenting comments and docstrings are formatted,
refer to the :ref:`documenting section of the coding guidelines
<contributing_documenting_symbols>`.
Documentation formatting policy
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The BuildStream documentation style is as follows:
* Titles and headings require two leading empty lines above them. Only the first word should be capitalized.
* Titles and headings require two leading empty lines above them.
Only the first word in a title should be capitalized.
* If there is an ``.. _internal_link`` anchor, there should be two empty lines above the anchor, followed by one leading empty line.
* If there is an ``.. _internal_link:`` anchor, there should be two empty lines
above the anchor, followed by one leading empty line.
* Within a section, paragraphs should be separated by one empty line.
* Notes are defined using: ``.. note::`` blocks, followed by an empty line and then indented (3 spaces) text.
* Notes are defined using: ``.. note::`` blocks, followed by an empty line
and then indented (3 spaces) text.
* Code blocks are defined using: ``.. code:: LANGUAGE`` blocks, followed by an empty line and then indented (3 spaces) text. Note that the default language is `python`.
* Other kinds of notes can be used throughout the documentation and will
be decorated in different ways, these work in the same way as ``.. note::`` does.
Feel free to also use ``.. attention::`` or ``.. important::`` to call special
attention to a paragraph, ``.. tip::`` to give the reader a special tip on how
to use an advanced feature or ``.. warning::`` to warn the user about a potential
misuse of the API and explain it's consequences.
* Code blocks are defined using: ``.. code:: LANGUAGE`` blocks, followed by an empty
line and then indented (3 spaces) text. Note that the default language is ``python``.
* Cross references should be of the form ``:role:`target```.
* To cross reference arbitrary locations with, for example, the anchor ``_anchor_name``, you must give the link an explicit title: ``:ref:`Link text <anchor_name>```. Note that the "_" prefix is not required.
* Explicit anchors can be declared as ``.. _anchor_name:`` on a line by itself.
* To cross reference arbitrary locations with, for example, the anchor ``anchor_name``,
always provide some explicit text in the link instead of deriving the text from
the target, e.g.: ``:ref:`Link text <anchor_name>```.
Note that the "_" prefix is not used when referring to the target.
Useful links:
For further information, please see the `Sphinx Documentation <http://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html>`_.
For further information, please see the `Sphinx Documentation
<http://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html>`_.
Building Docs
......@@ -312,54 +1266,28 @@ the ``man/`` subdirectory, which will be automatically included
in the buildstream distribution.
Documenting conventions
~~~~~~~~~~~~~~~~~~~~~~~
We use the sphinx.ext.napoleon extension for the purpose of having
a bit nicer docstrings than the default sphinx docstrings.
A docstring for a method, class or function should have the following
format::
"""Brief description of entity
Args:
argument1 (type): Description of arg
argument2 (type): Description of arg
Returns:
(type): Description of returned thing of the specified type
Raises:
(SomeError): When some error occurs
(SomeOtherError): When some other error occurs
A detailed description can go here if one is needed, only
after the above part documents the calling conventions.
"""
Documentation Examples
~~~~~~~~~~~~~~~~~~~~~~
The examples section of the documentation contains a series of standalone
examples, here are the criteria for an example addition.
* The example has a ``${name}``
* The example has a ``${name}``.
* The example has a project users can copy and use
* The example has a project users can copy and use.
* This project is added in the directory ``doc/examples/${name}``
* This project is added in the directory ``doc/examples/${name}``.
* The example has a documentation component
* The example has a documentation component.
* This is added at ``doc/source/examples/${name}.rst``
* A reference to ``examples/${name}`` is added to the toctree in ``doc/source/examples.rst``
* This documentation discusses the project elements declared in the project and may
provide some BuildStream command examples
* This documentation links out to the reference manual at every opportunity
provide some BuildStream command examples.
* This documentation links out to the reference manual at every opportunity.
* The example has a CI test component
* The example has a CI test component.
* This is an integration test added at ``tests/examples/${name}``
* This is an integration test added at ``tests/examples/${name}``.
* This test runs BuildStream in the ways described in the example
and assert that we get the results which we advertize to users in
the said examples.
......@@ -386,17 +1314,17 @@ The ``.run`` file format is just another YAML dictionary which consists of a
Each *command* is a dictionary, the members of which are listed here:
* ``directory``: The input file relative project directory
* ``directory``: The input file relative project directory.
* ``output``: The input file relative output html file to generate (optional)
* ``output``: The input file relative output html file to generate (optional).
* ``fake-output``: Don't really run the command, just pretend to and pretend
this was the output, an empty string will enable this too.
* ``command``: The command to run, without the leading ``bst``
* ``command``: The command to run, without the leading ``bst``.
* ``shell``: Specifying ``True`` indicates that ``command`` should be run as
a shell command from the project directory, instead of a bst command (optional)
a shell command from the project directory, instead of a bst command (optional).
When adding a new ``.run`` file, one should normally also commit the new
resulting generated ``.html`` file(s) into the ``doc/source/sessions-stored/``
......@@ -419,30 +1347,10 @@ regenerate them locally in order to build the docs.
command: build hello.bst
Protocol Buffers
----------------
BuildStream uses protobuf and gRPC for serialization and communication with
artifact cache servers. This requires ``.proto`` files and Python code
generated from the ``.proto`` files using protoc. All these files live in the
``buildstream/_protos`` directory. The generated files are included in the
git repository to avoid depending on grpcio-tools for user installations.
.. _contributing_testing:
Regenerating code
~~~~~~~~~~~~~~~~~
When ``.proto`` files are modified, the corresponding Python code needs to
be regenerated. As a prerequisite for code generation you need to install
``grpcio-tools`` using pip or some other mechanism::
pip3 install --user grpcio-tools
To actually regenerate the code::
./setup.py build_grpc
Testing BuildStream
-------------------
Testing
-------
BuildStream uses pytest for regression tests and testing out
the behavior of newly added components.
......@@ -495,6 +1403,7 @@ with::
Alternatively, any IDE plugin that uses pytest should automatically
detect the ``.pylintrc`` in the project's root directory.
Adding tests
~~~~~~~~~~~~
Tests are found in the tests subdirectory, inside of which
......@@ -513,7 +1422,7 @@ a subdirectory beside your test in which to store data.
When creating a test that needs data, use the datafiles extension
to decorate your test case (again, examples exist in the existing
tests for this), documentation on the datafiles extension can
be found here: https://pypi.python.org/pypi/pytest-datafiles
be found here: https://pypi.python.org/pypi/pytest-datafiles.
Tests that run a sandbox should be decorated with::
......@@ -521,8 +1430,9 @@ Tests that run a sandbox should be decorated with::
and use the integration cli helper.
Measuring BuildStream performance
---------------------------------
Measuring performance
---------------------
Benchmarking framework
......
......@@ -24,6 +24,9 @@ buildstream 1.3.1
o Add new `pip` source plugin for downloading python packages using pip,
based on requirements files from previous sources.
o Generate Docker images from built artifacts using
`contrib/bst-docker-import` script.
=================
buildstream 1.1.5
......
......@@ -426,6 +426,22 @@ class ArtifactCache():
raise ImplError("Cache '{kind}' does not implement contains()"
.format(kind=type(self).__name__))
# contains_subdir_artifact():
#
# Check whether an artifact element contains a digest for a subdir
# which is populated in the cache, i.e non dangling.
#
# Args:
# element (Element): The Element to check
# key (str): The cache key to use
# subdir (str): The subdir to check
#
# Returns: True if the subdir exists & is populated in the cache, False otherwise
#
def contains_subdir_artifact(self, element, key, subdir):
raise ImplError("Cache '{kind}' does not implement contains_subdir_artifact()"
.format(kind=type(self).__name__))
# list_artifacts():
#
# List artifacts in this cache in LRU order.
......@@ -551,11 +567,12 @@ class ArtifactCache():
# element (Element): The Element whose artifact is to be fetched
# key (str): The cache key to use
# progress (callable): The progress callback, if any
# subdir (str): The optional specific subdir to pull
#
# Returns:
# (bool): True if pull was successful, False if artifact was not available
#
def pull(self, element, key, *, progress=None):
def pull(self, element, key, *, progress=None, subdir=None, excluded_subdirs=None):
raise ImplError("Cache '{kind}' does not implement pull()"
.format(kind=type(self).__name__))
......
......@@ -92,6 +92,16 @@ class CASCache(ArtifactCache):
# This assumes that the repository doesn't have any dangling pointers
return os.path.exists(refpath)
def contains_subdir_artifact(self, element, key, subdir):
tree = self.resolve_ref(self.get_artifact_fullname(element, key))
# This assumes that the subdir digest is present in the element tree
subdirdigest = self._get_subdir(tree, subdir)
objpath = self.objpath(subdirdigest)
# True if subdir content is cached or if empty as expected
return os.path.exists(objpath)
def extract(self, element, key):
ref = self.get_artifact_fullname(element, key)
......@@ -228,7 +238,7 @@ class CASCache(ArtifactCache):
remotes_for_project = self._remotes[element._get_project()]
return any(remote.spec.push for remote in remotes_for_project)
def pull(self, element, key, *, progress=None):
def pull(self, element, key, *, progress=None, subdir=None, excluded_subdirs=None):
ref = self.get_artifact_fullname(element, key)
project = element._get_project()
......@@ -247,8 +257,14 @@ class CASCache(ArtifactCache):
tree.hash = response.digest.hash
tree.size_bytes = response.digest.size_bytes
self._fetch_directory(remote, tree)
# Check if the element artifact is present, if so just fetch subdir
if subdir and os.path.exists(self.objpath(tree)):
self._fetch_subdir(remote, tree, subdir)
else:
# Fetch artifact, excluded_subdirs determined in pullqueue
self._fetch_directory(remote, tree, excluded_subdirs=excluded_subdirs)
# tree is the remote value, so is the same without or without dangling ref locally
self.set_ref(ref, tree)
element.info("Pulled artifact {} <- {}".format(display_key, remote.spec.url))
......@@ -668,8 +684,10 @@ class CASCache(ArtifactCache):
stat.S_IRGRP | stat.S_IXGRP | stat.S_IROTH | stat.S_IXOTH)
for dirnode in directory.directories:
fullpath = os.path.join(dest, dirnode.name)
self._checkout(fullpath, dirnode.digest)
# Don't try to checkout a dangling ref
if os.path.exists(self.objpath(dirnode.digest)):
fullpath = os.path.join(dest, dirnode.name)
self._checkout(fullpath, dirnode.digest)
for symlinknode in directory.symlinks:
# symlink
......@@ -948,10 +966,12 @@ class CASCache(ArtifactCache):
# remote (Remote): The remote to use.
# dir_digest (Digest): Digest object for the directory to fetch.
#
def _fetch_directory(self, remote, dir_digest):
def _fetch_directory(self, remote, dir_digest, *, excluded_subdirs=None):
fetch_queue = [dir_digest]
fetch_next_queue = []
batch = _CASBatchRead(remote)
if not excluded_subdirs:
excluded_subdirs = []
while len(fetch_queue) + len(fetch_next_queue) > 0:
if len(fetch_queue) == 0:
......@@ -966,8 +986,9 @@ class CASCache(ArtifactCache):
directory.ParseFromString(f.read())
for dirnode in directory.directories:
batch = self._fetch_directory_node(remote, dirnode.digest, batch,
fetch_queue, fetch_next_queue, recursive=True)
if dirnode.name not in excluded_subdirs:
batch = self._fetch_directory_node(remote, dirnode.digest, batch,
fetch_queue, fetch_next_queue, recursive=True)
for filenode in directory.files:
batch = self._fetch_directory_node(remote, filenode.digest, batch,
......@@ -976,6 +997,10 @@ class CASCache(ArtifactCache):
# Fetch final batch
self._fetch_directory_batch(remote, batch, fetch_queue, fetch_next_queue)
def _fetch_subdir(self, remote, tree, subdir):
subdirdigest = self._get_subdir(tree, subdir)
self._fetch_directory(remote, subdirdigest)
def _fetch_tree(self, remote, digest):
# download but do not store the Tree object
with tempfile.NamedTemporaryFile(dir=self.tmpdir) as out:
......
......@@ -110,6 +110,9 @@ class Context():
# Make sure the XDG vars are set in the environment before loading anything
self._init_xdg()
# Whether or not to attempt to pull buildtrees globally
self.pullbuildtrees = False
# Private variables
self._cache_key = None
self._message_handler = None
......@@ -160,7 +163,7 @@ class Context():
_yaml.node_validate(defaults, [
'sourcedir', 'builddir', 'artifactdir', 'logdir',
'scheduler', 'artifacts', 'logging', 'projects',
'cache'
'cache', 'pullbuildtrees'
])
for directory in ['sourcedir', 'builddir', 'artifactdir', 'logdir']:
......@@ -185,6 +188,9 @@ class Context():
# Load artifact share configuration
self.artifact_cache_specs = ArtifactCache.specs_from_config_node(defaults)
# Load pull buildtrees configuration
self.pullbuildtrees = _yaml.node_get(defaults, bool, 'pullbuildtrees', default_value='False')
# Load logging config
logging = _yaml.node_get(defaults, Mapping, 'logging')
_yaml.node_validate(logging, [
......
......@@ -305,10 +305,12 @@ def init(app, project_name, format_version, element_path, force):
help="Allow tracking to cross junction boundaries")
@click.option('--track-save', default=False, is_flag=True,
help="Deprecated: This is ignored")
@click.option('--pull-buildtrees', default=False, is_flag=True,
help="Pull buildtrees from a remote cache server")
@click.argument('elements', nargs=-1,
type=click.Path(readable=False))
@click.pass_obj
def build(app, elements, all_, track_, track_save, track_all, track_except, track_cross_junctions):
def build(app, elements, all_, track_, track_save, track_all, track_except, track_cross_junctions, pull_buildtrees):
"""Build elements in a pipeline"""
if (track_except or track_cross_junctions) and not (track_ or track_all):
......@@ -327,7 +329,8 @@ def build(app, elements, all_, track_, track_save, track_all, track_except, trac
track_targets=track_,
track_except=track_except,
track_cross_junctions=track_cross_junctions,
build_all=all_)
build_all=all_,
pull_buildtrees=pull_buildtrees)
##################################################################
......@@ -429,10 +432,12 @@ def track(app, elements, deps, except_, cross_junctions):
help='The dependency artifacts to pull (default: none)')
@click.option('--remote', '-r',
help="The URL of the remote cache (defaults to the first configured cache)")
@click.option('--pull-buildtrees', default=False, is_flag=True,
help="Pull buildtrees from a remote cache server")
@click.argument('elements', nargs=-1,
type=click.Path(readable=False))
@click.pass_obj
def pull(app, elements, deps, remote):
def pull(app, elements, deps, remote, pull_buildtrees):
"""Pull a built artifact from the configured remote artifact cache.
By default the artifact will be pulled one of the configured caches
......@@ -446,7 +451,7 @@ def pull(app, elements, deps, remote):
all: All dependencies
"""
with app.initialized(session_name="Pull"):
app.stream.pull(elements, selection=deps, remote=remote)
app.stream.pull(elements, selection=deps, remote=remote, pull_buildtrees=pull_buildtrees)
##################################################################
......
......@@ -32,9 +32,20 @@ class PullQueue(Queue):
complete_name = "Pulled"
resources = [ResourceType.DOWNLOAD, ResourceType.CACHE]
def __init__(self, scheduler, buildtrees=False):
super().__init__(scheduler)
# Current default exclusions on pull
self._excluded_subdirs = ["buildtree"]
self._subdir = None
# If buildtrees are to be pulled, remove the value from exclusion list
if buildtrees:
self._subdir = "buildtree"
self._excluded_subdirs.remove(self._subdir)
def process(self, element):
# returns whether an artifact was downloaded or not
if not element._pull():
if not element._pull(subdir=self._subdir, excluded_subdirs=self._excluded_subdirs):
raise SkipJob(self.action_name)
def status(self, element):
......@@ -49,7 +60,7 @@ class PullQueue(Queue):
if not element._can_query_cache():
return QueueStatus.WAIT
if element._pull_pending():
if element._pull_pending(subdir=self._subdir):
return QueueStatus.READY
else:
return QueueStatus.SKIP
......
......@@ -160,12 +160,14 @@ class Stream():
# track_cross_junctions (bool): Whether tracking should cross junction boundaries
# build_all (bool): Whether to build all elements, or only those
# which are required to build the target.
# pull_buildtrees (bool): Whether to pull buildtrees from a remote cache server
#
def build(self, targets, *,
track_targets=None,
track_except=None,
track_cross_junctions=False,
build_all=False):
build_all=False,
pull_buildtrees=False):
if build_all:
selection = PipelineSelection.ALL
......@@ -195,7 +197,10 @@ class Stream():
self._add_queue(track_queue, track=True)
if self._artifacts.has_fetch_remotes():
self._add_queue(PullQueue(self._scheduler))
# Query if pullbuildtrees has been set globally in user config
if self._context.pullbuildtrees:
pull_buildtrees = True
self._add_queue(PullQueue(self._scheduler, buildtrees=pull_buildtrees))
self._add_queue(FetchQueue(self._scheduler, skip_cached=True))
self._add_queue(BuildQueue(self._scheduler))
......@@ -295,7 +300,8 @@ class Stream():
#
def pull(self, targets, *,
selection=PipelineSelection.NONE,
remote=None):
remote=None,
pull_buildtrees=False):
use_config = True
if remote:
......@@ -310,8 +316,12 @@ class Stream():
if not self._artifacts.has_fetch_remotes():
raise StreamError("No artifact caches available for pulling artifacts")
# Query if pullbuildtrees has been set globally in user config
if self._context.pullbuildtrees:
pull_buildtrees = True
self._pipeline.assert_consistent(elements)
self._add_queue(PullQueue(self._scheduler))
self._add_queue(PullQueue(self._scheduler, buildtrees=pull_buildtrees))
self._enqueue_plan(elements)
self._run()
......
......@@ -1316,7 +1316,8 @@ class Element(Plugin):
#
@contextmanager
def _prepare_sandbox(self, scope, directory, deps='run', integrate=True):
with self.__sandbox(directory, config=self.__sandbox_config) as sandbox:
# bst shell and bst checkout require a local sandbox.
with self.__sandbox(directory, config=self.__sandbox_config, allow_remote=False) as sandbox:
# Configure always comes first, and we need it.
self.configure_sandbox(sandbox)
......@@ -1691,18 +1692,26 @@ class Element(Plugin):
# _pull_pending()
#
# Check whether the artifact will be pulled.
# Check whether the artifact will be pulled. If the pull operation is to
# include a specific subdir of the element artifact (from cli or user conf)
# then the local cache is queried for the subdirs existence.
#
# Args:
# subdir (str): Whether the pull has been invoked with a specific subdir set
#
# Returns:
# (bool): Whether a pull operation is pending
#
def _pull_pending(self):
def _pull_pending(self, subdir=None):
if self._get_workspace():
# Workspace builds are never pushed to artifact servers
return False
if self.__strong_cached:
# Artifact already in local cache
if self.__strong_cached and subdir:
# If we've specified a subdir, check if the subdir is cached locally
if self.__artifacts.contains_subdir_artifact(self, self.__strict_cache_key, subdir):
return False
elif self.__strong_cached:
return False
# Pull is pending if artifact remote server available
......@@ -1724,11 +1733,10 @@ class Element(Plugin):
self._update_state()
def _pull_strong(self, *, progress=None):
def _pull_strong(self, *, progress=None, subdir=None, excluded_subdirs=None):
weak_key = self._get_cache_key(strength=_KeyStrength.WEAK)
key = self.__strict_cache_key
if not self.__artifacts.pull(self, key, progress=progress):
if not self.__artifacts.pull(self, key, progress=progress, subdir=subdir, excluded_subdirs=excluded_subdirs):
return False
# update weak ref by pointing it to this newly fetched artifact
......@@ -1736,10 +1744,10 @@ class Element(Plugin):
return True
def _pull_weak(self, *, progress=None):
def _pull_weak(self, *, progress=None, subdir=None, excluded_subdirs=None):
weak_key = self._get_cache_key(strength=_KeyStrength.WEAK)
if not self.__artifacts.pull(self, weak_key, progress=progress):
if not self.__artifacts.pull(self, weak_key, progress=progress, subdir=subdir,
excluded_subdirs=excluded_subdirs):
return False
# extract strong cache key from this newly fetched artifact
......@@ -1757,17 +1765,17 @@ class Element(Plugin):
#
# Returns: True if the artifact has been downloaded, False otherwise
#
def _pull(self):
def _pull(self, subdir=None, excluded_subdirs=None):
context = self._get_context()
def progress(percent, message):
self.status(message)
# Attempt to pull artifact without knowing whether it's available
pulled = self._pull_strong(progress=progress)
pulled = self._pull_strong(progress=progress, subdir=subdir, excluded_subdirs=excluded_subdirs)
if not pulled and not self._cached() and not context.get_strict():
pulled = self._pull_weak(progress=progress)
pulled = self._pull_weak(progress=progress, subdir=subdir, excluded_subdirs=excluded_subdirs)
if not pulled:
return False
......@@ -1790,10 +1798,21 @@ class Element(Plugin):
if not self._cached():
return True
# Do not push tained artifact
# Do not push tainted artifact
if self.__get_tainted():
return True
# Do not push elements that have a dangling buildtree artifact unless element type is
# expected to have an empty buildtree directory
if not self.__artifacts.contains_subdir_artifact(self, self.__strict_cache_key, 'buildtree'):
return True
# strict_cache_key can't be relied on to be available when running in non strict mode
context = self._get_context()
if not context.get_strict():
if not self.__artifacts.contains_subdir_artifact(self, self.__weak_cache_key, 'buildtree'):
return True
return False
# _push():
......@@ -2149,17 +2168,18 @@ class Element(Plugin):
# stdout (fileobject): The stream for stdout for the sandbox
# stderr (fileobject): The stream for stderr for the sandbox
# config (SandboxConfig): The SandboxConfig object
# allow_remote (bool): Whether the sandbox is allowed to be remote
#
# Yields:
# (Sandbox): A usable sandbox
#
@contextmanager
def __sandbox(self, directory, stdout=None, stderr=None, config=None):
def __sandbox(self, directory, stdout=None, stderr=None, config=None, allow_remote=True):
context = self._get_context()
project = self._get_project()
platform = Platform.get_platform()
if directory is not None and self.__use_remote_execution():
if directory is not None and allow_remote and self.__use_remote_execution():
self.info("Using a remote sandbox for artifact {} with directory '{}'".format(self.name, directory))
......@@ -2173,7 +2193,7 @@ class Element(Plugin):
yield sandbox
elif directory is not None and os.path.exists(directory):
if self.__remote_execution_url:
if allow_remote and self.__remote_execution_url:
self.warn("Artifact {} is configured to use remote execution but element plugin does not support it."
.format(self.name), detail="Element plugin '{kind}' does not support virtual directories."
.format(kind=self.get_kind()), warning_token="remote-failure")
......@@ -2193,7 +2213,8 @@ class Element(Plugin):
rootdir = tempfile.mkdtemp(prefix="{}-".format(self.normal_name), dir=context.builddir)
# Recursive contextmanager...
with self.__sandbox(rootdir, stdout=stdout, stderr=stderr, config=config) as sandbox:
with self.__sandbox(rootdir, stdout=stdout, stderr=stderr, config=config,
allow_remote=allow_remote) as sandbox:
yield sandbox
# Cleanup the build dir
......
......@@ -93,10 +93,10 @@ echo "INFO: Checking out $element ..." >&2
$bst_cmd checkout --tar "$element" "$checkout_tar" || die "Failed to checkout $element"
echo "INFO: Successfully checked out $element" >&2
echo "INFO: Importing Docker image ..."
echo "INFO: Importing Docker image ..." >&2
"${docker_import_cmd[@]}" "$checkout_tar" "$docker_image_tag" || die "Failed to import Docker image from tarball"
echo "INFO: Successfully import Docker image $docker_image_tag"
echo "INFO: Successfully import Docker image $docker_image_tag" >&2
echo "INFO: Cleaning up ..."
echo "INFO: Cleaning up ..." >&2
rm "$checkout_tar" || die "Failed to remove $checkout_tar"
echo "INFO: Clean up finished"
echo "INFO: Clean up finished" >&2
.. _bst_and_docker:
BuildStream and Docker
======================
BuildStream integrates with Docker in multiple ways. Here are some ways in
which these integrations work.
Run BuildStream inside Docker
-----------------------------
Refer to the :ref:`BuildStream inside Docker <docker>` documentation for
instructions on how to run BuildStream as a Docker container.
Generate Docker images
----------------------
The
`bst-docker-import script <https://gitlab.com/BuildStream/buildstream/blob/master/contrib/bst-docker-import>`_
can be used to generate a Docker image from built artifacts.
You can download it and make it executable like this:
.. code:: bash
mkdir -p ~/.local/bin
curl --get https://gitlab.com/BuildStream/buildstream/raw/master/contrib/bst-docker-import > ~/.local/bin/bst-docker-import
chmod +x ~/.local/bin/bst-docker-import
Check if ``~/.local/bin`` appears in your PATH environment variable -- if it
doesn't, you should
`edit your ~/.profile so that it does <https://stackoverflow.com/questions/14637979/>`_.
Once the script is available in your PATH and assuming you have Docker
installed, you can start using the ``bst-docker-import`` script. Here is a
minimal example to generate an image called ``bst-hello`` from an element
called ``hello.bst`` assuming it is already built:
.. code:: bash
bst-docker-import -t bst-hello hello.bst
This script can also be used if you are running BuildStream inside Docker. In
this case, you will need to supply the command that you are using to run
BuildStream using the ``-c`` option. If you are using the
`bst-here wrapper script <https://gitlab.com/BuildStream/buildstream/blob/master/contrib/bst-here>`_,
you can achieve the same results as the above example like this:
.. code:: bash
bst-docker-import -c bst-here -t bst-hello hello.bst
......@@ -8,3 +8,4 @@ Additional writings
additional_cachekeys
additional_sandboxing
additional_docker
......@@ -103,7 +103,7 @@ def test_commands(cli, cmd, word_idx, expected):
('bst --no-colors build -', 3, ['--all ', '--track ', '--track-all ',
'--track-except ',
'--track-cross-junctions ', '-J ',
'--track-save ']),
'--track-save ', '--pull-buildtrees ']),
# Test the behavior of completing after an option that has a
# parameter that cannot be completed, vs an option that has
......
import os
import shutil
import pytest
from tests.testutils import cli_integration as cli, create_artifact_share
from tests.testutils.integration import assert_contains
DATA_DIR = os.path.join(
os.path.dirname(os.path.realpath(__file__)),
"project"
)
# Remove artifact cache & set cli.config value of pullbuildtrees
# to false, which is the default user context
def default_state(cli, integration_cache, share):
# cache_dir = os.path.join(integration_cache, 'artifacts')
# Is this leaving behind the buildtree dir object?
# Might need to nuke the whole /artifacts dir instead
# cli.remove_artifact_from_cache(project, element_name, cache_dir=cache_dir)
shutil.rmtree(os.path.join(integration_cache, 'artifacts'))
cli.configure({'pullbuildtrees': False, 'artifacts': {'url': share.repo, 'push': False}})
# A test to capture the integration of the pullbuildtrees
# behaviour, which by default is to not include the buildtree
# directory of an element
@pytest.mark.integration
@pytest.mark.datafiles(DATA_DIR)
def test_pullbuildtrees(cli, tmpdir, datafiles, integration_cache):
project = os.path.join(datafiles.dirname, datafiles.basename)
element_name = 'autotools/amhello.bst'
# Create artifact shares for pull & push testing
with create_artifact_share(os.path.join(str(tmpdir), 'remote1')) as share1,\
create_artifact_share(os.path.join(str(tmpdir), 'remote2')) as share2:
cli.configure({
'artifacts': {'url': share1.repo, 'push': True},
})
# Build autotools element, checked pushed, delete local
result = cli.run(project=project, args=['build', element_name])
assert result.exit_code == 0
assert cli.get_element_state(project, element_name) == 'cached'
assert share1.has_artifact('test', element_name, cli.get_element_key(project, element_name))
default_state(cli, integration_cache, share1)
# Pull artifact with default config, assert that pulling again
# doesn't create a pull job, then assert with buildtrees user
# config set creates a pull job.
result = cli.run(project=project, args=['pull', element_name])
assert element_name in result.get_pulled_elements()
result = cli.run(project=project, args=['pull', element_name])
assert element_name not in result.get_pulled_elements()
cli.configure({'pullbuildtrees': True})
result = cli.run(project=project, args=['pull', element_name])
assert element_name in result.get_pulled_elements()
default_state(cli, integration_cache, share1)
# Pull artifact with default config, then assert that pulling
# with buildtrees cli flag set creates a pull job.
result = cli.run(project=project, args=['pull', element_name])
assert element_name in result.get_pulled_elements()
result = cli.run(project=project, args=['pull', '--pull-buildtrees', element_name])
assert element_name in result.get_pulled_elements()
default_state(cli, integration_cache, share1)
# Pull artifact with pullbuildtrees set in user config, then assert
# that pulling with the same user config doesn't creates a pull job,
# or when buildtrees cli flag is set.
cli.configure({'pullbuildtrees': True})
result = cli.run(project=project, args=['pull', element_name])
assert element_name in result.get_pulled_elements()
result = cli.run(project=project, args=['pull', element_name])
assert element_name not in result.get_pulled_elements()
result = cli.run(project=project, args=['pull', '--pull-buildtrees', element_name])
assert element_name not in result.get_pulled_elements()
default_state(cli, integration_cache, share1)
# Pull artifact with default config and buildtrees cli flag set, then assert
# that pulling with pullbuildtrees set in user config doesn't create a pull
# job.
result = cli.run(project=project, args=['pull', '--pull-buildtrees', element_name])
assert element_name in result.get_pulled_elements()
cli.configure({'pullbuildtrees': True})
result = cli.run(project=project, args=['pull', element_name])
assert element_name not in result.get_pulled_elements()
default_state(cli, integration_cache, share1)
# Assert that a partial build element (not containing a populated buildtree dir)
# can't be pushed to an artifact share, then assert that a complete build element
# can be. This will attempt a partial pull from share1 and then a partial push
# to share2
result = cli.run(project=project, args=['pull', element_name])
assert element_name in result.get_pulled_elements()
cli.configure({'artifacts': {'url': share2.repo, 'push': True}})
result = cli.run(project=project, args=['push', element_name])
assert element_name not in result.get_pushed_elements()
assert not share2.has_artifact('test', element_name, cli.get_element_key(project, element_name))
# Assert that after pulling the missing buildtree the element artifact can be
# successfully pushed to the remote. This will attempt to pull the buildtree
# from share1 and then a 'complete' push to share2
cli.configure({'artifacts': {'url': share1.repo, 'push': False}})
result = cli.run(project=project, args=['pull', '--pull-buildtrees', element_name])
assert element_name in result.get_pulled_elements()
cli.configure({'artifacts': {'url': share2.repo, 'push': True}})
result = cli.run(project=project, args=['push', element_name])
assert element_name in result.get_pushed_elements()
assert share2.has_artifact('test', element_name, cli.get_element_key(project, element_name))
default_state(cli, integration_cache, share1)
......@@ -128,7 +128,7 @@ class ArtifactShare():
valid_chars = string.digits + string.ascii_letters + '-._'
element_name = ''.join([
x if x in valid_chars else '_'
x if x in valid_chars else '-'
for x in element_name
])
artifact_key = '{0}/{1}/{2}'.format(project_name, element_name, cache_key)
......