Commit 55726338 authored by toscalix's avatar toscalix

Gathering 2018 minutes in different formats

parent 95242e9f
## BuildStream Gathering 2018 minutes
### Day 1
#### ELCE 2018 Session (toscalix)
- Calendar ELCE. Please subscribe to the ELCE-Codethink calendar.
- Create some slides
- Schedule presentations and take aside interested people and present
them the slides.
- Get the mail of the perso and send the slides
- Add a links slide to interesting info.
- Probably Agustin will have 10 min. to present BuildStream at the CIP
TSC meeting.
- CIP booth as meeting point.
#### Minimal runtime by default (Javier J.)
- Suggested by Javier. Javier introduces
[*Build*](https://gitlab.com/freedesktop-sdk/freedesktop-sdk/blob/18.08/elements/public-stacks/buildsystems.bst)
- This is about users who are new to BuildStream and building systems
(specifically integrating components). We currently expect people to
build these from scratch? And thus have the domain expertise to know
which components to use for a full toolchain and/or full build
system? Do we not have a default list of components for people to
use that will work with BuildStream?
- Well, we don’t expect everyone to build from scratch. Also, more
pertinently, we want to be independent here. Don’t want to document
in BuildStream how to build an operating system. We are not biased,
whatever you want to use.
- Maybe we could recommend a list? Yes, maybe we could. But not a
‘default’.
- We want to provide examples (which we do some of already, eg with
Alpine) but stay agnostic. Could we point to downstream projects?
Just a link from BuildStream. Downstream has to deal with the
details. We can take care of this within the BuildStream examples.
Also need to talk to the downstream projects about this.
**Summary and TODO**
- Freedesktop SDK: Javier to add MR to BuildStream docs to point to
the freedesktop-sdk build system examples as one of the ways to
create a base (in addition to Alpine).
- Freedesktop SDK: Javier to document in freedesktop-sdk how to use
the buildsystem compose element to get started with Buildstream.
- BuildStream: change the Alpine tarball that we currently generate
with a script and host to be accessed via a junction (will have own
element) in a quarantined location (in case they ever go down). This
is for test cases only, or examples also? Unclear.
- Javier: Propose freedesktop-sdk MR to be used instead alpine to
build examples. Decide later if keep using alpine or not (or have a
mix).
- BuildStream: At some point we can have a page on the website
outlining who is using BuildStream, ie, which other projects are
using BuildStream.
#### Workspace Handling (Jürg B.)
- Jurg expose the idea of how to handle workspaces. There is a
brainstorming about it. The proposal will be sent to the mailing
list after the Gathering since there are different use cases that
needs to be considered.
- The proposal is to replace bind-mounting workspaces with
synchronizing sources from the workspace directory to CAS and
keeping object files only in CAS as part of the cached build tree.
The plan is to implement this after BuildBox is available for local
execution, either as part of or following the SourceCache effort.
For compatibility, only do this for plugins that advertise
BST\_VIRTUAL\_DIRECTORY. Detailed proposal will follow on the list.
#### Extending Plug-ins (Will B.)
- Working on a tool in the trustable space named
[*gitect*](https://gitlab.com/trustable/gitect) that detects certain
metadata from the CI pipeline. Built artifacts, where they are, what
they are. Details of different binaries generated during build
process. This is not to check source code, it’s just to capture
information about what has been built: focus on binaries.
- Agustin points out a tool named
[*Quartermaster*](https://www.bitkom.org/Termine/2018/Forum-Open-Source-2018/Praesentationen/Fricke-Continuous-Compliance-Eine-Einfuehrung-in-Quartermaster.pdf)
who does this at build time, not just after.
- Is it a similar concept that what this app does?
- Ideas:
- The simplest way to do this is to simply have the build element
do the analysis andincludeboth the analysis and finished build
result in the produced artifact. We can add a filter element
after the instrumented build to remove the metadata, if
required.
- Tristanalso suggests recording the analysis with 'public data'
which can then be extracted with 'bst show' after the build.
- Custom plug-in to stage artifacts one by one? Could add build
instructions to the post-install of everything built in the system?
Would also need extra build dep on the new tooling - is this
currently possible? Not really atm in BuildSream. This is YAML
changes, not code. Public data can be adjusted during builds and
read back at the end of the process. Output would be a text file.
- ToDo: create a proposal to solve Will B’s issue mentioning the two
ideas listed above.
- Will will send a mail to the bst mailing list with the outcome of
the initial example.
- Document the use case when we are using junctions
#### Post processing with external tools (cpp check and license check for the whole system - (Javier J.)
Javier brings up the [*problem*](https://gitlab.com/BuildStream/buildstream/issues/441#note_108984699).
This can currently be done as part of the element (in a post-install
step, for example). However, if you want to add this to every build
element, it becomes onerous.
Tristan Maat suggested using some extensions which had been proposed as
part of the IDE integration work: see gitlab issue and link to the
original ML proposal
[*here*](https://gitlab.com/BuildStream/buildstream/issues/188). Tristan
did some work here but the MR never landed in the end.
Valentin raise this same issue for things you need to process after the
build is done: strip commands, removing libtool files, split rules … See
[*gitlab*](http://gitlab.com/BuildStream/issues/304)
Tristan VB suggests that we should have better ways to represent
pipelines so we can add these extra elements to do post analysis, rather
than making elements more complicated.
One way to do this is to allow specification of more than one element in
one .bst file.
Valentin: Suggested to introduce Post processing plugins?
Tristan: maybe better create a new class of elements to get groups of
elements and then process the output of them
TODO: Create a simple example showing how we can do this with
BuildStream at the moment, by adding (for example) a licence check
inside each element.
We may discuss later more efficient ways to do this.
#### Hacking session
- Jim, Juerg and Tristan went to a breakout room to discuss
architecture and specifically the architecture of remote execution.
- Javier, Emmet and Will had a breakout and discussed licence checking
tools.
- Agustin, Javier and Laurence worked on freedesktop-sdk planning.
#### 2019 Release Schedule (toscalix)
- Agustin proposed 3 gatherings next year, Emmet pushed for a fourth,
ie every quarter.
- When should we release? 1.4 - before FOSDEM?
- Are we still needing to tether to GNOME? No, this was arbitrary. We
can change dates but we can keep the cycle.
- Plan:
- Change Agustin proposal for 1.4 release by one week - Feature
Freeze (ie, Alpha) in Week 49, 2018
- For the second release in 2019: choose Week 30
**Outcome**
Agree to aim for 2 releases and 4 gatherings per year (FOSDEM, GUADEC
probably ELCE and one more). Discuss further the details at a hackathon.
### Day 2
#### BuildGrid - State of the art
- Laurence describes what [*BuildGrid*](https://gitlab.com/BuildGrid/buildgrid) is.
- Presentation done by Sander about BuildGrid using the slides created
for the recent talk at BazelCon.
- Jürg explains BuildBox.
- Discussion on speed of running locally vs sending to BuildGrid. For example configure script is IO bound not CPU bound.
- Discussion on resource limits and heterogeneous worker pool.
- The case of testing was brought up by Sander. In this case the workers should all be limited the smallest available worker
resource.
- Building few jobs require more memory than others. And not all builders should have a lot of memory. There is a way to pass requirements for workers, but there is no short term plan yet to provide hints that way.
- Javier express interest on spending some time with BuildGrid guys to build a POC with some of the FOSS project he manage
Other conversation triggered about cache, basically the problem exposed at [*https*](https://gitlab.com/BuildStream/buildstream/issues/401) (It should be a way to prioritize caches, so by default you use your personally cache server, and only going to remote if the artifacts is already there.
#### Plugins strategy (Javier and Jonathan M.)
- What is the purpose of the bst-external repo?
- Should be for two things:
- Plug-ins that are domain specific, given that BuildStream is ambivalent in this regard and so only multi purpose plug-ins should go into the core. Communities should maintain their own domain specific plug-ins. BuildStream doesn’t want to keep all of that baggage, want to encourage downstreams to keep their own repos.
- Second reason it exists is for keeping plug-ins that are not yet mature enough and not yet API stable, but eventually want to get into BuildStream core, eg, X86 image.
- Well then we need to clearly define ‘mature’ and also ‘general
purpose’ - this is not just about code or management, it’s about
stakeholders, how we invite people to join the project, we do not
want to send the message ‘unofficial’ and ‘official’.
- There are risks in any strategy we choose, but the policy we have
right now is just not clear.
- If we were to split out the current repo into two, would we break
any projects currently?
- Don’t want BuildStream to get into ‘blessing’ of certain plug-ins,
so the maintainers should decide and declare when it’s stable and
the user should trust the maintainer of the plug-in.
- What about when you have a huge set of plug-ins, how does a new user
know what to trust, what is the ‘golden set’ that are safe to use?
It’s true that we probably need a base set of plug-ins in this
regard. One possible approach is having an organisation (eg,
Codethink) saying ‘here’s what we think is the golden set of
plug-ins to use) and they can be recommended in that sense, which is
a transient recommendation valid at that time only.
- BuildStream cannot maintain every plug-in that people write, we
cannot bring them all into core. But we should have a convenient set
of plug-ins distributed with the core, but in parallel, not the same
time, this was they are maintained by the core but don’t have to be
released at the same time,, which of course means you just have to
guarantee API stability.
- **ToDo:** Agustin will talk to different people to try to come out
with a proposal on this front.
- More notes
- Discussions leaning towards an eventual goal of making all but
the bare minimum of plugins stored externally to the core
buildstream repo\
(suitable list of elements like junction, compose, stack,
import, filter)\
the specific build elements, for example, may be external.
- Is it worth disrupting downstream users? Maybe once we have a
full distro packaging solution what about pip dependencies? It's
not very nice, but it's the preferred solution for some users.
- What about windows? We'd provide an installer for that.
- When is a plugin maintainer ready to claim that the API will
only grow, never change (i.e. maintain full backward
compatibility).\
-> currently buildstream core makes this guarantee,
bst-external doesn't.\
-> This sounds like a discussion that hinges on individual
maintainers.\
-> Should we have a central plugin listing?\
\
What about curated lists where an individual / organisation vets
plugins for quality and stability?\
\
What can buildstream do without bst-external? Just directories
or tarballs?\
\* all the image exports are from bst-external. For system
integrators\
bst-external is effectively mandatory.\
\
Experiences from users: bst-external is not visible enough, it
exists as a footnote in the main documentation. A common error
is trying to build something and seeing an error that a plugin
is missing.\
\
Having plugins in separate repositories has a technical
complication that CI is not easy.
#### Hacking session
- Release schedule
- The target release cycle is december and june.
- 2019 will be jan, jun and december.
- ToDo: agustin will send the proposal to the mailing list.
- Performance discussion\
Present:\
\* Jonathan Maw\
\* Jürg Billeter\
\* Will Salmon\
\* Tristan Van Berkom (popped out near the end)\
\* Sander Striker\
\
Topics discussed:\
\* Is json loading faster than unpickling?\
- No answer, probably strongly depends on how much of pickle is
written in C, and how much extra effort would be required to
reproduce provenance from json.\
\* Element ready state check includes recursive syscalls to
interrogate the cache, apparently.\
- skipping interrogating the cache has ramifications with multiple
instances.\
- non-weird uses of multiple instances are:\
- running a build and querying the state of the pipeline\
- building two different projects at once\
- building two different versions of the same project (probably
via junctions)\
\* Element ready state check is done for all elements on the
completion of a single element, rather than just the elements that
are reverse dependencies.\
\* Interrogation of source consistency could be parallelised.\
- Important because that involves shelling out.\
\* Realistic tree shape (for benchmarking) is more like:\
\
|
|\
----------\
/ \\\
/ \\\
/ \\\
---------\
/ \\\
/\_\_\_\\\
James' anonymised tree is also useful, but the provider of the
original data is not comfortable with the anonymised tree being made
public.\
\* Cache key calculation should not require syscalls every time.\
\* Caching the entire Element was discussed. The huge number of
factors that affect it (going as far as command-line options) was
brought up.\
- Piecemeal caching of parts of element construction is probably
more useful.\
\* The high memory requirements (making 100k pipelines impossible)
was discussed, especially because remote execution doesn't require
all that information at once.\
\* Remote Execution optimisation:\
- Easier to see where to optimise once outstanding issues have been
resolved.\
- CAS-to-CAS transfers?\
- Maximising parallelism\
- Partial CAS
- Alpine tarball is on the cloud now:
[*https*](https://gitlab.com/BuildStream/buildstream/merge_requests/880), so [*https*](https://gitlab.com/BuildStream/nosoftware/alignment/issues/36) is almost fixed now
#### Git LFS (Richard Dale)
- Richard Dale explains the issue which is described in a mail to the
mailing list from 12th July.
- At the stage time, we assume there is no network.
- This is a critical problem for system integrators trying to create a
platform with blobs.
- Issue \#567
- Action: raise criticality and evaluate how to include the patch in
1.4 release.
#### BuildStream events
- DevOps seems like an acceptable approach
- The other approach is to attend to community events once we have
done the ground work to infiltrate in those conferences.
- ToDo: agustin will send a proposal to the mailing list.
### Day 3
#### Opening Session
- Added a topic for features for the 1.4 release, from TVB. Discussion
slot added.
- Summary of the training session: talked about the plug-in system in
BuildStream. Source plug-ins, build plug-ins, examples of junctions,
GNOME and Freedesktop, BuildStream defaults, caching: locally and
remote. People pulled DOOM from the cache. Martin talked about
Remote Execution.
- Docker/OCI discussion between Chandan, Javier and Daniel at the
11:30 hackathon session.
#### Source Cache
- Workspaces into SourceCache is a new idea.
- SourceCache should go to local CAS first.
- This should go into 1.4.
- Juerg can add some notes to
[*the*](https://gitlab.com/BuildStream/buildstream/issues/440)[*
*](https://gitlab.com/BuildStream/buildstream/issues/440)[*ticket*](https://gitlab.com/BuildStream/buildstream/issues/440).
See
[*here*](https://gitlab.com/BuildStream/buildstream/issues/440#note_109851108)
for the specific comment.
- SourceCache alone doesn't solve the mirroring use case, however,
it's still useful to avoid fetches using CAS. And sources are
anyway needed in CAS for remote execution.
- The fetch job will attempt to fetch sources from the remote
SourceCache and store it in the local CAS. If not available, it
will fetch from the original source repository and also store it
in the local CAS.
- In a first step sources will always be fetched for elements that
are scheduled for build, even if they are built remotely. This
will be optimized in a second step.
- \~/.cache/buildstream/sources will still be used.
- Use ReferenceStorage to map source keys to CAS directory trees.
- Sources fetched from the original source repository are pushed
to the remote SourceCache, if the user has push access.
#### IDE Integration
- CB looked at VS Code integration in Jan, found the basics very
trivial but the more advanced features much more difficult to
integrate. Debugging with gdb ‘remotely’ via bst-shell was shown to
work back then.
- [*This*](https://mail.gnome.org/archives/buildstream-list/2017-November/msg00040.html)
trigger the topic.
- There will be two classes of devs: those that care about the
underlying build system and those that just want their IDE to do
things for them with a build command. For the letter, we should
optimize.
- Something relevant here and not yet on the Roadmap: sysroot for an
entire project, so you can point your IDE directly at it. All source
and elements? No, just what is work-spaced.
- Language Server is also supported by most IDEs so integrating that
may be helpful. For that we need two things:
- Ability to launch a \`bst shell\` on multiple elements from BST,
also needed for things like debuggers
- A reference implementation to run Language Server inside a
BuildStream project
- Add a new \`bst session\` command to keep a persistent session open
for working on a set of elements. This can be used by IDEs as well
as other integrations.
- Richard Maw said he was already working on a bst-shell feature for
combining multiple elements.
#### Extending plugins
- Issue:
[*https*](https://gitlab.com/BuildStream/buildstream/issues/697)[*://*](https://gitlab.com/BuildStream/buildstream/issues/697)[*gitlab*](https://gitlab.com/BuildStream/buildstream/issues/697)[*.*](https://gitlab.com/BuildStream/buildstream/issues/697)[*com*](https://gitlab.com/BuildStream/buildstream/issues/697)[*/*](https://gitlab.com/BuildStream/buildstream/issues/697)[*BuildStream*](https://gitlab.com/BuildStream/buildstream/issues/697)[*/*](https://gitlab.com/BuildStream/buildstream/issues/697)[*buildstream*](https://gitlab.com/BuildStream/buildstream/issues/697)[*/*](https://gitlab.com/BuildStream/buildstream/issues/697)[*issues*](https://gitlab.com/BuildStream/buildstream/issues/697)[*/697*](https://gitlab.com/BuildStream/buildstream/issues/697)
- For example do something different for a plugin, so need to extend a
plugin. Some people have subclassed but subclassing is dangerous.
- Other users have forked plugins because plugins cannot be
subclassed.
- Proposed solution for code sharing (Tristan):
- Use composition instead of derivation (explanation already in
the issue comments)
- Problematic for the plugins that provide YAML files
- Use extensions as a dictionary?
- Kind: git
- Config:
- plugin:git-lfs
- Extensions:
- git-lfs:
- Username:tomjom
- Using extension: can they overwrite the parent plugin??
- Kind: manual
- Variables: overrides
- Config:
- Bla: override
- Extensions:
- custom:foo
- Discussion on the issue of deleting keys in YAML (for example to
delete a variable). This is not possible yet.
- Worries about complexity: maybe handle the simpler cases for now?
- Seems the **only plugin that has been extended is the git one**
- Autotools, cmake elements you can change them in project.conf,
but some people prefer to extend them instead
ACTIONS:
- Create a abstract class for the git plugin as a short-term solution
#### Docker / OCI plugins
- Generating docker images Issue:
[https](https://gitlab.com/BuildStream/buildstream/issues/349)[://](https://gitlab.com/BuildStream/buildstream/issues/349)
- Proposal for source OCI images in ML:
[*https*](https://mail.gnome.org/archives/buildstream-list/2018-October/msg00022.html)
- Docker source pluging:
[*https*](https://gitlab.com/BuildStream/bst-external/blob/master/bst_external/sources/docker.py)
- Discussion again about what bst-external is / should be
- Chandal explains the difference between OCI and Docker. Discussion
about how to design the plugin to get both images.
- That would be 5 plugins at the moment
- Problem: integration commands
- Seems something similar to the compose elements will work for this
ACTIONS:
- Create container-plugins repo under buildstream org
- Move docker source plugin from bst-external to this repo
- The new OCI source plugin should move here as well
- New OCI/Docker element plugins should also move here (Integration
commands and preserving layers in source plugins can be addressed
later)
#### Lunch
#### Buildstream 1.4 features
January 2nd is feature freeze
Blockers:
- Debug remotely failed builds:
[*https*](https://gitlab.com/BuildStream/buildstream/issues/539)
- Download build trees on demand:
[*https*](https://gitlab.com/BuildStream/buildstream/issues/494)
- Allow (insecure) use of git describe:
[*https*](https://gitlab.com/BuildStream/buildstream/issues/487)
- Remote execution
- Execution env requirements
- Decouple configuration of remote execution, CAS and artifact
caches
- Sandbox API for command batching:
[*https*](https://gitlab.com/BuildStream/buildstream/issues/675)[*://*](https://gitlab.com/BuildStream/buildstream/issues/675)[*gitlab*](https://gitlab.com/BuildStream/buildstream/issues/675)[*.*](https://gitlab.com/BuildStream/buildstream/issues/675)[*com*](https://gitlab.com/BuildStream/buildstream/issues/675)[*/*](https://gitlab.com/BuildStream/buildstream/issues/675)[*BuildStream*](https://gitlab.com/BuildStream/buildstream/issues/675)[*/*](https://gitlab.com/BuildStream/buildstream/issues/675)[*buildstream*](https://gitlab.com/BuildStream/buildstream/issues/675)[*/*](https://gitlab.com/BuildStream/buildstream/issues/675)[*issues*](https://gitlab.com/BuildStream/buildstream/issues/675)[*/675*](https://gitlab.com/BuildStream/buildstream/issues/675)
- CAS to CAS directory import:
[*https*](https://gitlab.com/BuildStream/buildstream/issues/574)
- Workspace UX reworking (include API breaks)
- bst artifact subgroup
- log
- checkout
- Not including reconstructing of build graph from artifacts
- bst source-checkout (with removal of source-bundle)
- Remove default implementation of “strip-commands” (BREAKING CHANGE)
- Buildstream to not fetch by default (principle of least surprise)
(user config and user prompts are nice to have)
- Separation of plugins out of buildstream main repo, maybe in a
“buildstream-plugins-base” repo
- Have a clear story about plugins not in the future
“buildstream-plugins-base” repo
Nice to have:
- BuildBox for local builds
- SourceCache
- Workspace handling changes (use CAS)
- Tracking actual file dependencies -
[*https*](https://gitlab.com/BuildStream/buildstream/issues/56)
- Remove MAKEFLAGS & V from “manual” build element
#### bst artifacts subcommand
This point depends on a summary that Tristan should send before starting
the discussion about how to proceed. Both ‘log’ and ‘checkout’ are
covered in 1.4 release.
#### Synchronizing bst command flag behavior
In light of some changes made to the ‘workspace’ command, Chandan
uncovered some inconsistency in how the ‘--fetch’ flag behaves compared
to the ‘--track’ flag - BuildStream will implicitly fetch files before
any command that makes use of sources (that is, ‘workspace’, ‘shell’,
…), but has a flag to ‘track’ instead, and doesn’t build before artifact
use at all. This is rather inconsistent.
We decided that the ‘build’ command is special and should do these
things implicitly, but that all other commands should stop and prompt
the user to ask them what they want to do. It should also be possible to
disable these prompts in user configuration.
### Day 4
#### Stripping of whitespaces from loaded yaml
- Ticket:
[*\#403*](https://gitlab.com/BuildStream/buildstream/issues/403)
check it for comments.
#### Minimal rebuilds when tweaking build instructions, e.g.
caching like Docker instructions
- A motivating example for this, when Angelos was tweaking
install-commands for his 'gcc.bst', he incurred a penalty of
\~30mins for every silly mistake. Mistakes made included forgetting
to make a symlink in the install instructions, and then subsequently
invoking ‘ln’ incorrectly.
- This MR may help when landed, by allowing folks to interactively
iterate and take advantage of incremental builds, if they want to
open a workspace: “WIP: Opening a workspace with a cached build”
[*https*](https://gitlab.com/BuildStream/buildstream/merge_requests/873)
- Three possible solutions were discussed:
a. Taking inspiration from ‘recc’, we could create caching versions
of common tools, e.g. linkers, gzip, xz, sort, etc. They would
use knowledge of how the underlying tools work to identify
inputs and outputs, for caching purposes.
b. We could create a high-level caching tool, to be used as a
replacement for the shell inside the sandbox. It takes the hash
of the entire sandbox filesystem, and the command, as a cache
key.
c. We could create a smarter high-level caching tool, again to be
used as a replacement for the shell inside the sandbox. This one
can monitor file-system reads. It takes the hashes of the parts
of the filesystem that were read on previous runs, and the
command, as a cache key.
- For (a, b), we don’t need any capabilities from BuildStream/BuildBox
that recc doesn’t already need. Namely efficient lookups for file
content hashes, and access to a persistent cache.
- For (a, b), it would be nice to have a merkle-dag of the filesystem
in the sandbox, for fast cache-key calculations.
- For (c), we would need extra functionality from the sandbox. Juerg
suggested that it would be technically possible to add this
functionality to BuildBox, e.g perhaps we could expose a virtual
file into the sandbox that provides the information we need. We
would also need functionality for clearing the file-read-flags.
- For (a, b, c), this seems like it will be possible for local and
remote builds with BuildBox.
- This would move us towards being fast in addition to being correct.
- The ‘quickness’ would be provided by tools orthogonal to
BuildStream, it need only keep out of the way.
Sander: can we reevaluate after local buildbox builds have been added,