Things to think about when redesigning YAML
Background
From performance and memory usage perspective, YAML-loading related parts of the code base could make use of some improvements, especially to improve CLI's startup performance for things like bst show.
I am sure this has been discussed and attempted previously: you can find the most recent discussions about it here: https://mail.gnome.org/archives/buildstream-list/2019-March/msg00017.html. I guess, now it can also be a good time to attack it, considering the metrics/benchmarking efforts ongoing. I wanted to create this ticket to capture decisions, things we find while working, and hopefully the finalized roadmap.
Things that complicate a possible redesigning we have noticed so far are:
-
Generally the provenancerelated logic- We mostly use provenance when we encounter an error and want to print useful things. Most of the time, we do not need it.
- In addition,
provenanceobjects are also used for different reasons, %90 insource.pyand %10 inelement.py, complicating things.. - see
git grep -n "provenance\."andgit grep node_get_provenance.
-
Family of mutable compositefunctions_yaml.pyand the places those are used in the rest of the codebase.- See
git grep -n -p "composite(". -
_project.py,element.pyand some other small parts seem to be squashing configuration dictionaries using that. There are around350 linesin_yaml.pyfor those operations.
- See
-
Includes functionality and Variable substitution -
The parts in the codebase that does isinstancechecks on the rawvaluesand does.geton the node object instead of usingnode_get, sometimes in order to allowUnion[dict, str]types in the yaml files. Not having a clear and restricted API allows other parts to do grow more complex.- See
git grep -n isinstance | grep -v _frontend | grep -v "_yaml\.py" | grep -v "tests"
- See
I think the points above needs to be thoroughly discussed since a possible redesign/refactor would probably touch all of these. It is possible that points above could be %20 of the case while we are trying to speed up %80 - redesign can turn out to be simpler that we've imagined if we change(fix?) other things first.