Choosing between dependency configurations
I thought of this in relation to #52, however it also applies generally to dependency resolution.
In the case where you have unresolved asset dependencies, the current plan was to complain and abort, with tools to allow users to see which packages provide which asset groups. While it would be relatively easy to integrate asset dependencies into the dependency solver, I had thought this wouldn't be wise since the dependency resolver would essentially be pulling random packages which happen to contain the required assets out of the tree, and we couldn't provide a way to let the user choose since the formula is complicated enough that we can't just pull a list of packages to choose from out of it (With all the relationships, it's much more complicated than choosing from a list, given that there might be a number of side effects and conflicts caused by choosing a package that satisfies the asset group, and we would need to ensure that each of the choices would actually be valid).
Similarly, any time you have
|| dependencies the dependency resolver will arbitrarily choose one for you (or in actuality, choose the simpler option, noting that choosing packages which are already installed is the simplest configuration). This actually led to a small feature providing non-standard behaviour for the
sys-bin category, which prioritizes
sys-bin if the program in question is installed. This feature could be dropped in favour of this new system.
We could solve the issue of being unable to let a user choose between packages that provide asset dependencies by enumerating the various solutions to the solver up to some small delta (which could even just be 0 to get all the optimal solutions, but that wouldn't account for essentially optimal solutions which are trivially more complex (e.g. having extra dependencies)), and letting the user choose between configurations (displaying the differences between each), which are likely to vary only slightly (e.g. swapping out one package for another), particularly if we keep the delta small.
Pysat provides a mechanism to do that, using, in the case of the RC2 solver we use, RC2.enumerate, a generator function yielding models (including their costs) from best to worst.
With the way the solver is currently implemented, there will almost certainly exist many close to optimal solutions that simply consist of various re-configurations of free flags (i.e. flags whose value have no effect without some other requirement being satisfied, such as flags for packages which aren't installed (which we could admittedly force into their default positions)), and the number of configurations grows exponentially with the number of flags, so enumerating those configurations could be slow. This is also a problem for non-free flags too, given that those we would have a harder time filtering out, and the point of this feature is not to have the system manually prompt the user to choose which flags they want enabled (potentially in every package on their system!) when they install a package.
The severity of this problem will depend on delta, given that there is a cost for changing the value of a flag, however it might be difficult to keep these costs lower than the cost of suboptimal, but potentially desirable, configurations that the user may want to choose (e.g. those which pull in a package with a large dependency tree).
Probably the way around this is just to keep delta as small as possible and just make the user reconfigure things manually if the automatically detected solution isn't satisfactory. Zero is of course the ideal delta value in terms of avoiding these problems, but a delta of zero may not be sufficient to make this feature useful.
While this would nicely handle a simple situation where the user has to choose between one of, for example, three packages that can satisfy a dependency (even if the packages were not exclusive, given that a configuration which pulls in multiple dependencies where only one is needed would be less optimal than one which pulls in just the one), it would be much less helpful if there are multiple such groups.
E.g. if you have two groups of three dependencies each that the user needs to select, you would end up choosing between nine different configurations (all possible combinations of choices), rather than, as you might expect, one out of three, followed by another out of three. This would become much worse as the number of groups increases. Fortunately this situation may be rare enough in practice to ignore without too much worry, and it might be possible to eventually throw in some heuristics that can simplify certain cases.
#52, but not the general solution)Caveat (for
This doesn't address the fact that this implementation is likely to send runtimes skyrocketing due the sheer number of tokens that would need to be fed to the solver, noting that since asset groups can be filled by multiple packages contributing one or more files each, every asset provided by a package (for tracked groups anyway) would need to be represented in the dependency formula)). Also note that the performance impact may be negligible, I don't really know. At the moment performance is really good, but due to the exponential worst case runtime of the solver, we might hit a formula size where runtimes suddenly jump from <1s to minutes or even hours. Given the heuristics employed by the solver, this is probably unlikely unless the formula is actually complex, which our formulas usually are not.
Side Note: This might be a useful feature in portage. I'm certainly tempted to try and improve portage's dependency resolution system (which, while it is slightly less powerful than portmod's*, is admittedly vastly more verbose), but there are also technical reasons why having pysat be a dependency for portage would be problematic, and I have plenty of things to do without also trying to wrestle with portage's massive codebase.
* I'm tempted to qualify that further. I don't think there's anything that portage couldn't solve but portmod could, however portage's solver has some practical limitations which portmod lacks, or at least I think it does (time, for one, since portage sometimes cuts of calculations and aborts, but given the order of magnitude difference in tree sizes, it's hard for me to compare the two in everyday usage. I also don't think that a feature like this could be implemented as easily for portmod as for portage, as I think it's dependency solver is much more tightly integrated with the rest of the system).