per-unit timeout vs global timeout
When per-unit timeout is enabled, without any other change, we do not really get the desired behavior: if a unit has a long unitTimeout (e.g. the cluster unit having 150m), then sylvactl watch will still timeout after 20m (default value of APPLY_WATCH_TIMEOUT_MIN).
Possible solutions/improvements:
- (1) when per-unit timeouts are enabled we could automatically pick higher default values for the global timeouts
- this isn't the ideal solution, because unless we pick a very very high value, we can still run into the problematic case (and picking a very very high value partially defeats the purpose)
- (2) we could stop passing
--timeoutwhen per-unit timeouts are used- doing only this will break what we have today to ensure that debug-on-exit has time to run in CI (we need sylvactl to timeout a few minutes before the gitlab job timeout)
- what we can do:
- in CI, keep using --timeout to play this role
- outside of CI, stop using --timeout when per-unit timeouts are used
Edited by Thomas Morin