Observables are evaluated in serial in a DoE with multi-processing enabled
Summary
A DoE run in parallel with multi-processing see its observables evaluated in serial.
This is due to the fact that the observable computation mechanism is done using a callback when inserting data in the database, which is done in the main process.
Gemseo version
HEAD of develop(2564111c) and previous versions.
Platform info
Observed under Linux env (Centos8) and Windows 10 Professional
Environment info
Not relevant
Steps to reproduce
Here is a reproducer:
def compute_obj_and_obs(x: float = 0.0) -> Tuple[float, float]:
"""Compute objective and observable variables.
Args:
x: The input x value.
Returns:
obj: The objective value.
obs: The observable value
"""
obj = x
obs = x + 1.0
return obj, obs
def test_evaluate_samples_multiproc_with_observables(doe):
"""Evaluate a DoE in // with multiprocessing and with observables."""
disc = create_discipline("AutoPyDiscipline", py_func=compute_obj_and_obs)
disc.cache = None
design_space = DesignSpace()
design_space.add_variable("x", l_b=0.0, u_b=1.0, value=0.5)
scenario = create_scenario(
[disc],
design_space=design_space,
objective_name="obj",
formulation="DisciplinaryOpt",
scenario_type="DOE",
)
scenario.add_observable("obs")
scenario.execute(
{"algo": "fullfact", "n_samples": 4, "algo_options": {"n_processes": 2}}
)
# The discipline should not be called on the main process
# In multi-processing mode,
# the disciplinary calls are only made on the worker processes
assert disc.n_calls == 0
What is the current bug behavior?
Disciplines are re-evaluated in serial on the master process if there is at least one observable added in the DoE scenario.
What is the expected correct behavior?
The observables should be computed inthe worker processes.
Relevant logs and/or screenshots
Not relevant
Possible fixes
A proposition of fix, which also would avoid any side effects, would be to add a eval_observables boolean to OptimizationProblem::evaluate_functions, and set to False by default. It would be set to True specifically for the // DoE worker.
And the modifications in the evaluate_functions methods would be the following:
def evaluate_functions(
self,
x_vect: ndarray = None,
eval_jac: bool = False,
eval_obj: bool = True,
eval_observables: bool = False,
normalize: bool = True,
no_db_no_norm: bool = False,
) -> tuple[dict[str, float | ndarray], dict[str, ndarray]]:
#################################################################
# Get the functions to be evaluated
if self.__functions_are_preprocessed and no_db_no_norm:
functions = list(self.nonproc_constraints)
if eval_obj:
functions += [self.nonproc_objective]
if eval_observables:
functions += self.nonproc_observables
else:
functions = list(self.constraints)
if eval_obj:
functions += [self.objective]
if eval_observables:
functions += self.observables