Observables are evaluated in serial in a DoE with multi-processing enabled

Summary

A DoE run in parallel with multi-processing see its observables evaluated in serial.

This is due to the fact that the observable computation mechanism is done using a callback when inserting data in the database, which is done in the main process.

Gemseo version

HEAD of develop(2564111c) and previous versions.

Platform info

Observed under Linux env (Centos8) and Windows 10 Professional

Environment info

Not relevant

Steps to reproduce

Here is a reproducer:

def compute_obj_and_obs(x: float = 0.0) -> Tuple[float, float]:
    """Compute objective and observable variables.

    Args:
         x: The input x value.

    Returns:
        obj: The objective value.
        obs: The observable value
    """
    obj = x
    obs = x + 1.0
    return obj, obs


def test_evaluate_samples_multiproc_with_observables(doe):
    """Evaluate a DoE in // with multiprocessing and with observables."""

    disc = create_discipline("AutoPyDiscipline", py_func=compute_obj_and_obs)
    disc.cache = None
    design_space = DesignSpace()
    design_space.add_variable("x", l_b=0.0, u_b=1.0, value=0.5)

    scenario = create_scenario(
        [disc],
        design_space=design_space,
        objective_name="obj",
        formulation="DisciplinaryOpt",
        scenario_type="DOE",
    )
    scenario.add_observable("obs")
    scenario.execute(
        {"algo": "fullfact", "n_samples": 4, "algo_options": {"n_processes": 2}}
    )

    # The discipline should not be called on the main process
    # In multi-processing mode,
    # the disciplinary calls are only made on the worker processes
    assert disc.n_calls == 0

What is the current bug behavior?

Disciplines are re-evaluated in serial on the master process if there is at least one observable added in the DoE scenario.

What is the expected correct behavior?

The observables should be computed inthe worker processes.

Relevant logs and/or screenshots

Not relevant

Possible fixes

A proposition of fix, which also would avoid any side effects, would be to add a eval_observables boolean to OptimizationProblem::evaluate_functions, and set to False by default. It would be set to True specifically for the // DoE worker.

And the modifications in the evaluate_functions methods would be the following:

 def evaluate_functions(
        self,
        x_vect: ndarray = None,
        eval_jac: bool = False,
        eval_obj: bool = True,
        eval_observables: bool = False,
        normalize: bool = True,
        no_db_no_norm: bool = False,
    ) -> tuple[dict[str, float | ndarray], dict[str, ndarray]]:

#################################################################


         # Get the functions to be evaluated
        if self.__functions_are_preprocessed and no_db_no_norm:
            functions = list(self.nonproc_constraints)
            if eval_obj:
                functions += [self.nonproc_objective]
            if eval_observables:
                functions += self.nonproc_observables
        else:
            functions = list(self.constraints)
            if eval_obj:
                functions += [self.objective]
            if eval_observables:
                functions += self.observables

Edited Jul 20, 2022 by Jean-Christophe Giret