Skip to content

Add runtime multithreading support to apply_rule

Here we go! 🎉

What does this MR do?

This MR re-introduces the parallelization capabilities of Utopia originally implemented in !98 (merged). The original MR has been reverted in !165 (merged) due to incompatibilities on macOS, see #225 (closed).

What were the problems?

Misinformation on the contents of the TBB / ParallelSTL(PSTL) packages across various platforms, see #225 (comment 54465). Here are the key points:

  • PSTL is implemented in the GNU standard library since GCC 9, but requires TBB to be installed separately. This concerns Ubuntu/Linux users.
  • PSTL is a separate package on Homebrew because LLVM has not included it into its standard library (yet!). This concerns macOS users.

Additionally, there was an issue in the code: Without PSTL definitions, the std::execution namespace is not available. We therefore have to consider the case where the original STL algorithms are called without an execution policy at all.

How was it resolved?

CMake now has an extended detection procedure for both PSTL and TBB, allowing for "internal" (standard library) and "external" PSTL implementations to be used at the users' discretion:

  1. Check if the <execution> header is available for the compiler. This indicates that PSTL is shipped with the standard library.
  2. If yes, check for TBB.
  3. If yes, but a cache or environment variable ParallelSTL_ROOT is defined, search for an external PSTL package.
  4. If no, search for an external PSTL package.

If any of 2), 3), or 4) are successful, parallel features are enabled. The information on which type of PSTL package is used is passed to the code, where some additional corner cases are handled.

The code issue was resolved by packing the STL algorithm arguments into a tuple and using std::apply() to call a function with a tuple that contains the function arguments. This complicated the syntax, but works around the issue of writing two cases for every STL algorithm overload.

Additional changes

The unit tests have been split up into one test case for the parallel interface (parallel_interface_test) and one for the STL algorithm overloads (parallel_stl_test). The first can and will only be compiled if the dependencies for parallel features are detected. The second is additionally registered as strictly sequential test (parallel_stl_seq_test), to check if the algorithm overloads work just fine with no dependencies for parallelization installed. Finally, the parallel_stl_test and apply_rule_parallel tests are compiled but disabled if said dependencies are not installed, as they would execute the exact same code as their sequential counterparts.

Additionally, the FindTBB.cmake module was deleted. It was only required on Ubuntu 18.04 (and will have to be re-introduced for a backport). On more recent OS versions and on macOS, CMake config files for TBB and PSTL are available.

Caveats for macOS

On top, the parallelstl package provided by Homebrew has a faulty configuration. It wants to place includes in /usr/local/include/ and /usr/local/stdlib/. The latter is not possible and therefore using the symlinked location of PSTL in /usr/local/ will lead to missing include files. Users have to specify the path to the actual Homebrew installation in /usr/local/Cellar/parallelstl/. This is reflected in the README.md. The issue had been reported for Homebrew but was considered an upstream problem: https://github.com/oneapi-src/oneDPL/issues/33

Is there something that needs to be double checked?

I checked these configurations:

  • macOS, AppleClang 10, PSTL not installed: 32 core tests are compiled, 2 are disabled, test suite succeeds.
  • macOS, AppleClang 10, PSTL installed: 33 core tests are compiled, 0 are disabled, test suite succeeds.
  • Debian, GCC 9, PSTL not installed: 32 core tests are compiled, 2 are disabled, test suite succeeds.
  • Debian, GCC 9, PSTL installed: 33 core tests are compiled, 0 are disabled, test suite succeeds.
  • (GitLab CI:) Ubuntu 20.04, GCC 10, PSTL installed: 33 core tests are compiled, 0 are disabled, test suite succeeds.
  • Any more to check?
  • Documentation clear?
  • CMake stuff comprehensible?

Can this MR be accepted?

  • Implemented the changes
    • CMake handling of ParallelSTL and TBB dependencies
    • parallel.hh now works for any (possible) definition of PSTL stuff
  • Added or extended tests
    • Run parallel tests sequentially as well
    • Disable parallel tests if required dependencies are not detected
  • Checked test code coverage on new and adjusted code
  • Added or updated documentation
  • Reasonably up-to-date with current master
  • Pipeline passing without warnings
  • Squash option set
  • Set labels to pick this MR into support branches (nope.)
  • Approved by @blsqr
  • Approved by ... anyone else?

Related issues

Closes #225 (closed)

Original issue: #27 (closed)

Edited by Utopia Developers

Merge request reports