Skip to content

Add Hexagon DSP partial vectorization support and test pipeline

What does this implement/fix?

We are extending the support in Eigen for the Qualcomm Hexagon DSP Vector Extension (HVX) capabilities. HVX uses 128-byte vector registers and can hold 32 single precision float elements. So vectors and matrices with sizes that don't fit 32 elements have quite some overhead.

With this change we want to enable using shorter packets as well with the HVX register by using partial loads and stores. We use the pload_partial and pstore_partial API and add the 'preverse_partial' API. The runtime for many operations becomes equal to having an object with dimensions of the next multiple of 32 and faster than the current implementation. We add macro EIGEN_VECTORIZE_PARTIAL to avoid affecting other architectures.

We also add a dockerfile for building Eigen and the unit tests with the Hexagon SDK and made small changes to the build system to make it all work. See below for additional information on this.

Finally, this merge request combines and supersedes our two older ones !1472 (closed) and !1634 (closed). Please, disregard the older MRs.

Additional information

Unfortunately, we cannot distribute a docker image with the Hexagon SDK already preloaded. The reason is that the SDK must be obtained with a Qualcomm Developer account, and we cannot distribute the Hexagon SDK in another way. Therefore to setup a testing environment using the provided docker file will require to create a Qualcomm developer account and obtain the SDK from there. We hope that the hexagon.dockerfile and build system changes can help to build the unit tests environment for debugging and testing.

See instructions in ci/hexagon.dockerfile for creating Qualcomm Developer account, signing agreements, and downloading QPM deb package. Once complete you can build/test this MR branch by overriding the default repo URL/BRANCH as follows:

docker build -t eigen-hex --pull --build-arg QPM_USER=<qcd-email> --build-arg QPM_PASS=<qcd-password> --build-arg REPO_URL=https://gitlab.com/bardia5/eigen.git --build-arg REPO_BRANCH=hexagon_build_partial_vectorization -f hexagon.dockerfile .
docker run --rm -v $(pwd):/output eigen-hex -- -j 32 --timeout 36000 -O /output/ctest.log --output-junit /output/ctest-report.xml

Running the unit tests inside the docker container uses a DSP simulator of the Hexagon SDK. As running in the simulator is rather slow, some failed passing unit tests can not get test result within 10 hours. Other failed cases may need further debug. However, we think the change itself is already worthwhile and we will continue to debug the failed test cases.

We also add a list of the failed tests in the simulator in simulator_10_hours_no_pass_unit_result.html

Edited by Bardia Behabadi

Merge request reports

Loading