CUDA-related Tensor unit tests fail
@chhtz
Submitted by Christoph HertzbergAssigned to Nobody
Link to original bugzilla bug (#1554)
Version: 3.4 (development)
Operating system: Linux
Description
I'm having trouble running the CUDA test-cases.
First of all, when compiling after just enabling EIGEN_TEST_CUDA
, I'm getting
"builtin_ia32_monitorx" is undefined
and related errors.
A workaround is to add some flags to CUDA_NVCC_FLAGS:
-D_MWAITXINTRIN_H_INCLUDED -D_FORCE_INLINES -D__STRICT_ANSI
(Source: https://github.com/NVIDIA/nccl/issues/29)
After that, all tests compile (with numerous warnings), but all tests except cuda_basic
fail:
Test project /home/chtz/workspace/eigen-bisect/build-nvcc
Start 701: cuda_basic
1/35 Test #701 (closed): cuda_basic ............................ Passed 0.37 sec
Start 891: cxx11_tensor_complex_cuda
2/35 Test #891 (closed): cxx11_tensor_complex_cuda .............***Exception: Illegal 0.29 sec
Start 892: cxx11_tensor_complex_cwise_ops_cuda
3/35 Test #892: cxx11_tensor_complex_cwise_ops_cuda ...***Exception: Illegal 0.17 sec
Start 893: cxx11_tensor_reduction_cuda_1
4/35 Test #893 (closed): cxx11_tensor_reduction_cuda_1 .........***Exception: Illegal 0.14 sec
Start 894: cxx11_tensor_reduction_cuda_2
5/35 Test #894 (closed): cxx11_tensor_reduction_cuda_2 .........***Exception: Illegal 0.14 sec
Start 895: cxx11_tensor_reduction_cuda_3
6/35 Test #895 (closed): cxx11_tensor_reduction_cuda_3 .........***Exception: Illegal 0.15 sec
Start 896: cxx11_tensor_reduction_cuda_4
7/35 Test #896 (closed): cxx11_tensor_reduction_cuda_4 .........***Exception: Illegal 0.14 sec
Start 897: cxx11_tensor_reduction_cuda_5
8/35 Test #897 (closed): cxx11_tensor_reduction_cuda_5 .........***Exception: Illegal 0.14 sec
Start 898: cxx11_tensor_reduction_cuda_6
9/35 Test #898 (closed): cxx11_tensor_reduction_cuda_6 .........***Exception: Illegal 0.14 sec
Start 899: cxx11_tensor_argmax_cuda_1
10/35 Test #899 (closed): cxx11_tensor_argmax_cuda_1 ............***Exception: Illegal 0.14 sec
Start 900: cxx11_tensor_argmax_cuda_2
11/35 Test #900 (closed): cxx11_tensor_argmax_cuda_2 ............***Exception: Illegal 0.13 sec
Start 901: cxx11_tensor_argmax_cuda_3
12/35 Test #901 (closed): cxx11_tensor_argmax_cuda_3 ............***Exception: Illegal 0.14 sec
Start 902: cxx11_tensor_cast_float16_cuda
13/35 Test #902 (closed): cxx11_tensor_cast_float16_cuda ........***Exception: Illegal 0.15 sec
Start 903: cxx11_tensor_scan_cuda_1
14/35 Test #903 (closed): cxx11_tensor_scan_cuda_1 ..............***Exception: Illegal 0.13 sec
Start 904: cxx11_tensor_scan_cuda_2
15/35 Test #904 (closed): cxx11_tensor_scan_cuda_2 ..............***Exception: Illegal 0.14 sec
Start 907: cxx11_tensor_cuda_1
16/35 Test #907 (closed): cxx11_tensor_cuda_1 ...................***Exception: Illegal 0.14 sec
Start 908: cxx11_tensor_cuda_2
17/35 Test #908 (closed): cxx11_tensor_cuda_2 ...................***Exception: Illegal 0.14 sec
Start 909: cxx11_tensor_cuda_3
18/35 Test #909 (closed): cxx11_tensor_cuda_3 ...................***Exception: Illegal 0.14 sec
Start 910: cxx11_tensor_cuda_4
19/35 Test #910: cxx11_tensor_cuda_4 ...................***Exception: Illegal 0.14 sec
Start 911: cxx11_tensor_cuda_5
20/35 Test #911 (closed): cxx11_tensor_cuda_5 ...................***Exception: Illegal 0.14 sec
Start 912: cxx11_tensor_cuda_6
21/35 Test #912: cxx11_tensor_cuda_6 ...................***Exception: Illegal 0.14 sec
Start 913: cxx11_tensor_contract_cuda_1
22/35 Test #913 (closed): cxx11_tensor_contract_cuda_1 ..........***Exception: Illegal 0.13 sec
Start 914: cxx11_tensor_contract_cuda_2
23/35 Test #914 (closed): cxx11_tensor_contract_cuda_2 ..........***Exception: Illegal 0.13 sec
Start 915: cxx11_tensor_contract_cuda_3
24/35 Test #915 (closed): cxx11_tensor_contract_cuda_3 ..........***Exception: Illegal 0.13 sec
Start 916: cxx11_tensor_contract_cuda_4
25/35 Test #916 (closed): cxx11_tensor_contract_cuda_4 ..........***Exception: Illegal 0.14 sec
Start 917: cxx11_tensor_contract_cuda_5
26/35 Test #917 (closed): cxx11_tensor_contract_cuda_5 ..........***Exception: Illegal 0.13 sec
Start 918: cxx11_tensor_contract_cuda_6
27/35 Test #918 (closed): cxx11_tensor_contract_cuda_6 ..........***Exception: Illegal 0.15 sec
Start 919: cxx11_tensor_contract_cuda_7
28/35 Test #919 (closed): cxx11_tensor_contract_cuda_7 ..........***Exception: Illegal 0.14 sec
Start 920: cxx11_tensor_contract_cuda_8
29/35 Test #920 (closed): cxx11_tensor_contract_cuda_8 ..........***Exception: Illegal 0.14 sec
Start 921: cxx11_tensor_contract_cuda_9
30/35 Test #921 (closed): cxx11_tensor_contract_cuda_9 ..........***Exception: Illegal 0.13 sec
Start 922: cxx11_tensor_of_float16_cuda_1
31/35 Test #922 (closed): cxx11_tensor_of_float16_cuda_1 ........***Exception: Illegal 0.17 sec
Start 923: cxx11_tensor_of_float16_cuda_2
32/35 Test #923 (closed): cxx11_tensor_of_float16_cuda_2 ........***Exception: Other 0.17 sec
Start 924: cxx11_tensor_of_float16_cuda_3
33/35 Test #924 (closed): cxx11_tensor_of_float16_cuda_3 ........***Exception: Illegal 0.17 sec
Start 925: cxx11_tensor_of_float16_cuda_4
34/35 Test #925 (closed): cxx11_tensor_of_float16_cuda_4 ........***Exception: Other 0.18 sec
Start 926: cxx11_tensor_of_float16_cuda_5
35/35 Test #926: cxx11_tensor_of_float16_cuda_5 ........***Exception: Other 0.18 sec
3% tests passed, 34 tests failed out of 35
Label Time Summary:
Official = 0.37 sec (1 test)
Unsupported = 5.05 sec (34 tests)
Output of ./test/cuda_basic (yes the GPU is quite old):
CUDA device info:
name: GeForce GTX 460 SE
capability: 2.1
multiProcessorCount: 6
maxThreadsPerMultiProcessor: 1536
warpSize: 32
regsPerBlock: 32768
concurrentKernels: 1
clockRate: 1296000
canMapHostMemory: 1
computeMode: 0
And nvcc --version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Tue_Aug_11_14:27:32_CDT_2015
Cuda compilation tools, release 7.5, V7.5.17
Am I missing something obvious, or is my compiler or GPU too old for tensor-CUDA support?