• Still not fixed in nvcc V12.5.40. Or rather, the error has changed with GCC 13.3:

    $ nvcc -O3 -ccbin g++-13 test.cu
    /usr/lib/gcc/x86_64-pc-linux-gnu/13.3.0/include/amxtileintrin.h(42): error: identifier "__builtin_ia32_ldtilecfg" is undefined
        __builtin_ia32_ldtilecfg (__config);
        ^
    
    /usr/lib/gcc/x86_64-pc-linux-gnu/13.3.0/include/amxtileintrin.h(49): error: identifier "__builtin_ia32_sttilecfg" is undefined
        __builtin_ia32_sttilecfg (__config);
        ^
    
    2 errors detected in the compilation of "test.cu".
    $ g++-13 --version
    g++-13 (GCC) 13.3.0
    Copyright (C) 2023 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    Edited by Jakub Klinkovský
  • Still not fixed in nvcc V12.6.20 😡

  • Issue

    I have the same issue:

    ...
    Making with 20 threads. Your PC provides 20 threads.
    -- The CXX compiler identification is GNU 13.3.0
    -- The CUDA compiler identification is NVIDIA 12.6.68
    -- Detecting CXX compiler ABI info
    -- Detecting CXX compiler ABI info - done
    -- Check for working CXX compiler: /usr/bin/g++-13 - skipped
    -- Detecting CXX compile features
    -- Detecting CXX compile features - done
    -- Detecting CUDA compiler ABI info
    -- Detecting CUDA compiler ABI info - done
    -- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
    -- Detecting CUDA compile features
    -- Detecting CUDA compile features - done
    ...
    [ 89%] Building CXX object CMakeFiles/nia_start_core.dir/src/main.cpp.o
    [ 94%] Building CUDA object CMakeFiles/nia_start_core.dir/src/GpuProcessing.cu.o
    [ 94%] Building CXX object CMakeFiles/nia_start_core.dir/src/ModelingMainDriver.cpp.o
    nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
    /usr/lib/gcc/x86_64-linux-gnu/13/include/amxtileintrin.h(42): error: identifier "__builtin_ia32_ldtilecfg" is undefined
        __builtin_ia32_ldtilecfg (__config);
        ^
    
    /usr/lib/gcc/x86_64-linux-gnu/13/include/amxtileintrin.h(49): error: identifier "__builtin_ia32_sttilecfg" is undefined
        __builtin_ia32_sttilecfg (__config);
        ^
    
    2 errors detected in the compilation of "/home/vladislavsemykin/Documents/Work/Start/src/GpuProcessing.cu".
    make[2]: *** [CMakeFiles/nia_start_core.dir/build.make:91: CMakeFiles/nia_start_core.dir/src/GpuProcessing.cu.o] Error 2
    make[2]: *** Waiting for unfinished jobs....
    make[1]: *** [CMakeFiles/Makefile2:285: CMakeFiles/nia_start_core.dir/all] Error 2
    make: *** [Makefile:91: all] Error 2

    Configs

    Here are my configs

    (base) vladislavsemykin@loveit:~/Documents/Work/Start$ nvidia-smi
    Mon Sep 30 17:04:57 2024       
    +-----------------------------------------------------------------------------------------+
    | NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
    |-----------------------------------------+------------------------+----------------------+
    | GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
    |                                         |                        |               MIG M. |
    |=========================================+========================+======================|
    |   0  NVIDIA GeForce RTX 4060 ...    Off |   00000000:01:00.0 Off |                  N/A |
    | N/A   39C    P0             15W /   80W |      15MiB /   8188MiB |      0%      Default |
    |                                         |                        |                  N/A |
    +-----------------------------------------+------------------------+----------------------+
                                                                                             
    +-----------------------------------------------------------------------------------------+
    | Processes:                                                                              |
    |  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
    |        ID   ID                                                               Usage      |
    |=========================================================================================|
    |    0   N/A  N/A      6241      G   /usr/lib/xorg/Xorg                              4MiB |
    +-----------------------------------------------------------------------------------------+
    (base) vladislavsemykin@loveit:~/Documents/Work/Start$ nvcc --version
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2024 NVIDIA Corporation
    Built on Wed_Aug_14_10:10:22_PDT_2024
    Cuda compilation tools, release 12.6, V12.6.68
    Build cuda_12.6.r12.6/compiler.34714021_0
  • Issue Report: Compilation Failure with Various GCC Versions and CUDA Toolkit 12.5

    Compiling CUDA code with different versions of the GCC compiler (g++-11, g++-12, g++-13, g++-14) results in multiple errors, particularly related to undefined identifiers and built-in functions in various system header files. The issue occurs while compiling GpuProcessing.cu (just a simple code as sample) using CUDA Toolkit 12.5. I tried to compile with different GCC versions and I got these results:

    g++-11

    nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
    /usr/lib/gcc/x86_64-linux-gnu/11/include/amxtileintrin.h(42): error: identifier "__builtin_ia32_ldtilecfg" is undefined
        __builtin_ia32_ldtilecfg (__config);
        ^
    
    /usr/lib/gcc/x86_64-linux-gnu/11/include/amxtileintrin.h(49): error: identifier "__builtin_ia32_sttilecfg" is undefined
        __builtin_ia32_sttilecfg (__config);
        ^
    
    2 errors detected in the compilation of "/home/vladislavsemykin/Documents/Work/Start/src/GpuProcessing.cu".
    make[2]: *** [CMakeFiles/nia_start_core.dir/build.make:91: CMakeFiles/nia_start_core.dir/src/GpuProcessing.cu.o] Error 2
    make[2]: *** Waiting for unfinished jobs....

    g++-12

    nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
    /usr/lib/gcc/x86_64-linux-gnu/12/include/avx512bf16vlintrin.h(53): error: identifier "__builtin_ia32_cvtne2ps2bf16_v16hi" is undefined
        return (__m256bh)__builtin_ia32_cvtne2ps2bf16_v16hi(__A, __B);
                         ^
    
    /usr/lib/gcc/x86_64-linux-gnu/12/include/avx512bf16vlintrin.h(60): error: identifier "__builtin_ia32_cvtne2ps2bf16_v16hi_mask" is undefined
        return (__m256bh)__builtin_ia32_cvtne2ps2bf16_v16hi_mask(__C, __D, __A, __B);
                         ^
    
    /usr/lib/gcc/x86_64-linux-gnu/12/include/avx512bf16vlintrin.h(67): error: identifier "__builtin_ia32_cvtne2ps2bf16_v16hi_maskz" is undefined
        return (__m256bh)__builtin_ia32_cvtne2ps2bf16_v16hi_maskz(__B, __C, __A);
                         ^
    
    /usr/lib/gcc/x86_64-linux-gnu/12/include/avx512bf16vlintrin.h(74): error: identifier "__builtin_ia32_cvtne2ps2bf16_v8hi" is undefined
        return (__m128bh)__builtin_ia32_cvtne2ps2bf16_v8hi(__A, __B);
                         ^
    
    /usr/lib/gcc/x86_64-linux-gnu/12/include/avx512bf16vlintrin.h(81): error: identifier "__builtin_ia32_cvtne2ps2bf16_v8hi_mask" is undefined
        return (__m128bh)__builtin_ia32_cvtne2ps2bf16_v8hi_mask(__C, __D, __A, __B);
                         ^
    
    /usr/lib/gcc/x86_64-linux-gnu/12/include/avx512bf16vlintrin.h(88): error: identifier "__builtin_ia32_cvtne2ps2bf16_v8hi_maskz" is undefined
        return (__m128bh)__builtin_ia32_cvtne2ps2bf16_v8hi_maskz(__B, __C, __A);
                         ^
    
    /usr/lib/gcc/x86_64-linux-gnu/12/include/avx512bf16intrin.h(60): error: identifier "__builtin_ia32_cvtne2ps2bf16_v32hi" is undefined
        return (__m512bh)__builtin_ia32_cvtne2ps2bf16_v32hi(__A, __B);
                         ^
    
    /usr/lib/gcc/x86_64-linux-gnu/12/include/avx512bf16intrin.h(67): error: identifier "__builtin_ia32_cvtne2ps2bf16_v32hi_mask" is undefined
        return (__m512bh)__builtin_ia32_cvtne2ps2bf16_v32hi_mask(__C, __D, __A, __B);
                         ^
    
    /usr/lib/gcc/x86_64-linux-gnu/12/include/avx512bf16intrin.h(74): error: identifier "__builtin_ia32_cvtne2ps2bf16_v32hi_maskz" is undefined
        return (__m512bh)__builtin_ia32_cvtne2ps2bf16_v32hi_maskz(__B, __C, __A);
                         ^
    
    /usr/lib/gcc/x86_64-linux-gnu/12/include/amxtileintrin.h(42): error: identifier "__builtin_ia32_ldtilecfg" is undefined
        __builtin_ia32_ldtilecfg (__config);
        ^
    
    /usr/lib/gcc/x86_64-linux-gnu/12/include/amxtileintrin.h(49): error: identifier "__builtin_ia32_sttilecfg" is undefined
        __builtin_ia32_sttilecfg (__config);

    g++-13

    [ 94%] Building CXX object CMakeFiles/nia_start_core.dir/src/ModelingMainDriver.cpp.o
    /usr/lib/gcc/x86_64-linux-gnu/13/include/amxtileintrin.h(42): error: identifier "__builtin_ia32_ldtilecfg" is undefined
        __builtin_ia32_ldtilecfg (__config);
        ^
    
    /usr/lib/gcc/x86_64-linux-gnu/13/include/amxtileintrin.h(49): error: identifier "__builtin_ia32_sttilecfg" is undefined
        __builtin_ia32_sttilecfg (__config);
        ^
    
    2 errors detected in the compilation of "/home/vladislavsemykin/Documents/Work/Start/src/GpuProcessing.cu".
    make[2]: *** [CMakeFiles/nia_start_core.dir/build.make:91: CMakeFiles/nia_start_core.dir/src/GpuProcessing.cu.o] Error 2
    make[2]: *** Waiting for unfinished jobs....
    ^Cmake[2]: *** [CMakeFiles/nia_start_core.dir/build.make:76: CMakeFiles/nia_start_core.dir/src/main.cpp.o] Interrupt
    make[2]: *** [CMakeFiles/nia_start_core.dir/build.make:105: CMakeFiles/nia_start_core.dir/src/ModelingMainDriver.cpp.o] Interrupt
    make[1]: *** [CMakeFiles/Makefile2:285: CMakeFiles/nia_start_core.dir/all] Interrupt
    make: *** [Makefile:91: all] Interrupt

    g++-14

    [ 94%] Building CXX object CMakeFiles/nia_start_core.dir/src/ModelingMainDriver.cpp.o
    nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
    /usr/include/x86_64-linux-gnu/c++/14/bits/c++config.h(827): error: user-defined literal operator not found
        typedef __decltype(0.0bf16) __bfloat16_t;
                           ^
    
    /usr/include/c++/14/type_traits(529): error: type name is not allowed
          : public __bool_constant<__is_array(_Tp)>
                                              ^
    
    /usr/include/c++/14/type_traits(529): error: identifier "__is_array" is undefined
          : public __bool_constant<__is_array(_Tp)>
                                   ^
    
    /usr/include/c++/14/type_traits(581): error: type name is not allowed
          : public __bool_constant<__is_member_object_pointer(_Tp)>
                                                              ^
    
    /usr/include/c++/14/type_traits(581): error: identifier "__is_member_object_pointer" is undefined
          : public __bool_constant<__is_member_object_pointer(_Tp)>
                                   ^
    
    /usr/include/c++/14/type_traits(603): error: type name is not allowed
          : public __bool_constant<__is_member_function_pointer(_Tp)>
                                                                ^
    
    /usr/include/c++/14/type_traits(603): error: identifier "__is_member_function_pointer" is undefined
          : public __bool_constant<__is_member_function_pointer(_Tp)>
                                   ^
    
    /usr/include/c++/14/type_traits(695): error: type name is not allowed
          : public __bool_constant<__is_reference(_Tp)>
                                                  ^
    
    /usr/include/c++/14/type_traits(695): error: identifier "__is_reference" is undefined
          : public __bool_constant<__is_reference(_Tp)>
                                   ^
    
    /usr/include/c++/14/type_traits(731): error: type name is not allowed
          : public __bool_constant<__is_object(_Tp)>
                                               ^
    
    /usr/include/c++/14/type_traits(731): error: identifier "__is_object" is undefined
          : public __bool_constant<__is_object(_Tp)>
                                   ^
    
    /usr/include/c++/14/type_traits(760): error: type name is not allowed
          : public __bool_constant<__is_member_pointer(_Tp)>
                                                       ^
    
    /usr/include/c++/14/type_traits(760): error: identifier "__is_member_pointer" is undefined
          : public __bool_constant<__is_member_pointer(_Tp)>
                                   ^
    
    /usr/include/c++/14/type_traits(3247): error: type name is not allowed
        inline constexpr bool is_array_v = __is_array(_Tp);
                                                      ^
    
    /usr/include/c++/14/type_traits(3271): error: type name is not allowed
          __is_member_object_pointer(_Tp);
                                     ^
    
    /usr/include/c++/14/type_traits(3281): error: type name is not allowed
          __is_member_function_pointer(_Tp);
                                       ^
    
    /usr/include/c++/14/type_traits(3298): error: type name is not allowed
        inline constexpr bool is_reference_v = __is_reference(_Tp);
                                                              ^
    
    /usr/include/c++/14/type_traits(3315): error: type name is not allowed
        inline constexpr bool is_object_v = __is_object(_Tp);
                                                        ^
    
    /usr/include/c++/14/type_traits(3328): error: type name is not allowed
        inline constexpr bool is_member_pointer_v = __is_member_pointer(_Tp);
                                                                        ^
    
    /usr/include/c++/14/type_traits(3649): error: type name is not allowed
          inline constexpr bool is_bounded_array_v = __is_bounded_array(_Tp);
                                                                        ^
    
    /usr/include/c++/14/type_traits(3649): error: identifier "__is_bounded_array" is undefined
          inline constexpr bool is_bounded_array_v = __is_bounded_array(_Tp);
                                                     ^
    
    /usr/include/c++/14/bits/utility.h(237): error: __type_pack_element is not a template
          { using type = __type_pack_element<_Np, _Types...>;
                             ^
    /usr/include/c++/14/tuple(2515): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get<_Tp,_Types...>(const std::tuple<_Types...> &&) noexcept" failed
          get(const tuple<_Types...>&& __t) noexcept
          ^
    /usr/include/c++/14/tuple(2503): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get<_Tp,_Types...>(const std::tuple<_Types...> &) noexcept" failed
          get(const tuple<_Types...>& __t) noexcept
          ^
    /usr/include/c++/14/tuple(2492): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get<_Tp,_Types...>(std::tuple<_Types...> &&) noexcept" failed
          get(tuple<_Types...>&& __t) noexcept
          ^
    /usr/include/c++/14/tuple(2481): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get<_Tp,_Types...>(std::tuple<_Types...> &) noexcept" failed
          get(tuple<_Types...>& __t) noexcept
          ^
    /usr/include/c++/14/tuple(2474): note #3327-D: candidate function template "std::get<__i,_Elements...>(const std::tuple<_Elements...> &)" failed deduction
          get(const tuple<_Elements...>&) = delete;
          ^
    /usr/include/c++/14/bits/ranges_util.h(444): note #3327-D: candidate function template "std::ranges::get<_Num,_It,_Sent,_Kind>(const std::ranges::subrange<_It, _Sent, _Kind> &)" failed deduction
          get(const subrange<_It, _Sent, _Kind>& __r)
          ^
    /usr/include/c++/14/bits/ranges_util.h(455): note #3327-D: candidate function template "std::ranges::get<_Num,_It,_Sent,_Kind>(std::ranges::subrange<_It, _Sent, _Kind> &&)" failed deduction
          get(subrange<_It, _Sent, _Kind>&& __r)
          ^
    /usr/include/c++/14/bits/stl_pair.h(1307): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get(const std::pair<_Up, _Tp> &&) noexcept" failed
          get(const pair<_Up, _Tp>&& __p) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1302): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get(std::pair<_Up, _Tp> &&) noexcept" failed
          get(pair<_Up, _Tp>&& __p) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1297): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get(const std::pair<_Up, _Tp> &) noexcept" failed
          get(const pair<_Up, _Tp>& __p) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1292): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get(std::pair<_Up, _Tp> &) noexcept" failed
          get(pair<_Up, _Tp>& __p) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1287): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get(const std::pair<_Tp, _Up> &&) noexcept" failed
          get(const pair<_Tp, _Up>&& __p) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1282): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get(std::pair<_Tp, _Up> &&) noexcept" failed
          get(pair<_Tp, _Up>&& __p) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1277): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get(const std::pair<_Tp, _Up> &) noexcept" failed
          get(const pair<_Tp, _Up>& __p) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1272): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get(std::pair<_Tp, _Up> &) noexcept" failed
          get(pair<_Tp, _Up>& __p) noexcept
          ^
    /usr/include/c++/14/array(417): note #3327-D: candidate function template "std::get<_Int,_Tp,_Nm>(const std::array<_Tp, _Nm> &&) noexcept" failed deduction
          get(const array<_Tp, _Nm>&& __arr) noexcept
          ^
    /usr/include/c++/14/array(408): note #3327-D: candidate function template "std::get<_Int,_Tp,_Nm>(const std::array<_Tp, _Nm> &) noexcept" failed deduction
          get(const array<_Tp, _Nm>& __arr) noexcept
          ^
    /usr/include/c++/14/array(399): note #3327-D: candidate function template "std::get<_Int,_Tp,_Nm>(std::array<_Tp, _Nm> &&) noexcept" failed deduction
          get(array<_Tp, _Nm>&& __arr) noexcept
          ^
    /usr/include/c++/14/array(390): note #3327-D: candidate function template "std::get<_Int,_Tp,_Nm>(std::array<_Tp, _Nm> &) noexcept" failed deduction
          get(array<_Tp, _Nm>& __arr) noexcept
          ^
    /usr/include/c++/14/tuple(2464): note #3327-D: candidate function template "std::get<__i,_Elements...>(const std::tuple<_Elements...> &&) noexcept" failed deduction
          get(const tuple<_Elements...>&& __t) noexcept
          ^
    /usr/include/c++/14/tuple(2455): note #3327-D: candidate function template "std::get<__i,_Elements...>(std::tuple<_Elements...> &&) noexcept" failed deduction
          get(tuple<_Elements...>&& __t) noexcept
          ^
    /usr/include/c++/14/tuple(2449): note #3327-D: candidate function template "std::get<__i,_Elements...>(const std::tuple<_Elements...> &) noexcept" failed deduction
          get(const tuple<_Elements...>& __t) noexcept
          ^
    /usr/include/c++/14/tuple(2443): note #3327-D: candidate function template "std::get<__i,_Elements...>(std::tuple<_Elements...> &) noexcept" failed deduction
          get(tuple<_Elements...>& __t) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1265): note #3327-D: candidate function template "std::get<_Int,_Tp1,_Tp2>(const std::pair<_Tp1, _Tp2> &&) noexcept" failed deduction
          get(const pair<_Tp1, _Tp2>&& __in) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1260): note #3327-D: candidate function template "std::get<_Int,_Tp1,_Tp2>(const std::pair<_Tp1, _Tp2> &) noexcept" failed deduction
          get(const pair<_Tp1, _Tp2>& __in) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1255): note #3327-D: candidate function template "std::get<_Int,_Tp1,_Tp2>(std::pair<_Tp1, _Tp2> &&) noexcept" failed deduction
          get(pair<_Tp1, _Tp2>&& __in) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1250): note #3327-D: candidate function template "std::get<_Int,_Tp1,_Tp2>(std::pair<_Tp1, _Tp2> &) noexcept" failed deduction
          get(pair<_Tp1, _Tp2>& __in) noexcept
          ^
    
    /usr/local/include/Kokkos_Tuners.hpp(512): error: no instance of overloaded function "std::get" matches the argument list
                argument types are: (<error-type>)
            auto vector_length = std::get<0>(configuration);
                                 ^
    /usr/include/c++/14/tuple(2515): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get<_Tp,_Types...>(const std::tuple<_Types...> &&) noexcept" failed
          get(const tuple<_Types...>&& __t) noexcept
          ^
    /usr/include/c++/14/tuple(2503): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get<_Tp,_Types...>(const std::tuple<_Types...> &) noexcept" failed
          get(const tuple<_Types...>& __t) noexcept
          ^
    /usr/include/c++/14/tuple(2492): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get<_Tp,_Types...>(std::tuple<_Types...> &&) noexcept" failed
          get(tuple<_Types...>&& __t) noexcept
          ^
    /usr/include/c++/14/tuple(2481): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get<_Tp,_Types...>(std::tuple<_Types...> &) noexcept" failed
          get(tuple<_Types...>& __t) noexcept
          ^
    /usr/include/c++/14/tuple(2474): note #3327-D: candidate function template "std::get<__i,_Elements...>(const std::tuple<_Elements...> &)" failed deduction
          get(const tuple<_Elements...>&) = delete;
          ^
    /usr/include/c++/14/bits/ranges_util.h(444): note #3327-D: candidate function template "std::ranges::get<_Num,_It,_Sent,_Kind>(const std::ranges::subrange<_It, _Sent, _Kind> &)" failed deduction
          get(const subrange<_It, _Sent, _Kind>& __r)
          ^
    /usr/include/c++/14/bits/ranges_util.h(455): note #3327-D: candidate function template "std::ranges::get<_Num,_It,_Sent,_Kind>(std::ranges::subrange<_It, _Sent, _Kind> &&)" failed deduction
          get(subrange<_It, _Sent, _Kind>&& __r)
          ^
    /usr/include/c++/14/bits/stl_pair.h(1307): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get(const std::pair<_Up, _Tp> &&) noexcept" failed
          get(const pair<_Up, _Tp>&& __p) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1302): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get(std::pair<_Up, _Tp> &&) noexcept" failed
          get(pair<_Up, _Tp>&& __p) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1297): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get(const std::pair<_Up, _Tp> &) noexcept" failed
          get(const pair<_Up, _Tp>& __p) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1292): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get(std::pair<_Up, _Tp> &) noexcept" failed
          get(pair<_Up, _Tp>& __p) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1287): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get(const std::pair<_Tp, _Up> &&) noexcept" failed
          get(const pair<_Tp, _Up>&& __p) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1282): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get(std::pair<_Tp, _Up> &&) noexcept" failed
          get(pair<_Tp, _Up>&& __p) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1277): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get(const std::pair<_Tp, _Up> &) noexcept" failed
          get(const pair<_Tp, _Up>& __p) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1272): note #3323-D: substituting explicit template arguments "<<expression>>" for function template "std::get(std::pair<_Tp, _Up> &) noexcept" failed
          get(pair<_Tp, _Up>& __p) noexcept
          ^
    /usr/include/c++/14/array(417): note #3327-D: candidate function template "std::get<_Int,_Tp,_Nm>(const std::array<_Tp, _Nm> &&) noexcept" failed deduction
          get(const array<_Tp, _Nm>&& __arr) noexcept
          ^
    /usr/include/c++/14/array(408): note #3327-D: candidate function template "std::get<_Int,_Tp,_Nm>(const std::array<_Tp, _Nm> &) noexcept" failed deduction
          get(const array<_Tp, _Nm>& __arr) noexcept
          ^
    /usr/include/c++/14/array(399): note #3327-D: candidate function template "std::get<_Int,_Tp,_Nm>(std::array<_Tp, _Nm> &&) noexcept" failed deduction
          get(array<_Tp, _Nm>&& __arr) noexcept
          ^
    /usr/include/c++/14/array(390): note #3327-D: candidate function template "std::get<_Int,_Tp,_Nm>(std::array<_Tp, _Nm> &) noexcept" failed deduction
          get(array<_Tp, _Nm>& __arr) noexcept
          ^
    /usr/include/c++/14/tuple(2464): note #3327-D: candidate function template "std::get<__i,_Elements...>(const std::tuple<_Elements...> &&) noexcept" failed deduction
          get(const tuple<_Elements...>&& __t) noexcept
          ^
    /usr/include/c++/14/tuple(2455): note #3327-D: candidate function template "std::get<__i,_Elements...>(std::tuple<_Elements...> &&) noexcept" failed deduction
          get(tuple<_Elements...>&& __t) noexcept
          ^
    /usr/include/c++/14/tuple(2449): note #3327-D: candidate function template "std::get<__i,_Elements...>(const std::tuple<_Elements...> &) noexcept" failed deduction
          get(const tuple<_Elements...>& __t) noexcept
          ^
    /usr/include/c++/14/tuple(2443): note #3327-D: candidate function template "std::get<__i,_Elements...>(std::tuple<_Elements...> &) noexcept" failed deduction
          get(tuple<_Elements...>& __t) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1265): note #3327-D: candidate function template "std::get<_Int,_Tp1,_Tp2>(const std::pair<_Tp1, _Tp2> &&) noexcept" failed deduction
          get(const pair<_Tp1, _Tp2>&& __in) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1260): note #3327-D: candidate function template "std::get<_Int,_Tp1,_Tp2>(const std::pair<_Tp1, _Tp2> &) noexcept" failed deduction
          get(const pair<_Tp1, _Tp2>& __in) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1255): note #3327-D: candidate function template "std::get<_Int,_Tp1,_Tp2>(std::pair<_Tp1, _Tp2> &&) noexcept" failed deduction
          get(pair<_Tp1, _Tp2>&& __in) noexcept
          ^
    /usr/include/c++/14/bits/stl_pair.h(1250): note #3327-D: candidate function template "std::get<_Int,_Tp1,_Tp2>(std::pair<_Tp1, _Tp2> &) noexcept" failed deduction
          get(pair<_Tp1, _Tp2>& __in) noexcept
          ^
    
    /usr/include/c++/14/format(3349): error: user-defined literal operator not found
         else if constexpr (is_same_v<_Td, decltype(0.0bf16)>)
                                                    ^
    
    /usr/include/boost/mp11/algorithm.hpp(352): error: __type_pack_element is not a template
          using type = __type_pack_element<I, T...>;
                       ^
    
    /usr/include/boost/mp11/algorithm.hpp(359): error: __type_pack_element is not a template
          using type = __type_pack_element<I, mp_value<A>...>;
                       ^
    
    /usr/lib/gcc/x86_64-linux-gnu/14/include/usermsrintrin.h(43): error: identifier "__builtin_ia32_urdmsr" is undefined
        return (unsigned long long) __builtin_ia32_urdmsr (__A);
                                    ^
    
    /usr/lib/gcc/x86_64-linux-gnu/14/include/usermsrintrin.h(50): error: identifier "__builtin_ia32_uwrmsr" is undefined
        __builtin_ia32_uwrmsr (__A, __B);
        ^
    
    /usr/lib/gcc/x86_64-linux-gnu/14/include/avxvnniint16intrin.h(42): error: identifier "__builtin_ia32_vpdpwsud128" is undefined
          __builtin_ia32_vpdpwsud128 ((__v4si) __W, (__v4si) __A, (__v4si) __B);
          ^
    
    /usr/lib/gcc/x86_64-linux-gnu/14/include/avxvnniint16intrin.h(50): error: identifier "__builtin_ia32_vpdpwsuds128" is undefined
          __builtin_ia32_vpdpwsuds128 ((__v4si) __W, (__v4si) __A, (__v4si) __B);
          ^
    
    /usr/lib/gcc/x86_64-linux-gnu/14/include/avxvnniint16intrin.h(58): error: identifier "__builtin_ia32_vpdpwusd128" is undefined
          __builtin_ia32_vpdpwusd128 ((__v4si) __W, (__v4si) __A, (__v4si) __B);
    
    ...
    /usr/lib/gcc/x86_64-linux-gnu/14/include/avx512vlbwintrin.h(5136): error: identifier "__builtin_shufflevector" is undefined
        __v16qi __T1 = (__v16qi)__W; __v16qi __T2 = __builtin_shufflevector (__T1, __T1, 8, 9, 10, 11, 12, 13, 14, 15, 8, 9, 10, 11, 12, 13, 14, 15); __v16qi __T3 = __T1 & __T2; __v16qi __T4 = __builtin_shufflevector (__T3, __T3, 4, 5, 6, 7, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); __v16qi __T5 = __T3 & __T4; __v16qi __T6 = __builtin_shufflevector (__T5, __T5, 2, 3, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); __v16qi __T7 = __T5 & __T6; __v16qi __T8 = __builtin_shufflevector (__T7, __T7, 1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); __v16qi __T9 = __T7 & __T8; return __T9[0];
                                                    ^
    
    /usr/lib/gcc/x86_64-linux-gnu/14/include/avx512vlbwintrin.h(5144): error: identifier "__builtin_shufflevector" is undefined
        __v16qi __T1 = (__v16qi)__W; __v16qi __T2 = __builtin_shufflevector (__T1, __T1, 8, 9, 10, 11, 12, 13, 14, 15, 8, 9, 10, 11, 12, 13, 14, 15); __v16qi __T3 = __T1 | __T2; __v16qi __T4 = __builtin_shufflevector (__T3, __T3, 4, 5, 6, 7, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); __v16qi __T5 = __T3 | __T4; __v16qi __T6 = __builtin_shufflevector (__T5, __T5, 2, 3, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); __v16qi __T7 = __T5 | __T6; __v16qi __T8 = __builtin_shufflevector (__T7, __T7, 1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); __v16qi __T9 = __T7 | __T8; return __T9[0];
                                                    ^
    
    Error limit reached.
    100 errors detected in the compilation of "/home/vladislavsemykin/Documents/Work/Start/src/GpuProcessing.cu".
    Compilation terminated.
    make[2]: *** [CMakeFiles/nia_start_core.dir/build.make:91: CMakeFiles/nia_start_core.dir/src/GpuProcessing.cu.o] Error 4
    make[2]: *** Waiting for unfinished jobs....
    make[1]: *** [CMakeFiles/Makefile2:285: CMakeFiles/nia_start_core.dir/all] Error 2
    make: *** [Makefile:91: all] Error 2
    
    cuda toolkit version is 12.5

    Possible Root Causes

    • Incompatible GCC Versions: The errors suggest that certain versions of GCC are either too recent or not fully compatible with the current version of the CUDA Toolkit (12.5), especially in how they handle intrinsic functions and built-in operators for advanced instruction sets like AVX and AMX.
    • CUDA nvcc Warnings: The nvcc warning about incompatible redefinition for option 'compiler-bindir' might indicate that conflicting configurations are being applied between CMake and the underlying CUDA compiler.
    • GCC Intrinsics: The undefined built-in functions in amxtileintrin.h and avx512bf16vlintrin.h files suggest that nvcc is struggling with certain AVX-512 or AMX intrinsic instructions, which may not be fully supported by these GCC versions or the current CUDA version.
  • @ViNN280801

    I have the same issue:

    Of course! It is an nvcc bug...

    Issue Report: Compilation Failure with Various GCC Versions and CUDA Toolkit 12.5

    Did your report it directly to NVIDIA? Unless they get a direct bug report on their tracker, nothing will happen.

    Also why did you use CUDA 12.5 in the second post while you have nvcc 12.6 in the first post?

    Edited by Jakub Klinkovský
  • @lahwaacz, yes, I did it by the following link, but this question still doesn't have answers.

    Also why did you use CUDA 12.5 in the second post while you have nvcc 12.6 in the first post?

    I tried to built it with different configurations, like:

    GCC ver CUDA compiler (nvcc) ver
    11 12.5
    12 12.5
    13 12.5
    14 12.5
    11 12.6
    12 12.6
    13 12.6
    14 12.6
  • @lahwaacz, you can view my old post on NVidia Forums or this post in Github for the solution.

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment