-
Notifications
You must be signed in to change notification settings - Fork 280
Description
Environment
- GCC 13.2.0
- ARM Neoverse V1 (in an AWS instance)
- GCC build flags:
-ftree-vectorize -mcpu=native -fno-math-errno -O3 -DNDEBUG -std=c++17 -fPIC -march=native
- xsimd version:
12.1.1
Error
I'm running into a build issue when compiling code that uses xsimd:
/tmp/bot/easybuild/build/DP3/6.0/foss-2023b/DP3/antennaflagger/Flagger.cc:116:66: required from here
/cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/neoverse_v1/software/IDG/1.2.0-foss-2023b/include/xsimd/arch/xsimd_neon.hpp:943:36: error: could not convert dispatcher.xsimd::kernel::detail::neon_dispatcher_base<xsimd::kernel::detail::comp_return_type, __Uint8x16_t, __Int8x16_t, __Uint16x8_t, __Int16x8_t, __Uint32x4_t, __Int32x4_t, __Float32x4_t>::binary::apply<__Float32x4_t>((& lhs)->xsimd::batch<float, xsimd::i8mm<xsimd::neon64> >::<anonymous>.xsimd::types::simd_register<float, xsimd::i8mm<xsimd::neon64> >::<anonymous>.xsimd::types::simd_register<float, xsimd::neon64>::<anonymous>.xsimd::types::simd_register<float, xsimd::neon>::operator register_type(), (& rhs)->xsimd::batch<float, xsimd::i8mm<xsimd::neon64> >::<anonymous>.xsimd::types::simd_register<float, xsimd::i8mm<xsimd::neon64> >::<anonymous>.xsimd::types::simd_register<float, xsimd::neon64>::<anonymous>.xsimd::types::simd_register<float, xsimd::neon>::operator register_type()) from xsimd::kernel::detail::comp_return_type<__Float32x4_t> {aka uint32x4_t} to xsimd::batch_bool<float, xsimd::i8mm<xsimd::neon64> >
943 | return dispatcher.apply(register_type(lhs), register_type(rhs));
| ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| xsimd::kernel::detail::comp_return_type<__Float32x4_t> {aka uint32x4_t}
The same code, with the same compiler flags / compiler version builds fine on Neoverse N1 (and on zen2, zen3, haswell and skylake by the way).
I've tried to dig into the code of xsimd a bit, but in the above error I'm in a bit over my head when it comes to all the types flying around :) Hoping that someone with more expertise in xsimd spots where this might be going wrong... My bet is there was some change in terms of datatypes, intrinsics, or similar in Neoverse V1 that was not accounted for (yet) in xsimd that makes this go wrong compared to e.g. Neoverse N1.
Not sure if this might be useful, but to get an overview of the supported instructions on N1 vs V1, on Neoverse N1:
$ lscpu | grep Flags
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
And for Neoverse V1:
$ lscpu | grep Flags
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs dcpodp svei8mm svebf16 i8mm bf16 dgh rng
N.B. Note that this none of these codes is mine: I'm just the guy having the pleasure of trying to build them on different hardware architectures :)