Instruction set selection

From cppreference.com

During compilation libsimdpp needs to be explicitly told which instruction set to use. This is done by defining one of the macros listed in the table below before the first inclusion of the simdpp/simd.h header. Multiple options may be specified: for example, if user wants to select AVX and FMA4, then he needs to define SIMDPP_ARCH_X86_AVX and SIMDPP_ARCH_X86_FMA4 macros.


The following instruction sets are supported by the library.

Instruction set Macro to enable it Remarks
Non-SIMD (none) Uses plain C. May be slower than equivalent C implementation as it makes compiler harder to reason about code. The compiler may still vectorize the code if it knows certain SIMD instruction set is available.
x86 SSE2 SIMDPP_ARCH_X86_SSE2 (none)
x86 SSE3 SIMDPP_ARCH_X86_SSE3 Implies SSE2
x86 SSSE3 SIMDPP_ARCH_X86_SSSE3 Implies SSE3
x86 SSE4.1 SIMDPP_ARCH_X86_SSE4_1 Implies SSSE3
x86 AVX SIMDPP_ARCH_X86_AVX 256-bit vectors for floating-point values, 128-bit vectors for integers. Implies SSE4.1
x86 FMA3 (Intel flavor) SIMDPP_ARCH_X86_FMA3 Implies SSE3.
x86 FMA4 (AMD flavor) SIMDPP_ARCH_X86_FMA4 Implies SSE3.
x86 XOP SIMDPP_ARCH_X86_XOR Implies SSE3.
x86 AVX2 SIMDPP_ARCH_X86_AVX 256-bit vectors. Implies AVX
x86 AVX512F SIMDPP_ARCH_X86_AVX512F 512-bit vectors for floating-point and 32, 64-bit integer types. 8-bit and 16-bit integer vectors have 256-bit length. Implies AVX2
ARM NEON without floating-point support SIMDPP_ARCH_ARM_NEON Does not use SIMD instructions for floating-point computations. The rationale for this mode is that certain NEON implementations have imprecise single-precision floating-point units.
ARM NEON with floating-point support SIMDPP_ARCH_ARM_NEON_FLT_SP Uses SIMD instructions for single-precision floating-point computations, non-SIMD instructions for double-precision floating-point computations.
ARM NEONv2 SIMDPP_ARCH_ARM_NEON_FLT_SP
or SIMDPP_ARCH_ARM_NEON
Automatically enabled when compiling for ARM64. All floating-point computations are done on the NEON unit.
PowerPC Altivec SIMDPP_ARCH_POWER_ALTIVEC Does not use SIMD for double-precision and 64-bit integer computations