Main vector types

The library has a number of types that correspond to various types of data that may be stored within a SIMD registers. The types may be categorized in two dimensions: the type of data stored within a single element (lane) of the wrapped SIMD register and the number of such elements wrapped by the type.

The following element types are supported:

signed integers: 8, 16, 32 and 64-bit wide
unsigned integers: 8, 16, 32 and 64-bit wide
floating-point numbers: 32 and 64-bit wide
integer masks: with elements 8, 16, 32 and 64-bits wide
floating-point masks: with elements 32 and 64-bits wide

Let's for the moment forget about masks, and why there are two different kinds of them; they will be described later.

The number of elements that may be contained within a vector type may be any power of two, which larger than certain minimum bound that is dependent on element type. Vectors containing 8, 16, 32 and 64-bit elements must have at least 16, 8, 4 and 2 of them respectively.

A vector may correspond to multiple native SIMD registers if it contains more elements than can be supported by a single register on that particular architecture. For example, an instance of int32x8 type maps to two instances of __m128i type on SSE2, but to a single instance of __m256i on AVX2. As a result, non-homogeneous SIMD instruction sets that for example support 128-bit SIMD registers for integer data and 256-bit registers for floating-point data may be supported without losing a bit of efficiency.

The actual physical layout of a vector type is undefined. In particular, this means that the user must use the library functions to store or load vectors from memory and also not depend on sizeof operator.

The following class templates are provided for non-mask types:

template<unsigned N, class Expr = void> class int8;
template<unsigned N, class Expr = void> class int16;
template<unsigned N, class Expr = void> class int32;
template<unsigned N, class Expr = void> class int64;
template<unsigned N, class Expr = void> class uint8;
template<unsigned N, class Expr = void> class uint16;
template<unsigned N, class Expr = void> class uint32;
template<unsigned N, class Expr = void> class uint64;
template<unsigned N, class Expr = void> class float32;
template<unsigned N, class Expr = void> class float64;

Here N is the number of elements within vector.

The Expr template parameter is used to support expression templates. Most of the user code will not use non-default values of it.

Masks

Masts are special vector types that are similar to their regular counterparts from user's perspective. The difference is that masks store one bit of information per element: either all bits are ones or zeroes. The physical layout is undefined similarly to the regular vector types. On certain instruction sets such as AVX512 each mask element occupies single physical bit, on others it is effectively a regular vector that stores either ones or zeroes in its elements.

The following class templates are provided for mask types:

template<unsigned N, class Expr = void> class mask_int8;
template<unsigned N, class Expr = void> class mask_int16;
template<unsigned N, class Expr = void> class mask_int32;
template<unsigned N, class Expr = void> class mask_int64;
template<unsigned N, class Expr = void> class mask_float32;
template<unsigned N, class Expr = void> class mask_float64;

Here N is the number of elements within mask vector and Expr is used to implement expression templates.

Different types are used for floating-point and integer masks not without a reason: on certain architectures integer and floating point operations are implemented in different processor "domains" with extra latency to pass data between them. Separate types are used to select instructions that operate in correct domain to avoid that extra latency.

Types
Operations
Bitwise operations
Floating-point operations
Integer operations
Memory access operations
Shuffle operations
Miscellaneous operations

Vector types
Type hierarchy
Vector expressions
Vector type promotion