simdpp::permute_zbytes16

From libsimdpp-docs
template<unsigned N, class E1, class E2, class E3>
Ret<N,_DETAIL_> permute_zbytes16( const Vec1<N,E1>& a, const Mask<N,E3>& mask );

Permutes elements in vector a according to a mask. The elements can be potentially zeroed. The shuffling is done at 16-byte granularity: that is, elements within each 16-byte lane can select only elements from the same 16-byte lane.

Each byte within the mask defines which element to select for the output vector:

  • Bit 7 defines whether to zero the element. 1 corresponds to element being zeroed.
  • Bits 6-4 must be zero or the behavior is undefined.
  • Bits 3-0 define the element within the selected vector.

Use make_shuffle_bytes16_mask() to create masks with known contents for shuffling elements of various sizes.

The implementation behaves as if the following set of overloads is provided:

Ret Vec1 Vec2 Mask
promoted 8-bit vector any 8-bit vector any 8-bit vector uint8
promoted 16-bit vector any 16-bit vector any 16-bit vector uint16
promoted 32-bit vector any 32-bit vector any 32-bit vector uint32
promoted 64-bit vector any 64-bit vector any 64-bit vector uint64

The type of the return vector is governed by the expression promotion rules. The return type is a vector expression.

The vectors are shuffled at 16 byte granularity as follows: the vectors are partitioned into 128-bit lanes. The n-th lane in the return vector is governed by n-th lane of the mask vector. The n-th lane in the return vector will contain elements only from n-th lanes of the a vector.

Parameters[edit]

a - source vector
mask - mask vector

Return value[edit]

A vector expression evaluating to the shuffled vectors.

See also[edit]

permutes elements in a vector according to another vector
(function template)