simdpp::permute_bytes16
template<unsigned N, class E1, class E2, class E3>
Ret<N,_DETAIL_> permute_bytes16( const Vec1<N,E1>& a, const Mask<N,E3>& mask ); |
||
Permutes elements in vector a
according to a mask
. The shuffling is done at 16-byte granularity: that is, elements within each 16-byte lane can select only elements from the same 16-byte lane.
Each byte within the mask
defines which element to select for the output vector:
- Bits 7-4 must be zero or the behavior is undefined.
- Bits 3-0 define the element within the selected vector.
Use make_shuffle_bytes16_mask() to create masks with known contents for shuffling elements of various sizes.
The implementation behaves as if the following set of overloads is provided:
Ret
|
Vec1
|
Vec2
|
Mask
|
---|---|---|---|
promoted 8-bit vector | any 8-bit vector | any 8-bit vector | uint8
|
promoted 16-bit vector | any 16-bit vector | any 16-bit vector | uint16
|
promoted 32-bit vector | any 32-bit vector | any 32-bit vector | uint32
|
promoted 64-bit vector | any 64-bit vector | any 64-bit vector | uint64
|
The type of the return vector is governed by the expression promotion rules. The return type is a vector expression.
The vectors are shuffled at 16 byte granularity as follows: the vectors are partitioned into 128-bit lanes. The n
-th lane in the return vector is governed by n
-th lane of the mask
vector. The n
-th lane in the return vector will contain elements only from n
-th lanes of the a
vector.
Parameters[edit]
a | - | source vector |
mask | - | mask vector |
Return value[edit]
A vector expression evaluating to the shuffled vectors.
See also[edit]
permutes or zeroes elements in a vector according to another vector (function template) |