libsimdpp
1.0
|
Functions | |
template<unsigned id> | |
uint8x16 | simdpp::insert (uint8x16 a, uint8_t x) |
Inserts an element into int8x16 vector at the position identified by id. More... | |
template<unsigned id> | |
uint16x8 | simdpp::insert (uint16x8 a, uint16_t x) |
Inserts an element into int16x8 vector at the position identified by id. More... | |
template<unsigned id> | |
uint32x4 | simdpp::insert (uint32x4 a, uint32_t x) |
Inserts an element into int32x4 vector at the position identified by id. More... | |
template<unsigned id> | |
uint64x2 | simdpp::insert (uint64x2 a, uint64_t x) |
Inserts an element into int64x2 vector at the position identified by id. More... | |
template<unsigned id> | |
float32x4 | simdpp::insert (float32x4 a, float x) |
Inserts an element into float32x4 vector at the position identified by id. More... | |
template<unsigned id> | |
float64x2 | simdpp::insert (float64x2 a, double x) |
Inserts an element into float64x2 vector at the position identified by id. More... | |
template<unsigned id> | |
uint8_t | simdpp::extract (uint8x16 a) |
Extracts the id-th element from int8x16 vector. More... | |
template<unsigned id> | |
int8_t | simdpp::extract (int8x16 a) |
Extracts the id-th element from int8x16 vector. More... | |
template<class E1 , class E2 > | |
uint8x32 | simdpp::combine (uint8< 16, E1 > a, uint8< 16, E2 > b) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<class E1 , class E2 > | |
uint16x16 | simdpp::combine (uint16< 8, E1 > a, uint16< 8, E2 > b) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<class E1 , class E2 > | |
uint32x8 | simdpp::combine (uint32< 4, E1 > a, uint32< 4, E2 > b) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<class E1 , class E2 > | |
uint64x4 | simdpp::combine (uint64< 2, E1 > a, uint64< 2, E2 > b) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<class E1 , class E2 > | |
int16x16 | simdpp::combine (int16< 8, E1 > a, int16< 8, E2 > b) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<class E1 , class E2 > | |
int32x8 | simdpp::combine (int32< 4, E1 > a, int32< 4, E2 > b) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<class E1 , class E2 > | |
int64x4 | simdpp::combine (int64< 2, E1 > a, int64< 2, E2 > b) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<class E1 , class E2 > | |
float32x8 | simdpp::combine (float32< 4, E1 > a, float32< 4, E2 > b) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<class E1 , class E2 > | |
float64x4 | simdpp::combine (float64< 2, E1 > a, float64< 2, E2 > b) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<unsigned N, class E1 , class E2 > | |
uint8< N *2 > | simdpp::combine (uint8< N, E1 > a1, uint8< N, E2 > a2) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<unsigned N, class E1 , class E2 > | |
uint16< N *2 > | simdpp::combine (uint16< N, E1 > a1, uint16< N, E2 > a2) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<unsigned N, class E1 , class E2 > | |
uint32< N *2 > | simdpp::combine (uint32< N, E1 > a1, uint32< N, E2 > a2) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<unsigned N, class E1 , class E2 > | |
uint64< N *2 > | simdpp::combine (uint64< N, E1 > a1, uint64< N, E2 > a2) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<unsigned N, class E1 , class E2 > | |
int8< N *2 > | simdpp::combine (int8< N, E1 > a1, int8< N, E2 > a2) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<unsigned N, class E1 , class E2 > | |
int16< N *2 > | simdpp::combine (int16< N, E1 > a1, int16< N, E2 > a2) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<unsigned N, class E1 , class E2 > | |
int32< N *2 > | simdpp::combine (int32< N, E1 > a1, int32< N, E2 > a2) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<unsigned N, class E1 , class E2 > | |
int64< N *2 > | simdpp::combine (int64< N, E1 > a1, int64< N, E2 > a2) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<unsigned N, class E1 , class E2 > | |
float32< N *2 > | simdpp::combine (float32< N, E1 > a1, float32< N, E2 > a2) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
template<unsigned N, class E1 , class E2 > | |
float64< N *2 > | simdpp::combine (float64< N, E1 > a1, float64< N, E2 > a2) |
Combines two 128-bit vectors into a 256-bit vector. More... | |
Detailed Description
Function Documentation
uint8x32 simdpp::combine | ( | uint8< 16, E1 > | a, |
uint8< 16, E2 > | b | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
uint16x16 simdpp::combine | ( | uint16< 8, E1 > | a, |
uint16< 8, E2 > | b | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
uint32x8 simdpp::combine | ( | uint32< 4, E1 > | a, |
uint32< 4, E2 > | b | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
uint64x4 simdpp::combine | ( | uint64< 2, E1 > | a, |
uint64< 2, E2 > | b | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
int16x16 simdpp::combine | ( | int16< 8, E1 > | a, |
int16< 8, E2 > | b | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
int32x8 simdpp::combine | ( | int32< 4, E1 > | a, |
int32< 4, E2 > | b | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
int64x4 simdpp::combine | ( | int64< 2, E1 > | a, |
int64< 2, E2 > | b | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
float32x8 simdpp::combine | ( | float32< 4, E1 > | a, |
float32< 4, E2 > | b | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
float64x4 simdpp::combine | ( | float64< 2, E1 > | a, |
float64< 2, E2 > | b | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
uint8<N*2> simdpp::combine | ( | uint8< N, E1 > | a1, |
uint8< N, E2 > | a2 | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
uint16<N*2> simdpp::combine | ( | uint16< N, E1 > | a1, |
uint16< N, E2 > | a2 | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
uint32<N*2> simdpp::combine | ( | uint32< N, E1 > | a1, |
uint32< N, E2 > | a2 | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
uint64<N*2> simdpp::combine | ( | uint64< N, E1 > | a1, |
uint64< N, E2 > | a2 | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
int8<N*2> simdpp::combine | ( | int8< N, E1 > | a1, |
int8< N, E2 > | a2 | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
int16<N*2> simdpp::combine | ( | int16< N, E1 > | a1, |
int16< N, E2 > | a2 | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
int32<N*2> simdpp::combine | ( | int32< N, E1 > | a1, |
int32< N, E2 > | a2 | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
int64<N*2> simdpp::combine | ( | int64< N, E1 > | a1, |
int64< N, E2 > | a2 | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
float32<N*2> simdpp::combine | ( | float32< N, E1 > | a1, |
float32< N, E2 > | a2 | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
float64<N*2> simdpp::combine | ( | float64< N, E1 > | a1, |
float64< N, E2 > | a2 | ||
) |
Combines two 128-bit vectors into a 256-bit vector.
- In AVX2 this intrinsic results in at least 1 instructions.
- In SSE2-AVX, NEON and ALTIVEC this intrinsic results in at least 0 instructions.
uint8_t simdpp::extract | ( | uint8x16 | a) |
Extracts the id-th element from int8x16 vector.
This function may have very high latency.
- In SSE2-SSSE3 this intrinsic results in at least 1-2 instructions.
- In SSE4.1-AVX this intrinsic results in at least 1 instructions.
- In ALTIVEC this intrinsic results in at least 2 instructions.
int8_t simdpp::extract | ( | int8x16 | a) |
Extracts the id-th element from int8x16 vector.
This function may have very high latency.
- In SSE2-SSSE3 this intrinsic results in at least 1-2 instructions.
- In SSE4.1-AVX this intrinsic results in at least 1 instructions.
- In ALTIVEC this intrinsic results in at least 2 instructions.
uint8x16 simdpp::insert | ( | uint8x16 | a, |
uint8_t | x | ||
) |
Inserts an element into int8x16 vector at the position identified by id.
This function may have very high latency.
- In SSE2-SSSE3 this intrinsic results in at least 4-5 instructions.
- In ALTIVEC this intrinsic results in at least 3 instructions.
uint16x8 simdpp::insert | ( | uint16x8 | a, |
uint16_t | x | ||
) |
Inserts an element into int16x8 vector at the position identified by id.
This function may have very high latency.
- In ALTIVEC this intrinsic results in at least 3 instructions.
uint32x4 simdpp::insert | ( | uint32x4 | a, |
uint32_t | x | ||
) |
Inserts an element into int32x4 vector at the position identified by id.
This function may have very high latency.
- In SSE2-SSSE3 this intrinsic results in at least 4 instructions.
- In ALTIVEC this intrinsic results in at least 3 instructions.
uint64x2 simdpp::insert | ( | uint64x2 | a, |
uint64_t | x | ||
) |
Inserts an element into int64x2 vector at the position identified by id.
This function may have very high latency.
- In SSE2, SSE3 and SSSE3 this intrinsic results in at least 2 instructions.
- In SSE4_1 this intrinsic results in at least 1 instructions.
- In SSE2_32bit, SSE3_32bit and SSSE3_32bit this intrinsic results in at least 4 instructions.
- In SSE4_1_32bit this intrinsic results in at least 2 instructions.
- In ALTIVEC this intrinsic results in at least 3 instructions.
float32x4 simdpp::insert | ( | float32x4 | a, |
float | x | ||
) |
Inserts an element into float32x4 vector at the position identified by id.
This function may have very high latency.
- In SSE2-SSSE3 this intrinsic results in at least 4 instructions.
- In ALTIVEC this intrinsic results in at least 3 instructions.
float64x2 simdpp::insert | ( | float64x2 | a, |
double | x | ||
) |
Inserts an element into float64x2 vector at the position identified by id.
This function potentially
This function may have very high latency.
- In SSE2-SSSE3 this intrinsic results in at least 2 instructions.
- In ALTIVEC this intrinsic results in at least 3 instructions.
Generated on Tue Apr 8 2014 03:14:34 for libsimdpp by
