libsimdpp
0.9.3
|
Functions | |
int128 | simdpp::load (int128 &a, const void *p) |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an aligned memory location. More... | |
int256 | simdpp::load (int256 &a, const void *p) |
float32x4 | simdpp::load (float32x4 &a, const float *p) |
float32x8 | simdpp::load (float32x8 &a, const float *p) |
float64x2 | simdpp::load (float64x2 &a, const double *p) |
float64x4 | simdpp::load (float64x4 &a, const double *p) |
basic_int8x16 | simdpp::load_u (basic_int8x16 &a, const void *p) |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location. More... | |
basic_int16x8 | simdpp::load_u (basic_int16x8 &a, const void *p) |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location. More... | |
basic_int32x4 | simdpp::load_u (basic_int32x4 &a, const void *p) |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location. More... | |
basic_int64x2 | simdpp::load_u (basic_int64x2 &a, const void *p) |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location. More... | |
float32x4 | simdpp::load_u (float32x4 &a, const float *p) |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location. More... | |
float64x2 | simdpp::load_u (float64x2 &a, const double *p) |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location. More... | |
basic_int8x32 | simdpp::load_u (basic_int8x32 &a, const void *p) |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location. More... | |
basic_int16x16 | simdpp::load_u (basic_int16x16 &a, const void *p) |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location. More... | |
basic_int32x8 | simdpp::load_u (basic_int32x8 &a, const void *p) |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location. More... | |
basic_int64x4 | simdpp::load_u (basic_int64x4 &a, const void *p) |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location. More... | |
float32x8 | simdpp::load_u (float32x8 &a, const float *p) |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location. More... | |
float64x4 | simdpp::load_u (float64x4 &a, const double *p) |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location. More... | |
void | simdpp::load_packed2 (basic_int8x16 &a, basic_int8x16 &b, const void *p) |
Loads 8-bit values packed in pairs, de-interleaves them and stores the result into two vectors. More... | |
void | simdpp::load_packed2 (basic_int8x32 &a, basic_int8x32 &b, const void *p) |
Loads 8-bit values packed in pairs, de-interleaves them and stores the result into two vectors. More... | |
void | simdpp::load_packed2 (basic_int16x8 &a, basic_int16x8 &b, const void *p) |
Loads 16-bit values packed in pairs, de-interleaves them and stores the result into two vectors. More... | |
void | simdpp::load_packed2 (basic_int16x16 &a, basic_int16x16 &b, const void *p) |
Loads 16-bit values packed in pairs, de-interleaves them and stores the result into two vectors. More... | |
void | simdpp::load_packed2 (basic_int32x4 &a, basic_int32x4 &b, const void *p) |
Loads 32-bit values packed in pairs, de-interleaves them and stores the result into two vectors. More... | |
void | simdpp::load_packed2 (basic_int32x8 &a, basic_int32x8 &b, const void *p) |
Loads 32-bit values packed in pairs, de-interleaves them and stores the result into two vectors. More... | |
void | simdpp::load_packed2 (basic_int64x2 &a, basic_int64x2 &b, const void *p) |
Loads 64-bit values packed in pairs, de-interleaves them and stores the result into two vectors. More... | |
void | simdpp::load_packed2 (basic_int64x4 &a, basic_int64x4 &b, const void *p) |
Loads 64-bit values packed in pairs, de-interleaves them and stores the result into two vectors. More... | |
Detailed Description
Function Documentation
|
inline |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an aligned memory location.
- 128-bit version:
p must be aligned to 16 bytes.
- 256-bit version:
p must be aligned to 32 bytes.
- In SSE2-SSE4.1, NEON and ALTIVEC this intrinsic results in at least 2 instructions.
- In AVX (integer vectors) this intrinsic results in at least 2 instructions.
|
inline |
|
inline |
|
inline |
|
inline |
|
inline |
|
inline |
Loads 8-bit values packed in pairs, de-interleaves them and stores the result into two vectors.
- 128-bit version:
- p must be aligned to 16 bytes.a = [ *(p), *(p+2), *(p+4), ... , *(p+30) ]b = [ *(p+1), *(p+3), *(p+5), ... , *(p+31) ]
- 256-bit version:
- p must be aligned to 32 bytes.a = [ *(p), *(p+2), *(p+4), ... , *(p+62) ]b = [ *(p+1), *(p+3), *(p+5), ... , *(p+63) ]
|
inline |
Loads 8-bit values packed in pairs, de-interleaves them and stores the result into two vectors.
- 128-bit version:
- p must be aligned to 16 bytes.a = [ *(p), *(p+2), *(p+4), ... , *(p+30) ]b = [ *(p+1), *(p+3), *(p+5), ... , *(p+31) ]
- 256-bit version:
- p must be aligned to 32 bytes.a = [ *(p), *(p+2), *(p+4), ... , *(p+62) ]b = [ *(p+1), *(p+3), *(p+5), ... , *(p+63) ]
|
inline |
Loads 16-bit values packed in pairs, de-interleaves them and stores the result into two vectors.
- 128-bit version:
- p must be aligned to 16 bytes.a = [ *(p), *(p+2), *(p+4), ... , *(p+14) ]b = [ *(p+1), *(p+3), *(p+5), ... , *(p+15) ]
- 256-bit version:
- p must be aligned to 32 bytes.a = [ *(p), *(p+2), *(p+4), ... , *(p+30) ]b = [ *(p+1), *(p+3), *(p+5), ... , *(p+31) ]
|
inline |
Loads 16-bit values packed in pairs, de-interleaves them and stores the result into two vectors.
- 128-bit version:
- p must be aligned to 16 bytes.a = [ *(p), *(p+2), *(p+4), ... , *(p+14) ]b = [ *(p+1), *(p+3), *(p+5), ... , *(p+15) ]
- 256-bit version:
- p must be aligned to 32 bytes.a = [ *(p), *(p+2), *(p+4), ... , *(p+30) ]b = [ *(p+1), *(p+3), *(p+5), ... , *(p+31) ]
|
inline |
Loads 32-bit values packed in pairs, de-interleaves them and stores the result into two vectors.
- 128-bit version:
- p must be aligned to 16 bytes.a = [ *(p), *(p+2), *(p+4), *(p+6) ]b = [ *(p+1), *(p+3), *(p+5), *(p+7) ]
- 256-bit version:
- p must be aligned to 32 bytes.a = [ *(p), *(p+2), *(p+4), ... , *(p+14) ]b = [ *(p+1), *(p+3), *(p+5), ... , *(p+15) ]
|
inline |
Loads 32-bit values packed in pairs, de-interleaves them and stores the result into two vectors.
- 128-bit version:
- p must be aligned to 16 bytes.a = [ *(p), *(p+2), *(p+4), *(p+6) ]b = [ *(p+1), *(p+3), *(p+5), *(p+7) ]
- 256-bit version:
- p must be aligned to 32 bytes.a = [ *(p), *(p+2), *(p+4), ... , *(p+14) ]b = [ *(p+1), *(p+3), *(p+5), ... , *(p+15) ]
|
inline |
Loads 64-bit values packed in pairs, de-interleaves them and stores the result into two vectors.
- 128-bit version:
- p must be aligned to 16 bytes.a = [ *(p), *(p+2) ]b = [ *(p+1), *(p+3) ]
- 256-bit version:
- p must be aligned to 32 bytes.a = [ *(p), *(p+2), *(p+4), *(p+14) ]b = [ *(p+1), *(p+3), *(p+5), *(p+15) ]
|
inline |
Loads 64-bit values packed in pairs, de-interleaves them and stores the result into two vectors.
- 128-bit version:
- p must be aligned to 16 bytes.a = [ *(p), *(p+2) ]b = [ *(p+1), *(p+3) ]
- 256-bit version:
- p must be aligned to 32 bytes.a = [ *(p), *(p+2), *(p+4), *(p+14) ]b = [ *(p+1), *(p+3), *(p+5), *(p+15) ]
|
inline |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location.
- 128-bit version:
p must be aligned to the element size. If p is aligned to 16 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 16-byte aligned 32-byte block may be accessed.
- In ALTIVEC this intrinsic results in at least 4 instructions.
- 256-bit version:
p must be aligned to 32 bytes.
- In SSE2-SSE4.1 and NEON this intrinsic results in at least 2 instructions.
- In ALTIVEC this intrinsic results in at least 6 instructions.
p must be aligned to the element size. If p is aligned to 32 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 32-byte aligned 64-byte block may be accessed.
|
inline |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location.
- 128-bit version:
p must be aligned to the element size. If p is aligned to 16 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 16-byte aligned 32-byte block may be accessed.
- In ALTIVEC this intrinsic results in at least 4 instructions.
- 256-bit version:
p must be aligned to 32 bytes.
- In SSE2-SSE4.1 and NEON this intrinsic results in at least 2 instructions.
- In ALTIVEC this intrinsic results in at least 6 instructions.
p must be aligned to the element size. If p is aligned to 32 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 32-byte aligned 64-byte block may be accessed.
|
inline |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location.
- 128-bit version:
p must be aligned to the element size. If p is aligned to 16 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 16-byte aligned 32-byte block may be accessed.
- In ALTIVEC this intrinsic results in at least 4 instructions.
- 256-bit version:
p must be aligned to 32 bytes.
- In SSE2-SSE4.1 and NEON this intrinsic results in at least 2 instructions.
- In ALTIVEC this intrinsic results in at least 6 instructions.
p must be aligned to the element size. If p is aligned to 32 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 32-byte aligned 64-byte block may be accessed.
|
inline |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location.
- 128-bit version:
p must be aligned to the element size. If p is aligned to 16 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 16-byte aligned 32-byte block may be accessed.
- In ALTIVEC this intrinsic results in at least 4 instructions.
- 256-bit version:
p must be aligned to 32 bytes.
- In SSE2-SSE4.1 and NEON this intrinsic results in at least 2 instructions.
- In ALTIVEC this intrinsic results in at least 6 instructions.
p must be aligned to the element size. If p is aligned to 32 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 32-byte aligned 64-byte block may be accessed.
|
inline |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location.
- 128-bit version:
p must be aligned to the element size. If p is aligned to 16 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 16-byte aligned 32-byte block may be accessed.
- In ALTIVEC this intrinsic results in at least 4 instructions.
- 256-bit version:
p must be aligned to 32 bytes.
- In SSE2-SSE4.1 and NEON this intrinsic results in at least 2 instructions.
- In ALTIVEC this intrinsic results in at least 6 instructions.
p must be aligned to the element size. If p is aligned to 32 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 32-byte aligned 64-byte block may be accessed.
|
inline |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location.
- 128-bit version:
p must be aligned to the element size. If p is aligned to 16 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 16-byte aligned 32-byte block may be accessed.
- In ALTIVEC this intrinsic results in at least 4 instructions.
- 256-bit version:
p must be aligned to 32 bytes.
- In SSE2-SSE4.1 and NEON this intrinsic results in at least 2 instructions.
- In ALTIVEC this intrinsic results in at least 6 instructions.
p must be aligned to the element size. If p is aligned to 32 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 32-byte aligned 64-byte block may be accessed.
|
inline |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location.
- 128-bit version:
p must be aligned to the element size. If p is aligned to 16 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 16-byte aligned 32-byte block may be accessed.
- In ALTIVEC this intrinsic results in at least 4 instructions.
- 256-bit version:
p must be aligned to 32 bytes.
- In SSE2-SSE4.1 and NEON this intrinsic results in at least 2 instructions.
- In ALTIVEC this intrinsic results in at least 6 instructions.
p must be aligned to the element size. If p is aligned to 32 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 32-byte aligned 64-byte block may be accessed.
|
inline |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location.
- 128-bit version:
p must be aligned to the element size. If p is aligned to 16 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 16-byte aligned 32-byte block may be accessed.
- In ALTIVEC this intrinsic results in at least 4 instructions.
- 256-bit version:
p must be aligned to 32 bytes.
- In SSE2-SSE4.1 and NEON this intrinsic results in at least 2 instructions.
- In ALTIVEC this intrinsic results in at least 6 instructions.
p must be aligned to the element size. If p is aligned to 32 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 32-byte aligned 64-byte block may be accessed.
|
inline |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location.
- 128-bit version:
p must be aligned to the element size. If p is aligned to 16 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 16-byte aligned 32-byte block may be accessed.
- In ALTIVEC this intrinsic results in at least 4 instructions.
- 256-bit version:
p must be aligned to 32 bytes.
- In SSE2-SSE4.1 and NEON this intrinsic results in at least 2 instructions.
- In ALTIVEC this intrinsic results in at least 6 instructions.
p must be aligned to the element size. If p is aligned to 32 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 32-byte aligned 64-byte block may be accessed.
|
inline |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location.
- 128-bit version:
p must be aligned to the element size. If p is aligned to 16 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 16-byte aligned 32-byte block may be accessed.
- In ALTIVEC this intrinsic results in at least 4 instructions.
- 256-bit version:
p must be aligned to 32 bytes.
- In SSE2-SSE4.1 and NEON this intrinsic results in at least 2 instructions.
- In ALTIVEC this intrinsic results in at least 6 instructions.
p must be aligned to the element size. If p is aligned to 32 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 32-byte aligned 64-byte block may be accessed.
|
inline |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location.
- 128-bit version:
p must be aligned to the element size. If p is aligned to 16 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 16-byte aligned 32-byte block may be accessed.
- In ALTIVEC this intrinsic results in at least 4 instructions.
- 256-bit version:
p must be aligned to 32 bytes.
- In SSE2-SSE4.1 and NEON this intrinsic results in at least 2 instructions.
- In ALTIVEC this intrinsic results in at least 6 instructions.
p must be aligned to the element size. If p is aligned to 32 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 32-byte aligned 64-byte block may be accessed.
|
inline |
Loads a 128-bit or 256-bit integer, 32-bit or 64-bit float vector from an unaligned memory location.
- 128-bit version:
p must be aligned to the element size. If p is aligned to 16 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 16-byte aligned 32-byte block may be accessed.
- In ALTIVEC this intrinsic results in at least 4 instructions.
- 256-bit version:
p must be aligned to 32 bytes.
- In SSE2-SSE4.1 and NEON this intrinsic results in at least 2 instructions.
- In ALTIVEC this intrinsic results in at least 6 instructions.
p must be aligned to the element size. If p is aligned to 32 bytes only the referenced 16 byte block is accessed. Otherwise, memory within the smallest 32-byte aligned 64-byte block may be accessed.
Generated on Thu Oct 31 2013 04:08:51 for libsimdpp by 1.8.3.1