libsimdpp  1.0
Dynamic dispatch

If the user wants to include several versions of the same code, compiled for different architectures sets into the same executable, then all such code must be put into SIMDPP_ARCH_NAMESPACE namespace. This macro evaluates to an identifier which is unique for each architecture.

In addition to the above, the source file must not define any of the architecture select macros; they must be supplied via the compiler options. The code for NONE_NULL architecture must be linked to the resulting executable.

To use dynamic dispatch mechanism, declare the function within an SIMDPP_ARCH_NAMESPACE and then use one of SIMDPP_MAKE_DISPATCHER_*** macros.

Dynamic dispatch example

The following example demonstrates the simpliest usage of dynamic dispatch:

// test.h
void print_arch();
// test.cc
#include "test.h"
#include <simdpp/simd.h>
#include <iostream>
namespace SIMDPP_ARCH_NAMESPACE {
void print_arch()
{
std::cout << static_cast<unsigned>(simdpp::this_compile_arch()) << '\n';
}
} // namespace SIMDPP_ARCH_NAMESPACE
SIMDPP_MAKE_DISPATCHER_VOID0(print_arch);
// main.cc
#include "test.h"
int main()
{
print_arch();
}
#Makefile
CXXFLAGS="-std=c++11"
test: main.o test_sse2.o test_sse3.o test_sse4_1.o test_null.o
g++ $^ -lpthread -o test
main.o: main.cc
g++ main.cc $(CXXFLAGS) -c -o main.o
# inclusion of NONE_NULL is mandatory
test_null.o: test.cc
g++ test.cc -c $(CXXFLAGS) -o test_sse2.o
test_sse2.o: test.cc
g++ test.cc -c $(CXXFLAGS) -DSIMDPP_ARCH_X86_SSE2 -msse2 -o test_sse2.o
test_sse3.o: test.cc
g++ test.cc -c $(CXXFLAGS) -DSIMDPP_ARCH_X86_SSE3 -msse3 -o test_sse3.o
test_sse4_1.o: test.cc
g++ test.cc -c $(CXXFLAGS) -DSIMDPP_ARCH_X86_SSE4_1 -msse4.1 -o test_sse3.o

If compiled, the above example selects the "fastest" of SSE2, SSE3 or SSE4.1 instruction sets, whichever is available on the target processor and outputs an integer that identifiers that instruction set.

Note, that the object files must be linked directly to the executable. If static libraries are used, the linker may throw out static dispatcher registration code and break the mechanism. Do prevent this behavior, -Wl,–whole-archive or an equivalent flag must be used.

CMake

For CMake users, cmake/SimdppMultiarch.cmake contains several useful functions:

  • simdpp_get_compilable_archs: checks what architectures are supported by the compiler.
  • simdpp_get_runnable_archs: checks what architectures are supported by both the compiler and the current processor.
  • simdpp_multiversion: given a list of architectures (possibly generated by simdpp_get_compilable_archs or simdpp_get_runnable_archs), automatically configures compilation of additional objects. The user only needs to add the returned list of source files to add_library or add_executable.

The above example may be build with CMakeLists.txt as simple as follows:

cmake_minimum_required(VERSION 2.8.0)
project(test)
include(SimdppMultiarch)
simdpp_get_runnable_archs(RUNNABLE_ARCHS)
simdpp_multiarch(GEN_ARCH_FILES test.cc ${RUNNABLE_ARCHS})
add_executable(test main.cc ${GEN_ARCH_FILES})
target_link_libraries(test pthread)
set_target_properties(test PROPERTIES COMPILE_FLAGS "-std=c++11")