Dynamic dispatch

From cppreference.com

Using libsimdpp it's easy to produce binaries that support multiple instruction sets and automatically select the fastest available on target processor. The implementation of this consists of three parts:

  • All functions that need to be dispatched must be put in SIMDPP_ARCH_NAMESPACE, a special namespace whose name depends on the instruction set that is being compiled for. This is required for dynamic dispatcher to function and also helps to avoid ODR violations. For each function, one of the SIMDPP_MAKE_DISPATCHER_* must be invoked in namespace scope just outside the SIMDPP_ARCH_NAMESPACE namespace the actual function is in.
  • Production of several object files each of which contain compiled code for different target instruction set. The target instruction set is selected by predefining appropriate macros as described here.
  • Production of auxiliary dispatcher code that, on runtime, queries the system about the available instruction sets and based on that information selects the fastest implementation. In libsimdpp, the code is added to one of the mentioned object files by compiling it with additional predefined flags that tell the library that dispatcher code is needed and which instruction sets the user wants to support.

The additional predefined flags for dispatcher support are the following:

  • SIMDPP_EMIT_DISPATCHER - defining it tells the library that this object file should contain implementations of dispatcher functionality.
  • SIMDPP_USER_ARCH_INFO - tells the dispatcher how to acquire the information about the supported instruction sets. The macro must result in an expression that evaluates to a value of simdpp::Arch type.
  • SIMDPP_DISPATCH_ARCH1, SIMDPP_DISPATCH_ARCH2, SIMDPP_DISPATCH_ARCH3, ...: these tell the library for which instruction sets the user wants to dispatch. Each of these macros needs to be defined to a comma separated list of SIMDPP_ARCH_* values corresponding to the enabled instruction sets.

Example

test.h

void print_arch();

test.cc

#include "test.h"
#include <simdpp/simd.h>
#include <iostream>
#include <simdpp/dispatch/get_arch_gcc_builtin_cpu_supports.h>
#define SIMDPP_USER_ARCH_INFO ::simdpp::get_arch_gcc_builtin_cpu_supports()
 
namespace SIMDPP_ARCH_NAMESPACE {
 
void print_arch()
{
    std::cout << static_cast<unsigned>(simdpp::this_compile_arch()) << '\n';
}
 
} // namespace SIMDPP_ARCH_NAMESPACE
 
SIMDPP_MAKE_DISPATCHER_VOID0(print_arch);

main.cc

#include "test.h"
 
int main()
{
    print_arch();
}

Makefile

CXXFLAGS=""
 
test: main.o test_sse2.o test_sse3.o test_sse4_1.o test_null.o
    g++ $^ -o test
 
main.o: main.cc
    g++ main.cc $(CXXFLAGS) -c -o main.o
 
test_null.o: test.cc
    g++ test.cc -c $(CXXFLAGS) -DSIMDPP_EMIT_DISPATCHER \
        -DSIMDPP_DISPATCH_ARCH1=SIMDPP_ARCH_X86_SSE2 \
        -DSIMDPP_DISPATCH_ARCH2=SIMDPP_ARCH_X86_SSE3 \
        -DSIMDPP_DISPATCH_ARCH3=SIMDPP_ARCH_X86_SSE4_1 -o test_null.o
 
test_sse2.o: test.cc
    g++ test.cc -c $(CXXFLAGS) -DSIMDPP_ARCH_X86_SSE2 -msse2 -o test_sse2.o
 
test_sse3.o: test.cc
    g++ test.cc -c $(CXXFLAGS) -DSIMDPP_ARCH_X86_SSE3 -msse3 -o test_sse3.o
 
test_sse4_1.o: test.cc
    g++ test.cc -c $(CXXFLAGS) -DSIMDPP_ARCH_X86_SSE4_1 -msse4.1 -o test_sse4_1.o