external: Add xpu library.
Refs #2490.
Merge request reports
Activity
assigned to @f.uhlig
requested review from @f.uhlig
added Build System label
there is a problem with the dlopen call on macosx. The flag
RTLD_DEEPBIND
is not defined on macosx. If I understand the information correctlyRTLD_DEEPBIND
searches for symbols first in the local scope before searching in the global scope. (https://man7.org/linux/man-pages//man3/dlmopen.3.html) In some other discussion there was a statement that this order os the default on macosx. (https://stackoverflow.com/questions/7203161/does-rtld-first-on-mac-do-the-job-of-rtld-deep-bind-on-linux) Unfortunately the documentation isn't really good.I would propose to add a preprocessor statement which in case of macosx create the handle in the following way
handle = dlopen(name, RTLD_LAZY | RTLD_DEEPBIND);
so the code block would become
#if defined __APPLE__ handle = dlopen(name, RTLD_LAZY); #elif defined __linux__ handle = dlopen(name, RTLD_LAZY | RTLD_DEEPBIND); #endif
- Resolved by Florian Uhlig
if compiling the code with the proposed fix the compilation continuous but fails with the following error
external/xpu/src/xpu/driver/cpu/cpu_driver.cpp:75:32: error: use of undeclared identifier '_SC_AVPHYS_PAGES' *free = pagesize * sysconf(_SC_AVPHYS_PAGES);
sysconfig is available for macosx but the argument is not known on macosx. Currently I have no idea how to receive the information on macosx. Could ou explain what the function cpu_driver::meminfo should do.
I have created a merge request to your source code which at least compiles on my mac. Since the changes are implemented using a preprocessor statement I don't expect any problems with the Linux implementation. I did not test if the code works. Do you have a test suite such that I can try to run it on macosx?
added 10 commits
-
17d88d29...f7ee6b9e - 9 commits from branch
computing:master
- 24d12055 - external: Add xpu library.
-
17d88d29...f7ee6b9e - 9 commits from branch
- Resolved by Florian Uhlig
I fixed (work around) the compiler errors concerning the math functions but finally I run into some linker errors on macosx. Maybe you have an idea what could be the problem.
[ 38%] Linking CXX shared library libTestKernels.dylib cd /Users/uhlig/software/fair/cbm/cbmroot_git/external/xpu/build/test && /usr/local/Cellar/cmake/3.22.1/bin/cmake -E cmake_link_script CMakeFiles/TestKernels.dir/link.txt --verbose=1 /Library/Developer/CommandLineTools/usr/bin/c++ -Wall -Wextra -Werror -O3 -Xclang -fopenmp -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX11.1.sdk -mmacosx-version-min=10.15 -dynamiclib -Wl,-headerpad_max_install_names -o libTestKernels.dylib -install_name @rpath/libTestKernels.dylib CMakeFiles/TestKernels.dir/TestKernels.cpp.o Undefined symbols for architecture x86_64: "xpu::detail::logger::write(char const*, ...)", referenced from: xpu::detail::action_runner<xpu::detail::kernel_tag, empty_kernel, xpu::no_smem, void (*)(xpu::no_smem&)>::call(float*, xpu::grid) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, vector_add, xpu::no_smem, void (*)(xpu::no_smem&, float const*, float const*, float*, int)>::call(float*, xpu::grid, float const*, float const*, float*, int) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, vector_add_timing, xpu::no_smem, void (*)(xpu::no_smem&, float const*, float const*, float*, int)>::call(float*, xpu::grid, float const*, float const*, float*, int) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, sort_float, sort_floats_smem, void (*)(sort_floats_smem&, float*, int, float*, float**)>::call(float*, xpu::grid, float*, int, float*, float**) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, sort_struct, sort_kv_smem, void (*)(sort_kv_smem&, key_value_t*, int, key_value_t*, key_value_t**)>::call(float*, xpu::grid, key_value_t*, int, key_value_t*, key_value_t**) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, merge, xpu::block_merge<float, 64, 8, (xpu::driver_t)0>::storage_t, void (*)(xpu::block_merge<float, 64, 8, (xpu::driver_t)0>::storage_t&, float const*, unsigned long, float const*, unsigned long, float*)>::call(float*, xpu::grid, float const*, unsigned long, float const*, unsigned long, float*) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, merge_single, xpu::block_merge<float, 64, 1, (xpu::driver_t)0>::storage_t, void (*)(xpu::block_merge<float, 64, 1, (xpu::driver_t)0>::storage_t&, float const*, unsigned long, float const*, unsigned long, float*)>::call(float*, xpu::grid, float const*, unsigned long, float const*, unsigned long, float*) in TestKernels.cpp.o ... "xpu::detail::logger::instance()", referenced from: xpu::detail::action_runner<xpu::detail::kernel_tag, empty_kernel, xpu::no_smem, void (*)(xpu::no_smem&)>::call(float*, xpu::grid) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, vector_add, xpu::no_smem, void (*)(xpu::no_smem&, float const*, float const*, float*, int)>::call(float*, xpu::grid, float const*, float const*, float*, int) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, vector_add_timing, xpu::no_smem, void (*)(xpu::no_smem&, float const*, float const*, float*, int)>::call(float*, xpu::grid, float const*, float const*, float*, int) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, sort_float, sort_floats_smem, void (*)(sort_floats_smem&, float*, int, float*, float**)>::call(float*, xpu::grid, float*, int, float*, float**) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, sort_struct, sort_kv_smem, void (*)(sort_kv_smem&, key_value_t*, int, key_value_t*, key_value_t**)>::call(float*, xpu::grid, key_value_t*, int, key_value_t*, key_value_t**) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, merge, xpu::block_merge<float, 64, 8, (xpu::driver_t)0>::storage_t, void (*)(xpu::block_merge<float, 64, 8, (xpu::driver_t)0>::storage_t&, float const*, unsigned long, float const*, unsigned long, float*)>::call(float*, xpu::grid, float const*, unsigned long, float const*, unsigned long, float*) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, merge_single, xpu::block_merge<float, 64, 1, (xpu::driver_t)0>::storage_t, void (*)(xpu::block_merge<float, 64, 1, (xpu::driver_t)0>::storage_t&, float const*, unsigned long, float const*, unsigned long, float*)>::call(float*, xpu::grid, float const*, unsigned long, float const*, unsigned long, float*) in TestKernels.cpp.o ... "xpu::sincos(float, float*, float*)", referenced from: void test_device_funcs::impl<xpu::no_smem>(xpu::no_smem&, variant*) in TestKernels.cpp.o "thread-local wrapper routine for xpu::detail::this_thread::grid_dim", referenced from: _.omp_outlined. in TestKernels.cpp.o _.omp_outlined..11 in TestKernels.cpp.o _.omp_outlined..14 in TestKernels.cpp.o _.omp_outlined..17 in TestKernels.cpp.o _.omp_outlined..20 in TestKernels.cpp.o _.omp_outlined..23 in TestKernels.cpp.o _.omp_outlined..26 in TestKernels.cpp.o ... "thread-local wrapper routine for xpu::detail::this_thread::block_idx", referenced from: void vector_add::impl<xpu::no_smem>(xpu::no_smem&, float const*, float const*, float*, int) in TestKernels.cpp.o void vector_add_timing::impl<xpu::no_smem>(xpu::no_smem&, float const*, float const*, float*, int) in TestKernels.cpp.o void sort_float::impl<sort_floats_smem>(sort_floats_smem&, float*, int, float*, float**) in TestKernels.cpp.o void sort_struct::impl<sort_kv_smem>(sort_kv_smem&, key_value_t*, int, key_value_t*, key_value_t**) in TestKernels.cpp.o void get_thread_idx::impl<xpu::no_smem>(xpu::no_smem&, int*) in TestKernels.cpp.o _.omp_outlined. in TestKernels.cpp.o _.omp_outlined..11 in TestKernels.cpp.o ... "___kmpc_for_static_fini", referenced from: _.omp_outlined. in TestKernels.cpp.o _.omp_outlined..11 in TestKernels.cpp.o _.omp_outlined..14 in TestKernels.cpp.o _.omp_outlined..17 in TestKernels.cpp.o _.omp_outlined..20 in TestKernels.cpp.o _.omp_outlined..23 in TestKernels.cpp.o _.omp_outlined..26 in TestKernels.cpp.o ... "___kmpc_for_static_init_4", referenced from: _.omp_outlined. in TestKernels.cpp.o _.omp_outlined..11 in TestKernels.cpp.o _.omp_outlined..14 in TestKernels.cpp.o _.omp_outlined..17 in TestKernels.cpp.o _.omp_outlined..20 in TestKernels.cpp.o _.omp_outlined..23 in TestKernels.cpp.o _.omp_outlined..26 in TestKernels.cpp.o ... "___kmpc_fork_call", referenced from: xpu::detail::action_runner<xpu::detail::kernel_tag, empty_kernel, xpu::no_smem, void (*)(xpu::no_smem&)>::call(float*, xpu::grid) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, vector_add, xpu::no_smem, void (*)(xpu::no_smem&, float const*, float const*, float*, int)>::call(float*, xpu::grid, float const*, float const*, float*, int) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, vector_add_timing, xpu::no_smem, void (*)(xpu::no_smem&, float const*, float const*, float*, int)>::call(float*, xpu::grid, float const*, float const*, float*, int) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, sort_float, sort_floats_smem, void (*)(sort_floats_smem&, float*, int, float*, float**)>::call(float*, xpu::grid, float*, int, float*, float**) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, sort_struct, sort_kv_smem, void (*)(sort_kv_smem&, key_value_t*, int, key_value_t*, key_value_t**)>::call(float*, xpu::grid, key_value_t*, int, key_value_t*, key_value_t**) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, merge, xpu::block_merge<float, 64, 8, (xpu::driver_t)0>::storage_t, void (*)(xpu::block_merge<float, 64, 8, (xpu::driver_t)0>::storage_t&, float const*, unsigned long, float const*, unsigned long, float*)>::call(float*, xpu::grid, float const*, unsigned long, float const*, unsigned long, float*) in TestKernels.cpp.o xpu::detail::action_runner<xpu::detail::kernel_tag, merge_single, xpu::block_merge<float, 64, 1, (xpu::driver_t)0>::storage_t, void (*)(xpu::block_merge<float, 64, 1, (xpu::driver_t)0>::storage_t&, float const*, unsigned long, float const*, unsigned long, float*)>::call(float*, xpu::grid, float const*, unsigned long, float const*, unsigned long, float*) in TestKernels.cpp.o ... ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation)
Dear @f.uhlig,
you have been identified as code owner of at least one file which was changed with this merge request.
Please check the changes and approve them or request changes.
added CodeOwners label
added 2 commits
added 2 commits
added 6 commits
-
45d43d92...79b3aeef - 5 commits from branch
computing:master
- a244d714 - external: Add xpu library.
-
45d43d92...79b3aeef - 5 commits from branch
enabled an automatic merge when the pipeline for a244d714 succeeds