zfs/config/kernel-fpu.m4

132 lines
3.3 KiB
Plaintext
Raw Normal View History

dnl #
dnl # Handle differences in kernel FPU code.
Support for vectorized algorithms on x86 This is initial support for x86 vectorized implementations of ZFS parity and checksum algorithms. For the compilation phase, configure step checks if toolchain supports relevant instruction sets. Each implementation must ensure that the code is not passed to compiler if relevant instruction set is not supported. For this purpose, following new defines are provided if instruction set is supported: - HAVE_SSE, - HAVE_SSE2, - HAVE_SSE3, - HAVE_SSSE3, - HAVE_SSE4_1, - HAVE_SSE4_2, - HAVE_AVX, - HAVE_AVX2. For detecting if an instruction set can be used in runtime, following functions are provided in (include/linux/simd_x86.h): - zfs_sse_available() - zfs_sse2_available() - zfs_sse3_available() - zfs_ssse3_available() - zfs_sse4_1_available() - zfs_sse4_2_available() - zfs_avx_available() - zfs_avx2_available() - zfs_bmi1_available() - zfs_bmi2_available() These function should be called once, on module load, or initialization. They are safe to use from user and kernel space. If an implementation is using more than single instruction set, both compiler and runtime support for all relevant instruction sets should be checked. Kernel fpu methods: - kfpu_begin() - kfpu_end() Use __get_cpuid_max and __cpuid_count from <cpuid.h> Both gcc and clang have support for these. They also handle ebx register in case it is used for PIC code. Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Closes #4381
2016-02-29 18:42:27 +00:00
dnl #
dnl # Kernel
dnl # 5.0: Wrappers have been introduced to save/restore the FPU state.
dnl # This change was made to the 4.19.38 and 4.14.120 LTS kernels.
dnl # HAVE_KERNEL_FPU_INTERNAL
dnl #
dnl # 4.2: Use __kernel_fpu_{begin,end}()
dnl # HAVE_UNDERSCORE_KERNEL_FPU & KERNEL_EXPORTS_X86_FPU
dnl #
dnl # Pre-4.2: Use kernel_fpu_{begin,end}()
dnl # HAVE_KERNEL_FPU & KERNEL_EXPORTS_X86_FPU
Support for vectorized algorithms on x86 This is initial support for x86 vectorized implementations of ZFS parity and checksum algorithms. For the compilation phase, configure step checks if toolchain supports relevant instruction sets. Each implementation must ensure that the code is not passed to compiler if relevant instruction set is not supported. For this purpose, following new defines are provided if instruction set is supported: - HAVE_SSE, - HAVE_SSE2, - HAVE_SSE3, - HAVE_SSSE3, - HAVE_SSE4_1, - HAVE_SSE4_2, - HAVE_AVX, - HAVE_AVX2. For detecting if an instruction set can be used in runtime, following functions are provided in (include/linux/simd_x86.h): - zfs_sse_available() - zfs_sse2_available() - zfs_sse3_available() - zfs_ssse3_available() - zfs_sse4_1_available() - zfs_sse4_2_available() - zfs_avx_available() - zfs_avx2_available() - zfs_bmi1_available() - zfs_bmi2_available() These function should be called once, on module load, or initialization. They are safe to use from user and kernel space. If an implementation is using more than single instruction set, both compiler and runtime support for all relevant instruction sets should be checked. Kernel fpu methods: - kfpu_begin() - kfpu_end() Use __get_cpuid_max and __cpuid_count from <cpuid.h> Both gcc and clang have support for these. They also handle ebx register in case it is used for PIC code. Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Closes #4381
2016-02-29 18:42:27 +00:00
dnl #
Perform KABI checks in parallel Reduce the time required for ./configure to perform the needed KABI checks by allowing kbuild to compile multiple test cases in parallel. This was accomplished by splitting each test's source code from the logic handling whether that code could be compiled or not. By introducing this split it's possible to minimize the number of times kbuild needs to be invoked. As importantly, it means all of the tests can be built in parallel. This does require a little extra care since we expect some tests to fail, so the --keep-going (-k) option must be provided otherwise some tests may not get compiled. Furthermore, since a failure during the kbuild modpost phase will result in an early exit; the final linking phase is limited to tests which passed the initial compilation and produced an object file. Once everything has been built the configure script proceeds as previously. The only significant difference is that it now merely needs to test for the existence of a .ko file to determine the result of a given test. This vastly speeds up the entire process. New test cases should use ZFS_LINUX_TEST_SRC to declare their test source code and ZFS_LINUX_TEST_RESULT to check the result. All of the existing kernel-*.m4 files have been updated accordingly, see config/kernel-current-time.m4 for a basic example. The legacy ZFS_LINUX_TRY_COMPILE macro has been kept to handle special cases but it's use is not encouraged. master (secs) patched (secs) ------------- ---------------- autogen.sh 61 68 configure 137 24 (~17% of current run time) make -j $(nproc) 44 44 make rpms 287 150 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #8547 Closes #9132 Closes #9341 Conflicts: Makefile.am config/kernel-fpu.m4
2019-10-01 19:50:34 +00:00
dnl # N.B. The header check is performed before all other checks since it
dnl # depends on HAVE_KERNEL_FPU_API_HEADER being set in confdefs.h.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_FPU_HEADER], [
AC_MSG_CHECKING([whether fpu headers are available])
Support for vectorized algorithms on x86 This is initial support for x86 vectorized implementations of ZFS parity and checksum algorithms. For the compilation phase, configure step checks if toolchain supports relevant instruction sets. Each implementation must ensure that the code is not passed to compiler if relevant instruction set is not supported. For this purpose, following new defines are provided if instruction set is supported: - HAVE_SSE, - HAVE_SSE2, - HAVE_SSE3, - HAVE_SSSE3, - HAVE_SSE4_1, - HAVE_SSE4_2, - HAVE_AVX, - HAVE_AVX2. For detecting if an instruction set can be used in runtime, following functions are provided in (include/linux/simd_x86.h): - zfs_sse_available() - zfs_sse2_available() - zfs_sse3_available() - zfs_ssse3_available() - zfs_sse4_1_available() - zfs_sse4_2_available() - zfs_avx_available() - zfs_avx2_available() - zfs_bmi1_available() - zfs_bmi2_available() These function should be called once, on module load, or initialization. They are safe to use from user and kernel space. If an implementation is using more than single instruction set, both compiler and runtime support for all relevant instruction sets should be checked. Kernel fpu methods: - kfpu_begin() - kfpu_end() Use __get_cpuid_max and __cpuid_count from <cpuid.h> Both gcc and clang have support for these. They also handle ebx register in case it is used for PIC code. Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Closes #4381
2016-02-29 18:42:27 +00:00
ZFS_LINUX_TRY_COMPILE([
#include <linux/module.h>
#include <asm/fpu/api.h>
],[
],[
AC_DEFINE(HAVE_KERNEL_FPU_API_HEADER, 1,
[kernel has asm/fpu/api.h])
AC_MSG_RESULT(asm/fpu/api.h)
],[
AC_MSG_RESULT(i387.h & xcr.h)
])
Perform KABI checks in parallel Reduce the time required for ./configure to perform the needed KABI checks by allowing kbuild to compile multiple test cases in parallel. This was accomplished by splitting each test's source code from the logic handling whether that code could be compiled or not. By introducing this split it's possible to minimize the number of times kbuild needs to be invoked. As importantly, it means all of the tests can be built in parallel. This does require a little extra care since we expect some tests to fail, so the --keep-going (-k) option must be provided otherwise some tests may not get compiled. Furthermore, since a failure during the kbuild modpost phase will result in an early exit; the final linking phase is limited to tests which passed the initial compilation and produced an object file. Once everything has been built the configure script proceeds as previously. The only significant difference is that it now merely needs to test for the existence of a .ko file to determine the result of a given test. This vastly speeds up the entire process. New test cases should use ZFS_LINUX_TEST_SRC to declare their test source code and ZFS_LINUX_TEST_RESULT to check the result. All of the existing kernel-*.m4 files have been updated accordingly, see config/kernel-current-time.m4 for a basic example. The legacy ZFS_LINUX_TRY_COMPILE macro has been kept to handle special cases but it's use is not encouraged. master (secs) patched (secs) ------------- ---------------- autogen.sh 61 68 configure 137 24 (~17% of current run time) make -j $(nproc) 44 44 make rpms 287 150 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #8547 Closes #9132 Closes #9341 Conflicts: Makefile.am config/kernel-fpu.m4
2019-10-01 19:50:34 +00:00
])
Perform KABI checks in parallel Reduce the time required for ./configure to perform the needed KABI checks by allowing kbuild to compile multiple test cases in parallel. This was accomplished by splitting each test's source code from the logic handling whether that code could be compiled or not. By introducing this split it's possible to minimize the number of times kbuild needs to be invoked. As importantly, it means all of the tests can be built in parallel. This does require a little extra care since we expect some tests to fail, so the --keep-going (-k) option must be provided otherwise some tests may not get compiled. Furthermore, since a failure during the kbuild modpost phase will result in an early exit; the final linking phase is limited to tests which passed the initial compilation and produced an object file. Once everything has been built the configure script proceeds as previously. The only significant difference is that it now merely needs to test for the existence of a .ko file to determine the result of a given test. This vastly speeds up the entire process. New test cases should use ZFS_LINUX_TEST_SRC to declare their test source code and ZFS_LINUX_TEST_RESULT to check the result. All of the existing kernel-*.m4 files have been updated accordingly, see config/kernel-current-time.m4 for a basic example. The legacy ZFS_LINUX_TRY_COMPILE macro has been kept to handle special cases but it's use is not encouraged. master (secs) patched (secs) ------------- ---------------- autogen.sh 61 68 configure 137 24 (~17% of current run time) make -j $(nproc) 44 44 make rpms 287 150 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #8547 Closes #9132 Closes #9341 Conflicts: Makefile.am config/kernel-fpu.m4
2019-10-01 19:50:34 +00:00
AC_DEFUN([ZFS_AC_KERNEL_SRC_FPU], [
ZFS_LINUX_TEST_SRC([kernel_fpu], [
#include <linux/types.h>
#ifdef HAVE_KERNEL_FPU_API_HEADER
#include <asm/fpu/api.h>
#else
#include <asm/i387.h>
#include <asm/xcr.h>
#endif
Perform KABI checks in parallel Reduce the time required for ./configure to perform the needed KABI checks by allowing kbuild to compile multiple test cases in parallel. This was accomplished by splitting each test's source code from the logic handling whether that code could be compiled or not. By introducing this split it's possible to minimize the number of times kbuild needs to be invoked. As importantly, it means all of the tests can be built in parallel. This does require a little extra care since we expect some tests to fail, so the --keep-going (-k) option must be provided otherwise some tests may not get compiled. Furthermore, since a failure during the kbuild modpost phase will result in an early exit; the final linking phase is limited to tests which passed the initial compilation and produced an object file. Once everything has been built the configure script proceeds as previously. The only significant difference is that it now merely needs to test for the existence of a .ko file to determine the result of a given test. This vastly speeds up the entire process. New test cases should use ZFS_LINUX_TEST_SRC to declare their test source code and ZFS_LINUX_TEST_RESULT to check the result. All of the existing kernel-*.m4 files have been updated accordingly, see config/kernel-current-time.m4 for a basic example. The legacy ZFS_LINUX_TRY_COMPILE macro has been kept to handle special cases but it's use is not encouraged. master (secs) patched (secs) ------------- ---------------- autogen.sh 61 68 configure 137 24 (~17% of current run time) make -j $(nproc) 44 44 make rpms 287 150 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #8547 Closes #9132 Closes #9341 Conflicts: Makefile.am config/kernel-fpu.m4
2019-10-01 19:50:34 +00:00
], [
kernel_fpu_begin();
kernel_fpu_end();
Perform KABI checks in parallel Reduce the time required for ./configure to perform the needed KABI checks by allowing kbuild to compile multiple test cases in parallel. This was accomplished by splitting each test's source code from the logic handling whether that code could be compiled or not. By introducing this split it's possible to minimize the number of times kbuild needs to be invoked. As importantly, it means all of the tests can be built in parallel. This does require a little extra care since we expect some tests to fail, so the --keep-going (-k) option must be provided otherwise some tests may not get compiled. Furthermore, since a failure during the kbuild modpost phase will result in an early exit; the final linking phase is limited to tests which passed the initial compilation and produced an object file. Once everything has been built the configure script proceeds as previously. The only significant difference is that it now merely needs to test for the existence of a .ko file to determine the result of a given test. This vastly speeds up the entire process. New test cases should use ZFS_LINUX_TEST_SRC to declare their test source code and ZFS_LINUX_TEST_RESULT to check the result. All of the existing kernel-*.m4 files have been updated accordingly, see config/kernel-current-time.m4 for a basic example. The legacy ZFS_LINUX_TRY_COMPILE macro has been kept to handle special cases but it's use is not encouraged. master (secs) patched (secs) ------------- ---------------- autogen.sh 61 68 configure 137 24 (~17% of current run time) make -j $(nproc) 44 44 make rpms 287 150 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #8547 Closes #9132 Closes #9341 Conflicts: Makefile.am config/kernel-fpu.m4
2019-10-01 19:50:34 +00:00
], [], [$ZFS_META_LICENSE])
ZFS_LINUX_TEST_SRC([__kernel_fpu], [
#include <linux/types.h>
#ifdef HAVE_KERNEL_FPU_API_HEADER
#include <asm/fpu/api.h>
#else
#include <asm/i387.h>
#include <asm/xcr.h>
#endif
], [
__kernel_fpu_begin();
__kernel_fpu_end();
], [], [$ZFS_META_LICENSE])
ZFS_LINUX_TEST_SRC([fpu_internal], [
#if defined(__x86_64) || defined(__x86_64__) || \
defined(__i386) || defined(__i386__)
#if !defined(__x86)
#define __x86
#endif
#endif
#if !defined(__x86)
#error Unsupported architecture
#endif
#include <linux/types.h>
#ifdef HAVE_KERNEL_FPU_API_HEADER
#include <asm/fpu/api.h>
#include <asm/fpu/internal.h>
#else
#include <asm/i387.h>
#include <asm/xcr.h>
#endif
#if !defined(XSTATE_XSAVE)
#error XSTATE_XSAVE not defined
#endif
#if !defined(XSTATE_XRESTORE)
#error XSTATE_XRESTORE not defined
#endif
],[
struct fpu *fpu = &current->thread.fpu;
union fpregs_state *st = &fpu->state;
struct fregs_state *fr __attribute__ ((unused)) = &st->fsave;
struct fxregs_state *fxr __attribute__ ((unused)) = &st->fxsave;
struct xregs_state *xr __attribute__ ((unused)) = &st->xsave;
])
])
AC_DEFUN([ZFS_AC_KERNEL_FPU], [
dnl #
dnl # Legacy kernel
dnl #
AC_MSG_CHECKING([whether kernel fpu is available])
ZFS_LINUX_TEST_RESULT_SYMBOL([kernel_fpu_license],
[kernel_fpu_begin], [arch/x86/kernel/fpu/core.c], [
AC_MSG_RESULT(kernel_fpu_*)
AC_DEFINE(HAVE_KERNEL_FPU, 1,
[kernel has kernel_fpu_* functions])
AC_DEFINE(KERNEL_EXPORTS_X86_FPU, 1,
[kernel exports FPU functions])
Support for vectorized algorithms on x86 This is initial support for x86 vectorized implementations of ZFS parity and checksum algorithms. For the compilation phase, configure step checks if toolchain supports relevant instruction sets. Each implementation must ensure that the code is not passed to compiler if relevant instruction set is not supported. For this purpose, following new defines are provided if instruction set is supported: - HAVE_SSE, - HAVE_SSE2, - HAVE_SSE3, - HAVE_SSSE3, - HAVE_SSE4_1, - HAVE_SSE4_2, - HAVE_AVX, - HAVE_AVX2. For detecting if an instruction set can be used in runtime, following functions are provided in (include/linux/simd_x86.h): - zfs_sse_available() - zfs_sse2_available() - zfs_sse3_available() - zfs_ssse3_available() - zfs_sse4_1_available() - zfs_sse4_2_available() - zfs_avx_available() - zfs_avx2_available() - zfs_bmi1_available() - zfs_bmi2_available() These function should be called once, on module load, or initialization. They are safe to use from user and kernel space. If an implementation is using more than single instruction set, both compiler and runtime support for all relevant instruction sets should be checked. Kernel fpu methods: - kfpu_begin() - kfpu_end() Use __get_cpuid_max and __cpuid_count from <cpuid.h> Both gcc and clang have support for these. They also handle ebx register in case it is used for PIC code. Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Closes #4381
2016-02-29 18:42:27 +00:00
],[
dnl #
dnl # Linux 4.2 kernel
dnl #
Perform KABI checks in parallel Reduce the time required for ./configure to perform the needed KABI checks by allowing kbuild to compile multiple test cases in parallel. This was accomplished by splitting each test's source code from the logic handling whether that code could be compiled or not. By introducing this split it's possible to minimize the number of times kbuild needs to be invoked. As importantly, it means all of the tests can be built in parallel. This does require a little extra care since we expect some tests to fail, so the --keep-going (-k) option must be provided otherwise some tests may not get compiled. Furthermore, since a failure during the kbuild modpost phase will result in an early exit; the final linking phase is limited to tests which passed the initial compilation and produced an object file. Once everything has been built the configure script proceeds as previously. The only significant difference is that it now merely needs to test for the existence of a .ko file to determine the result of a given test. This vastly speeds up the entire process. New test cases should use ZFS_LINUX_TEST_SRC to declare their test source code and ZFS_LINUX_TEST_RESULT to check the result. All of the existing kernel-*.m4 files have been updated accordingly, see config/kernel-current-time.m4 for a basic example. The legacy ZFS_LINUX_TRY_COMPILE macro has been kept to handle special cases but it's use is not encouraged. master (secs) patched (secs) ------------- ---------------- autogen.sh 61 68 configure 137 24 (~17% of current run time) make -j $(nproc) 44 44 make rpms 287 150 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #8547 Closes #9132 Closes #9341 Conflicts: Makefile.am config/kernel-fpu.m4
2019-10-01 19:50:34 +00:00
ZFS_LINUX_TEST_RESULT_SYMBOL([__kernel_fpu_license],
[__kernel_fpu_begin],
[arch/x86/kernel/fpu/core.c arch/x86/kernel/i387.c], [
AC_MSG_RESULT(__kernel_fpu_*)
AC_DEFINE(HAVE_UNDERSCORE_KERNEL_FPU, 1,
[kernel has __kernel_fpu_* functions])
AC_DEFINE(KERNEL_EXPORTS_X86_FPU, 1,
[kernel exports FPU functions])
],[
Perform KABI checks in parallel Reduce the time required for ./configure to perform the needed KABI checks by allowing kbuild to compile multiple test cases in parallel. This was accomplished by splitting each test's source code from the logic handling whether that code could be compiled or not. By introducing this split it's possible to minimize the number of times kbuild needs to be invoked. As importantly, it means all of the tests can be built in parallel. This does require a little extra care since we expect some tests to fail, so the --keep-going (-k) option must be provided otherwise some tests may not get compiled. Furthermore, since a failure during the kbuild modpost phase will result in an early exit; the final linking phase is limited to tests which passed the initial compilation and produced an object file. Once everything has been built the configure script proceeds as previously. The only significant difference is that it now merely needs to test for the existence of a .ko file to determine the result of a given test. This vastly speeds up the entire process. New test cases should use ZFS_LINUX_TEST_SRC to declare their test source code and ZFS_LINUX_TEST_RESULT to check the result. All of the existing kernel-*.m4 files have been updated accordingly, see config/kernel-current-time.m4 for a basic example. The legacy ZFS_LINUX_TRY_COMPILE macro has been kept to handle special cases but it's use is not encouraged. master (secs) patched (secs) ------------- ---------------- autogen.sh 61 68 configure 137 24 (~17% of current run time) make -j $(nproc) 44 44 make rpms 287 150 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #8547 Closes #9132 Closes #9341 Conflicts: Makefile.am config/kernel-fpu.m4
2019-10-01 19:50:34 +00:00
ZFS_LINUX_TEST_RESULT([fpu_internal], [
AC_MSG_RESULT(internal)
AC_DEFINE(HAVE_KERNEL_FPU_INTERNAL, 1,
[kernel fpu internal])
],[
AC_MSG_RESULT(unavailable)
])
])
Support for vectorized algorithms on x86 This is initial support for x86 vectorized implementations of ZFS parity and checksum algorithms. For the compilation phase, configure step checks if toolchain supports relevant instruction sets. Each implementation must ensure that the code is not passed to compiler if relevant instruction set is not supported. For this purpose, following new defines are provided if instruction set is supported: - HAVE_SSE, - HAVE_SSE2, - HAVE_SSE3, - HAVE_SSSE3, - HAVE_SSE4_1, - HAVE_SSE4_2, - HAVE_AVX, - HAVE_AVX2. For detecting if an instruction set can be used in runtime, following functions are provided in (include/linux/simd_x86.h): - zfs_sse_available() - zfs_sse2_available() - zfs_sse3_available() - zfs_ssse3_available() - zfs_sse4_1_available() - zfs_sse4_2_available() - zfs_avx_available() - zfs_avx2_available() - zfs_bmi1_available() - zfs_bmi2_available() These function should be called once, on module load, or initialization. They are safe to use from user and kernel space. If an implementation is using more than single instruction set, both compiler and runtime support for all relevant instruction sets should be checked. Kernel fpu methods: - kfpu_begin() - kfpu_end() Use __get_cpuid_max and __cpuid_count from <cpuid.h> Both gcc and clang have support for these. They also handle ebx register in case it is used for PIC code. Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Closes #4381
2016-02-29 18:42:27 +00:00
])
])