2019-01-12 02:01:28 +00:00
|
|
|
dnl #
|
|
|
|
dnl # Handle differences in kernel FPU code.
|
Support for vectorized algorithms on x86
This is initial support for x86 vectorized implementations of ZFS parity
and checksum algorithms.
For the compilation phase, configure step checks if toolchain supports relevant
instruction sets. Each implementation must ensure that the code is not passed
to compiler if relevant instruction set is not supported. For this purpose,
following new defines are provided if instruction set is supported:
- HAVE_SSE,
- HAVE_SSE2,
- HAVE_SSE3,
- HAVE_SSSE3,
- HAVE_SSE4_1,
- HAVE_SSE4_2,
- HAVE_AVX,
- HAVE_AVX2.
For detecting if an instruction set can be used in runtime, following functions
are provided in (include/linux/simd_x86.h):
- zfs_sse_available()
- zfs_sse2_available()
- zfs_sse3_available()
- zfs_ssse3_available()
- zfs_sse4_1_available()
- zfs_sse4_2_available()
- zfs_avx_available()
- zfs_avx2_available()
- zfs_bmi1_available()
- zfs_bmi2_available()
These function should be called once, on module load, or initialization.
They are safe to use from user and kernel space.
If an implementation is using more than single instruction set, both compiler
and runtime support for all relevant instruction sets should be checked.
Kernel fpu methods:
- kfpu_begin()
- kfpu_end()
Use __get_cpuid_max and __cpuid_count from <cpuid.h>
Both gcc and clang have support for these. They also handle ebx register
in case it is used for PIC code.
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Closes #4381
2016-02-29 18:42:27 +00:00
|
|
|
dnl #
|
2019-01-12 02:01:28 +00:00
|
|
|
dnl # Kernel
|
2019-07-12 16:31:20 +00:00
|
|
|
dnl # 5.0: Wrappers have been introduced to save/restore the FPU state.
|
|
|
|
dnl # This change was made to the 4.19.38 and 4.14.120 LTS kernels.
|
|
|
|
dnl # HAVE_KERNEL_FPU_INTERNAL
|
2019-01-12 02:01:28 +00:00
|
|
|
dnl #
|
|
|
|
dnl # 4.2: Use __kernel_fpu_{begin,end}()
|
|
|
|
dnl # HAVE_UNDERSCORE_KERNEL_FPU & KERNEL_EXPORTS_X86_FPU
|
|
|
|
dnl #
|
|
|
|
dnl # Pre-4.2: Use kernel_fpu_{begin,end}()
|
|
|
|
dnl # HAVE_KERNEL_FPU & KERNEL_EXPORTS_X86_FPU
|
Support for vectorized algorithms on x86
This is initial support for x86 vectorized implementations of ZFS parity
and checksum algorithms.
For the compilation phase, configure step checks if toolchain supports relevant
instruction sets. Each implementation must ensure that the code is not passed
to compiler if relevant instruction set is not supported. For this purpose,
following new defines are provided if instruction set is supported:
- HAVE_SSE,
- HAVE_SSE2,
- HAVE_SSE3,
- HAVE_SSSE3,
- HAVE_SSE4_1,
- HAVE_SSE4_2,
- HAVE_AVX,
- HAVE_AVX2.
For detecting if an instruction set can be used in runtime, following functions
are provided in (include/linux/simd_x86.h):
- zfs_sse_available()
- zfs_sse2_available()
- zfs_sse3_available()
- zfs_ssse3_available()
- zfs_sse4_1_available()
- zfs_sse4_2_available()
- zfs_avx_available()
- zfs_avx2_available()
- zfs_bmi1_available()
- zfs_bmi2_available()
These function should be called once, on module load, or initialization.
They are safe to use from user and kernel space.
If an implementation is using more than single instruction set, both compiler
and runtime support for all relevant instruction sets should be checked.
Kernel fpu methods:
- kfpu_begin()
- kfpu_end()
Use __get_cpuid_max and __cpuid_count from <cpuid.h>
Both gcc and clang have support for these. They also handle ebx register
in case it is used for PIC code.
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Closes #4381
2016-02-29 18:42:27 +00:00
|
|
|
dnl #
|
2019-10-01 19:50:34 +00:00
|
|
|
dnl # N.B. The header check is performed before all other checks since it
|
|
|
|
dnl # depends on HAVE_KERNEL_FPU_API_HEADER being set in confdefs.h.
|
|
|
|
dnl #
|
|
|
|
AC_DEFUN([ZFS_AC_KERNEL_FPU_HEADER], [
|
2019-07-12 16:31:20 +00:00
|
|
|
AC_MSG_CHECKING([whether fpu headers are available])
|
Support for vectorized algorithms on x86
This is initial support for x86 vectorized implementations of ZFS parity
and checksum algorithms.
For the compilation phase, configure step checks if toolchain supports relevant
instruction sets. Each implementation must ensure that the code is not passed
to compiler if relevant instruction set is not supported. For this purpose,
following new defines are provided if instruction set is supported:
- HAVE_SSE,
- HAVE_SSE2,
- HAVE_SSE3,
- HAVE_SSSE3,
- HAVE_SSE4_1,
- HAVE_SSE4_2,
- HAVE_AVX,
- HAVE_AVX2.
For detecting if an instruction set can be used in runtime, following functions
are provided in (include/linux/simd_x86.h):
- zfs_sse_available()
- zfs_sse2_available()
- zfs_sse3_available()
- zfs_ssse3_available()
- zfs_sse4_1_available()
- zfs_sse4_2_available()
- zfs_avx_available()
- zfs_avx2_available()
- zfs_bmi1_available()
- zfs_bmi2_available()
These function should be called once, on module load, or initialization.
They are safe to use from user and kernel space.
If an implementation is using more than single instruction set, both compiler
and runtime support for all relevant instruction sets should be checked.
Kernel fpu methods:
- kfpu_begin()
- kfpu_end()
Use __get_cpuid_max and __cpuid_count from <cpuid.h>
Both gcc and clang have support for these. They also handle ebx register
in case it is used for PIC code.
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Closes #4381
2016-02-29 18:42:27 +00:00
|
|
|
ZFS_LINUX_TRY_COMPILE([
|
2019-03-07 00:03:03 +00:00
|
|
|
#include <linux/module.h>
|
|
|
|
#include <asm/fpu/api.h>
|
|
|
|
],[
|
|
|
|
],[
|
2019-07-17 00:22:31 +00:00
|
|
|
AC_DEFINE(HAVE_KERNEL_FPU_API_HEADER, 1,
|
|
|
|
[kernel has asm/fpu/api.h])
|
2019-03-07 00:03:03 +00:00
|
|
|
AC_MSG_RESULT(asm/fpu/api.h)
|
|
|
|
],[
|
|
|
|
AC_MSG_RESULT(i387.h & xcr.h)
|
|
|
|
])
|
2019-10-01 19:50:34 +00:00
|
|
|
])
|
2019-03-07 00:03:03 +00:00
|
|
|
|
2019-10-01 19:50:34 +00:00
|
|
|
AC_DEFUN([ZFS_AC_KERNEL_SRC_FPU], [
|
|
|
|
ZFS_LINUX_TEST_SRC([kernel_fpu], [
|
2019-07-12 16:31:20 +00:00
|
|
|
#include <linux/types.h>
|
2019-03-07 00:03:03 +00:00
|
|
|
#ifdef HAVE_KERNEL_FPU_API_HEADER
|
|
|
|
#include <asm/fpu/api.h>
|
|
|
|
#else
|
2019-01-12 02:01:28 +00:00
|
|
|
#include <asm/i387.h>
|
|
|
|
#include <asm/xcr.h>
|
2019-03-07 00:03:03 +00:00
|
|
|
#endif
|
2019-10-01 19:50:34 +00:00
|
|
|
], [
|
2019-01-12 02:01:28 +00:00
|
|
|
kernel_fpu_begin();
|
|
|
|
kernel_fpu_end();
|
2019-10-01 19:50:34 +00:00
|
|
|
], [], [$ZFS_META_LICENSE])
|
|
|
|
|
|
|
|
ZFS_LINUX_TEST_SRC([__kernel_fpu], [
|
|
|
|
#include <linux/types.h>
|
|
|
|
#ifdef HAVE_KERNEL_FPU_API_HEADER
|
|
|
|
#include <asm/fpu/api.h>
|
|
|
|
#else
|
|
|
|
#include <asm/i387.h>
|
|
|
|
#include <asm/xcr.h>
|
|
|
|
#endif
|
|
|
|
], [
|
|
|
|
__kernel_fpu_begin();
|
|
|
|
__kernel_fpu_end();
|
|
|
|
], [], [$ZFS_META_LICENSE])
|
|
|
|
|
|
|
|
ZFS_LINUX_TEST_SRC([fpu_internal], [
|
|
|
|
#if defined(__x86_64) || defined(__x86_64__) || \
|
|
|
|
defined(__i386) || defined(__i386__)
|
|
|
|
#if !defined(__x86)
|
|
|
|
#define __x86
|
|
|
|
#endif
|
|
|
|
#endif
|
|
|
|
|
|
|
|
#if !defined(__x86)
|
|
|
|
#error Unsupported architecture
|
|
|
|
#endif
|
|
|
|
|
|
|
|
#include <linux/types.h>
|
|
|
|
#ifdef HAVE_KERNEL_FPU_API_HEADER
|
|
|
|
#include <asm/fpu/api.h>
|
|
|
|
#include <asm/fpu/internal.h>
|
|
|
|
#else
|
|
|
|
#include <asm/i387.h>
|
|
|
|
#include <asm/xcr.h>
|
|
|
|
#endif
|
|
|
|
|
|
|
|
#if !defined(XSTATE_XSAVE)
|
|
|
|
#error XSTATE_XSAVE not defined
|
|
|
|
#endif
|
|
|
|
|
|
|
|
#if !defined(XSTATE_XRESTORE)
|
|
|
|
#error XSTATE_XRESTORE not defined
|
|
|
|
#endif
|
|
|
|
],[
|
|
|
|
struct fpu *fpu = ¤t->thread.fpu;
|
|
|
|
union fpregs_state *st = &fpu->state;
|
|
|
|
struct fregs_state *fr __attribute__ ((unused)) = &st->fsave;
|
|
|
|
struct fxregs_state *fxr __attribute__ ((unused)) = &st->fxsave;
|
|
|
|
struct xregs_state *xr __attribute__ ((unused)) = &st->xsave;
|
|
|
|
])
|
|
|
|
])
|
|
|
|
|
|
|
|
AC_DEFUN([ZFS_AC_KERNEL_FPU], [
|
|
|
|
dnl #
|
|
|
|
dnl # Legacy kernel
|
|
|
|
dnl #
|
|
|
|
AC_MSG_CHECKING([whether kernel fpu is available])
|
|
|
|
ZFS_LINUX_TEST_RESULT_SYMBOL([kernel_fpu_license],
|
|
|
|
[kernel_fpu_begin], [arch/x86/kernel/fpu/core.c], [
|
2019-01-12 02:01:28 +00:00
|
|
|
AC_MSG_RESULT(kernel_fpu_*)
|
2019-07-17 00:22:31 +00:00
|
|
|
AC_DEFINE(HAVE_KERNEL_FPU, 1,
|
|
|
|
[kernel has kernel_fpu_* functions])
|
|
|
|
AC_DEFINE(KERNEL_EXPORTS_X86_FPU, 1,
|
|
|
|
[kernel exports FPU functions])
|
Support for vectorized algorithms on x86
This is initial support for x86 vectorized implementations of ZFS parity
and checksum algorithms.
For the compilation phase, configure step checks if toolchain supports relevant
instruction sets. Each implementation must ensure that the code is not passed
to compiler if relevant instruction set is not supported. For this purpose,
following new defines are provided if instruction set is supported:
- HAVE_SSE,
- HAVE_SSE2,
- HAVE_SSE3,
- HAVE_SSSE3,
- HAVE_SSE4_1,
- HAVE_SSE4_2,
- HAVE_AVX,
- HAVE_AVX2.
For detecting if an instruction set can be used in runtime, following functions
are provided in (include/linux/simd_x86.h):
- zfs_sse_available()
- zfs_sse2_available()
- zfs_sse3_available()
- zfs_ssse3_available()
- zfs_sse4_1_available()
- zfs_sse4_2_available()
- zfs_avx_available()
- zfs_avx2_available()
- zfs_bmi1_available()
- zfs_bmi2_available()
These function should be called once, on module load, or initialization.
They are safe to use from user and kernel space.
If an implementation is using more than single instruction set, both compiler
and runtime support for all relevant instruction sets should be checked.
Kernel fpu methods:
- kfpu_begin()
- kfpu_end()
Use __get_cpuid_max and __cpuid_count from <cpuid.h>
Both gcc and clang have support for these. They also handle ebx register
in case it is used for PIC code.
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Closes #4381
2016-02-29 18:42:27 +00:00
|
|
|
],[
|
2019-07-12 16:31:20 +00:00
|
|
|
dnl #
|
|
|
|
dnl # Linux 4.2 kernel
|
|
|
|
dnl #
|
2019-10-01 19:50:34 +00:00
|
|
|
ZFS_LINUX_TEST_RESULT_SYMBOL([__kernel_fpu_license],
|
|
|
|
[__kernel_fpu_begin],
|
|
|
|
[arch/x86/kernel/fpu/core.c arch/x86/kernel/i387.c], [
|
2019-01-12 02:01:28 +00:00
|
|
|
AC_MSG_RESULT(__kernel_fpu_*)
|
2019-07-12 16:31:20 +00:00
|
|
|
AC_DEFINE(HAVE_UNDERSCORE_KERNEL_FPU, 1,
|
|
|
|
[kernel has __kernel_fpu_* functions])
|
|
|
|
AC_DEFINE(KERNEL_EXPORTS_X86_FPU, 1,
|
|
|
|
[kernel exports FPU functions])
|
2019-01-12 02:01:28 +00:00
|
|
|
],[
|
2019-10-01 19:50:34 +00:00
|
|
|
ZFS_LINUX_TEST_RESULT([fpu_internal], [
|
2019-07-12 16:31:20 +00:00
|
|
|
AC_MSG_RESULT(internal)
|
|
|
|
AC_DEFINE(HAVE_KERNEL_FPU_INTERNAL, 1,
|
|
|
|
[kernel fpu internal])
|
|
|
|
],[
|
|
|
|
AC_MSG_RESULT(unavailable)
|
|
|
|
])
|
2019-01-12 02:01:28 +00:00
|
|
|
])
|
Support for vectorized algorithms on x86
This is initial support for x86 vectorized implementations of ZFS parity
and checksum algorithms.
For the compilation phase, configure step checks if toolchain supports relevant
instruction sets. Each implementation must ensure that the code is not passed
to compiler if relevant instruction set is not supported. For this purpose,
following new defines are provided if instruction set is supported:
- HAVE_SSE,
- HAVE_SSE2,
- HAVE_SSE3,
- HAVE_SSSE3,
- HAVE_SSE4_1,
- HAVE_SSE4_2,
- HAVE_AVX,
- HAVE_AVX2.
For detecting if an instruction set can be used in runtime, following functions
are provided in (include/linux/simd_x86.h):
- zfs_sse_available()
- zfs_sse2_available()
- zfs_sse3_available()
- zfs_ssse3_available()
- zfs_sse4_1_available()
- zfs_sse4_2_available()
- zfs_avx_available()
- zfs_avx2_available()
- zfs_bmi1_available()
- zfs_bmi2_available()
These function should be called once, on module load, or initialization.
They are safe to use from user and kernel space.
If an implementation is using more than single instruction set, both compiler
and runtime support for all relevant instruction sets should be checked.
Kernel fpu methods:
- kfpu_begin()
- kfpu_end()
Use __get_cpuid_max and __cpuid_count from <cpuid.h>
Both gcc and clang have support for these. They also handle ebx register
in case it is used for PIC code.
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Closes #4381
2016-02-29 18:42:27 +00:00
|
|
|
])
|
|
|
|
])
|