Archive-Team/zfs - zfs - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Attila Fülöp	31b160f0a6	ICP: Improve AES-GCM performance Currently SIMD accelerated AES-GCM performance is limited by two factors: a. The need to disable preemption and interrupts and save the FPU state before using it and to do the reverse when done. Due to the way the code is organized (see (b) below) we have to pay this price twice for each 16 byte GCM block processed. b. Most processing is done in C, operating on single GCM blocks. The use of SIMD instructions is limited to the AES encryption of the counter block (AES-NI) and the Galois multiplication (PCLMULQDQ). This leads to the FPU not being fully utilized for crypto operations. To solve (a) we do crypto processing in larger chunks while owning the FPU. An `icp_gcm_avx_chunk_size` module parameter was introduced to make this chunk size tweakable. It defaults to 32 KiB. This step alone roughly doubles performance. (b) is tackled by porting and using the highly optimized openssl AES-GCM assembler routines, which do all the processing (CTR, AES, GMULT) in a single routine. Both steps together result in up to 32x reduction of the time spend in the en/decryption routines, leading up to approximately 12x throughput increase for large (128 KiB) blocks. Lastly, this commit changes the default encryption algorithm from AES-CCM to AES-GCM when setting the `encryption=on` property. Reviewed-By: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-By: Jason King <jason.king@joyent.com> Reviewed-By: Tom Caputi <tcaputi@datto.com> Reviewed-By: Richard Laager <rlaager@wiktel.com> Signed-off-by: Attila Fülöp <attila@fueloep.org> Closes #9749	2020-02-10 12:59:50 -08:00
Nathan Lewis	010d12474c	Add support for selecting encryption backend - Add two new module parameters to icp (icp_aes_impl, icp_gcm_impl) that control the crypto implementation. At the moment there is a choice between generic and aesni (on platforms that support it). - This enables support for AES-NI and PCLMULQDQ-NI on AMD Family 15h (bulldozer) and newer CPUs (zen). - Modify aes_key_t to track what implementation it was generated with as key schedules generated with various implementations are not necessarily interchangable. Reviewed by: Gvozden Neskovic <neskovic@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tom Caputi <tcaputi@datto.com> Reviewed-by: Richard Laager <rlaager@wiktel.com> Signed-off-by: Nathaniel R. Lewis <linux.robotdude@gmail.com> Closes #7102 Closes #7103	2018-08-02 11:59:24 -07:00
Gvozden Neskovic	32ffaa3de5	Add support for AVX-512 family of instruction sets This patch adds compiler and runtime tests (user and kernel) for following instruction sets: avx512f, avx512cd, avx512er, avx512pf, avx512bw, avx512dq, avx512vl, avx512ifma, avx512vbmi. note: Linux support for AVX-512F (Foundation) instruction set started with linux v3.15 Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #4952	2016-08-16 14:10:33 -07:00
Gvozden Neskovic	fc0c72b167	Support for vectorized algorithms on x86 This is initial support for x86 vectorized implementations of ZFS parity and checksum algorithms. For the compilation phase, configure step checks if toolchain supports relevant instruction sets. Each implementation must ensure that the code is not passed to compiler if relevant instruction set is not supported. For this purpose, following new defines are provided if instruction set is supported: - HAVE_SSE, - HAVE_SSE2, - HAVE_SSE3, - HAVE_SSSE3, - HAVE_SSE4_1, - HAVE_SSE4_2, - HAVE_AVX, - HAVE_AVX2. For detecting if an instruction set can be used in runtime, following functions are provided in (include/linux/simd_x86.h): - zfs_sse_available() - zfs_sse2_available() - zfs_sse3_available() - zfs_ssse3_available() - zfs_sse4_1_available() - zfs_sse4_2_available() - zfs_avx_available() - zfs_avx2_available() - zfs_bmi1_available() - zfs_bmi2_available() These function should be called once, on module load, or initialization. They are safe to use from user and kernel space. If an implementation is using more than single instruction set, both compiler and runtime support for all relevant instruction sets should be checked. Kernel fpu methods: - kfpu_begin() - kfpu_end() Use __get_cpuid_max and __cpuid_count from <cpuid.h> Both gcc and clang have support for these. They also handle ebx register in case it is used for PIC code. Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Closes #4381	2016-03-21 09:24:34 -07:00

4 Commits