zfs/module
Gvozden Neskovic 70b258fc96 Fletcher4 implementation using avx512f instruction set
Algorithm runs 8 parallel sums, consuming 8x uint32_t elements per
loop iteration. Size alignment of main fletcher4 methods is adjusted
accordingly. New implementation is called 'avx512f'.

Note: byteswap method can be implemented more efficiently when avx512bw hardware
becomes available. Currently, it is ~ 2x slower than native method.

Table shows result of full (native) fletcher4 calculation for different buffer size:

fletcher4   4KB     16KB    64KB    128KB   256KB   1MB     16MB
--------------------------------------------------------------------
[scalar]    1213    1228    1231    1231    1225    1200    1160
[sse2]      2374    2442    2459    2456    2462    2250    2220
[avx2]      4288    4753    4871    4893    4900    4050    3882
[avx512f]   5975    8445    9196    9221    9262    6307    5620

Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #4952
2016-08-16 14:11:14 -07:00
..
avl Fix uninitialized variable in avl_add() 2016-07-25 14:21:34 -07:00
icp icp: mark asm files with noexec stack 2016-08-12 09:51:26 -07:00
nvpair OpenZFS 7263 - deeply nested nvlist can overflow stack 2016-08-11 15:58:03 -07:00
unicode Build user-space with different gcc optimization levels 2016-08-09 14:40:35 -07:00
zcommon Fletcher4 implementation using avx512f instruction set 2016-08-16 14:11:14 -07:00
zfs Add tunable to ignore hole_birth 2016-08-15 09:52:56 -07:00
zpios Add large block support to zpios(1) benchmark 2015-09-22 09:13:20 -07:00
.gitignore module/.gitignore: Add *.dwo (#4580) 2016-05-02 09:07:04 -07:00
Makefile.in Illumos Crypto Port module added to enable native encryption in zfs 2016-07-20 10:43:30 -07:00