zfs/include
Gvozden Neskovic fc897b24b2 Rework of fletcher_4 module
- Benchmark memory block is increased to 128kiB to reflect real block sizes more
accurately. Measurements include all three stages needed for checksum generation,
i.e. `init()/compute()/fini()`. The inner loop is repeated multiple times to offset
overhead of time function.

- Fastest implementation selects native and byteswap methods independently in
benchmark. To support this new function pointers `init_byteswap()/fini_byteswap()`
are introduced.

- Implementation mutex lock is replaced by atomic variable.

- To save time, benchmark is not executed in userspace. Instead, highest supported
implementation is used for fastest. Default userspace selector is still 'cycle'.

- `fletcher_4_native/byteswap()` methods use incremental methods to finish
calculation if data size is not multiple of vector stride (currently 64B).

- Added `fletcher_4_native_varsize()` special purpose method for use when buffer size
is not known in advance. The method does not enforce 4B alignment on buffer size, and
will ignore last (size % 4) bytes of the data buffer.

- Benchmark `kstat` is changed to match the one of vdev_raidz. It now shows
throughput for all supported implementations (in B/s), native and byteswap,
as well as the code [fastest] is running.

Example of `fletcher_4_bench` running on `Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz`:
implementation   native         byteswap
scalar           4768120823     3426105750
sse2             7947841777     4318964249
ssse3            7951922722     6112191941
avx2             13269714358    11043200912
fastest          avx2           avx2

Example of `fletcher_4_bench` running on `Intel(R) Xeon Phi(TM) CPU 7210 @ 1.30GHz`:
implementation   native         byteswap
scalar           1291115967     1031555336
sse2             2539571138     1280970926
ssse3            2537778746     1080016762
avx2             4950749767     1078493449
avx512f          9581379998     4010029046
fastest          avx512f        avx512f

Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4952
2016-08-16 14:11:55 -07:00
..
linux Add support for AVX-512 family of instruction sets 2016-08-16 14:10:33 -07:00
sys OpenZFS 5997 - FRU field not set during pool creation and never updated 2016-08-12 13:06:48 -07:00
Makefile.am Kernel header installation should respect --prefix 2014-10-28 09:37:06 -07:00
libnvpair.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
libuutil.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
libuutil_common.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
libuutil_impl.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
libzfs.h OpenZFS 6314 - buffer overflow in dsl_dataset_name 2016-06-28 13:47:03 -07:00
libzfs_core.h Implement zfs_ioc_recv_new() for OpenZFS 2605 2016-06-28 13:47:03 -07:00
libzfs_impl.h OpenZFS 6314 - buffer overflow in dsl_dataset_name 2016-06-28 13:47:03 -07:00
zfeature_common.h Implement large_dnode pool feature 2016-06-24 13:13:21 -07:00
zfs_comutil.h Illumos #2882, #2883, #2900 2013-09-04 15:49:00 -07:00
zfs_deleg.h Illumos 4368, 4369. 2014-07-29 10:55:29 -07:00
zfs_fletcher.h Rework of fletcher_4 module 2016-08-16 14:11:55 -07:00
zfs_namecheck.h Illumos 4368, 4369. 2014-07-29 10:55:29 -07:00
zfs_prop.h Check the dataset type more rigorously when fetching properties. 2014-05-06 10:41:46 -07:00
zpios-ctl.h Add large block support to zpios(1) benchmark 2015-09-22 09:13:20 -07:00
zpios-internal.h Add large block support to zpios(1) benchmark 2015-09-22 09:13:20 -07:00