zfs/include/sys
Alexander Motin 891568c990
Split dmu_zfetch() speculation and execution parts
To make better predictions on parallel workloads dmu_zfetch() should
be called as early as possible to reduce possible request reordering.
In particular, it should be called before dmu_buf_hold_array_by_dnode()
calls dbuf_hold(), which may sleep waiting for indirect blocks, waking
up multiple threads same time on completion, that can significantly
reorder the requests, making the stream look like random.  But we
should not issue prefetch requests before the on-demand ones, since
they may get to the disks first despite the I/O scheduler, increasing
on-demand request latency.

This patch splits dmu_zfetch() into two functions: dmu_zfetch_prepare()
and dmu_zfetch_run().  The first can be executed as early as needed.
It only updates statistics and makes predictions without issuing any
I/Os.  The I/O issuance is handled by dmu_zfetch_run(), which can be
called later when all on-demand I/Os are already issued.  It even
tracks the activity of other concurrent threads, issuing the prefetch
only when _all_ on-demand requests are issued.

For many years it was a big problem for storage servers, handling
deeper request queues from their clients, having to either serialize
consequential reads to make ZFS prefetcher usable, or execute the
incoming requests as-is and get almost no prefetch from ZFS, relying
only on deep enough prefetch by the clients.  Benefits of those ways
varied, but neither was perfect.  With this patch deeper queue
sequential read benchmarks with CrystalDiskMark from Windows via
iSCSI to FreeBSD target show me much better throughput with almost
100% prefetcher hit rate, comparing to almost zero before.

While there, I also removed per-stream zs_lock as useless, completely
covered by parent zf_lock.  Also I reused zs_blocks refcount to track
zf_stream linkage of the stream, since I believe previous zs_fetch ==
NULL check in dmu_zfetch_stream_done() was racy.

Delete prefetch streams when they reach ends of files.  It saves up
to 1KB of RAM per file, plus reduces searches through the stream list.

Block data prefetch (speculation and indirect block prefetch is still
done since they are cheaper) if all dbufs of the stream are already
in DMU cache.  First cache miss immediately fires all the prefetch
that would be done for the stream by that time.  It saves some CPU
time if same files within DMU cache capacity are read over and over.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Adam Moss <c@yotes.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #11652
2021-03-19 22:56:11 -07:00
..
crypto Extending FreeBSD UIO Struct 2021-01-20 21:27:30 -08:00
fm Avoid posting duplicate zpool events 2020-09-04 10:34:28 -07:00
fs Add "compatibility" property for zpool feature sets 2021-02-17 21:30:45 -08:00
lua FreeBSD: Reduce stack usage of Lua 2020-09-22 16:03:11 -07:00
sysevent Avoid installing kernel headers on FreeBSD 2020-06-27 17:40:14 -07:00
zstd zstd: track allocator statistics 2020-10-30 15:26:10 -07:00
Makefile.am Restore FreeBSD resource usage accounting 2021-02-19 22:34:33 -08:00
abd.h Make inline ABD predicates compatible with C++ 2021-02-15 10:15:50 -08:00
abd_impl.h allow callers to allocate and provide the abd_t struct 2021-01-20 11:24:37 -08:00
aggsum.h Reduce number of atomic_add() calls in aggsum 2020-02-06 13:21:06 -08:00
arc.h Implement memory and CPU hotplug 2020-12-10 14:09:23 -08:00
arc_impl.h dmu_zfetch: fix memory leak 2020-12-12 16:00:00 -08:00
avl.h Restore avl_update() calls and related functions 2020-06-03 09:49:32 -07:00
avl_impl.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
bitops.h Reduce loaded range tree memory usage 2019-10-09 10:36:03 -07:00
blkptr.h OpenZFS 8067 - zdb should be able to dump literal embedded block pointer 2017-07-07 11:28:01 -07:00
bplist.h Fast Clone Deletion 2019-07-26 10:54:14 -07:00
bpobj.h Fast Clone Deletion 2019-07-26 10:54:14 -07:00
bptree.h Illumos 4914 - zfs on-disk bookmark structure should be named *_phys_t 2014-08-06 14:48:41 -07:00
bqueue.h Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
btree.h Fix typos 2020-06-09 21:24:09 -07:00
dataset_kstats.h port async unlinked drain from illumos-nexenta 2019-02-12 10:41:15 -08:00
dbuf.h Improve zfs receive performance with lightweight write 2020-12-11 10:26:02 -08:00
ddt.h Appease GCC sprintf warnings found on Fedora 32/GCC 10.0.1 2020-08-24 10:32:59 -07:00
dmu.h Set aside a metaslab for ZIL blocks 2021-01-21 15:12:54 -08:00
dmu_impl.h Remove UIO_ZEROCOPY functions structures 2020-10-30 10:00:33 -07:00
dmu_objset.h Improve zfs receive performance with lightweight write 2020-12-11 10:26:02 -08:00
dmu_recv.h filesystem_limit/snapshot_limit is incorrectly enforced against root 2020-07-11 17:18:02 -07:00
dmu_redact.h Suppress cppcheck invalidSyntax warninigs 2021-03-05 17:56:35 -08:00
dmu_send.h Add 'zfs send --saved' flag 2020-01-10 10:16:58 -08:00
dmu_traverse.h Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
dmu_tx.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
dmu_zfetch.h Split dmu_zfetch() speculation and execution parts 2021-03-19 22:56:11 -07:00
dnode.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
dsl_bookmark.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
dsl_crypt.h dmu_objset_from_ds must be called with dp_config_rwlock held 2020-03-12 10:55:02 -07:00
dsl_dataset.h implicit conversion from 'boolean_t' to 'ds_hold_flags_t' 2020-12-27 16:31:02 -08:00
dsl_deadlist.h Add fast path for zfs_ioc_space_snaps() handling of empty_bpobj 2019-08-20 11:34:52 -07:00
dsl_deleg.h Remove code for zfs remap 2019-06-24 16:44:01 -07:00
dsl_destroy.h Fast Clone Deletion 2019-07-26 10:54:14 -07:00
dsl_dir.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
dsl_pool.h Eliminate Linux specific inode usage from common code 2019-12-11 11:53:57 -08:00
dsl_prop.h Support inheriting properties in channel programs 2020-01-22 17:03:17 -08:00
dsl_scan.h Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
dsl_synctask.h Add upper bound for slop space calculation 2021-02-24 09:52:43 -08:00
dsl_userhold.h Illumos #3740 2013-11-04 11:17:48 -08:00
edonr.h OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R 2016-10-03 14:51:15 -07:00
efi_partition.h Fix typos in include/ 2019-08-30 09:53:15 -07:00
frame.h Linux 5.10 compat: frame.h renamed objtool.h 2020-11-02 22:01:10 +00:00
hkdf.h Encryption patch follow-up 2017-10-11 16:54:48 -04:00
metaslab.h Only examine best metaslabs on each vdev 2020-12-16 14:40:05 -08:00
metaslab_impl.h Make metaslab class rotor and aliquot per-allocator. 2020-12-15 10:55:44 -08:00
mmp.h Add zfs_multihost_interval tunable handler for FreeBSD 2020-06-23 13:32:42 -07:00
mntent.h Add FreeBSD required defines to mntent.h 2019-11-30 15:49:09 -08:00
mod.h Replace ZFS on Linux references with OpenZFS 2020-10-08 20:10:13 -07:00
multilist.h Avoid extra taskq_dispatch() calls by DMU 2019-06-25 12:03:38 -07:00
note.h Update build system and packaging 2018-05-29 16:00:33 -07:00
nvpair.h FreeBSD: make adjustments for the standalone environment 2020-10-13 21:05:49 -07:00
nvpair_impl.h OpenZFS 9580 - Add a hash-table on top of nvlist to speed-up operations 2018-07-30 11:30:03 -07:00
objlist.h Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
pathname.h Replace ZFS on Linux references with OpenZFS 2020-10-08 20:10:13 -07:00
qat.h QAT related bug fixes 2019-09-12 13:33:44 -07:00
range_tree.h Improve compatibility with C++ consumers 2020-06-06 12:54:04 -07:00
rrwlock.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
sa.h Extending FreeBSD UIO Struct 2021-01-20 21:27:30 -08:00
sa_impl.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
skein.h OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R 2016-10-03 14:51:15 -07:00
spa.h Checksum errors may not be counted 2021-02-19 22:33:15 -08:00
spa_boot.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
spa_checkpoint.h Serialize ZTHR operations to eliminate races 2019-01-13 10:09:46 -08:00
spa_checksum.h Implementation of AVX2 optimized Fletcher-4 2016-06-02 14:30:51 -07:00
spa_impl.h Add "compatibility" property for zpool feature sets 2021-02-17 21:30:45 -08:00
spa_log_spacemap.h Log Spacemap Project 2019-07-16 10:11:49 -07:00
space_map.h Extend zdb to print inconsistencies in livelists and metaslabs 2020-07-14 17:51:05 -07:00
space_reftree.h Reduce loaded range tree memory usage 2019-10-09 10:36:03 -07:00
sysevent.h OpenZFS 6939 - add sysevents to zfs core for commands 2017-07-12 21:28:13 -07:00
txg.h Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
txg_impl.h Fix typos in include/ 2019-08-30 09:53:15 -07:00
u8_textprep.h Throw const on some strings 2020-10-02 17:44:10 -07:00
u8_textprep_data.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
uberblock.h Multi-modifier protection (MMP) 2017-07-13 13:54:00 -04:00
uberblock_impl.h MMP interval and fail_intervals in uberblock 2019-03-21 12:47:57 -07:00
uio_impl.h Cleaning up uio headers 2021-02-20 20:16:50 -08:00
unique.h Illumos #3742 2013-11-04 10:55:25 -08:00
uuid.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
vdev.h Set aside a metaslab for ZIL blocks 2021-01-21 15:12:54 -08:00
vdev_disk.h Make struct vdev_disk_t be platform private 2020-06-16 11:43:33 -07:00
vdev_draid.h Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
vdev_file.h Add zfs_file_* interface, remove vnodes 2019-11-21 09:32:57 -08:00
vdev_impl.h Parallelize vdev_validate 2021-01-26 19:36:51 -08:00
vdev_indirect_births.h OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_indirect_mapping.h OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_initialize.h Add TRIM support 2019-03-29 09:13:20 -07:00
vdev_raidz.h Clean up RAIDZ/DRAID ereport code 2021-03-19 16:22:10 -07:00
vdev_raidz_impl.h Clean up RAIDZ/DRAID ereport code 2021-03-19 16:22:10 -07:00
vdev_rebuild.h Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
vdev_removal.h panic in removal_remap test on 4K devices 2019-06-13 13:12:39 -07:00
vdev_trim.h Trim L2ARC 2020-06-09 10:15:08 -07:00
xvattr.h Linux 4.18 compat: inode timespec -> timespec64 2018-06-19 21:51:18 -07:00
zap.h fat zap should prefetch when iterating 2019-06-12 13:13:09 -07:00
zap_impl.h OpenZFS 7793 - ztest fails assertion in dmu_tx_willuse_space 2017-03-07 09:51:59 -08:00
zap_leaf.h Fix ENOSPC in "Handle zap_add() failures in ..." 2018-04-18 14:19:50 -07:00
zcp.h filesystem_limit/snapshot_limit is incorrectly enforced against root 2020-07-11 17:18:02 -07:00
zcp_global.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zcp_iter.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zcp_prop.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zcp_set.h Support setting user properties in a channel program 2020-02-14 13:41:42 -08:00
zfeature.h Revert "zhack: Add 'feature disable' command" 2016-05-17 11:52:07 -07:00
zfs_acl.h Return an error code from zfs_acl_chmod_setattr 2019-11-01 10:19:11 -07:00
zfs_bootenv.h zfs label bootenv should store data as nvlist 2020-09-15 15:42:27 -07:00
zfs_context.h Cleaning up uio headers 2021-02-20 20:16:50 -08:00
zfs_debug.h Set aside a metaslab for ZIL blocks 2021-01-21 15:12:54 -08:00
zfs_delay.h Update build system and packaging 2018-05-29 16:00:33 -07:00
zfs_file.h Re-share zfsdev_getminor and zfs_onexit_fd_hold 2020-02-28 14:50:32 -08:00
zfs_fuid.h Replace sprintf()->snprintf() and strcpy()->strlcpy() 2020-06-07 11:42:12 -07:00
zfs_ioctl.h FreeBSD: Clean up zfsdev_close to match Linux 2021-03-12 16:09:15 -08:00
zfs_ioctl_impl.h Make zc_nvlist_src_size limit tunable 2020-08-18 09:33:55 -07:00
zfs_onexit.h Remove deduplicated send/receive code 2020-04-23 10:06:57 -07:00
zfs_project.h Minor diff reduction with ZoF in include/sys 2019-11-27 11:11:03 -08:00
zfs_quota.h File incorrectly zeroed when receiving incremental stream that toggles -L 2020-06-09 10:41:01 -07:00
zfs_racct.h Restore FreeBSD resource usage accounting 2021-02-19 22:34:33 -08:00
zfs_ratelimit.h Change checksum & IO delay ratelimit values 2018-03-04 17:34:51 -08:00
zfs_refcount.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
zfs_rlock.h Add a "try" operation for range locks 2020-07-06 11:53:31 -07:00
zfs_sa.h Extending FreeBSD UIO Struct 2021-01-20 21:27:30 -08:00
zfs_stat.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
zfs_sysfs.h Fix in-kernel sysfs entries 2018-09-06 21:44:52 -07:00
zfs_vfsops.h Add 'zfs rename -u' to rename without remounting 2020-09-01 16:14:16 -07:00
zfs_vnops.h Extending FreeBSD UIO Struct 2021-01-20 21:27:30 -08:00
zfs_znode.h Rename zfs_inode_update to zfs_znode_update_vfs 2021-02-09 11:17:29 -08:00
zil.h Fix zfs_get_data access to files with wrong generation 2021-03-19 22:53:31 -07:00
zil_impl.h make zil max block size tunable 2019-06-10 11:48:42 -07:00
zio.h Clean up RAIDZ/DRAID ereport code 2021-03-19 16:22:10 -07:00
zio_checksum.h Remove dependency on linear ABD 2017-03-29 12:24:51 -07:00
zio_compress.h Add zstd support to zfs 2020-08-20 10:30:06 -07:00
zio_crypt.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
zio_impl.h Add zstd support to zfs 2020-08-20 10:30:06 -07:00
zio_priority.h Add device rebuild feature 2020-07-03 11:05:50 -07:00
zrlock.h OpenZFS 6328 - Fix cstyle errors in zfs codebase 2017-01-12 09:42:11 -08:00
zthr.h Introduce names for ZTHRs 2020-07-29 09:43:33 -07:00
zvol.h async zvol minor node creation interferes with receive 2020-02-03 09:33:14 -08:00
zvol_impl.h Fix zfs_get_data access to files with wrong generation 2021-03-19 22:53:31 -07:00