zfs/include/sys
Rob Norris e35088c93f ZIO: add "vdev tracing" facility; use it for ZIL flushing
A problem with zio_flush() is that it issues a flush ZIO to a top-level
vdev, which then recursively issues child flush ZIOs until the real leaf
devices are flushed. As usual, an error in a child ZIO results in the
parent ZIO erroring too, so if a leaf device has failed, it's flush ZIO
will fail, and so will the entire flush operation.

This didn't matter when we used to ignore flush errors, but now that we
propagate them, the flush error propagates into the ZIL write ZIO. This
causes the ZIL to believe its write failed, and fall back to a full txg
wait. This still provides correct behaviour for zil_commit() callers (eg
fsync()) but it ruins performance.

We cannot simply skip flushing failed vdevs, because the associated
write may have succeeded before the vdev failed, which would give the
appearance the write is fully flushed when it is not. Neither can we
issue a "syncing write" to the device (eg SCSI FUA), as this also
degrades performance.

The answer is that we must bind writes and flushes together in a way
such that we only flush the physical devices that we wrote to.

This adds a "vdev tracing" facility to ZIOs, zio_vdev_trace. When
enabled on a ZIO with ZIO_FLAG_VDEV_TRACE, then upon successful
completion (in the _done handler), zio->io_vdev_trace_tree will have a
list of zio_vdev_trace_t objects that each describe a vdev that was
involved in the successful completion of the ZIO.

A companion function, zio_vdev_trace_flush(), is included, that issues a
flush ZIO to the child vdevs on the given trace tree.
zil_lwb_write_done() is updated to use this to bind ZIL writes and
flushes together.

The tracing facility is similar in many ways to the "deferred flushing"
facility inside the ZIL, to the point where it can replace it. Now, if
the flush should be deferred, the trace records from the writing ZIO are
captured and combined with any captured from previous writes. When its
finally time to issue the flush, we issue it to the entire accumulated
set of traced vdevs.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
2024-08-17 14:43:36 +10:00
..
crypto icp: remove skein module 2024-05-31 15:13:39 -07:00
fm Add slow disk diagnosis to ZED 2024-02-08 09:19:52 -08:00
fs ddt: add support for prefetching tables into the ARC 2024-07-26 09:16:18 -07:00
lua autoconf: single-step includes 2022-05-10 10:18:51 -07:00
sysevent Teach zpool scrub to scrub only blocks in error log 2023-05-18 11:59:42 -07:00
zstd Unbreak zstd build on sparc64 2022-05-25 09:18:49 -07:00
abd.h abd: lift ABD zero scan from zio_compress_data() to abd_cmp_zero() 2024-08-09 14:30:26 -07:00
abd_impl.h abd: add page iterator 2024-03-25 16:50:35 -07:00
aggsum.h More aggsum optimizations 2021-06-07 09:02:47 -07:00
arc.h ddt: add support for prefetching tables into the ARC 2024-07-26 09:16:18 -07:00
arc_impl.h Several improvements to ARC shrinking (#16197) 2024-07-25 10:31:14 -07:00
asm_linkage.h Unify Assembler files between Linux and Windows 2023-01-17 11:09:19 -08:00
avl.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
avl_impl.h Make sure avl_tree.avl_pad is not in kernel module (#16280) 2024-07-17 13:54:11 -07:00
bitmap.h Replace dead opensolaris.org license links 2023-03-14 14:44:01 -07:00
bitops.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
blake3.h Update BLAKE3 for using the new impl handling 2023-03-02 13:52:27 -08:00
blkptr.h OpenZFS 8067 - zdb should be able to dump literal embedded block pointer 2017-07-07 11:28:01 -07:00
bplist.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
bpobj.h Add explicit prefetches to bpobj_iterate(). 2023-07-21 11:50:48 -07:00
bptree.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
bqueue.h Batch enqueue/dequeue for bqueue 2023-01-10 13:39:22 -08:00
brt.h zdb: include cloned blocks in block statistics 2023-08-01 08:56:30 -07:00
brt_impl.h brt: lift internal definitions into _impl header 2023-11-27 13:34:43 -08:00
btree.h btree: Implement faster binary search algorithm 2023-05-26 10:03:12 -07:00
dataset_kstats.h Update the kstat dataset_name when renaming a zvol 2023-11-07 11:34:50 -08:00
dbuf.h Skip dnode handles use when not needed 2024-07-29 14:48:12 -07:00
ddt.h ddt: lookup and log stats 2024-08-16 12:03:51 -07:00
ddt_impl.h ddt: dedup log 2024-08-16 12:03:35 -07:00
dmu.h ddt: dedup log 2024-08-16 12:03:35 -07:00
dmu_impl.h ZIL: Avoid dbuf_read() before dmu_sync(). 2023-08-11 09:04:08 -07:00
dmu_objset.h Add prefetch property 2023-10-24 11:00:07 -07:00
dmu_recv.h nvpair: Constify string functions 2023-03-14 15:25:50 -07:00
dmu_redact.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dmu_send.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dmu_traverse.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dmu_tx.h Add dmu_tx_hold_append() interface 2023-05-09 09:03:10 -07:00
dmu_zfetch.h Speculative prefetch for reordered requests 2024-04-08 15:13:27 -07:00
dnode.h dnode: allow storage class to be overridden by object type 2024-07-29 17:05:41 -07:00
dsl_bookmark.h Increase limit of redaction list by using spill block 2023-08-26 11:34:43 -07:00
dsl_crypt.h Allow block cloning across encrypted datasets 2023-12-05 11:03:48 -08:00
dsl_dataset.h Revert zfeature_active() to static 2023-02-28 14:03:52 -08:00
dsl_deadlist.h Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
dsl_deleg.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dsl_destroy.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dsl_dir.h Cleanup ->dd_space_towrite should be unsigned 2023-01-20 11:10:15 -08:00
dsl_pool.h Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
dsl_prop.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dsl_scan.h ddt: add "flat phys" feature 2024-08-16 12:02:39 -07:00
dsl_synctask.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dsl_userhold.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
edonr.h Remove unused constant EdonR256_BLOCK_BITSIZE 2023-03-22 08:39:48 -07:00
efi_partition.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
frame.h Linux 5.10 compat: frame.h renamed objtool.h 2020-11-02 22:01:10 +00:00
hkdf.h Encryption patch follow-up 2017-10-11 16:54:48 -04:00
metaslab.h Selectable block allocators 2023-09-01 18:00:30 -07:00
metaslab_impl.h Reduce number of metaslab preload taskq threads. 2023-10-06 09:04:00 -07:00
mmp.h Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
mntent.h Expose ZFS dataset case sensitivity setting via sb_opts 2022-07-14 10:38:16 -07:00
mod.h linux: module: weld all but spl.ko into zfs.ko 2022-04-20 13:28:24 -07:00
multilist.h L2ARC: Relax locking during write 2024-04-09 16:23:19 -07:00
nvpair.h nvpair: Use flexible array member for nvpair name strings 2023-03-14 15:25:55 -07:00
nvpair_impl.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
objlist.h Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
pathname.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
qat.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
range_tree.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
rrwlock.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
sa.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
sa_impl.h Fix sa_add_projid to lookup and update SA_ZPL_DXATTR (avoid DXATTR loss) (#16288) 2024-07-31 18:41:49 -07:00
sha2.h icp: remove unused SHA2 HMAC mechanisms 2024-05-31 15:13:30 -07:00
skein.h icp: remove digest entry points 2024-05-31 15:13:16 -07:00
spa.h ddt: add "flat phys" feature 2024-08-16 12:02:39 -07:00
spa_checkpoint.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
spa_checksum.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
spa_impl.h Sync AUX label during pool import 2024-08-08 15:16:46 -07:00
spa_log_spacemap.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
space_map.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
space_reftree.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
sysevent.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
txg.h Cleanup: Specify unsignedness on things that should not be signed 2022-09-27 16:42:41 -07:00
txg_impl.h Properly pad struct tx_cpu to cache line 2023-10-20 11:54:05 -07:00
u8_textprep.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
u8_textprep_data.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
uberblock.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
uberblock_impl.h vdev probe to slow disk can stall mmp write checker 2024-04-29 14:35:53 -07:00
uio_impl.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
unique.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
uuid.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev.h RAID-Z expansion feature 2023-11-08 10:19:41 -08:00
vdev_disk.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_draid.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_file.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_impl.h vdev probe to slow disk can stall mmp write checker 2024-04-29 14:35:53 -07:00
vdev_indirect_births.h OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_indirect_mapping.h OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_initialize.h Add the ability to uninitialize 2023-05-18 10:02:20 -07:00
vdev_raidz.h RAID-Z expansion feature 2023-11-08 10:19:41 -08:00
vdev_raidz_impl.h Workaround UBSAN errors for variable arrays 2023-11-12 16:26:07 -08:00
vdev_rebuild.h Do not report bytes skipped by scan as issued. 2023-06-30 08:47:13 -07:00
vdev_removal.h Cleanup: Specify unsignedness on things that should not be signed 2022-09-27 16:42:41 -07:00
vdev_trim.h Fix short-lived txg caused by autotrim 2023-03-28 08:43:41 -07:00
xvattr.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zap.h ddt: add support for prefetching tables into the ARC 2024-07-26 09:16:18 -07:00
zap_impl.h ZAP: Massively switch to _by_dnode() interfaces 2024-03-25 14:58:50 -07:00
zap_leaf.h zap_leaf: make l_hash[] variable length to silence UBSAN 2024-04-03 16:38:18 -07:00
zcp.h Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
zcp_global.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zcp_iter.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zcp_prop.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zcp_set.h Support setting user properties in a channel program 2020-02-14 13:41:42 -08:00
zfeature.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_acl.h Linux 6.3 compat: idmapped mount API changes 2023-04-10 14:15:36 -07:00
zfs_bootenv.h zfs label bootenv should store data as nvlist 2020-09-15 15:42:27 -07:00
zfs_chksum.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_context.h Linux: Report reclaimable memory to kernel as such (#16385) 2024-07-30 11:40:47 -07:00
zfs_debug.h zdb/ztest: send dbgmsg output to stderr 2024-05-14 09:49:00 -07:00
zfs_delay.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_file.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_fuid.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_impl.h Add generic implementation handling and SHA2 impl 2023-03-02 13:52:21 -08:00
zfs_ioctl.h Parallel pool import 2024-04-22 09:42:38 -07:00
zfs_ioctl_impl.h Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
zfs_onexit.h zfs_onexit_add_cb: make action_handle point to a uintptr_t 2022-11-03 09:52:12 -07:00
zfs_project.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_quota.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_racct.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_ratelimit.h Change checksum & IO delay ratelimit values 2018-03-04 17:34:51 -08:00
zfs_refcount.h Switch refcount tracking from lists to AVL-trees. 2023-06-14 08:02:27 -07:00
zfs_rlock.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_sa.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_stat.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_sysfs.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_vfsops.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_vnops.h BRT: Fix FICLONE/FICLONERANGE shortened copy 2024-02-05 16:44:45 -08:00
zfs_znode.h ZIL: Cleanup sync and commit handling 2023-10-30 14:51:56 -07:00
zil.h ZIO: add "vdev tracing" facility; use it for ZIL flushing 2024-08-17 14:43:36 +10:00
zil_impl.h ZIO: add "vdev tracing" facility; use it for ZIL flushing 2024-08-17 14:43:36 +10:00
zio.h ZIO: add "vdev tracing" facility; use it for ZIL flushing 2024-08-17 14:43:36 +10:00
zio_checksum.h Don't emit cksum_{actual_expected} in ereport.fs.zfs.checksum events 2023-07-21 11:49:26 -07:00
zio_compress.h Skip memory allocation when compressing holes 2023-02-27 14:41:02 -08:00
zio_crypt.h Enable -Wwrite-strings 2022-06-29 14:08:54 -07:00
zio_impl.h zio: rename ZIO_TYPE_IOCTL to ZIO_TYPE_FLUSH 2024-04-11 17:17:23 -07:00
zio_priority.h Add device rebuild feature 2020-07-03 11:05:50 -07:00
zrlock.h Pack zrlock_t by 8 bytes 2023-01-05 09:31:55 -08:00
zthr.h Avoid memory allocations in the ARC eviction thread 2022-01-21 10:28:13 -08:00
zvol.h zvol: fix delayed update to block device ro entry 2023-10-31 09:50:38 -07:00
zvol_impl.h zvol: ensure device minors are properly cleaned up 2024-08-06 12:08:14 -07:00