zfs/include/sys
Rob Norris e7966e581a zio_flush: propagate flush errors to the ZIL
Since the beginning, ZFS' "flush" operation has always ignored
errors[1]. Write errors are captured and dealt with, but if a write
succeeds but the subsequent flush fails, the operation as a whole will
appear to succeed[2].

In the end-of-transaction uberblock+label write+flush ceremony, it's
very difficult for this situation to occur. Since all devices are
written to, typically the first write will succeed, the first flush will
fail unobserved, but then the second write will fail, and the entire
transaction is aborted. It's difficult to imagine a real-world scenario
where all the writes in that sequence could succeed even as the flushes
are failing (understanding that the OS is still seeing hardware problems
and taking devices offline).

In the ZIL however, it's another story. Since only the write response is
checked, if that write succeeds but the flush then fails, the ZIL will
believe that it succeeds, and zil_commit() (and thus fsync()) will
return success rather than the "correct" behaviour of falling back into
txg_wait_synced()[3].

This commit fixes this by adding a simple flag to zio_flush() to
indicate whether or not the caller wants to receive flush errors. This
flag is enabled for ZIL calls. The existing zio chaining inside the ZIL
and the flush handler zil_lwb_flush_vdevs_done() already has all the
necessary support to properly handle a flush failure and fail the entire
zio chain. This causes zil_commit() to correct fall back to
txg_wait_synced() rather than returning success prematurely.

1. The ZFS birth commit (illumos/illumos-gate@fa9e4066f0) had support
   for flushing devices with write caches with the DKIOCFLUSHWRITECACHE
   ioctl. No errors are checked. The comment in `zil_flush_vdevs()` from
   from the time shows the thinking:

   /*
    * Wait for all the flushes to complete.  Not all devices actually
    * support the DKIOCFLUSHWRITECACHE ioctl, so it's OK if it fails.
    */

2. It's not entirely clear from the code history why this was acceptable
   for devices that _do_ have write caches. Our best guess is that this
   was an oversight: between the combination of hardware, pool topology
   and application behaviour required to hit this, it basically didn't
   come up.

3. Somewhat frustratingly, zil.c contains comments describing this exact
   behaviour, and further discussion in #12443 (September 2021). It
   appears that those involved saw the potential, but were looking at a
   different problem and so didn't have the context to recognise it for
   what it was.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
2024-08-17 14:42:45 +10:00
..
crypto icp: remove skein module 2024-05-31 15:13:39 -07:00
fm Add slow disk diagnosis to ZED 2024-02-08 09:19:52 -08:00
fs ddt: add support for prefetching tables into the ARC 2024-07-26 09:16:18 -07:00
lua autoconf: single-step includes 2022-05-10 10:18:51 -07:00
sysevent Teach zpool scrub to scrub only blocks in error log 2023-05-18 11:59:42 -07:00
zstd Unbreak zstd build on sparc64 2022-05-25 09:18:49 -07:00
abd.h abd: lift ABD zero scan from zio_compress_data() to abd_cmp_zero() 2024-08-09 14:30:26 -07:00
abd_impl.h abd: add page iterator 2024-03-25 16:50:35 -07:00
aggsum.h More aggsum optimizations 2021-06-07 09:02:47 -07:00
arc.h ddt: add support for prefetching tables into the ARC 2024-07-26 09:16:18 -07:00
arc_impl.h Several improvements to ARC shrinking (#16197) 2024-07-25 10:31:14 -07:00
asm_linkage.h Unify Assembler files between Linux and Windows 2023-01-17 11:09:19 -08:00
avl.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
avl_impl.h Make sure avl_tree.avl_pad is not in kernel module (#16280) 2024-07-17 13:54:11 -07:00
bitmap.h Replace dead opensolaris.org license links 2023-03-14 14:44:01 -07:00
bitops.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
blake3.h Update BLAKE3 for using the new impl handling 2023-03-02 13:52:27 -08:00
blkptr.h OpenZFS 8067 - zdb should be able to dump literal embedded block pointer 2017-07-07 11:28:01 -07:00
bplist.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
bpobj.h Add explicit prefetches to bpobj_iterate(). 2023-07-21 11:50:48 -07:00
bptree.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
bqueue.h Batch enqueue/dequeue for bqueue 2023-01-10 13:39:22 -08:00
brt.h zdb: include cloned blocks in block statistics 2023-08-01 08:56:30 -07:00
brt_impl.h brt: lift internal definitions into _impl header 2023-11-27 13:34:43 -08:00
btree.h btree: Implement faster binary search algorithm 2023-05-26 10:03:12 -07:00
dataset_kstats.h Update the kstat dataset_name when renaming a zvol 2023-11-07 11:34:50 -08:00
dbuf.h Skip dnode handles use when not needed 2024-07-29 14:48:12 -07:00
ddt.h ddt: lookup and log stats 2024-08-16 12:03:51 -07:00
ddt_impl.h ddt: dedup log 2024-08-16 12:03:35 -07:00
dmu.h ddt: dedup log 2024-08-16 12:03:35 -07:00
dmu_impl.h ZIL: Avoid dbuf_read() before dmu_sync(). 2023-08-11 09:04:08 -07:00
dmu_objset.h Add prefetch property 2023-10-24 11:00:07 -07:00
dmu_recv.h nvpair: Constify string functions 2023-03-14 15:25:50 -07:00
dmu_redact.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dmu_send.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dmu_traverse.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dmu_tx.h Add dmu_tx_hold_append() interface 2023-05-09 09:03:10 -07:00
dmu_zfetch.h Speculative prefetch for reordered requests 2024-04-08 15:13:27 -07:00
dnode.h dnode: allow storage class to be overridden by object type 2024-07-29 17:05:41 -07:00
dsl_bookmark.h Increase limit of redaction list by using spill block 2023-08-26 11:34:43 -07:00
dsl_crypt.h Allow block cloning across encrypted datasets 2023-12-05 11:03:48 -08:00
dsl_dataset.h Revert zfeature_active() to static 2023-02-28 14:03:52 -08:00
dsl_deadlist.h Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
dsl_deleg.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dsl_destroy.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dsl_dir.h Cleanup ->dd_space_towrite should be unsigned 2023-01-20 11:10:15 -08:00
dsl_pool.h Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
dsl_prop.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dsl_scan.h ddt: add "flat phys" feature 2024-08-16 12:02:39 -07:00
dsl_synctask.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dsl_userhold.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
edonr.h Remove unused constant EdonR256_BLOCK_BITSIZE 2023-03-22 08:39:48 -07:00
efi_partition.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
frame.h Linux 5.10 compat: frame.h renamed objtool.h 2020-11-02 22:01:10 +00:00
hkdf.h Encryption patch follow-up 2017-10-11 16:54:48 -04:00
metaslab.h Selectable block allocators 2023-09-01 18:00:30 -07:00
metaslab_impl.h Reduce number of metaslab preload taskq threads. 2023-10-06 09:04:00 -07:00
mmp.h Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
mntent.h Expose ZFS dataset case sensitivity setting via sb_opts 2022-07-14 10:38:16 -07:00
mod.h linux: module: weld all but spl.ko into zfs.ko 2022-04-20 13:28:24 -07:00
multilist.h L2ARC: Relax locking during write 2024-04-09 16:23:19 -07:00
nvpair.h nvpair: Use flexible array member for nvpair name strings 2023-03-14 15:25:55 -07:00
nvpair_impl.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
objlist.h Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
pathname.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
qat.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
range_tree.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
rrwlock.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
sa.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
sa_impl.h Fix sa_add_projid to lookup and update SA_ZPL_DXATTR (avoid DXATTR loss) (#16288) 2024-07-31 18:41:49 -07:00
sha2.h icp: remove unused SHA2 HMAC mechanisms 2024-05-31 15:13:30 -07:00
skein.h icp: remove digest entry points 2024-05-31 15:13:16 -07:00
spa.h ddt: add "flat phys" feature 2024-08-16 12:02:39 -07:00
spa_checkpoint.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
spa_checksum.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
spa_impl.h Sync AUX label during pool import 2024-08-08 15:16:46 -07:00
spa_log_spacemap.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
space_map.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
space_reftree.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
sysevent.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
txg.h Cleanup: Specify unsignedness on things that should not be signed 2022-09-27 16:42:41 -07:00
txg_impl.h Properly pad struct tx_cpu to cache line 2023-10-20 11:54:05 -07:00
u8_textprep.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
u8_textprep_data.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
uberblock.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
uberblock_impl.h vdev probe to slow disk can stall mmp write checker 2024-04-29 14:35:53 -07:00
uio_impl.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
unique.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
uuid.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev.h RAID-Z expansion feature 2023-11-08 10:19:41 -08:00
vdev_disk.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_draid.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_file.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_impl.h vdev probe to slow disk can stall mmp write checker 2024-04-29 14:35:53 -07:00
vdev_indirect_births.h OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_indirect_mapping.h OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_initialize.h Add the ability to uninitialize 2023-05-18 10:02:20 -07:00
vdev_raidz.h RAID-Z expansion feature 2023-11-08 10:19:41 -08:00
vdev_raidz_impl.h Workaround UBSAN errors for variable arrays 2023-11-12 16:26:07 -08:00
vdev_rebuild.h Do not report bytes skipped by scan as issued. 2023-06-30 08:47:13 -07:00
vdev_removal.h Cleanup: Specify unsignedness on things that should not be signed 2022-09-27 16:42:41 -07:00
vdev_trim.h Fix short-lived txg caused by autotrim 2023-03-28 08:43:41 -07:00
xvattr.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zap.h ddt: add support for prefetching tables into the ARC 2024-07-26 09:16:18 -07:00
zap_impl.h ZAP: Massively switch to _by_dnode() interfaces 2024-03-25 14:58:50 -07:00
zap_leaf.h zap_leaf: make l_hash[] variable length to silence UBSAN 2024-04-03 16:38:18 -07:00
zcp.h Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
zcp_global.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zcp_iter.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zcp_prop.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zcp_set.h Support setting user properties in a channel program 2020-02-14 13:41:42 -08:00
zfeature.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_acl.h Linux 6.3 compat: idmapped mount API changes 2023-04-10 14:15:36 -07:00
zfs_bootenv.h zfs label bootenv should store data as nvlist 2020-09-15 15:42:27 -07:00
zfs_chksum.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_context.h Linux: Report reclaimable memory to kernel as such (#16385) 2024-07-30 11:40:47 -07:00
zfs_debug.h zdb/ztest: send dbgmsg output to stderr 2024-05-14 09:49:00 -07:00
zfs_delay.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_file.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_fuid.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_impl.h Add generic implementation handling and SHA2 impl 2023-03-02 13:52:21 -08:00
zfs_ioctl.h Parallel pool import 2024-04-22 09:42:38 -07:00
zfs_ioctl_impl.h Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
zfs_onexit.h zfs_onexit_add_cb: make action_handle point to a uintptr_t 2022-11-03 09:52:12 -07:00
zfs_project.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_quota.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_racct.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_ratelimit.h Change checksum & IO delay ratelimit values 2018-03-04 17:34:51 -08:00
zfs_refcount.h Switch refcount tracking from lists to AVL-trees. 2023-06-14 08:02:27 -07:00
zfs_rlock.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_sa.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_stat.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_sysfs.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_vfsops.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_vnops.h BRT: Fix FICLONE/FICLONERANGE shortened copy 2024-02-05 16:44:45 -08:00
zfs_znode.h ZIL: Cleanup sync and commit handling 2023-10-30 14:51:56 -07:00
zil.h zil: add stats for commit failure/fallback (#16315) 2024-07-25 16:53:59 -07:00
zil_impl.h ZIL: Improve next log block size prediction 2023-12-21 10:54:44 -08:00
zio.h zio_flush: propagate flush errors to the ZIL 2024-08-17 14:42:45 +10:00
zio_checksum.h Don't emit cksum_{actual_expected} in ereport.fs.zfs.checksum events 2023-07-21 11:49:26 -07:00
zio_compress.h Skip memory allocation when compressing holes 2023-02-27 14:41:02 -08:00
zio_crypt.h Enable -Wwrite-strings 2022-06-29 14:08:54 -07:00
zio_impl.h zio: rename ZIO_TYPE_IOCTL to ZIO_TYPE_FLUSH 2024-04-11 17:17:23 -07:00
zio_priority.h Add device rebuild feature 2020-07-03 11:05:50 -07:00
zrlock.h Pack zrlock_t by 8 bytes 2023-01-05 09:31:55 -08:00
zthr.h Avoid memory allocations in the ARC eviction thread 2022-01-21 10:28:13 -08:00
zvol.h zvol: fix delayed update to block device ro entry 2023-10-31 09:50:38 -07:00
zvol_impl.h zvol: ensure device minors are properly cleaned up 2024-08-06 12:08:14 -07:00