zfs/module
Rob Norris 2724bcb3d6 zil: allow the ZIL to fail and restart independently of the pool
zil_commit() has always returned void, and thus, cannot fail. Everything
inside it assumed that if anything ever went wrong, it could fall back
on txg_wait_synced() until the txg covering the operations being flushed
from the ZIL has fully committed. This meant that if the pool failed and
failmode=continue was set, syncing operations like fsync() would still
block.

Unblocking zil_commit() means largely the same approach. The difficulty
is that the ZIL carries the record of uncommitted VFS operations (vs the
changed data), and attached to those, callbacks and cvs that will
release userspace callers once the data is on disk. So if we can't write
the ZIL, we also can't release those records until the data is on disk.

This wasn't a problem before, because the zil_commit() would block. If
we change zil_commit() to return error, we still need to track those
entries until the data they represent hits the disk. We also need to
accept new records; just because the ZIL fails may not necessarily mean
the pool itself is unavailable.

This commit reorganises the ZIL to allow zil_commit() to return failure.
If ZIL writes or flushes fail, the ZIL is moved into a "failed" state,
and no further writes are done; all zil_commit() calls are serviced by
the regular txg mechanism. Outstanding records (itx_ts) are held until
the main pool writes their associated txg out. The records are then
released. Once all records are cleared, the ZIL is reset and reopened.

Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
(cherry picked from commit af821006f6602261e690fe6635689cabdeefcadf)
2023-07-05 13:27:31 +00:00
..
avl Fix various typos 2021-04-07 13:27:11 -07:00
icp Fix functions without a prototype 2022-05-20 10:33:24 -07:00
lua Use fallthrough macro 2021-11-02 09:50:30 -07:00
nvpair Update `checkstyle` workflow env to ubuntu-20.04 2021-12-08 13:27:56 -08:00
os zil: allow the ZIL to fail and restart independently of the pool 2023-07-05 13:27:31 +00:00
spl Cleanup linux module kbuild files 2020-06-10 09:24:15 -07:00
unicode Update `checkstyle` workflow env to ubuntu-20.04 2021-12-08 13:27:56 -08:00
zcommon Linux 5.16 compat: don't use XSTATE_XSAVE to save FPU state 2022-02-16 17:58:55 -08:00
zfs zil: allow the ZIL to fail and restart independently of the pool 2023-07-05 13:27:31 +00:00
zstd module: zstd: check we don't leak symbols; regenerate symbol map 2022-05-16 15:48:21 -07:00
.gitignore Cleanup linux module kbuild files 2020-06-10 09:24:15 -07:00
Kbuild.in Add a JSON equivalent to zpool-status(8) 2023-07-05 13:27:30 +00:00
Makefile.bsd Add a JSON equivalent to zpool-status(8) 2023-07-05 13:27:30 +00:00
Makefile.in Add support for $KERNEL_{CC,LD,LLVM} variables 2022-02-16 17:58:55 -08:00