zfs/os at 5a256eaed19980ad5508bc5a7b0a71a84aba0963 - zfs

History

Rob Norris 2724bcb3d6 zil: allow the ZIL to fail and restart independently of the pool zil_commit() has always returned void, and thus, cannot fail. Everything inside it assumed that if anything ever went wrong, it could fall back on txg_wait_synced() until the txg covering the operations being flushed from the ZIL has fully committed. This meant that if the pool failed and failmode=continue was set, syncing operations like fsync() would still block. Unblocking zil_commit() means largely the same approach. The difficulty is that the ZIL carries the record of uncommitted VFS operations (vs the changed data), and attached to those, callbacks and cvs that will release userspace callers once the data is on disk. So if we can't write the ZIL, we also can't release those records until the data is on disk. This wasn't a problem before, because the zil_commit() would block. If we change zil_commit() to return error, we still need to track those entries until the data they represent hits the disk. We also need to accept new records; just because the ZIL fails may not necessarily mean the pool itself is unavailable. This commit reorganises the ZIL to allow zil_commit() to return failure. If ZIL writes or flushes fail, the ZIL is moved into a "failed" state, and no further writes are done; all zil_commit() calls are serviced by the regular txg mechanism. Outstanding records (itx_ts) are held until the main pool writes their associated txg out. The records are then released. Once all records are cleared, the ZIL is reset and reopened. Signed-off-by: Rob Norris <rob.norris@klarasystems.com> (cherry picked from commit af821006f6602261e690fe6635689cabdeefcadf)	2023-07-05 13:27:31 +00:00
..
freebsd	vnops: thread DMU_TX_ASSIGN_CONTINUE to a bunch of vnops	2023-07-05 13:27:30 +00:00
linux	zil: allow the ZIL to fail and restart independently of the pool	2023-07-05 13:27:31 +00:00

Rob Norris 2724bcb3d6 zil: allow the ZIL to fail and restart independently of the pool

zil_commit() has always returned void, and thus, cannot fail. Everything
inside it assumed that if anything ever went wrong, it could fall back
on txg_wait_synced() until the txg covering the operations being flushed
from the ZIL has fully committed. This meant that if the pool failed and
failmode=continue was set, syncing operations like fsync() would still
block.

Unblocking zil_commit() means largely the same approach. The difficulty
is that the ZIL carries the record of uncommitted VFS operations (vs the
changed data), and attached to those, callbacks and cvs that will
release userspace callers once the data is on disk. So if we can't write
the ZIL, we also can't release those records until the data is on disk.

This wasn't a problem before, because the zil_commit() would block. If
we change zil_commit() to return error, we still need to track those
entries until the data they represent hits the disk. We also need to
accept new records; just because the ZIL fails may not necessarily mean
the pool itself is unavailable.

This commit reorganises the ZIL to allow zil_commit() to return failure.
If ZIL writes or flushes fail, the ZIL is moved into a "failed" state,
and no further writes are done; all zil_commit() calls are serviced by
the regular txg mechanism. Outstanding records (itx_ts) are held until
the main pool writes their associated txg out. The records are then
released. Once all records are cleared, the ZIL is reset and reopened.

Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
(cherry picked from commit af821006f6602261e690fe6635689cabdeefcadf)

2023-07-05 13:27:31 +00:00

freebsd

vnops: thread DMU_TX_ASSIGN_CONTINUE to a bunch of vnops

2023-07-05 13:27:30 +00:00

linux

zil: allow the ZIL to fail and restart independently of the pool

2023-07-05 13:27:31 +00:00