zil: ensure flush errors are recieved

Its possible for a hardware failure to occur in a way that the ZIL block
writes appear to succeed, but the flush fails.

Because flush errors were being ignored, the lwb chain would finish with
a zero error code, which would result in zil_commit() returning and thus
fsync() returning success to the caller, even though the data was not
recorded in the ZIL.

If the ZIL is on the main pool (no SLOG device) it would typically
suspend around the same time. If that happened before the txg committed,
then those writes are now totally lost - not on the pool, not in the
ZIL.

zil_lwb_flush_vdevs_done() has the necessary code to deal with this
situation, but zio_flush() would never return failure, so it never saw
it. This just allows flushes to report failure, and now we never miss a
failed ZIL write.

Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
(cherry picked from commit d9db5dccc56b551d0bf66bc9022b6c19a659b7e1)
This commit is contained in:
Rob Norris 2023-05-10 16:49:12 +10:00 committed by Geoff Amey
parent 8ec175d7e1
commit cdaf041d39
1 changed files with 5 additions and 14 deletions

View File

@ -1197,12 +1197,6 @@ zil_lwb_flush_vdevs_done(zio_t *zio)
* includes ZIO errors from either this LWB's write or
* flush, as well as any errors from other dependent LWBs
* (e.g. a root LWB ZIO that might be a child of this LWB).
*
* With that said, it's important to note that LWB flush
* errors are not propagated up to the LWB root ZIO.
* This is incorrect behavior, and results in VDEV flush
* errors not being handled correctly here. See the
* comment above the call to "zio_flush" for details.
*/
zcw->zcw_zio_error = zio->io_error;
@ -1315,15 +1309,12 @@ zil_lwb_write_done(zio_t *zio)
vdev_t *vd = vdev_lookup_top(spa, zv->zv_vdev);
if (vd != NULL) {
/*
* The "ZIO_FLAG_DONT_PROPAGATE" is currently
* always used within "zio_flush". This means,
* any errors when flushing the vdev(s), will
* (unfortunately) not be handled correctly,
* since these "zio_flush" errors will not be
* propagated up to "zil_lwb_flush_vdevs_done".
* Issue DKIOCFLUSHWRITECACHE to all vdevs that have
* been touched by writes for this or previous lwbs
* that had their flushes deferred. Flush errors will
* be delivered to zil_lwb_flush_vdevs_done().
*/
zio_flush(lwb->lwb_root_zio, vd,
ZIO_FLAG_DONT_PROPAGATE);
zio_flush(lwb->lwb_root_zio, vd, 0);
}
kmem_free(zv, sizeof (*zv));
}