Handle unexpected errors in zil_lwb_commit() without ASSERT()
We tripped `ASSERT(error == ENOENT || error == EEXIST || error == EALREADY)` in `zil_lwb_commit()` at Klara when doing robustness testing of ZIL against drive power cycles. That assertion presumably exists because when this code was written, the only errors expected from here were EIO, ENOENT, EEXIST and EALREADY, with EIO having its own handling before the assertion. However, upon doing a manual depth first search traversal of the source tree, it turns out that a large number of unexpected errors are possible here. In theory, EINVAL and ENOSPC can come from dnode_hold_impl(). However, most unexpected errors originate in the block layer and come to us from zio_wait() in various ways. One way is ->zl_get_data() -> dmu_buf_hold() -> dbuf_read() -> zio_wait(). From vdev_disk.c on Linux alone, zio_wait() can return the unexpected errors ENXIO, ENOTSUP, EOPNOTSUPP, ETIMEDOUT, ENOSPC, ENOLINK, EREMOTEIO, EBADE, ENODATA, EILSEQ and ENOMEM This was only observed after what have been likely over 1000 test iterations, so we do not expect to reproduce this again to find out what the error code was. However, circumstantial evidence suggests that the error was ENXIO. When ENXIO or any other unexpected error occurs, the `fsync()` or equivalent operation that called zil_commit() will return success, when in fact, dirty data has not been committed to stable storage. This is a violation of the Single UNIX Specification. The code should be able to handle this and any other unknown error by calling `txg_wait_synced()`. In addition to changing the code to call txg_wait_synced() on unexpected errors instead of returning, we modify it to print information about unexpected errors to dmesg. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <richard.yao@klarasystems.com> Sponsored-By: Wasabi Technology, Inc. Closes #14532
This commit is contained in:
parent
25c4d1f032
commit
3a7c35119e
|
@ -1977,13 +1977,39 @@ cont:
|
|||
/* Zero any padding bytes in the last block. */
|
||||
memset((char *)dbuf + lrwb->lr_length, 0, dpad);
|
||||
|
||||
if (error == EIO) {
|
||||
/*
|
||||
* Typically, the only return values we should see from
|
||||
* ->zl_get_data() are 0, EIO, ENOENT, EEXIST or
|
||||
* EALREADY. However, it is also possible to see other
|
||||
* error values such as ENOSPC or EINVAL from
|
||||
* dmu_read() -> dnode_hold() -> dnode_hold_impl() or
|
||||
* ENXIO as well as a multitude of others from the
|
||||
* block layer through dmu_buf_hold() -> dbuf_read()
|
||||
* -> zio_wait(), as well as through dmu_read() ->
|
||||
* dnode_hold() -> dnode_hold_impl() -> dbuf_read() ->
|
||||
* zio_wait(). When these errors happen, we can assume
|
||||
* that neither an immediate write nor an indirect
|
||||
* write occurred, so we need to fall back to
|
||||
* txg_wait_synced(). This is unusual, so we print to
|
||||
* dmesg whenever one of these errors occurs.
|
||||
*/
|
||||
switch (error) {
|
||||
case 0:
|
||||
break;
|
||||
default:
|
||||
cmn_err(CE_WARN, "zil_lwb_commit() received "
|
||||
"unexpected error %d from ->zl_get_data()"
|
||||
". Falling back to txg_wait_synced().",
|
||||
error);
|
||||
zfs_fallthrough;
|
||||
case EIO:
|
||||
txg_wait_synced(zilog->zl_dmu_pool, txg);
|
||||
return (lwb);
|
||||
}
|
||||
if (error != 0) {
|
||||
ASSERT(error == ENOENT || error == EEXIST ||
|
||||
error == EALREADY);
|
||||
zfs_fallthrough;
|
||||
case ENOENT:
|
||||
zfs_fallthrough;
|
||||
case EEXIST:
|
||||
zfs_fallthrough;
|
||||
case EALREADY:
|
||||
return (lwb);
|
||||
}
|
||||
}
|
||||
|
|
Loading…
Reference in New Issue