zfs/module
Nathaniel Wesley Filardo 056a658dee
vdev_mirror: don't scrub/resilver devices that can't be read
This ensures that we don't accumulate checksum errors against offline or
unavailable devices but, more importantly, means that we don't
needlessly create DTL entries for offline devices that are already
up-to-date.

Consider a 3-way mirror, with disk A always online (and so always with
an empty DTL) and B and C only occasionally online.  When A & B resilver
with C offline, B's DTL will effectively be appended to C's due to these
spurious ZIOs even as the resilver empties B's DTL:

  * These ZIOs land in vdev_mirror_scrub_done() and flag an error

  * That flagged error causes vdev_mirror_io_done() to see
    unexpected_errors, so it issues a ZIO_TYPE_WRITE repair ZIO, which
    inherits ZIO_FLAG_SCAN_THREAD because zio_vdev_child_io() includes
    that flag in ZIO_VDEV_CHILD_FLAGS.

  * That ZIO fails, too, and eventually zio_done() gets its hands on it
    and calls vdev_stat_update().

  * vdev_stat_update() sees the error and this zio...

    * is not speculative,
    * is not due to EIO (but rather ENXIO, since the device is closed)
    * has an ->io_vd != NULL (specifically, the offline leaf device)
    * is a write
    * is for a txg != 0 (but rather the read block's physical birth txg)
    * has ZIO_FLAG_SCAN_THREAD asserted

  * So: vdev_stat_update() calls vdev_dtl_dirty() on the offline vdev.

Then, when A & C resilver with B offline, that story gets replayed and
C's DTL will be appended to B's.

In fact, one does not need this permanently-broken-mirror scenario to
induce badness: breaking a mirror with no DTLs and then scrubbing will
create DTLs for all offline devices.  These DTLs will persist until the
entire mirror is reassembled for the duration of the *resilver*, which,
incidentally, will not consider the devices with good data to be sources
of good data in the case of a read failure.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Nathaniel Wesley Filardo <nwfilardo@gmail.com>
Closes #11930
2021-04-27 17:48:11 -07:00
..
avl Fix various typos 2021-04-02 18:52:15 -07:00
icp ICP: Silence objtool "stack pointer realignment" warnings 2021-04-17 13:11:18 -07:00
lua cppcheck: integrete cppcheck 2021-01-26 16:12:26 -08:00
nvpair Links in Source Files 2020-09-02 09:42:12 -07:00
os Drop "All rights reserved" from files by trasz@FreeBSD.org 2021-04-27 08:25:48 -07:00
spl Cleanup linux module kbuild files 2020-06-10 09:24:15 -07:00
unicode Fix various typos 2021-04-02 18:52:15 -07:00
zcommon Fix AVX512BW Fletcher code on AVX512-but-not-BW machines 2021-04-26 12:42:42 -07:00
zfs vdev_mirror: don't scrub/resilver devices that can't be read 2021-04-27 17:48:11 -07:00
zstd Fix various typos 2021-04-02 18:52:15 -07:00
.gitignore Cleanup linux module kbuild files 2020-06-10 09:24:15 -07:00
Kbuild.in Add zstd support to zfs 2020-08-20 10:30:06 -07:00
Makefile.bsd Restore FreeBSD resource usage accounting 2021-02-19 22:34:33 -08:00
Makefile.in FreeBSD module --enable-debug --enable-invariants 2021-03-05 12:16:41 -08:00