OpenZFS 8166 - zpool scrub thinks it repaired offline device
Authored by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed-by: loli10K <ezomori.nozomu@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Ported-by: Matthew Ahrens <mahrens@delphix.com> If we do a scrub while a leaf device is offline (via "zpool offline"), we will inadvertently clear the DTL (dirty time log) of the offline device, even though it is still damaged. When the device comes back online, we will incompletely resilver it, thinking that the scrub repaired blocks written before the scrub was started. The incomplete resilver can lead to data loss if there is a subsequent failure of a different leaf device. The fix is to never clear the DTL of offline devices. Note that if a device is onlined while a scrub is in progress, the scrub will be restarted. The problem can be worked around by running "zpool scrub" after "zpool online". OpenZFS-issue: https://www.illumos.org/issues/8166 OpenZFS-commit: https://github.com/openzfs/openzfs/pull/372 Closes #5806 Closes #6103
This commit is contained in:
parent
36ccb9db43
commit
1e5f75ecbe
|
@ -1799,6 +1799,9 @@ vdev_dtl_should_excise(vdev_t *vd)
|
||||||
ASSERT0(scn->scn_phys.scn_errors);
|
ASSERT0(scn->scn_phys.scn_errors);
|
||||||
ASSERT0(vd->vdev_children);
|
ASSERT0(vd->vdev_children);
|
||||||
|
|
||||||
|
if (vd->vdev_state < VDEV_STATE_DEGRADED)
|
||||||
|
return (B_FALSE);
|
||||||
|
|
||||||
if (vd->vdev_resilver_txg == 0 ||
|
if (vd->vdev_resilver_txg == 0 ||
|
||||||
range_tree_space(vd->vdev_dtl[DTL_MISSING]) == 0)
|
range_tree_space(vd->vdev_dtl[DTL_MISSING]) == 0)
|
||||||
return (B_TRUE);
|
return (B_TRUE);
|
||||||
|
|
Loading…
Reference in New Issue