431083f75b
Address the following bugs in persistent error log: 1) Check nested clones, eg "fs->snap->clone->snap2->clone2". 2) When deleting files containing error blocks in those clones (from "clone" the example above), do not break the check chain. 3) When deleting files in the originating fs before syncing the errlog to disk, do not break the check chain. This happens because at the time of introducing the error block in the error list, we do not have its birth txg and the head filesystem. If the original file is deleted before the error list is synced to the error log (which is when we actually lookup the birth txg and the head filesystem), then we do not have access to this info anymore and break the check chain. The most prominent change is related to achieving (3). We expand the spa_error_entry_t structure to accommodate the newly introduced zbookmark_err_phys_t structure (containing the birth txg of the error block).Due to compatibility reasons we cannot remove the zbookmark_phys_t structure and we also need to place the new structure after se_avl, so it is not accounted for in avl_find(). Then we modify spa_log_error() to also provide the birth txg of the error block. With these changes in place we simplify the previously introduced function get_head_and_birth_txg() (now named get_head_ds()). We chose not to follow the same approach for the head filesystem (thus completely removing get_head_ds()) to avoid introducing new lock contentions. The stack sizes of nested functions (as measured by checkstack.pl in the linux kernel) are: check_filesystem [zfs]: 272 (was 912) check_clones [zfs]: 64 We also introduced two new tests covering the above changes. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #14633 |
||
---|---|---|
.. | ||
common.run | ||
freebsd.run | ||
linux.run | ||
longevity.run | ||
perf-regression.run | ||
sanity.run | ||
sunos.run |