2a493a4c71
Coverity complained about unchecked return values and unused values that turned out to be unused return values. Different approaches were used to handle the different cases of unchecked return values: * cmd/zdb/zdb.c: VERIFY0 was used in one place since the existing code had no error handling. An error message was printed in another to match the rest of the code. * cmd/zed/agents/zfs_retire.c: We dismiss the return value with `(void)` because the value is expected to be potentially unset. * cmd/zpool_influxdb/zpool_influxdb.c: We dismiss the return value with `(void)` because the values are expected to be potentially unset. * cmd/ztest.c: VERIFY0 was used since we want failures if something goes wrong in ztest. * module/zfs/dsl_dir.c: We dismiss the return value with `(void)` because there is no guarantee that the zap entry will always be there. For example, old pools imported readonly would not have it and we do not want to fail here because of that. * module/zfs/zfs_fm.c: `fnvlist_add_*()` was used since the allocations sleep and thus can never fail. * module/zfs/zvol.c: We dismiss the return value with `(void)` because we do not need it. This matches what is already done in the analogous `zfs_replay_write2()`. * tests/zfs-tests/cmd/draid.c: We suppress one return value with `(void)` since the code handles errors already. The other return value is handled by switching to `fnvlist_lookup_uint8_array()`. * tests/zfs-tests/cmd/file/file_fadvise.c: We add error handling. * tests/zfs-tests/cmd/mmap_sync.c: We add error handling for munmap, but ignore failures on remove() with (void) since it is expected to be able to fail. * tests/zfs-tests/cmd/mmapwrite.c: We add error handling. As for unused return values, they were all in places where there was error handling, so logic was added to handle the return values. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes #13920 |
||
---|---|---|
.. | ||
README.md | ||
fmd_api.c | ||
fmd_api.h | ||
fmd_serd.c | ||
fmd_serd.h | ||
zfs_agents.c | ||
zfs_agents.h | ||
zfs_diagnosis.c | ||
zfs_mod.c | ||
zfs_retire.c |
README.md
Fault Management Logic for ZED
The integration of Fault Management Daemon (FMD) logic from illumos is being deployed in three phases. This logic is encapsulated in several software modules inside ZED.
ZED+FM Phase 1
All the phase 1 work is in current Master branch. Phase I work includes:
- Add new paths to the persistent VDEV label for device matching.
- Add a disk monitor for generating disk-add and disk-change events.
- Add support for automated VDEV auto-online, auto-replace and auto-expand.
- Expand the statechange event to include all VDEV state transitions.
ZED+FM Phase 2 (WIP)
The phase 2 work primarily entails the Diagnosis Engine and the Retire Agent modules. It also includes infrastructure to support a crude FMD environment to host these modules. For additional information see the FMD Components in ZED and Implementation Notes sections below.
ZED+FM Phase 3
Future work will add additional functionality and will likely include:
- Add FMD module garbage collection (periodically call
fmd_module_gc()
). - Add real module property retrieval (currently hard-coded in accessors).
- Additional diagnosis telemetry (like latency outliers and SMART data).
- Export FMD module statistics.
- Zedlet parallel execution and resiliency (add watchdog).
ZFS Fault Management Overview
The primary purpose with ZFS fault management is automated diagnosis
and isolation of VDEV faults. A fault is something we can associate
with an impact (e.g. loss of data redundancy) and a corrective action
(e.g. offline or replace a disk). A typical ZFS fault management stack
is comprised of error detectors (e.g. zfs_ereport_post()
), a disk
monitor, a diagnosis engine and response agents.
After detecting a software error, the ZFS kernel module sends error events to the ZED user daemon which in turn routes the events to its internal FMA modules based on their event subscriptions. Likewise, if a disk is added or changed in the system, the disk monitor sends disk events which are consumed by a response agent.
FMD Components in ZED
There are three FMD modules (aka agents) that are now built into ZED.
- A Diagnosis Engine module (
agents/zfs_diagnosis.c
) - A Retire Agent module (
agents/zfs_retire.c
) - A Disk Add Agent module (
agents/zfs_mod.c
)
To begin with, a Diagnosis Engine consumes per-vdev I/O and checksum ereports and feeds them into a Soft Error Rate Discrimination (SERD) algorithm which will generate a corresponding fault diagnosis when the tracked VDEV encounters N events in a given T time window. The initial N and T values for the SERD algorithm are estimates inherited from illumos (10 errors in 10 minutes).
In turn, a Retire Agent responds to diagnosed faults by isolating the faulty VDEV. It will notify the ZFS kernel module of the new VDEV state (degraded or faulted). The retire agent is also responsible for managing hot spares across all pools. When it encounters a device fault or a device removal it will replace the device with an appropriate spare if available.
Finally, a Disk Add Agent responds to events from a libudev disk
monitor (EC_DEV_ADD
or EC_DEV_STATUS
) and will online, replace or
expand the associated VDEV. This agent is also known as the zfs_mod
or Sysevent Loadable Module (SLM) on the illumos platform. The added
disk is matched to a specific VDEV using its device id, physical path
or VDEV GUID.
Note that the auto-replace feature (aka hot plug) is opt-in and you
must set the pool's autoreplace
property to enable it. The new disk
will be matched to the corresponding leaf VDEV by physical location
and labeled with a GPT partition before replacing the original VDEV
in the pool.
Implementation Notes
-
The FMD module API required for logic modules is emulated and implemented in the
fmd_api.c
andfmd_serd.c
source files. This support includes module registration, memory allocation, module property accessors, basic case management, one-shot timers and SERD engines. For detailed information on the FMD module API, see the document -- "Fault Management Daemon Programmer's Reference Manual". -
The event subscriptions for the modules (located in a module specific configuration file on illumos) are currently hard-coded into the ZED
zfs_agent_dispatch()
function. -
The FMD modules are called one at a time from a single thread that consumes events queued to the modules. These events are sourced from the normal ZED events and also include events posted from the diagnosis engine and the libudev disk event monitor.
-
The FMD code modules have minimal changes and were intentionally left as similar as possible to their upstream source files.
-
The sysevent namespace in ZED differs from illumos. For example:
- illumos uses
"resource.sysevent.EC_zfs.ESC_ZFS_vdev_remove"
- Linux uses
"sysevent.fs.zfs.vdev_remove"
- illumos uses
-
The FMD Modules port was produced by Intel Federal, LLC under award number B609815 between the U.S. Department of Energy (DOE) and Intel Federal, LLC.