zfs/cmd/zed/agents
Brian Behlendorf 95401cb6f7 Enable remaining tests
Enable most of the remaining test cases which were previously
disabled.  The required fixes are as follows:

* cache_001_pos - No changes required.

* cache_010_neg - Updated to use losetup under Linux.  Loopback
  cache devices are allowed, ZVOLs as cache devices are not.
  Disabled until all the builders pass reliably.

* cachefile_001_pos, cachefile_002_pos, cachefile_003_pos,
  cachefile_004_pos - Set set_device_dir path in cachefile.cfg,
  updated CPATH1 and CPATH2 to reference unique files.

* zfs_clone_005_pos - Wait for udev to create volumes.

* zfs_mount_007_pos - Updated mount options to expected Linux names.

* zfs_mount_009_neg, zfs_mount_all_001_pos - No changes required.

* zfs_unmount_005_pos, zfs_unmount_009_pos, zfs_unmount_all_001_pos -
  Updated to expect -f to not unmount busy mount points under Linux.

* rsend_019_pos - Observed to occasionally take a long time on both
  32-bit systems and the kmemleak builder.

* zfs_written_property_001_pos - Switched sync(1) to sync_pool.

* devices_001_pos, devices_002_neg - Updated create_dev_file() helper
  for Linux.

* exec_002_neg.ksh - Fixed mmap_exec.c to preserve errno.  Updated
  test case to expect EPERM from Linux as described by mmap(2).

* grow_pool_001_pos - Adding missing setup.ksh and cleanup.ksh
  scripts from OpenZFS.

* grow_replicas_001_pos.ksh - Added missing $SLICE_* variables.

* history_004_pos, history_006_neg, history_008_pos - Fixed by
  previous commits and were not enabled.  No changes required.

* zfs_allow_010_pos - Added missing spaces after assorted zfs
  commands in delegate_common.kshlib.

* inuse_* - Illumos dump device tests skipped.  Remaining test
  cases updated to correctly create required partitions.

* large_files_001_pos - Fixed largest_file.c to accept EINVAL
  as well as EFBIG as described in write(2).

* link_count_001 - Added nproc to required commands.

* umountall_001 - Updated to use umount -a.

* online_offline_001_* - Pull in OpenZFS change to file_trunc.c
  to make the '-c 0' option run the test in a loop.  Included
  online_offline.cfg file in all test cases.

* rename_dirs_001_pos - Updated to use the rename_dir test binary,
  pkill restricted to exact matches and total runtime reduced.

* slog_013_neg, write_dirs_002_pos - No changes required.

* slog_013_pos.ksh - Updated to use losetup under Linux.

* slog_014_pos.ksh - ZED will not be running, manually degrade
  the damaged vdev as expected.

* nopwrite_varying_compression, nopwrite_volume - Forced pool
  sync with sync_pool to ensure up to date property values.

* Fixed typos in ZED log messages.  Refactored zed_* helper
  functions to resolve all-syslog exit=1 errors in zedlog.

* zfs_copies_005_neg, zfs_get_004_pos, zpool_add_004_pos,
  zpool_destroy_001_pos, largest_pool_001_pos, clone_001_pos.ksh,
  clone_001_pos, - Skip until layering pools on zvols is solid.

* largest_pool_001_pos - Limited to 7eb pool, maximum
  supported size in 8eb-1 on Linux.

* zpool_expand_001_pos, zpool_expand_003_neg - Requires
  additional support from the ZED, updated skip reason.

* zfs_rollback_001_pos, zfs_rollback_002_pos - Properly cleanup
  busy mount points under Linux between test loops.

* privilege_001_pos, privilege_003_pos, rollback_003_pos,
  threadsappend_001_pos - Skip with log_unsupported.

* snapshot_016_pos - No changes required.

* snapshot_008_pos - Increased LIMIT from 512K to 2M and added
  sync_pool to avoid false positives.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6128
2017-05-22 12:34:32 -04:00
..
README.md Add illumos FMD ZFS logic to ZED -- phase 2 2016-11-07 15:01:38 -08:00
fmd_api.c Add illumos FMD ZFS logic to ZED -- phase 2 2016-11-07 15:01:38 -08:00
fmd_api.h Add illumos FMD ZFS logic to ZED -- phase 2 2016-11-07 15:01:38 -08:00
fmd_serd.c Fix coverity defects: 154021 2016-11-08 14:34:52 -08:00
fmd_serd.h Add illumos FMD ZFS logic to ZED -- phase 2 2016-11-07 15:01:38 -08:00
zfs_agents.c Fix spelling 2017-01-03 11:31:18 -06:00
zfs_agents.h Add illumos FMD ZFS logic to ZED -- phase 2 2016-11-07 15:01:38 -08:00
zfs_diagnosis.c Fix undefined reference to `libzfs_fru_compare' 2017-03-23 18:24:09 -07:00
zfs_mod.c Enable remaining tests 2017-05-22 12:34:32 -04:00
zfs_retire.c Fix undefined reference to `libzfs_fru_compare' 2017-03-23 18:24:09 -07:00

README.md

Fault Management Logic for ZED

The integration of Fault Management Daemon (FMD) logic from illumos is being deployed in three phases. This logic is encapsulated in several software modules inside ZED.

ZED+FM Phase 1

All the phase 1 work is in current Master branch. Phase I work includes:

  • Add new paths to the persistent VDEV label for device matching.
  • Add a disk monitor for generating disk-add and disk-change events.
  • Add support for automated VDEV auto-online, auto-replace and auto-expand.
  • Expand the statechange event to include all VDEV state transitions.

ZED+FM Phase 2 (WIP)

The phase 2 work primarily entails the Diagnosis Engine and the Retire Agent modules. It also includes infrastructure to support a crude FMD environment to host these modules. For additional information see the FMD Components in ZED and Implementation Notes sections below.

ZED+FM Phase 3

Future work will add additional functionality and will likely include:

  • Add FMD module garbage collection (periodically call fmd_module_gc()).
  • Add real module property retrieval (currently hard-coded in accessors).
  • Additional diagnosis telemetry (like latency outliers and SMART data).
  • Export FMD module statistics.
  • Zedlet parallel execution and resiliency (add watchdog).

ZFS Fault Management Overview

The primary purpose with ZFS fault management is automated diagnosis and isolation of VDEV faults. A fault is something we can associate with an impact (e.g. loss of data redundancy) and a corrective action (e.g. offline or replace a disk). A typical ZFS fault management stack is comprised of error detectors (e.g. zfs_ereport_post()), a disk monitor, a diagnosis engine and response agents.

After detecting a software error, the ZFS kernel module sends error events to the ZED user daemon which in turn routes the events to its internal FMA modules based on their event subscriptions. Likewise, if a disk is added or changed in the system, the disk monitor sends disk events which are consumed by a response agent.

FMD Components in ZED

There are three FMD modules (aka agents) that are now built into ZED.

  1. A Diagnosis Engine module (agents/zfs_diagnosis.c)
  2. A Retire Agent module (agents/zfs_retire.c)
  3. A Disk Add Agent module (agents/zfs_mod.c)

To begin with, a Diagnosis Engine consumes per-vdev I/O and checksum ereports and feeds them into a Soft Error Rate Discrimination (SERD) algorithm which will generate a corresponding fault diagnosis when the tracked VDEV encounters N events in a given T time window. The initial N and T values for the SERD algorithm are estimates inherited from illumos (10 errors in 10 minutes).

In turn, a Retire Agent responds to diagnosed faults by isolating the faulty VDEV. It will notify the ZFS kernel module of the new VDEV state (degraded or faulted). The retire agent is also responsible for managing hot spares across all pools. When it encounters a device fault or a device removal it will replace the device with an appropriate spare if available.

Finally, a Disk Add Agent responds to events from a libudev disk monitor (EC_DEV_ADD or EC_DEV_STATUS) and will online, replace or expand the associated VDEV. This agent is also known as the zfs_mod or Sysevent Loadable Module (SLM) on the illumos platform. The added disk is matched to a specific VDEV using its device id, physical path or VDEV GUID.

Note that the auto-replace feature (aka hot plug) is opt-in and you must set the pool's autoreplace property to enable it. The new disk will be matched to the corresponding leaf VDEV by physical location and labeled with a GPT partition before replacing the original VDEV in the pool.

Implementation Notes

  • The FMD module API required for logic modules is emulated and implemented in the fmd_api.c and fmd_serd.c source files. This support includes module registration, memory allocation, module property accessors, basic case management, one-shot timers and SERD engines. For detailed information on the FMD module API, see the document -- "Fault Management Daemon Programmer's Reference Manual".

  • The event subscriptions for the modules (located in a module specific configuration file on illumos) are currently hard-coded into the ZED zfs_agent_dispatch() function.

  • The FMD modules are called one at a time from a single thread that consumes events queued to the modules. These events are sourced from the normal ZED events and also include events posted from the diagnosis engine and the libudev disk event monitor.

  • The FMD code modules have minimal changes and were intentionally left as similar as possible to their upstream source files.

  • The sysevent namespace in ZED differs from illumos. For example:

    • illumos uses "resource.sysevent.EC_zfs.ESC_ZFS_vdev_remove"
    • Linux uses "sysevent.fs.zfs.vdev_remove"
  • The FMD Modules port was produced by Intel Federal, LLC under award number B609815 between the U.S. Department of Energy (DOE) and Intel Federal, LLC.