From bf6c2ffe00e20e0d8a77fd776c2df433eff8a117 Mon Sep 17 00:00:00 2001 From: Richard Elling Date: Thu, 6 Jun 2019 11:32:11 -0700 Subject: [PATCH] added ZFS events section --- Troubleshooting.md | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/Troubleshooting.md b/Troubleshooting.md index 496fc40..103821c 100644 --- a/Troubleshooting.md +++ b/Troubleshooting.md @@ -22,4 +22,35 @@ Log files of interest: [Generic Kernel Log](#generic-kernel-log), [ZFS Kernel Mo Important information: if a kernel thread is stuck, then a backtrace of the stuck thread can be in the logs. In some cases, the stuck thread is not logged until the deadman timer expires. See also [debug tunables](https://github.com/zfsonlinux/zfs/wiki/ZFS-on-Linux-Module-Parameters#debug) +## ZFS Events +ZFS uses an event-based messaging interface for communication of important events to +other consumers running on the system. The ZFS Event Daemon (zed) is a userland daemon that +listens for these events and processes them. zed is extensible so you can write shell scripts +or other programs that subscribe to events and take action. For example, the script usually +installed at `/etc/zfs/zed.d/all-syslog.sh` writes a formatted event message to `syslog.` +See the man page for `zed(8)` for more information. + +A history of events is also available via the `zpool events` command. This history begins at +ZFS kernel module load and includes events from any pool. These events are stored in RAM and +limited in count to a value determined by the kernel tunable [zfs_event_len_max](https://github.com/zfsonlinux/zfs/wiki/ZFS-on-Linux-Module-Parameters#zfs_zevent_len_max). +`zed` has an internal throttling mechanism to prevent overconsumption of system resources +processing ZFS events. + +More detailed information about events is observable using `zpool events -v` +The contents of the verbose events is subject to change, based on the event and information +available at the time of the event. + +Each event has a class identifier used for filtering event types. Commonly seen events are +those related to pool management with class `sysevent.fs.zfs.*` including import, export, +configuration updates, and `zpool history` updates. + +Events related to errors are reported as class `ereport.*` These can be invaluable for +troubleshooting. Some faults can cause multiple ereports as various layers of the software +deal with the fault. For example, on a simple pool without parity protection, a faulty +disk could cause an `ereport.io` during a read from the disk that results in an +`erport.fs.zfs.checksum` at the pool level. These events are also reflected by the error +counters observed in `zpool status` +If you see checksum or read/write errors in `zpool status` then there should be one or more +corresponding ereports in the `zpool events` output. + # DRAFT