added ZFS events section

Richard Elling 2019-06-06 11:32:11 -07:00
parent bd1bebc2ff
commit bf6c2ffe00
1 changed files with 31 additions and 0 deletions

@ -22,4 +22,35 @@ Log files of interest: [Generic Kernel Log](#generic-kernel-log), [ZFS Kernel Mo
Important information: if a kernel thread is stuck, then a backtrace of the stuck thread can be in the logs.
In some cases, the stuck thread is not logged until the deadman timer expires. See also [debug tunables](https://github.com/zfsonlinux/zfs/wiki/ZFS-on-Linux-Module-Parameters#debug)
## ZFS Events
ZFS uses an event-based messaging interface for communication of important events to
other consumers running on the system. The ZFS Event Daemon (zed) is a userland daemon that
listens for these events and processes them. zed is extensible so you can write shell scripts
or other programs that subscribe to events and take action. For example, the script usually
installed at `/etc/zfs/zed.d/all-syslog.sh` writes a formatted event message to `syslog.`
See the man page for `zed(8)` for more information.
A history of events is also available via the `zpool events` command. This history begins at
ZFS kernel module load and includes events from any pool. These events are stored in RAM and
limited in count to a value determined by the kernel tunable [zfs_event_len_max](https://github.com/zfsonlinux/zfs/wiki/ZFS-on-Linux-Module-Parameters#zfs_zevent_len_max).
`zed` has an internal throttling mechanism to prevent overconsumption of system resources
processing ZFS events.
More detailed information about events is observable using `zpool events -v`
The contents of the verbose events is subject to change, based on the event and information
available at the time of the event.
Each event has a class identifier used for filtering event types. Commonly seen events are
those related to pool management with class `sysevent.fs.zfs.*` including import, export,
configuration updates, and `zpool history` updates.
Events related to errors are reported as class `ereport.*` These can be invaluable for
troubleshooting. Some faults can cause multiple ereports as various layers of the software
deal with the fault. For example, on a simple pool without parity protection, a faulty
disk could cause an `ereport.io` during a read from the disk that results in an
`erport.fs.zfs.checksum` at the pool level. These events are also reflected by the error
counters observed in `zpool status`
If you see checksum or read/write errors in `zpool status` then there should be one or more
corresponding ereports in the `zpool events` output.
# DRAFT