Commit Graph

168 Commits

Author SHA1 Message Date
Chris Dunlap de6d197683 Fix io-spare.sh to work with disk vdevs
The "zpool status" output shows the full pathname for file-type vdevs,
but only the basename component for disk-type vdevs.  In commit
bee6665, the "basename" command was dropped from altering the vdev
name used when searching the "zpool status" output.  Consequently,
hot-disk sparing for disk vdevs broke since "zpool status" output
was now being searched for the full pathname to the disk vdev.

Parsing the "zpool status" output in this manner is rather brittle.
It would be preferable to search for the vdev based on its guid.
But until that happens, this commit adds back the "basename" command
to fix the vdev name breakage.

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3310
2015-04-17 14:21:26 -07:00
Chunwei Chen aa506dcb3d Fix build error when make deb
After 53698a4, the following error occurs when make deb.

  CCLD     zed
../../lib/libzfs/.libs/libzfs.so: undefined reference to `get_system_hostid'

Add libzpool.la to zed/Makefile.am to fix this

Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3080
2015-02-06 09:16:32 -08:00
Chris Dunlap 2c41df5bf8 Cleanup _zed_event_add_nvpair()
When _zed_event_add_var() was updated to be the common routine
for adding zedlet environment variables, an additional snprintf()
was added to the processing of each nvpair.  This commit changes
_zed_event_add_nvpair() to directly call _zed_event_add_var()
for nvpair non-array types, thereby removing a superfluous call to
snprintf().  For consistency, the helper functions for converting
nvpair array types are similarly adjusted to add variables.

The _zed_event_value_is_hex() and _zed_event_add_var() functions have
been moved up in the file since forward declarations are not used,
but no changes have been made to these functions.

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3042
2015-01-30 14:46:44 -08:00
Chris Dunlap 854f30a91f Protect against adding duplicate strings in ZED
The zed_strings container stores strings in an AVL, but does not
check for duplicate strings being added.  Within the AVL, strings
are indexed by the string value itself.  avl_add() requires the node
being added must not already exist in the tree, and will assert()
if this is not the case.

This should not cause problems in practice.  ZED uses this container
in two places.  In zed_conf.c, it is used to store the names of
enabled zedlets as zed scans the zedlet directory listing; duplicate
entries cannot occur here since duplicate names cannot occur within
a directory.  In zed_event.c, it is used to store the environment
variables (as "NAME=VALUE" strings) that will be passed to zedlets;
duplicate strings here should never happen unless there is a bug
resulting in a duplicate nvpair or environment variable.

This commit protects against adding a duplicate to a zed_strings
container by first checking for the string being added, and removing
the previous entry should one exist.  This implements a "last one
wins" policy.

This commit also changes the prototype for zed_strings_add() to allow
the string key (by which it is indexed in the AVL) to differ from
the string value.  By adding zedlet environment variables using the
variable name as the key, multiple adds for the same variable name
will result in only the last value being stored.

Finally, this commit routes all additions of zedlet environment
variables through the updated _zed_event_add_var().  This ensures
all zedlet environment variable names are properly converted.

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3042
2015-01-30 14:46:17 -08:00
Chris Dunlap 8ac9b5e6b5 Cleanup struct zed_conf vars in zed_conf_destroy
Reset struct zed_conf file descriptors to -1 after close(),
and pointers to NULL after free().

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2756
2014-10-06 13:18:11 -07:00
Chris Dunlap 56697c4264 Obtain advisory lock on ZED PID file
ZED uses an advisory lock on its state file to protect against
multiple instances running concurrently.  However, work is planned
to move this state information into the kernel, and ZED will still
need to protect against starting multiple instances.

This commit adds an advisory lock on the PID file to protect against
starting multiple instances.  A lock failure can be overridden with
the "-f" (force) command-line option.  The advisory lock on the state
file is being retained for as long as the state information is stored
in the state file.

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2756
2014-10-06 13:17:40 -07:00
Chris Dunlap dcca723ace Refer to ZED's scripts as ZEDLETs
The executables invoked by the ZED in response to a given zevent
have been generically referred to as "scripts".  By convention,
these scripts have aimed to be /bin/sh compatible for reasons of
portability and comprehensibility.  However, the ZED only requires
they be executable and (ideally) capable of reading environment
variables.  As such, these scripts are now referred to as ZEDLETs
(ZFS Event Daemon Linkage for Executable Tasks).

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2735
2014-09-25 13:54:17 -07:00
Chris Dunlap 8cb8cf91df Replace zed's use of malloc with calloc
When zed allocates memory via malloc(), it typically follows that
with a memset().  However, calloc() implementations can often perform
optimizations when zeroing memory:

https://stackoverflow.com/questions/2688466/why-mallocmemset-is-slower-than-calloc

This commit replaces zed's use of malloc() with calloc().

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2736
2014-09-25 13:43:57 -07:00
Chris Dunlap bee6665b88 Fix zed io-spare.sh dash incompatibility
The zed's io-spare.sh script defines a vdev_status() function to query
the 'zpool status' output for obtaining the status of a specified vdev.
This function contains a small awk script that uses a parameter
expansion (${parameter/pattern/string}) supported in bash but not
in dash.  Under dash, this fails with a "Bad substitution" error.

This commit replaces the awk script with a (hopefully more portable)
sed script that has been tested under both bash and dash.

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2536
2014-09-23 15:32:30 -07:00
Chris Dunlap 5043deaa40 Remove reverse indentation from zed comments.
Remove all occurrences of reverse indentation from zed comments for
consistency within the project code base.

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2695
2014-09-22 12:17:53 -07:00
louwrentius bcd9624d0f Change delimiter for ZED email scripts
When the ZED_EMAIL_INTERVAL_SECS="3600" option is set in zed.rc
configuration file then notification emails should be rate limited.

Rate limiting is accomplished by maintaining a colon delimited state
file which includes the device name.  Unfortunately there are valid
device names which include a colon and therefore prevent the rate
limiting for working properly.  For this reason the delimiter has
been changed to a semi-colon.

Signed-off-by: louwrentius <louwrentius@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Closes #2645
2014-09-02 14:18:54 -07:00
Chris Dunlap 8125fb7190 Cleanup zed logging
This is a set of minor cleanup changes related to zed logging:
- Remove the program identity prefix from messages written to stderr
  since systemd already prepends this output with the program name.
- Replace the copy of the program identity string with a ptr reference.
- Replace "pid" with "PID" for consistency in comments & strings.
- Rename the zed_log.c struct _ctx component "level" to "priority".
- Add the LOG_PID option for messages written to syslog.

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2252
2014-09-02 14:18:53 -07:00
Chris Dunlap 5a8855b716 Fix race condition with zed pidfile creation
When the zed is started as a forking daemon (by default),
a race-condition exists where the parent process can terminate before
the pidfile has been created by the grandchild process.  When invoked
as a Type=forking systemd service, this can result in the following:

  systemd[1]: Starting ZFS Event Daemon (zed)...
  systemd[1]: PID file /var/run/zed.pid not readable (yet?) after start.

This commit adds a daemonize pipe to allow the grandchild process to
signal the parent process that initialization is complete (and the
pidfile has been created).  The parent process will wait for this
notification before exiting.

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2252
2014-09-02 14:18:53 -07:00
Tim Chase 603cb25ca5 zed needs libzfs_core
As of a recent group of Illumos/Delphix updates, zed needs libzfs_core
in order to resolve lzc_get_bookmarks() and likely other functions
going forward.

Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2534
2014-07-31 09:49:01 -07:00
Chris Dunlap 6ac770b196 Replace zed_file_create_dirs() with mkdirp()
When processing directory components starting from the root dir,
zed_file_create_dirs() contained a bug in checking the return value of
mkdir().  A typo was made, and the test for (mkdir_errno != EEXIST) was
erroneously written as (mkdir_errno == EEXIST).  If some of the leading
directory components already existed, this bug would cause the routine
to exit before creating the remaining directory components.

Instead of fixing the above mkdir_errno test, this commit replaces
zed_file_create_dirs() with mkdirp().  This cleanup was already
planned, and zed_file_create_dirs() only existed because I didn't
realize mkdirp() was already in tree at the time.

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2248
2014-04-09 13:32:54 -07:00
Chris Dunlap 518eba1492 Replace check for _POSIX_MEMLOCK w/ HAVE_MLOCKALL
zed supports a '-M' cmdline opt to lock all pages in memory via
mlockall().  The _POSIX_MEMLOCK define is checked to determine whether
this function is supported.  The current test assumes mlockall()
is supported if _POSIX_MEMLOCK is non-zero.  However, this test is
insufficient according to mlock(2) and sysconf(3).  If _POSIX_MEMLOCK
is -1, mlockall() is not supported; but if _POSIX_MEMLOCK is 0,
availability must be checked at runtime.

This commit adds an autoconf check for mlockall() to user.m4.  The zed
code block for mlockall() is now guarded with a test for HAVE_MLOCKALL.
If defined, mlockall() will be called and its runtime availability
checked via its return value.

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2
2014-04-02 13:10:08 -07:00
Brian Behlendorf 904ea2763e Add automatic hot spare functionality
When a vdev starts getting I/O or checksum errors it is now
possible to automatically rebuild to a hot spare device.

To cleanly support this functionality in a shell script some
additional information was added to all zevent ereports which
include a vdev.  This covers both io and checksum zevents but
may be used but other scripts.

In the Illumos FMA solution the same information is required
but it is retrieved through the libzfs library interface.
Specifically the following members were added:

  vdev_spare_paths  - List of vdev paths for all hot spares.
  vdev_spare_guids  - List of vdev guids for all hot spares.
  vdev_read_errors  - Read errors for the problematic vdev
  vdev_write_errors - Write errors for the problematic vdev
  vdev_cksum_errors - Checksum errors for the problematic vdev.

By default the required hot spare scripts are installed but this
functionality is disabled.  To enable hot sparing uncomment the
ZED_SPARE_ON_IO_ERRORS and ZED_SPARE_ON_CHECKSUM_ERRORS in the
/etc/zfs/zed.d/zed.rc configuration file.

These scripts do no add support for the autoexpand property. At
a minimum this requires adding a new udev rule to detect when
a new device is added to the system.  It also requires that the
autoexpand policy be ported from Illumos, see:

  https://github.com/illumos/illumos-gate/blob/master/usr/src/cmd/syseventd/modules/zfs_mod/zfs_mod.c

Support for detecting the correct name of a vdev when it's not
a whole disk was added by Turbo Fredriksson.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Issue #2
2014-04-02 13:10:08 -07:00
Chris Dunlap 9e246ac3d8 Initial implementation of zed (ZFS Event Daemon)
zed monitors ZFS events.  When a zevent is posted, zed will run any
scripts that have been enabled for the corresponding zevent class.
Multiple scripts may be invoked for a given zevent.  The zevent
nvpairs are passed to the scripts as environment variables.

Events are processed synchronously by the single thread, and there is
no maximum timeout for script execution.  Consequently, a misbehaving
script can delay (or forever block) the processing of subsequent
zevents.  Plans are to address this in future commits.

Initial scripts have been developed to log events to syslog
and send email in response to checksum/data/io errors and
resilver.finish/scrub.finish events.  By default, email will only
be sent if the ZED_EMAIL variable is configured in zed.rc (which is
serving as a config file of sorts until a proper configuration file
is implemented).

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2
2014-04-02 13:10:03 -07:00