Solaris recently introduced the idea of drive topology because
where a drive is located does matter. I have already handled
this with udev/blkid integration under Linux so I'm hopeful
this case can simply be removed but for now I've just stubbed
out what is needed in libspl and commented out the rest here.
The upstream ZFS code has correctly moved to a faster native sha2
implementation. Unfortunately, under Linux that's going to be a little
problematic so we revert the code to the more portable version contained
in earlier ZFS releases. Using the native sha2 implementation in Linux
is possible but the API is slightly different in kernel version user
space depending on which libraries are used. Ideally, we need a fast
implementation of SHA256 which builds as part of ZFS this shouldn't be
that hard to do but it will take some effort.
This is a portability change which removes the dependence of the Solaris
thread library. All locations where Solaris thread API was used before
have been replaced with equivilant Solaris kernel style thread calls.
In user space the kernel style threading API is implemented in term of
the portable pthreads library. This includes all threads, mutexs,
condition variables, reader/writer locks, and taskqs.
The major change is removing the thread pool when importing devices.
This may be reintroduced at some point if needed, but it is added
complexity which has already been handled by blkid on modern Linux
systems. We only need to fallback to probing everything is /dev/
if you config file is toast and even then it only takes a few seconds.
Added print_timestamp() compatibility function, this will be needed
long term but it's a simply enough addition.
Added Solaris style label functions. This was done simply to aid in
the initial update to onnv_141. I'm hopeful that after more careful
inspection all of this can be removed and we can integrate with a
more Linux friendly Solution without breaking any compatibility.
Added several missing headers which are required by the updated
version of ZFS. As usual I just add empty headers if needed because
it's easier than tracking the change against the core ZFS code.
Added SEC, MILLISEC, MICROSEC defines if unavailable.
Added missing xuio structure and typedefs. I'm hopeful these can
be removed as well once we crack the zero-copy nut under Linux.
Almost exclusively this patch handled the addition of another char
array to the zfs_cmd_t structure. Unfortunately c90 doesn't allow
zero filling the entire struct with the '= { 0 };' shorthand.
While I would rather fix all the instances where something is shadowed
it complicates tracking the OpenSolaris code where they either don't
seem to care or have different conflicts. Anyway, this ends up being
more simply gratutous change than I care for.
These changes do not belong on linux-kernel-module since they
are tweaks to user space utilities. I'm reverting them from
this topic branch and will be moving them to a new topic branch
which can be used for just this sort of thing.
It turns out the zil allocates quite large buffers. This isn't
all the surprising but we need to suppress the warnings until
it's clear what to do about it.
Interestingly this has only been a problem on a clean RHEL6
install so I suspect the include was removed from one of the
standard system include headers. We should be including it
explicitly anyway since it's used in both of these .c files.
Fold in a minor fix from the removed linux-legacy branch to skip checking
character devices in /dev/ during import. Under Linux pread() may simply
block when reading from these devices causing the import to hang.
A change to the nvpair implementation should not have been made in
the libspl-topic branch. This patch fixes that accident by reverting
the change and providing the missing libspl header to allow the
proper building of nvpair_alloc_system.c without the need to modify it.
These changes were made when some development was still occuring
under RHEL4. Since then the needed fopendir() and atopen() APIs
have become commonly available under Linux. Since I have no
intention of supporting systems as old as RHEL4 we can safely
drop the topic branch.
This topic branch leverages the Solaris style FMA call points
in ZFS to create a user space visible event notification system
under Linux. This new system is called zevent and it unifies
all previous Solaris style ereports and sysevent notifications.
Under this Linux specific scheme when a sysevent or ereport event
occurs an nvlist describing the event is created which looks almost
exactly like a Solaris ereport. These events are queued up in the
kernel when they occur and conditionally logged to the console.
It is then up to a user space application to consume the events
and do whatever it likes with them.
To make this possible the existing /dev/zfs ABI has been extended
with two new ioctls which behave as follows.
* ZFS_IOC_EVENTS_NEXT
Get the next pending event. The kernel will keep track of the last
event consumed by the file descriptor and provide the next one if
available. If no new events are available the ioctl() will block
waiting for the next event. This ioctl may also be called in a
non-blocking mode by setting zc.zc_guid = ZEVENT_NONBLOCK. In the
non-blocking case if no events are available ENOENT will be returned.
It is possible that ESHUTDOWN will be returned if the ioctl() is
called while module unloading is in progress. And finally ENOMEM
may occur if the provided nvlist buffer is not large enough to
contain the entire event.
* ZFS_IOC_EVENTS_CLEAR
Clear are events queued by the kernel. The kernel will keep a fairly
large number of recent events queued, use this ioctl to clear the
in kernel list. This will effect all user space processes consuming
events.
The zpool command has been extended to use this events ABI with the
'events' subcommand. You may run 'zpool events -v' to output a
verbose log of all recent events. This is very similar to the
Solaris 'fmdump -ev' command with the key difference being it also
includes what would be considered sysevents under Solaris. You
may also run in follow mode with the '-f' option. To clear the
in kernel event queue use the '-c' option.
$ sudo cmd/zpool/zpool events -fv
TIME CLASS
May 13 2010 16:31:15.777711000 ereport.fs.zfs.config.sync
class = "ereport.fs.zfs.config.sync"
ena = 0x40982b7897700001
detector = (embedded nvlist)
version = 0x0
scheme = "zfs"
pool = 0xed976600de75dfa6
(end detector)
time = 0x4bec8bc3 0x2e5aed98
pool = "zpios"
pool_guid = 0xed976600de75dfa6
pool_context = 0x0
While the 'zpool events' command is handy for interactive debugging
it is not expected to be the primary consumer of zevents. This ABI
was primarily added to facilitate the addition of a user space
monitoring daemon. This daemon would consume all events posted by
the kernel and based on the type of event perform an action. For
most events simply forwarding them on to syslog is likely enough.
But this interface also cleanly allows for more sophisticated
actions to be taken such as generating an email for a failed drive
These changes lay some of the ground work for supporting something
similar to FMA event under Solaris. In particular these changes
add or modify the following areas.
First off an implementation of the gethrestime() function is added
to libspl. Secondly, the missing type processorid_t has been added.
And finally the lib/libspl/include/sys/fm/{protocol.h|util.h} stub
headers have been removed in favor of updating the full versions
in module/zfs/include/sys/fm/{protocol.h|util.h} to work cleanly
in both user and kernel space.