Commit Graph

256 Commits

Author SHA1 Message Date
Brian Behlendorf 343bcf823f Refresh autogen products with automake 1.11.1 2010-05-21 16:04:53 -07:00
Brian Behlendorf e3a272f273 Merge commit 'refs/top-bases/linux-configure-branch' into linux-configure-branch 2010-05-21 15:37:10 -07:00
Brian Behlendorf 542052314e Merge branch 'linux-events' into refs/top-bases/linux-zfs-branch 2010-05-21 15:35:53 -07:00
Brian Behlendorf 1573fc9932 Cast to (const time_t *) to fix compiler type warning. 2010-05-21 15:06:25 -07:00
Brian Behlendorf 3193647679 Merge commit 'refs/top-bases/linux-configure-branch' into linux-configure-branch 2010-05-21 10:51:52 -07:00
Brian Behlendorf 14152c7b2c Merge branch 'linux-user-util' into refs/top-bases/linux-zfs-branch 2010-05-21 10:49:47 -07:00
Brian Behlendorf 868e5de066 Add linux-user-util topic branch.
This topic branch contains required changes to the user space
utilities to allow them to integrate cleanly with Linux.
2010-05-21 10:47:59 -07:00
Brian Behlendorf 6e06d537fe Refresh autogen products 2010-05-14 13:36:31 -07:00
Brian Behlendorf 66c45ef1f4 Merge commit 'refs/top-bases/linux-configure-branch' into linux-configure-branch 2010-05-14 12:59:08 -07:00
Brian Behlendorf feb723fa7d Merge branch 'linux-events' into refs/top-bases/linux-zfs-branch 2010-05-14 12:55:10 -07:00
Brian Behlendorf 7d30d5032a Merge branch 'linux-docs' into refs/top-bases/linux-zfs-branch 2010-05-14 12:53:12 -07:00
Brian Behlendorf 97d19a5e45 Add linux-events topic branch for zevent handling.
This topic branch leverages the Solaris style FMA call points
in ZFS to create a user space visible event notification system
under Linux.  This new system is called zevent and it unifies
all previous Solaris style ereports and sysevent notifications.

Under this Linux specific scheme when a sysevent or ereport event
occurs an nvlist describing the event is created which looks almost
exactly like a Solaris ereport.  These events are queued up in the
kernel when they occur and conditionally logged to the console.
It is then up to a user space application to consume the events
and do whatever it likes with them.

To make this possible the existing /dev/zfs ABI has been extended
with two new ioctls which behave as follows.

* ZFS_IOC_EVENTS_NEXT
Get the next pending event.  The kernel will keep track of the last
event consumed by the file descriptor and provide the next one if
available.  If no new events are available the ioctl() will block
waiting for the next event.  This ioctl may also be called in a
non-blocking mode by setting zc.zc_guid = ZEVENT_NONBLOCK.  In the
non-blocking case if no events are available ENOENT will be returned.
It is possible that ESHUTDOWN will be returned if the ioctl() is
called while module unloading is in progress.  And finally ENOMEM
may occur if the provided nvlist buffer is not large enough to
contain the entire event.

* ZFS_IOC_EVENTS_CLEAR
Clear are events queued by the kernel.  The kernel will keep a fairly
large number of recent events queued, use this ioctl to clear the
in kernel list.  This will effect all user space processes consuming
events.

The zpool command has been extended to use this events ABI with the
'events' subcommand.  You may run 'zpool events -v' to output a
verbose log of all recent events.  This is very similar to the
Solaris 'fmdump -ev' command with the key difference being it also
includes what would be considered sysevents under Solaris.  You
may also run in follow mode with the '-f' option.  To clear the
in kernel event queue use the '-c' option.

$ sudo cmd/zpool/zpool events -fv
TIME                        CLASS
May 13 2010 16:31:15.777711000 ereport.fs.zfs.config.sync
        class = "ereport.fs.zfs.config.sync"
        ena = 0x40982b7897700001
        detector = (embedded nvlist)
                version = 0x0
                scheme = "zfs"
                pool = 0xed976600de75dfa6
        (end detector)

        time = 0x4bec8bc3 0x2e5aed98
        pool = "zpios"
        pool_guid = 0xed976600de75dfa6
        pool_context = 0x0

While the 'zpool events' command is handy for interactive debugging
it is not expected to be the primary consumer of zevents.  This ABI
was primarily added to facilitate the addition of a user space
monitoring daemon.  This daemon would consume all events posted by
the kernel and based on the type of event perform an action.  For
most events simply forwarding them on to syslog is likely enough.
But this interface also cleanly allows for more sophisticated
actions to be taken such as generating an email for a failed drive
2010-05-14 12:40:44 -07:00
Brian Behlendorf 98d5d8bd50 Add missing include path for FMA aware zpool command. 2010-05-14 11:57:48 -07:00
Brian Behlendorf 7b34c839f9 Merge commit 'refs/top-bases/linux-configure-branch' into linux-configure-branch 2010-04-23 11:09:04 -07:00
Brian Behlendorf aafdbe5d6e Check all partitions with check_file() even when no libblkid is found
When creating a new pool on a block device we need to check all the
partitions even if we don't have liblkdid support.  In this case
we can't consult the blkid cache but we still can call check_file()
and attempt to read a valid label from each partition.

Additionally, the O_EXCL flag was removed because the device will
be opened multiple times and this was causing the check the file.
The device is only opened read-only anyway so this is still safe.

$ sudo zpool create tank /dev/sdz
invalid vdev specification
use '-f' to override the following errors:
/dev/sdz1 is part of potentially active pool 'tank'
2010-04-23 10:59:31 -07:00
Brian Behlendorf 34edbcd956 Refresh autogen products 2010-03-26 15:57:19 -07:00
Brian Behlendorf ddb2e7a5c5 Refresh autogen products 2010-03-22 17:01:13 -07:00
Brian Behlendorf 02d15b4e4f Refresh autogen products 2010-03-08 10:57:16 -08:00
Brian Behlendorf 5786166a86 Merge commit 'refs/top-bases/linux-configure-branch' into linux-configure-branch 2010-01-08 11:40:38 -08:00
Brian Behlendorf 60c9121dd2 Merge commit 'refs/top-bases/linux-user-disk' into linux-user-disk 2010-01-08 11:40:14 -08:00
Brian Behlendorf 889f0e5e30 Merge branch 'linux-docs' into refs/top-bases/linux-zfs-branch 2010-01-08 11:39:32 -08:00
Brian Behlendorf 303d9f010d Merge commit 'refs/top-bases/zfs-branch' into zfs-branch 2010-01-08 11:39:31 -08:00
Brian Behlendorf 6cb71e1dec Merge branch 'gcc-branch' into refs/top-bases/zfs-branch 2010-01-08 11:39:14 -08:00
Brian Behlendorf e69572c1b5 Merge branch 'gcc-c90' into refs/top-bases/gcc-branch 2010-01-08 11:39:00 -08:00
Brian Behlendorf 4cd8e49a69 Add .gitignore files to exclude build products 2010-01-08 11:35:17 -08:00
Brian Behlendorf 9b473082fa Refresh autogen products 2009-12-23 14:53:51 -08:00
Brian Behlendorf 840aa5356d Refresh autogen products 2009-11-20 12:14:59 -08:00
Brian Behlendorf 8a662b7de1 Merge commit 'refs/top-bases/linux-configure-branch' into linux-configure-branch 2009-11-20 12:12:37 -08:00
Brian Behlendorf aebe6818a9 Linux ZVOL implementation; user-side changes
At last a useful user space interface for the Linux ZFS port arrives.
With the addition of the ZVOL real ZFS based block devices are available
and can be compared head to head with Linux's MD and LVM block drivers.
The Linux ZVOL has not yet had any performance work done but from a user
perspective it should be functionally complete and behave like any other
Linux block device.

The ZVOL has so far been tested using zconfig.sh on the following x86_64
based platforms: FC11, CHAOS4, RHEL5, RHEL6, and SLES11.  However, more
testing is required to ensure everything is working as designed.

What follows in a somewhat detailed list of changes includes in this
commit to make ZVOL's possible.  A few other issues were addressed in
the context of these changes which will also be mentioned.

* zvol_create_link_common() simplified to simply issue to ioctl to
create the device and then wait up to 10 seconds for it to appear.
The device will be created within a few miliseconds by udev under
/dev/<pool>/<volume>.  Note this naming convention is slightly
different than on Solaris by I feel is more Linuxy.

* Removed support for dump vdevs.  This concept is specific to Solaris
and done not map cleanly to Linux.  Under Linux generating system cores
is perferably done over the network via netdump, or alternately to a
block device via O_DIRECT.
2009-11-20 12:00:08 -08:00
Brian Behlendorf 81c56431ae Refresh autogen products 2009-11-16 10:42:39 -08:00
Brian Behlendorf 6caa088ff3 Merge commit 'refs/top-bases/linux-configure-branch' into linux-configure-branch 2009-11-15 16:11:08 -08:00
Brian Behlendorf e576375b9f Merge branch 'linux-have-zpl' into refs/top-bases/linux-zfs-branch 2009-11-15 16:11:05 -08:00
Brian Behlendorf e588ef08cb Revert contents of linux-have-zpl topic branch. 2009-11-15 16:06:10 -08:00
Brian Behlendorf 75b67634af Refresh autogen products. 2009-11-12 12:57:46 -08:00
Brian Behlendorf d34108ca05 Merge commit 'refs/top-bases/linux-configure-branch' into linux-configure-branch 2009-10-27 15:04:33 -07:00
Brian Behlendorf 22c51d6136 Merge branch 'linux-user-disk' into refs/top-bases/linux-zfs-branch 2009-10-27 15:03:46 -07:00
Brian Behlendorf a9accbcb57 Always open using O_EXCL to ensure the device is not in use.
Allow partition tables on md devices but not dm- devices.
2009-10-27 14:58:12 -07:00
Brian Behlendorf e71166a384 Merge commit 'refs/top-bases/linux-configure-branch' into linux-configure-branch 2009-10-23 16:33:24 -07:00
Brian Behlendorf 1a42b319f6 Merge branch 'linux-user-disk' into refs/top-bases/linux-zfs-branch 2009-10-23 16:33:22 -07:00
Brian Behlendorf 29c9a2518c Properly handle block devices other the IDE and SCSI disks.
Based on the block device type we can expect a specific naming
convention.  With this in mind update efi_get_info() to be more
aware of the type when parsing out the partition number.  In,
addition be aware that all block device types are not partitionable.
Finally, when attempting to lookup a device partition by appending
the partition number to the whole device take in to account the
kernel naming scheme.  If the last character of the device name
is a digit the partition will always be 'p#' instead of just '#'.
2009-10-23 16:25:16 -07:00
Brian Behlendorf f11e5e26e2 Refresh autogen products 2009-10-23 12:34:20 -07:00
Brian Behlendorf c13c22eaad Merge commit 'refs/top-bases/linux-configure-branch' into linux-configure-branch 2009-10-23 12:29:26 -07:00
Brian Behlendorf a227047d89 Merge commit 'refs/top-bases/linux-have-zpl' into linux-have-zpl
Conflicts:

	cmd/zfs/zfs_main.c
2009-10-23 12:29:02 -07:00
Brian Behlendorf e8e3a8ae70 Merge branch 'linux-user-disk' into refs/top-bases/linux-zfs-branch 2009-10-23 12:28:12 -07:00
Brian Behlendorf 8a34963bec Merge commit 'refs/top-bases/linux-user-disk' into linux-user-disk 2009-10-23 12:28:10 -07:00
Brian Behlendorf a56b8d337f Merge branch 'linux-docs' into refs/top-bases/linux-zfs-branch
Conflicts:

	cmd/zfs/zfs_main.c
2009-10-23 12:25:33 -07:00
Brian Behlendorf 74b67983f1 Merge commit 'refs/top-bases/zfs-branch' into zfs-branch 2009-10-23 12:24:39 -07:00
Brian Behlendorf edb22b6a3e Merge branch 'gcc-branch' into refs/top-bases/zfs-branch 2009-10-23 12:24:38 -07:00
Brian Behlendorf d8d360724d Merge branch 'gcc-uninit' into refs/top-bases/gcc-branch 2009-10-23 12:24:37 -07:00
Brian Behlendorf 24f3d6e49e Misc fixed based on testing with the dragon config.
In check_disk() we should only check the entire device if it
not a whole disk.  It is a whole disk with an EFI label on it,
it is possible that libblkid will misidentify the device as a
filesystem.  I had a case yesterday where 2 bytes in the EFI
GUID happened we set to the right values such that libblkid
decided there was a minux filesystem there.  If it's a whole
device we look for a EFI label.

If we are able to read the backup EFI label from a device but
the primary is corrupt.  Then don't bother trying to stat
the partitions in /dev/ the kernel will not create devices
using the backup label when the primary is damaged.

Add code to determine if we have a udev path instead of a
normal device path.  In this case use the -part# partition
naming scheme instead of the /dev/disk# scheme.  This is
important because we always want to access devices using
the full path provided at configuration time.

Readded support for zpool_relabel_disk() now that we have
the full libefi library in place we do have access to this
functionality.

Lots of additional paranoia to ensure EFI label are written
correctly.  These changes include:

1) Removing the O_NDELAY flag when opening a file descriptor
for libefi.  This flag should really only be used when you
do not intend to do any file IO.  Under Solaris only ioctl()'s
were performed under linux we do perform reads and writes.

2) Use O_DIRECT to ensure any caching is bypassed while
writing or reading the EFI labels.  This change forces the
use of sector aligned memory buffers which are allocated
using posix_memalign().

3) Add additional efi_debug error messages to efi_ioctl().

4) While doing a fsync is good to ensure the EFI label is on
disk we can, and should go one step futher by issuing the
BLKFLSBUF ioctl().  This signals the kernel to instruct the
drive to flush it's on-disk cache.

5) Because of some initial strangeness I observed in testing
with some flakey drives be extra paranoid in zpool_label_disk().
After we've written the device without error, flushed the drive
caches, correctly detected the new partitions created by the
kernel.  Then additionally read back the EFI label from user
space to make sure it is intact and correct.  I don't think we
can ever be to careful here.

NOTE: The was recently some concern expressed that writing EFI
labels from user space on Linux was not the right way to do this.
That instead two kernel ioctl()s should be used to create and
remove partitions.  After some investigation it's clear to me
using those ioctl() would be a bad idea.  The in fact don't
actually write partition tables to the disk, they only create
the partition devices in the kernel.  So what you really want
to do is write the label out from user space, then prompt the
kernel to re-read the partition from disk to create the partitions.
This is in fact exactly what newer version of parted do.
2009-10-23 11:57:59 -07:00