Archive-Team/zfs - zfs - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
James Lee	16a276f109	zfs-import: Perform verbatim import using cache file This change modifies the import service to use the default cache file to perform a verbatim import of pools at boot. This fixes code that searches all devices and imported all visible pools. Using the cache file is in keeping with the way ZFS has always worked, how Solaris, Illumos, FreeBSD, and systemd performs imports, and is how it is written in the man page (zpool(1M,8)): All pools in this cache are automatically imported when the system boots. Importantly, the cache contains important information for importing multipath devices, and helps control which pools get imported in more dynamic environments like SANs, which may have thousands of visible and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable is not equipped to handle. Verbatim imports prevent rogue pools from being automatically imported and mounted where they shouldn't be. The change also stops the service from exporting pools at shutdown. Exporting pools is only meant to be performed explicitly by the administrator of the system. The old behavior of searching and importing all visible pools is preserved and can be switched on by heeding the warning and toggling the ZPOOL_IMPORT_ALL_VISIBLE variable in /etc/default/zfs. Signed-off-by: James Lee <jlee@thestaticvoid.com> Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3777 Closes #3526	2015-10-13 14:33:49 -07:00
Turbo Fredriksson	99108db0a8	Init script fixes * Fix regression - "OVERLAY_MOUNTS" should have been "DO_OVERLAY_MOUNTS". * Fix update-rc.d commands in postinst. Thanx to subzero79@GitHub. * Fix make sure a filesystem exists before trying to mount in mount_fs() * Fix local variable usage. * Fix to read_mtab(): * Strip control characters (space - \040) from /proc/mounts GLOBALY, not just first occurrence. * Don't replace unprintable characters ([/-. ]) for use in the variable name with underscore. No need, just remove them all together. * Add check_boolean() to check if a user configure option is set ('yes', 'Yes', 'YES' or any combination there of) OR '1'. Anything else is considered 'unset'. * Add a ZFS_POOL_IMPORT to the default config. * This is a semi colon separated list of pools to import ONLY. * This is intended for systems which have _a lot_ of pools (from a SAN for example) and it would be to many to put in the ZFS_POOL_EXCEPTIONS variable.. * Add a config option "ZPOOL_IMPORT_OPTS" for adding additional options to "zpool import". * Add documentation and the chance of overriding the ZPOOL_CACHE variable in the config file. * Remove "sort" from find_pools() and setup_snapshot_booting(). Sometimes not available, and not really necessary. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ned Bass <bass6@llnl.gov> Issue #3816	2015-09-29 15:27:14 -07:00
yuina822	e75e501265	Fixed --signal typo Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #3773	2015-09-29 15:27:14 -07:00
yuina822	b9889021d4	Add extra_started_commands because reload function is not default Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #3773	2015-09-29 15:27:14 -07:00
SenH	1e17e910ea	Force create /run/sendsigs.omit.d link when starting zed Resolve the following error when restarting the zed by force creating the /run/sendsigs.omit.d/zed link. sudo /etc/init.d/zfs-zed restart * Stopping ZFS Event Daemon [ OK ] * Starting ZFS Event Daemon ln: failed to create symbolic link `/run/sendsigs.omit.d/zed': File exists Signed-off-by: SenH <sen@senhaerens.be> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3747	2015-09-08 09:45:34 -07:00
James Lee	3f1cc17c90	Reorder zfs-* services to allow /var on separate dataset ZED depends on /var. When /var is a separate dataset, it must be mounted before starting ZED. This change moves the zfs-zed service from starting first, to starting after zfs-mount, but before zfs-share. As discussed in issue #3513, ZED does not need to start first in order to consume events made during the zfs-import and zfs-mount services. The events will be queued and can be handled later in the boot process. ZED may, however, handle sharing in the future, so it should be started before the zfs-share service. This commit also stops the zfs-import service from writing temp files to /var/tmp on shutdown and it corrects the return code for the OpenRC service. Other OpenRC-specific changes noted in issue #3513 were reitereated in issue #3715 and committed in `da619f3`. Signed-off-by: James Lee <jlee@thestaticvoid.com> Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3513	2015-09-02 09:16:39 -07:00
Richard Yao	da619f3a19	Some OpenRC dependency logic belongs in mount The dependencies for handling / on ZFS belong in the mount script, not the zed script. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3715	2015-08-30 10:06:59 -07:00
Turbo Fredriksson	48511ea645	Fix some minor issues with the SYSV init and initramfs scripts. This is some minor fixes to commits `2cac7f5f11` and `2a34db1bdb`. * Make sure to alien'ate the new initramfs rpm package as well! The rpm package is build correctly, but alien isn't run on it to create the deb. * Before copying file from COPY_FILE_LIST, make sure the DESTDIR/dir exists. * Include /lib/udev/vdev_id file in the initrd. * Because the initrd needs to use '/sbin/modprobe' instead of 'modprobe', we need to use this in load_module() as well. * Make sure that load_module() can be used more globaly, instead of calling '/sbin/modprobe' all over the place. * Make sure that check_module_loaded() have a parameter - module to check. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3626	2015-07-24 15:05:33 -07:00
Turbo Fredriksson	47a4a6fd5f	Support parallel build trees (VPATH builds) Build products from an out of tree build should be written relative to the build directory. Sources should be referred to by their locations in the source directory. This is accomplished by adding the 'src' and 'obj' variables for the module Makefile.am, using relative paths to reference source files, and by setting VPATH when source files are not co-located with the Makefile. This enables the following: $ mkdir build $ cd build $ ../configure \ --with-spl=$HOME/src/git/spl/ \ --with-spl-obj=$HOME/src/git/spl/build $ make -s This change also has the advantage of resolving the following warning which is generated by modern versions of automake. Makefile.am:00: warning: source file 'xxx' is in a subdirectory, Makefile.am:00: but option 'subdir-objects' is disabled Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1082	2015-07-17 13:42:51 -07:00
Turbo Fredriksson	d6c9ff0a6b	Add /dev/mapper to the list of possible sources for pool devices. This is especially needed when using LUKS backed pools. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3536	2015-06-29 12:32:05 -07:00
Turbo Fredriksson	16421a1dc8	Additional SYSV init script fixes (3). * In read_mtab(), fix problems (!?) in the mounts file. It will record 'rpool 1' as 'rpool\0401' instead of 'rpool\00401' which seems to be the correct (at least as far as 'printf' is concerned). Use this using the external 'echo' command (and not the one built in to the shell) because the internal one would interpret the backslash code (incorrectly), giving us a instead. * Remove reregister_mounts() - no longer needed. * For Gentoo, the zfs_log_failure_msg() should use eend(), not eerror() (which requires an error message, which we don't have). Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3488 Closes #3509 Closes #3514	2015-06-25 11:56:47 -07:00
Turbo Fredriksson	216f9d04a6	Revert "Additional SYSV init script fixes." This reverts commit `036391c980`. Because #3509 came just after this commit was accepted and is related to the original problem the commit was supposed to fix, we need to solve the problem in another way. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2015-06-25 11:56:09 -07:00
Turbo Fredriksson	036391c980	Additional SYSV init script fixes. Use the 'mount' command instead of /proc/mounts to get a list of matching filesystems. This because /proc/mounts reports a pool with a space 'rpool 1' as 'rpool\0401'. The space is encoded as 3-digit octal which is legal. However 'printf "%b"', which we use to filter out other illegal characters (such as slash, space etc) can't properly interpret this because it expects 4-digit octal. We get a instead of the space we expected. The correct value should have been 'rpool\00401' (note the additional leading zero). So use 'mount', which interprets all backslash-escapes correctly, instead. Signed-off-by: Turbo Fredriksson turbo@bayour.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3488	2015-06-17 13:30:03 -07:00
Turbo Fredriksson	4f38c25910	SYSV init script fixes. * Change the order of the function library check/load. Redhat based system _can_ have a /lib/lsb/init-functions file (from the redhat-lsb-core package), but it's only partially what we can use. Instead, look for that file last, giving the script a chance to catch the 'real' distribution file. * Filter out dashes and dots in dataset name in read_mtab(). * Get rid of 'awk' entirely. This is usually in /usr, which might not be availible. * Get rid of the 'find /dev/disk/by-' (find is on /usr, which might not be availible). Instead use echo in a for loop. Rebuild scripts if any of the .in files changed. Move the sed part that filters out duplicates inside the check fo valid variable. Signed-off-by: Turbo Fredriksson turbo@bayour.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3463 Closes #3457	2015-06-05 12:35:39 -07:00
Turbo Fredriksson	2a34db1bdb	Base init scripts for SYSV systems * Based on the init scripts included with Debian GNU/Linux, then take code from the already existing ones, trying to merge them into one set of scripts that will work for 'everyone' for better maintainability. * Add configurable variables to control the workings of the init scripts: * ZFS_INITRD_PRE_MOUNTROOT_SLEEP Set a sleep time before we load the module (used primarily by initrd scripts to allow for slower media (such as USB devices etc) to be availible before we load the zfs module). * ZFS_INITRD_POST_MODPROBE_SLEEP Set a timed sleep in the initrd to after the load of the zfs module. * ZFS_INITRD_ADDITIONAL_DATASETS To allow for mounting additional datasets in the initrd. Primarily used in initrd scripts to allow for when filesystem needed to boot (such as /usr, /opt, /var etc) isn't directly under the root dataset. * ZFS_POOL_EXCEPTIONS Exclude pools from being imported (in the initrd and/or init scripts). * ZFS_DKMS_ENABLE_DEBUG, ZFS_DKMS_ENABLE_DEBUG_DMU_TX, ZFS_DKMS_DISABLE_STRIP Set to control how dkms should build the dkms packages. * ZPOOL_IMPORT_PATH Set path(s) where "zpool import" should import pools from. This was previously the job of "USE_DISK_BY_ID" (which is still used for backwards compatibility) but was renamed to allow for better control of import path(s). * If old USE_DISK_BY_ID is set, but not new ZPOOL_IMPORT_PATH, then we set ZPOOL_IMPORT_PATH to sane defaults just to be on the safe side. * ZED_ARGS To allow for local options to zed without having to change the init script. * The import function, do_import(), imports pools by name instead of '-a' for better control of pools to import and from where. * If USE_DISK_BY_ID is set (for backwards compatibility), but isn't 'yes' then ignore it. * If pool(s) isn't found with a simple "zpool import" (seen it happen), try looking for them in /dev/disk/by-id (if it exists). Any duplicates (pools found with both commands) is filtered out. * IF we have found extra pool(s) this way, we must force USE_DISK_BY_ID so that the first, simple "zpool import $pool" is able to find it. * Fallback on importing the pool using the cache file (if it exists) only if 'simple' import (either with ZPOOL_IMPORT_PATH or the 'built in' defaults) didn't work. * The export function, do_export(), will export all pools imported, EXCEPT the root pool (if there is one). * ZED script from the Debian GNU/Linux packages added. * Refreshed ZED init script from behlendorf@5e7a660 to be portable so it may be used on both LSB and Redhat style systems. * If there is no pool(s) imported and zed successfully shut down, we will unload the zfs modules. * The function library file for the ZoL init script is installed as /etc/init.d/zfs-functions. * The four init scripts, the /etc/{defaults,sysconfig,conf.d}/zfs config file as well as the common function library is tagged as '%config(noreplace)' in the rpm rules file to make sure they are not replaced automatically if locally modifed. * Pitfals and workarounds: * If we're running from init, remove stale /etc/dfs/sharetab before importing pools in the zfs-import init script. * On Debian GNU/Linux, there's a 'sendsigs' script that will kill basically everything quite early in the shutdown phase and zed is/should be stopped much later than that. We don't want zed to be among the ones killed, so add the zed pid to list of pids for 'sendsigs' to ignore. * CentOS uses echo_success() and echo_failure() to print out status of command. These in turn uses "echo -n \0xx[etc]" to move cursor and choose colour etc. This doesn't work with the modified IFS variable we need to use in zfs-import for some reason, so work around that when we define zfs_log_{end,failure}_msg() for RedHat and derivative distributions. * All scripts passes ShellCheck (with one false positive in do_mount()). Signed-off-by: Turbo Fredriksson turbo@bayour.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Richard Yao <ryao@gentoo.org> Reviewed by: Chris Dunlap <cdunlap@llnl.gov> Closes #2974 Closes #2107	2015-05-28 14:14:53 -07:00
DHE	9012354bf0	Rebuild init scripts on source file updates The resulting script is not removed by 'make clean' or rebuilt when the source files are changed. Users with long standing git trees may find their init script is out of date. Signed-off-by: DHE <git@dehacked.net> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3273	2015-04-14 13:26:49 -07:00
Hajo Möller	6184b3a6a0	Actually source /etc/sysconfig/zfs instead of /etc/default/zfs Signed-off-by: Hajo M<C3><B6>ller <dasjoe@users.noreply.github.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3162	2015-03-09 17:13:04 -07:00
Chris Dunlap	0e86d309cc	Add ZED to zfs.redhat.in script This commit updates the zfs.redhat.in script to start/stop ZED. Signed-off-by: Chris Dunlap <cdunlap@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #3153	2015-03-05 14:07:04 -08:00
Brian Behlendorf	a7b9d0c3a0	Replace zfs.redhat.in with zfs.lsb.in init script This commit replaces the zfs.redhat.in init script with a slightly modified version of the existing zfs.lsb.in init script. This was done to minimize the functional differences between platforms. The lsb version of the script was choosen because it's heavily tested and provides the most functionality. Changes made for RHEL systems: * Configuration: /etc/default/zfs -> /etc/sysconfig/zfs * LSB functions: /lib/lsb/init-functions -> /etc/rc.d/init.d/functions * Logging: log_begin_msg/log_end_msg -> action Features in LSB which are now in RHEL: * USE_DISK_BY_ID=0 - Use the by-id names * VERBOSE_MOUNT=0 - Verbose mounts by default * DO_OVERLAY_MOUNTS=0 - Overlay mounts by default * MOUNT_EXTRA_OPTIONS=0 - Generic extra options Existing RHEL features which were removed: * Automatically mounting FSs on ZVOLs listed in /etc/fstab Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #3153	2015-03-04 11:33:07 -08:00
Derek Dai	7a870db1b9	Do not export pool to prevent cache from been removed Signed-off-by: Derek Dai <daiderek@gmail.com> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2353	2014-06-05 13:49:15 -07:00
Brian Behlendorf	51268f31a8	Remove SELinux enforcing check from init scripts The default SELinux policy for RHEL and Fedora has been updated to include ZFS in the list of filesystems which support xattrs. Therefore, there's no longer a need to detect this in the init scripts. References: https://bugzilla.redhat.com/show_bug.cgi?id=811532 https://bugzilla.redhat.com/show_bug.cgi?id=816543 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2166	2014-05-02 11:37:46 -07:00
Turbo Fredriksson	b79e1f1f27	Allow specifying '-o <opts>' in defaults/init script. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2103	2014-04-04 09:49:09 -07:00
Turbo Fredriksson	e37212f9a2	Support using overlay mounts in defaults/init script. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2103	2014-04-04 09:48:25 -07:00
Richard Yao	b42b812efb	Inform OpenRC that ZFS uses mtab p_l in #zfsonlinux reported that he had issues mounting filesystems that were resolved by adding rc_need="mtab" to /etc/init.d/zfs. Closer inspection revealed that we do have a race, but it is not clear how this race caused mounting to fail. What is clear is that this race should be fixed, so lets add the proper `use mtab` line to handle it. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2148	2014-03-04 11:54:44 -08:00
Turbo Fredriksson	8c091798f2	Add UNSHARING of filesystems and EXPORTING pools As a 'stop' action ensure the filesystem is unshared before it is unmounted, just in case. Additionally, export the pool so it may be cleanly imported by a different host. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2003	2014-01-07 09:48:04 -08:00
Turbo Fredriksson	c1ab64d393	Update init script to allow verbose mounts Allow verbose mounts to make is easier to monitor progress when mounting a large number of filesystems. This functionality is disabled by default. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1929	2013-12-06 10:59:35 -08:00
Turbo Fredriksson	fc220e9ea5	Update init script to allow /dev/disk/by-id import Many people prefer to use by-id at import time instead of using the cache file. This can be a much better solution than the cache file in some environments so we're adding some infrastructure to allow it. This functionality is disabled by default. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1929	2013-12-06 10:59:09 -08:00
Matthew Thode	760ec997df	Updating init scripts to have more robust grepping The previous pattern could accidentally match on things like 'real_root=ZFS=node02-zp00/ROOT/rootfs' due to the 'ZFS=no' substring. Signed-off-by: Matthew Thode <mthode@mthode.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1837	2013-11-08 10:55:20 -08:00
Richard Yao	9eaf0832ad	Improve OpenRC init script The current zfs OpenRC script's dependencies cause OpenRC to attempt to unmount ZFS filesystems at shutdown while things were still using them, which would fail. This is a cosmetic issue, but it should still be addressed. It probably does not affect systems where the rootfs is a legacy filesystem, but any system with the rootfs on ZFS needs to run the ZFS init script after the system is ready to shutdown filesystems. OpenRC's shutdown process occurs in the reverse order of the startup process. Therefore running the ZFS shutdown procedure after filesystems are ready to be unmounted requires running the startup procedure before fstab. This patch changes the dependencies of the script to expliclty run before fstab at boot when the rootfs is ZFS and to run after fstab at boot whenever the rootfs is not ZFS. This should cover most use cases. The only cases not covered well by this are systems with legacy root filesystems where people want to configure fstab to mount a non-ZFS filesystem off a zvol and possibly also systems whose pools are stored on network block devices. The former requires that the ZFS script run before fstab, which could cause ZFS datasets to mount too early and appear under the fstab mount points. The latter requires that the ZFS script run after networking starts, which precludes the ability to store any system information on ZFS. An additional OpenRC script could be written to handle non-root pools on network block devices, but that will depend on user demand and developer time. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1479	2013-06-18 17:03:25 -07:00
Turbo Fredriksson	382c4e5184	Possibility to disable (not start) zfs at bootup. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1402	2013-04-24 16:18:44 -07:00
Brian Behlendorf	0da31cd6ca	Remove ARCH packaging The kernel modules are now available in the Arch User Repository (AUR) via zfs. Since their packaging is maintained and superior to ours it is being removed from the tree. https://wiki.archlinux.org/index.php/ZFS Now that various distributions are picking up the packages we should eventually be able to remove most of this infrastructure. Packaging belongs with the distributions not upstream. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-03-06 15:46:41 -08:00
Brian Behlendorf	ca8b5af89d	Remove autotools products Remove all of the generated autotools products from the repository and update the .gitignore files accordingly. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #718	2012-08-27 11:47:44 -07:00
Etienne Dechamps	ee5fd0bb80	Set zvol discard_granularity to the volblocksize. Currently, zvols have a discard granularity set to 0, which suggests to the upper layer that discard requests of arbirarily small size and alignment can be made efficiently. In practice however, ZFS does not handle unaligned discard requests efficiently: indeed, it is unable to free a part of a block. It will write zeros to the specified range instead, which is both useless and inefficient (see dnode_free_range). With this patch, zvol block devices expose volblocksize as their discard granularity, so the upper layer is aware that it's not supposed to send discard requests smaller than volblocksize. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #862	2012-08-07 14:55:31 -07:00
Richard Yao	739a1a82e0	Linux 3.5 compat, end_writeback() changed to clear_inode() The end_writeback() function was changed by moving the call to inode_sync_wait() earlier in to evict(). This effecitvely changes the ordering of the sync but it does not impact the details of the zfs implementation. However, as part of this change end_writeback() was renamed to clear_inode() to reflect the new semantics. This change does impact us and clear_inode() now maps to end_writeback() for kernels prior to 3.5. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #784	2012-07-23 12:29:36 -07:00
Richard Yao	ea1fdf46e2	Linux 3.5 compat, iops->truncate_range() removed The vmtruncate_range() support has been removed from the kernel in favor of using the fallocate method in the file_operations table. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #784	2012-07-23 12:29:32 -07:00
Richard Yao	756c3e5a9c	Linux 3.5 compat, eops->encode_fh() takes inodes The export_operations member ->encode_fh() has been updated to take both the child and parent inodes. This interface used to take the child dentry and a bool describing if the parent is needed. NOTE: While updating this code I noticed that we do not currently cleanly handle the case where we're passed a connectable parent. This code should be audited to make sure we're doing the right thing. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #784	2012-07-23 12:29:23 -07:00
Etienne Dechamps	b5a28807cd	Move partition scanning from userspace to module. Currently, zpool online -e (dynamic vdev expansion) doesn't work on whole disks because we're invoking ioctl(BLKRRPART) from userspace while ZFS still has a partition open on the disk, which results in EBUSY. This patch moves the BLKRRPART invocation from the zpool utility to the module. Specifically, this is done just before opening the device in vdev_disk_open() which is called inside vdev_reopen(). This requires jumping through some hoops to get to the disk device from the partition device, and to make sure we can still open the partition after the BLKRRPART call. Note that this new code path is triggered on dynamic vdev expansion only; other actions, like creating a new pool, are unchanged and still call BLKRRPART from userspace. This change also depends on API changes which are available in 2.6.37 and latter kernels. The build system has been updated to detect this, but there is no compatibility mode for older kernels. This means that online expansion will NOT be available in older kernels. However, it will still be possible to expand the vdev offline. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #808	2012-07-17 09:17:31 -07:00
Richard Yao	ba9b5428fd	Relicense zfs.gentoo.in from GPLv2 to 2-clause BSD As the Gentoo sys-fs/zfs maintainer, I receive license compatibility questions and at times, those questions can be harassing. I feel that the presence of the GPL in Gentoo's package metadata promotes such questions. zfs.gentoo.in is the only GPLv2 licensed file in ZFS, so I have taken the liberty of contacting all contributors to this file to request permission to relicense it. All of the contributors to this file have agreed to relicense it under the 2-clause BSD license. I have added their Signed-offs to this commit, in order of first contribution. Thank you everyone for being so understanding. Signed-off-by: devsk <devsku@gmail.com> Signed-off-by: Alexey Shvetsov <alexxy@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Andrew Tselischev <andrewtselischev@gmail.com> Signed-off-by: Zachary Bedell <zac@thebedells.org> Signed-off-by: Gunnar Beutner <gunnar@beutner.name> Signed-off-by: Kyle Fuller <inbox@kylefuller.co.uk> Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Closes #819	2012-07-10 15:00:16 -07:00
Richard Yao	6a0936babc	Linux 3.4 compat, d_make_root() replaces d_alloc_root() torvalds/linux@adc0e91ab1 introduced introduced d_make_root() as a replacement for d_alloc_root(). Further commits appear to have removed d_alloc_root() from the Linux source tree. This causes the following failure: error: implicit declaration of function 'd_alloc_root' [-Werror=implicit-function-declaration] To correct this we update the code to use the current d_make_root() interface for readability. Then we introduce an autotools check to determine if d_make_root() is available. If it isn't then we define some compatibility logic which used the older d_alloc_root() interface. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #776	2012-06-11 10:04:49 -07:00
Brian Behlendorf	b39d3b9f7b	Linux 3.3 compat, iops->create()/mkdir()/mknod() The mode argument of iops->create()/mkdir()/mknod() was changed from an 'int' to a 'umode_t'. To prevent a compiler warning an autoconf check was added to detect the API change and then correctly set a zpl_umode_t typedef. There is no functional change. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #701	2012-04-30 12:52:38 -07:00
Richard Yao	2ce9d0ec61	Make Gentoo initscript use modinfo The -l parameter to modprobe has been removed from the latest upstream code and this change has entered Gentoo. Using modinfo as a substitute addresses this. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #636	2012-04-03 10:37:18 -07:00
Brian Behlendorf	1c5de20ae2	Add --enable-debug-dmu-tx configure option Allow rigorous (and expensive) tx validation to be enabled/disabled indepentantly from the standard zfs debugging. When enabled these checks ensure that all txs are constructed properly and that a dbuf is never dirtied without taking the correct tx hold. This checking is particularly helpful when adding new dmu consumers like Lustre. However, for established consumers such as the zpl with no known outstanding tx construction problems this is just overhead. --enable-debug-dmu-tx - Enable/disable validation of each tx as --disable-debug-dmu-tx it is constructed. By default validation is disabled due to performance concerns. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-03-23 12:25:17 -07:00
Brian Behlendorf	ebe7e575ea	Add .zfs control directory Add support for the .zfs control directory. This was accomplished by leveraging as much of the existing ZFS infrastructure as posible and updating it for Linux as required. The bulk of the core functionality is now all there with the following limitations. ) The .zfs/snapshot directory automount support requires a 2.6.37 or newer kernel. The exception is RHEL6.2 which has backported the d_automount patches. ) Creating/destroying/renaming snapshots with mkdir/rmdir/mv in the .zfs/snapshot directory works as expected. However, this functionality is only available to root until zfs delegations are finished. * mkdir - create a snapshot * rmdir - destroy a snapshot * mv - rename a snapshot The following issues are known defeciences, but we expect them to be addressed by future commits. ) Add automount support for kernels older the 2.6.37. This should be possible using follow_link() which is what Linux did before. ) Accessing the .zfs/snapshot directory via NFS is not yet possible. The majority of the ground work for this is complete. However, finishing this work will require resolving some lingering integration issues with the Linux NFS kernel server. *) The .zfs/shares directory exists but no futher smb functionality has yet been implemented. Contributions-by: Rohan Puri <rohan.puri15@gmail.com> Contributiobs-by: Andrew Barnes <barnes333@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #173	2012-03-22 13:03:47 -07:00
Brian Behlendorf	4b787d75c8	Cleanly support debug packages Allow a source rpm to be rebuilt with debugging enabled. This avoids the need to have to manually modify the spec file. By default debugging is still largely disabled. To enable specific debugging features use the following options with rpmbuild. '--with debug' - Enables ASSERTs # For example: $ rpmbuild --rebuild --with debug zfs-modules-0.6.0-rc6.src.rpm Additionally, ZFS_CONFIG has been added to zfs_config.h for packages which build against these headers. This is critical to ensure both zfs and the dependant package are using the same prototype and structure definitions. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-27 14:08:17 -08:00
Etienne Dechamps	30930fba21	Add support for DISCARD to ZVOLs. DISCARD (REQ_DISCARD, BLKDISCARD) is useful for thin provisioning. It allows ZVOL clients to discard (unmap, trim) block ranges from a ZVOL, thus optimizing disk space usage by allowing a ZVOL to shrink instead of just grow. We can't use zfs_space() or zfs_freesp() here, since these functions only work on regular files, not volumes. Fortunately we can use the low-level function dmu_free_long_range() which does exactly what we want. Currently the discard operation is not added to the log. That's not a big deal since losing discard requests cannot result in data corruption. It would however result in disk space usage higher than it should be. Thus adding log support to zvol_discard() is probably a good idea for a future improvement. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-09 16:19:38 -08:00
Etienne Dechamps	cb2d19010d	Support the fallocate() file operation. Currently only the (FALLOC_FL_PUNCH_HOLE) flag combination is supported, since it's the only one that matches the behavior of zfs_space(). This makes it pretty much useless in its current form, but it's a start. To support other flag combinations we would need to modify zfs_space() to make it more flexible, or emulate the desired functionality in zpl_fallocate(). Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #334	2012-02-09 16:19:32 -08:00
Etienne Dechamps	34037afe24	Improve ZVOL queue behavior. The Linux block device queue subsystem exposes a number of configurable settings described in Linux block/blk-settings.c. The defaults for these settings are tuned for hard drives, and are not optimized for ZVOLs. Proper configuration of these options would allow upper layers (I/O scheduler) to take better decisions about write merging and ordering. Detailed rationale: - max_hw_sectors is set to unlimited (UINT_MAX). zvol_write() is able to handle writes of any size, so there's no reason to impose a limit. Let the upper layer decide. - max_segments and max_segment_size are set to unlimited. zvol_write() will copy the requests' contents into a dbuf anyway, so the number and size of the segments are irrelevant. Let the upper layer decide. - physical_block_size and io_opt are set to the ZVOL's block size. This has the potential to somewhat alleviate issue #361 for ZVOLs, by warning the upper layers that writes smaller than the volume's block size will be slow. - The NONROT flag is set to indicate this isn't a rotational device. Although the backing zpool might be composed of rotational devices, the resulting ZVOL often doesn't exhibit the same behavior due to the COW mechanisms used by ZFS. Setting this flag will prevent upper layers from making useless decisions (such as reordering writes) based on incorrect assumptions about the behavior of the ZVOL. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-07 16:23:06 -08:00
Etienne Dechamps	b18019d2d8	Fix synchronicity for ZVOLs. zvol_write() assumes that the write request must be written to stable storage if rq_is_sync() is true. Unfortunately, this assumption is incorrect. Indeed, "sync" does not mean what we think it means in the context of the Linux block layer. This is well explained in linux/fs.h: WRITE: A normal async write. Device will be plugged. WRITE_SYNC: Synchronous write. Identical to WRITE, but passes down the hint that someone will be waiting on this IO shortly. WRITE_FLUSH: Like WRITE_SYNC but with preceding cache flush. WRITE_FUA: Like WRITE_SYNC but data is guaranteed to be on non-volatile media on completion. In other words, SYNC does not mean that the write must be on stable storage on completion. It just means that someone is waiting on us to complete the write request. Thus triggering a ZIL commit for each SYNC write request on a ZVOL is unnecessary and harmful for performance. To make matters worse, ZVOL users have no way to express that they actually want data to be written to stable storage, which means the ZIL is broken for ZVOLs. The request for stable storage is expressed by the FUA flag, so we must commit the ZIL after the write if the FUA flag is set. In addition, we must commit the ZIL before the write if the FLUSH flag is set. Also, we must inform the block layer that we actually support FLUSH and FUA. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-07 16:23:06 -08:00
Brian Behlendorf	47621f3d76	Linux 3.3 compat, sops->show_options() The second argument of sops->show_options() was changed from a 'struct vfsmount ' to a 'struct dentry '. Add an autoconf check to detect the API change and then conditionally define the expected interface. In either case we are only interested in the zfs_sb_t. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #549	2012-02-03 10:02:01 -08:00
Brian Behlendorf	ab26409db7	Linux 3.1 compat, super_block->s_shrink The Linux 3.1 kernel has introduced the concept of per-filesystem shrinkers which are directly assoicated with a super block. Prior to this change there was one shared global shrinker. The zfs code relied on being able to call the global shrinker when the arc_meta_limit was exceeded. This would cause the VFS to drop references on a fraction of the dentries in the dcache. The ARC could then safely reclaim the memory used by these entries and honor the arc_meta_limit. Unfortunately, when per-filesystem shrinkers were added the old interfaces were made unavailable. This change adds support to use the new per-filesystem shrinker interface so we can continue to honor the arc_meta_limit. The major benefit of the new interface is that we can now target only the zfs filesystem for dentry and inode pruning. Thus we can minimize any impact on the caching of other filesystems. In the context of making this change several other important issues related to managing the ARC were addressed, they include: * The dnlc_reduce_cache() function which was called by the ARC to drop dentries for the Posix layer was replaced with a generic zfs_prune_t callback. The ZPL layer now registers a callback to drop these dentries removing a layering violation which dates back to the Solaris code. This callback can also be used by other ARC consumers such as Lustre. arc_add_prune_callback() arc_remove_prune_callback() * The arc_reduce_dnlc_percent module option has been changed to arc_meta_prune for clarity. The dnlc functions are specific to Solaris's VFS and have already been largely eliminated already. The replacement tunable now represents the number of bytes the prune callback will request when invoked. * Less aggressively invoke the prune callback. We used to call this whenever we exceeded the arc_meta_limit however that's not strictly correct since it results in over zeleous reclaim of dentries and inodes. It is now only called once the arc_meta_limit is exceeded and every effort has been made to evict other data from the ARC cache. * More promptly manage exceeding the arc_meta_limit. When reading meta data in to the cache if a buffer was unable to be recycled notify the arc_reclaim thread to invoke the required prune. * Added arcstat_prune kstat which is incremented when the ARC is forced to request that a consumer prune its cache. Remember this will only occur when the ARC has no other choice. If it can evict buffers safely without invoking the prune callback it will. * This change is also expected to resolve the unexpect collapses of the ARC cache. This would occur because when exceeded just the arc_meta_limit reclaim presure would be excerted on the arc_c value via arc_shrink(). This effectively shrunk the entire cache when really we just needed to reclaim meta data. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #466 Closes #292	2012-01-11 11:46:02 -08:00

1 2

85 Commits