Archive-Team/zfs - zfs - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Richard Yao	7d6e2adb4e	Remove blk_rq_bytes()/blk_rq_sectors autotools checks Signed-off-by: Richard Yao <ryao@gentoo.org>	2015-09-04 15:37:24 -04:00
Richard Yao	f952eaa7ec	Remove blk_rq_pos() autotools check Signed-off-by: Richard Yao <ryao@gentoo.org>	2015-09-04 15:37:24 -04:00
Richard Yao	f8c56b405d	Remove blk_fetch_request() autotools check Signed-off-by: Richard Yao <ryao@gentoo.org>	2015-09-04 15:37:24 -04:00
Richard Yao	e8c6be131c	Remove blk_requeue_request() autotools check Signed-off-by: Richard Yao <ryao@gentoo.org>	2015-09-04 15:37:24 -04:00
Richard Yao	dd6f9fe61b	Remove blk_end_request() autotools check. Signed-off-by: Richard Yao <ryao@gentoo.org>	2015-09-04 15:37:24 -04:00
Richard Yao	65f340e725	Remove rq_is_sync() autotools check Signed-off-by: Richard Yao <ryao@gentoo.org>	2015-09-04 15:37:24 -04:00
Richard Yao	9ddf9b8e15	Remove rq_for_each_segment() autotools check Signed-off-by: Richard Yao <ryao@gentoo.org>	2015-09-04 15:37:24 -04:00
Richard Yao	37f9dac592	zvol processing should use struct bio Internally, zvols are files exposed through the block device API. This is intended to reduce overhead when things require block devices. However, the ZoL zvol code emulates a traditional block device in that it has a top half and a bottom half. This is an unnecessary source of overhead that does not exist on any other OpenZFS platform does this. This patch removes it. Early users of this patch reported double digit performance gains in IOPS on zvols in the range of 50% to 80%. Comments in the code suggest that the current implementation was done to obtain IO merging from Linux's IO elevator. However, the DMU already does write merging while arc_read() should implicitly merge read IOs because only 1 thread is permitted to fetch the buffer into ARC. In addition, commercial ZFSOnLinux distributions report that regular files are more performant than zvols under the current implementation, and the main consumers of zvols are VMs and iSCSI targets, which have their own elevators to merge IOs. Some minor refactoring allows us to register zfs_request() as our ->make_request() handler in place of the generic_make_request() function. This eliminates the layer of code that broke IO requests on zvols into a top half and a bottom half. This has several benefits: 1. No per zvol spinlocks. 2. No redundant IO elevator processing. 3. Interrupts are disabled only when actually necessary. 4. No redispatching of IOs when all taskq threads are busy. 5. Linux's page out routines will properly block. 6. Many autotools checks become obsolete. An unfortunate consequence of eliminating the layer that generic_make_request() is that we no longer calls the instrumentation hooks for block IO accounting. Those hooks are GPL-exported, so we cannot call them ourselves and consequently, we lose the ability to do IO monitoring via iostat. Since zvols are internally files mapped as block devices, this should be okay. Anyone who is willing to accept the performance penalty for the block IO layer's accounting could use the loop device in between the zvol and its consumer. Alternatively, perf and ftrace likely could be used. Also, tools like latencytop will still work. Tools such as latencytop sometimes provide a better view of performance bottlenecks than the traditional block IO accounting tools do. Lastly, if direct reclaim occurs during spacemap loading and swap is on a zvol, this code will deadlock. That deadlock could already occur with sync=always on zvols. Given that swap on zvols is not yet production ready, this is not a blocker. Signed-off-by: Richard Yao <ryao@gentoo.org>	2015-09-04 15:30:24 -04:00
Richard Yao	97771edaca	Remove blk_queue_io_opt() autotools check This is needed for supporting kernels earlier than 2.6.30. Support for those kernels was dropped, so we can safely remove this check. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2015-09-01 09:33:18 -07:00
Richard Yao	3c119330a6	Remove blk_queue_physical_block_size() autotools check This is needed for supporting kernels earlier than 2.6.30. Support for those kernels was dropped, so we can safely remove this check. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2015-09-01 09:33:18 -07:00
Brian Behlendorf	278bee9319	Linux 3.18 compat: Snapshot auto-mounting Re-factor the .zfs/snapshot auto-mouting code to take in to account changes made to the upstream kernels. And to lay the groundwork for enabling access to .zfs snapshots via NFS clients. This patch makes the following core improvements. * All actively auto-mounted snapshots are now tracked in two global trees which are indexed by snapshot name and objset id respectively. This allows for fast lookups of any auto-mounted snapshot regardless without needing access to the parent dataset. * Snapshot entries are added to the tree in zfsctl_snapshot_mount(). However, they are now removed from the tree in the context of the unmount process. This eliminates the need complicated error logic in zfsctl_snapshot_unmount() to handle unmount failures. * References are now taken on the snapshot entries in the tree to ensure they always remain valid while a task is outstanding. * The MNT_SHRINKABLE flag is set on the snapshot vfsmount_t right after the auto-mount succeeds. This allows to kernel to unmount idle auto-mounted snapshots if needed removing the need for the zfsctl_unmount_snapshots() function. * Snapshots in active use will not be automatically unmounted. As long as at least one dentry is revalidated every zfs_expire_snapshot/2 seconds the auto-unmount expiration timer will be extended. * Commit torvalds/linux@bafc9b7 caused snapshots auto-mounted by ZFS to be immediately unmounted when the dentry was revalidated. This was a consequence of ZFS invaliding all snapdir dentries to ensure that negative dentries didn't mask new snapshots. This patch modifies the behavior such that only negative dentries are invalidated. This solves the issue and may result in a performance improvement. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3589 Closes #3344 Closes #3295 Closes #3257 Closes #3243 Closes #3030 Closes #2841	2015-08-31 13:54:39 -07:00
Chunwei Chen	17888ae30d	Add compatibility layer for {kmap,kunmap}_atomic Starting from linux-2.6.37, {kmap,kunmap}_atomic takes 1 argument instead of 2. We use zfs_{kmap,kunmap}_atomic as wrappers and always take 2 argument, but ignore the 2nd for newer kernel. Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2015-08-24 10:13:25 -07:00
Chris Dunlop	302f31ffc7	Linux 4.1 compat: configure bdi_setup_and_register() Pull struct backing_dev_info off the stack: by linux-4.1 it's grown past our 1024 byte stack frame warning limit resulting in an incorrect configure result. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chris Dunlop <chris@onthe.net.au> Closes #3671	2015-08-18 16:43:04 -07:00
Turbo Fredriksson	48511ea645	Fix some minor issues with the SYSV init and initramfs scripts. This is some minor fixes to commits `2cac7f5f11` and `2a34db1bdb`. * Make sure to alien'ate the new initramfs rpm package as well! The rpm package is build correctly, but alien isn't run on it to create the deb. * Before copying file from COPY_FILE_LIST, make sure the DESTDIR/dir exists. * Include /lib/udev/vdev_id file in the initrd. * Because the initrd needs to use '/sbin/modprobe' instead of 'modprobe', we need to use this in load_module() as well. * Make sure that load_module() can be used more globaly, instead of calling '/sbin/modprobe' all over the place. * Make sure that check_module_loaded() have a parameter - module to check. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3626	2015-07-24 15:05:33 -07:00
Turbo Fredriksson	47a4a6fd5f	Support parallel build trees (VPATH builds) Build products from an out of tree build should be written relative to the build directory. Sources should be referred to by their locations in the source directory. This is accomplished by adding the 'src' and 'obj' variables for the module Makefile.am, using relative paths to reference source files, and by setting VPATH when source files are not co-located with the Makefile. This enables the following: $ mkdir build $ cd build $ ../configure \ --with-spl=$HOME/src/git/spl/ \ --with-spl-obj=$HOME/src/git/spl/build $ make -s This change also has the advantage of resolving the following warning which is generated by modern versions of automake. Makefile.am:00: warning: source file 'xxx' is in a subdirectory, Makefile.am:00: but option 'subdir-objects' is disabled Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1082	2015-07-17 13:42:51 -07:00
Brian Behlendorf	bd29109f1a	Linux 4.2 compat: follow_link() / put_link() As of Linux 4.2 the kernel has completely retired the nameidata structure. One of the few remaining consumers of this interface were the follow_link() and put_link() callbacks. This patch adds the required checks to configure to detect the interface change and updates the functions accordingly. Migrating to the simple_follow_link() interface was considered but was decided against ironically due to the increased complexity. It also should be noted that the kernel follow_link() and put_link() interfaces changes several times after 4.1 and but before 4.2. This means there is a narrow range of kernel commits which never appear in an official tag of the Linux kernel which ZoL will not build. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@gentoo.org> Issue #3596	2015-07-17 09:18:16 -07:00
Brian Behlendorf	c2d17fd891	Disable gcc bool-compare warning As of gcc version 5.1.1 a new boolean comparison warning has been introduced. This warning is harmless but is triggered several places in the ZFS code base. Because warnings are promoted to errors when building with debugging enabled it is necessary to disable the warning when using versions of gcc which automatically enabling this check. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2015-07-13 12:55:26 -07:00
Turbo Fredriksson	2cac7f5f11	Initramfs scripts for ZoL. * Supports booting of a ZFS snapshot. Do this by cloning the snapshot into a dataset. If this, the resulting dataset, already exists, destroy it. Then mount it on root. * If snapshot does not exist, use base dataset (the part before '@') as boot filesystem instead. * If no snapshot is specified on the 'root=' kernel command line, but there is an '@', then get a list of snapshots below that filesystem and ask the user which to use. * Clone with 'mountpoint=none' and 'canmount=noauto' - we mount manually and explicitly. * For sub-filesystems, that doesn't have a mountpoint property set, we use the 'org.zol:mountpoint' to keep track of it's mountpoint. * Allow rollback of snapshots instead of clone it and boot from the clone. * Allow mounting a root- and subfs with mountpoint=legacy set * Allow mounting a filesystem which is using nativ encryption. * Support all currently used kernel command line arguments All the different distributions have their own standard on what to specify on the kernel command line to boot of a ZFS filesystem. * Extra options: * zfsdebug=(on,yes,1) Show extra debugging information * zfsforce=(on,yes,1) Force import the pool * rollback=(on,yes,1) Rollback (instead of clone) the snapshot * Only try to import pool if it haven't already been imported * This will negate the need to force import a pool that have not been exported cleanly. * Support exclusion of pools to import by setting ZFS_POOL_EXCEPTIONS in /etc/default/zfs. * Support additional configuration variable ZFS_INITRD_ADDITIONAL_DATASETS to mount additional filesystems not located under your root dataset. * Include /etc/modprobe.d/{zfs,spl}.conf in the initrd if it/they exist. * Include the udev rule to use by-vdev for pool imports. * Include the /etc/default/zfs file to the initrd. * Only try /dev/disk/by-* in the initrd if USE_DISK_BY_ID is set. * Use /dev/disk/by-vdev before anything. * Add /dev as a last ditch attempt. * Fallback to using the cache file if that exist if nothing else worked. * Use /sbin/modprobe instead of built-in (BusyBox) modprobe. This gets rid of the message "modprobe: can't load module zcommon". Thanx to pcoultha for finding this. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2116 Closes #2114	2015-07-08 18:14:34 -07:00
Brian Behlendorf	218b4e0a76	Add zfs_sb_prune_aliases() function For kernels which do not implement a per-suberblock shrinker, those older than Linux 3.1, the shrink_dcache_parent() function was used to attempt to reclaim dentries. This was found not be entirely reliable and could lead to performance issues on older kernels running meta-data heavy workloads. To address this issue a zfs_sb_prune_aliases() function has been added to implement this functionality. It relies on traversing the list of znodes for a filesystem and adding them to a private list with a reference held. The private list can then be safely walked outside the z_znodes_lock to prune dentires and drop the last reference so the inode can be freed. This provides the same synchronous behavior as the per-filesystem shrinker and has the advantage of depending on only long standing interfaces. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Closes #3501	2015-06-22 10:22:49 -07:00
Matus Kral	57ae840077	Linux 4.1 compat: use read_iter() / write_iter() Linux 3.15 commit torvalds/linux@293bc98 introduced two new methods. The ->read_iter() and ->write_iter() methods were designed to replace the ->aio_read() and ->aio_write() interfaces. Both interfaces were preserved for several kernel releases in order to migrate all existing consumers to the new interfaces. But as of Linux 4.1 the legacy interface has been retired and the ZFS code must be updated to use the new interfaces. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3352	2015-06-18 12:06:59 -07:00
Tim Chase	90947b2357	3.12 compat, NUMA-aware per-superblock shrinker Kernels >= 3.12 have a NUMA-aware superblock shrinker which is used in ZoL by zfs_sb_prune(). This patch calls the shrinker for each on-line NUMA node in order that memory be freed for each one. Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3495	2015-06-17 10:43:13 -07:00
Turbo Fredriksson	2a34db1bdb	Base init scripts for SYSV systems * Based on the init scripts included with Debian GNU/Linux, then take code from the already existing ones, trying to merge them into one set of scripts that will work for 'everyone' for better maintainability. * Add configurable variables to control the workings of the init scripts: * ZFS_INITRD_PRE_MOUNTROOT_SLEEP Set a sleep time before we load the module (used primarily by initrd scripts to allow for slower media (such as USB devices etc) to be availible before we load the zfs module). * ZFS_INITRD_POST_MODPROBE_SLEEP Set a timed sleep in the initrd to after the load of the zfs module. * ZFS_INITRD_ADDITIONAL_DATASETS To allow for mounting additional datasets in the initrd. Primarily used in initrd scripts to allow for when filesystem needed to boot (such as /usr, /opt, /var etc) isn't directly under the root dataset. * ZFS_POOL_EXCEPTIONS Exclude pools from being imported (in the initrd and/or init scripts). * ZFS_DKMS_ENABLE_DEBUG, ZFS_DKMS_ENABLE_DEBUG_DMU_TX, ZFS_DKMS_DISABLE_STRIP Set to control how dkms should build the dkms packages. * ZPOOL_IMPORT_PATH Set path(s) where "zpool import" should import pools from. This was previously the job of "USE_DISK_BY_ID" (which is still used for backwards compatibility) but was renamed to allow for better control of import path(s). * If old USE_DISK_BY_ID is set, but not new ZPOOL_IMPORT_PATH, then we set ZPOOL_IMPORT_PATH to sane defaults just to be on the safe side. * ZED_ARGS To allow for local options to zed without having to change the init script. * The import function, do_import(), imports pools by name instead of '-a' for better control of pools to import and from where. * If USE_DISK_BY_ID is set (for backwards compatibility), but isn't 'yes' then ignore it. * If pool(s) isn't found with a simple "zpool import" (seen it happen), try looking for them in /dev/disk/by-id (if it exists). Any duplicates (pools found with both commands) is filtered out. * IF we have found extra pool(s) this way, we must force USE_DISK_BY_ID so that the first, simple "zpool import $pool" is able to find it. * Fallback on importing the pool using the cache file (if it exists) only if 'simple' import (either with ZPOOL_IMPORT_PATH or the 'built in' defaults) didn't work. * The export function, do_export(), will export all pools imported, EXCEPT the root pool (if there is one). * ZED script from the Debian GNU/Linux packages added. * Refreshed ZED init script from behlendorf@5e7a660 to be portable so it may be used on both LSB and Redhat style systems. * If there is no pool(s) imported and zed successfully shut down, we will unload the zfs modules. * The function library file for the ZoL init script is installed as /etc/init.d/zfs-functions. * The four init scripts, the /etc/{defaults,sysconfig,conf.d}/zfs config file as well as the common function library is tagged as '%config(noreplace)' in the rpm rules file to make sure they are not replaced automatically if locally modifed. * Pitfals and workarounds: * If we're running from init, remove stale /etc/dfs/sharetab before importing pools in the zfs-import init script. * On Debian GNU/Linux, there's a 'sendsigs' script that will kill basically everything quite early in the shutdown phase and zed is/should be stopped much later than that. We don't want zed to be among the ones killed, so add the zed pid to list of pids for 'sendsigs' to ignore. * CentOS uses echo_success() and echo_failure() to print out status of command. These in turn uses "echo -n \0xx[etc]" to move cursor and choose colour etc. This doesn't work with the modified IFS variable we need to use in zfs-import for some reason, so work around that when we define zfs_log_{end,failure}_msg() for RedHat and derivative distributions. * All scripts passes ShellCheck (with one false positive in do_mount()). Signed-off-by: Turbo Fredriksson turbo@bayour.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Richard Yao <ryao@gentoo.org> Reviewed by: Chris Dunlap <cdunlap@llnl.gov> Closes #2974 Closes #2107	2015-05-28 14:14:53 -07:00
Turbo Fredriksson	01fcbec52d	The mount helper mount.zfs MUST be in /sbin (not '$sbindir'). Commit `60e9f69` added the --with-mounthelperdir option for Gentoo and in the process accidentally modified the default installation location. For security reasons mount(8) expects it to only be installed under /sbin. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3426	2015-05-18 16:54:36 -07:00
Tim Chase	e48533383b	Linux 2.6.36 compat, use REQ_FAILFAST_MASK and remove pre-2.6.36 support Commit `f4af6bb783` which added support for REQ_FAILFAST_MASK but the new autoconf test didn't use the same preprocessor macro name as the code did. The effect is that FAILFAST mode has not been enabled for ZoL in any post-2.6.35 kernel. Retire the HAVE_BIO_RW_FAILFAST interface used in pre-2.6.28 kernels. Raise an error condition if the FAILFAST interface can't be detected. Signed-off-by: Tim Chase <tim@onlight.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3386	2015-05-11 15:07:00 -07:00
Brian Behlendorf	ee2ca1db28	Add RHEL style kmod packages Provide a Redhat specific zfs-kmod.spec file which uses the old style kmods (not kmods2) packaging. By using the provided kmodtool script packages can be built which support weak modules. This allows for the kernel to be updated without having to rebuild the ZFS kernel modules. Packages for RHEL/Centos/SL/TOSS which use this spec file can by built as follows: $ ./configure --with-spec=redhat $ make rpms Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2015-03-27 14:41:48 -07:00
Brian Behlendorf	d820d2e9cf	Remove rpm/fedora directory Originally it was thought that custom spec files might be required for Fedora. Happily that has turns out not to be the case. Since this directory just contains symlinks to the generic spec files it can be removed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2015-03-27 14:30:58 -07:00
Bill McGonigle	e023409500	Linux 4.0 compat: bdi_setup_and_register() __must_check Explicitly disable the unused by variable warnings by setting __attribute__((unused)) for bdi_setup_and_register(). This is required because the function is defined with the __must_check attribute. Signed-off-by: Bill McGonigle <bill-github.com-public1@bfccomputing.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3141	2015-03-16 10:56:26 -07:00
Brian Behlendorf	8c45def24a	Linux 4.0 compat: bdi_setup_and_register() The 'capabilities' argument which was passed to bdi_setup_and_register() has been removed. File systems should no longer pass BDI_CAP_MAP_COPY. For our purposes this means there are now three different interfaces which must be handled. A zpl_bdi_setup_and_register() wrapper function has been introduced to provide a single interface to the ZPL code. * 2.6.32 - 2.6.33, bdi_setup_and_register() is not exported. * 2.6.34 - 3.19, bdi_setup_and_register() takes 3 arguments. * 4.0 - x.y, bdi_setup_and_register() takes 2 arguments. I've also taken this opportunity to remove HAVE_BDI because kernels older then 2.6.32 are no longer supported. All kernels newer than this will have one of the above interfaces. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Closes #3128	2015-03-03 10:49:45 -08:00
Jörg Thalheim	534759fad3	Linux 3.19 compat: file_inode was added struct access f->f_dentry->d_inode was replaced by accessor function file_inode(f) Signed-off-by: Joerg Thalheim <joerg@higgsboson.tk> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3084	2015-02-10 11:24:51 -08:00
Ned Bass	4e30e68caf	Don't use AC_LANG_SOURCE for conftest.h source Using AC_LANG_SOURCE with some versions of autoconf is problematic if the given source is to be written to a header file. Such versions assume the contents are to be written to conftest.c and generate shell code to that effect. The contents of the test program to detect support for Linux tracepoints were consequently malformed (containing the source for conftest.h) so the build system incorrectly disabled tracepoints support. Fix this in ZFS_LINUX_TRY_COMPILE_HEADER by passing the header source directly to ZFS_LINUX_COMPILE_IFELSE. Signed-off-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2953	2015-01-06 16:53:30 -08:00
Prakash Surya	0b39b9f96f	Swap DTRACE_PROBE* with Linux tracepoints This patch leverages Linux tracepoints from within the ZFS on Linux code base. It also refactors the debug code to bring it back in sync with Illumos. The information exported via tracepoints can be used for a variety of reasons (e.g. debugging, tuning, general exploration/understanding, etc). It is advantageous to use Linux tracepoints as the mechanism to export this kind of information (as opposed to something else) for a number of reasons: * A number of external tools can make use of our tracepoints "automatically" (e.g. perf, systemtap) * Tracepoints are designed to be extremely cheap when disabled * It's one of the "accepted" ways to export this kind of information; many other kernel subsystems use tracepoints too. Unfortunately, though, there are a few caveats as well: * Linux tracepoints appear to only be available to GPL licensed modules due to the way certain kernel functions are exported. Thus, to actually make use of the tracepoints introduced by this patch, one might have to patch and re-compile the kernel; exporting the necessary functions to non-GPL modules. * Prior to upstream kernel version v3.14-rc6-30-g66cc69e, Linux tracepoints are not available for unsigned kernel modules (tracepoints will get disabled due to the module's 'F' taint). Thus, one either has to sign the zfs kernel module prior to loading it, or use a kernel versioned v3.14-rc6-30-g66cc69e or newer. Assuming the above two requirements are satisfied, lets look at an example of how this patch can be used and what information it exposes (all commands run as 'root'): # list all zfs tracepoints available $ ls /sys/kernel/debug/tracing/events/zfs enable filter zfs_arc__delete zfs_arc__evict zfs_arc__hit zfs_arc__miss zfs_l2arc__evict zfs_l2arc__hit zfs_l2arc__iodone zfs_l2arc__miss zfs_l2arc__read zfs_l2arc__write zfs_new_state__mfu zfs_new_state__mru # enable all zfs tracepoints, clear the tracepoint ring buffer $ echo 1 > /sys/kernel/debug/tracing/events/zfs/enable $ echo 0 > /sys/kernel/debug/tracing/trace # import zpool called 'tank', inspect tracepoint data (each line was # truncated, they're too long for a commit message otherwise) $ zpool import tank $ cat /sys/kernel/debug/tracing/trace \| head -n35 # tracer: nop # # entries-in-buffer/entries-written: 1219/1219 #P:8 # # _-----=> irqs-off # / _----=> need-resched # \| / _---=> hardirq/softirq # \|\| / _--=> preempt-depth # \|\|\| / delay # TASK-PID CPU# \|\|\|\| TIMESTAMP FUNCTION # \| \| \| \|\|\|\| \| \| lt-zpool-30132 [003] .... 91344.200050: zfs_arc__miss: hdr... z_rd_int/0-30156 [003] .... 91344.200611: zfs_new_state__mru... lt-zpool-30132 [003] .... 91344.201173: zfs_arc__miss: hdr... z_rd_int/1-30157 [003] .... 91344.201756: zfs_new_state__mru... lt-zpool-30132 [003] .... 91344.201795: zfs_arc__miss: hdr... z_rd_int/2-30158 [003] .... 91344.202099: zfs_new_state__mru... lt-zpool-30132 [003] .... 91344.202126: zfs_arc__hit: hdr ... lt-zpool-30132 [003] .... 91344.202130: zfs_arc__hit: hdr ... lt-zpool-30132 [003] .... 91344.202134: zfs_arc__hit: hdr ... lt-zpool-30132 [003] .... 91344.202146: zfs_arc__miss: hdr... z_rd_int/3-30159 [003] .... 91344.202457: zfs_new_state__mru... lt-zpool-30132 [003] .... 91344.202484: zfs_arc__miss: hdr... z_rd_int/4-30160 [003] .... 91344.202866: zfs_new_state__mru... lt-zpool-30132 [003] .... 91344.202891: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.203034: zfs_arc__miss: hdr... z_rd_iss/1-30149 [001] .... 91344.203749: zfs_new_state__mru... lt-zpool-30132 [001] .... 91344.203789: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.203878: zfs_arc__miss: hdr... z_rd_iss/3-30151 [001] .... 91344.204315: zfs_new_state__mru... lt-zpool-30132 [001] .... 91344.204332: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.204337: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.204352: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.204356: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.204360: zfs_arc__hit: hdr ... To highlight the kind of detailed information that is being exported using this infrastructure, I've taken the first tracepoint line from the output above and reformatted it such that it fits in 80 columns: lt-zpool-30132 [003] .... 91344.200050: zfs_arc__miss: hdr { dva 0x1:0x40082 birth 15491 cksum0 0x163edbff3a flags 0x640 datacnt 1 type 1 size 2048 spa 3133524293419867460 state_type 0 access 0 mru_hits 0 mru_ghost_hits 0 mfu_hits 0 mfu_ghost_hits 0 l2_hits 0 refcount 1 } bp { dva0 0x1:0x40082 dva1 0x1:0x3000e5 dva2 0x1:0x5a006e cksum 0x163edbff3a:0x75af30b3dd6:0x1499263ff5f2b:0x288bd118815e00 lsize 2048 } zb { objset 0 object 0 level -1 blkid 0 } For the specific tracepoint shown here, 'zfs_arc__miss', data is exported detailing the arc_buf_hdr_t (hdr), blkptr_t (bp), and zbookmark_t (zb) that caused the ARC miss (down to the exact DVA!). This kind of precise and detailed information can be extremely valuable when trying to answer certain kinds of questions. For anybody unfamiliar but looking to build on this, I found the XFS source code along with the following three web links to be extremely helpful: * http://lwn.net/Articles/379903/ * http://lwn.net/Articles/381064/ * http://lwn.net/Articles/383362/ I should also node the more "boring" aspects of this patch: * The ZFS_LINUX_COMPILE_IFELSE autoconf macro was modified to support a sixth paramter. This parameter is used to populate the contents of the new conftest.h file. If no sixth parameter is provided, conftest.h will be empty. * The ZFS_LINUX_TRY_COMPILE_HEADER autoconf macro was introduced. This macro is nearly identical to the ZFS_LINUX_TRY_COMPILE macro, except it has support for a fifth option that is then passed as the sixth parameter to ZFS_LINUX_COMPILE_IFELSE. These autoconf changes were needed to test the availability of the Linux tracepoint macros. Due to the odd nature of the Linux tracepoint macro API, a separate ".h" must be created (the path and filename is used internally by the kernel's define_trace.h file). * The HAVE_DECLARE_EVENT_CLASS autoconf macro was introduced. This is to determine if we can safely enable the Linux tracepoint functionality. We need to selectively disable the tracepoint code due to the kernel exporting certain functions as GPL only. Without this check, the build process will fail at link time. In addition, the SET_ERROR macro was modified into a tracepoint as well. To do this, the 'sdt.h' file was moved into the 'include/sys' directory and now contains a userspace portion and a kernel space portion. The dprintf and zfs_dbgmsg* interfaces are now implemented as tracepoint as well. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-11-17 11:13:55 -08:00
Marcel Wysocki	11662bf969	Add config/compile to config/.gitignore This file may be added by automake and therefore should be added to config/.gitignore. For the full list of possible auxiliary programs see the full automake documentation. http://www.gnu.org/software/automake/manual/automake.html#Auxiliary-Programs Signed-off-by: Marcel Wysocki <maci.stgn@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2848	2014-10-31 16:25:34 -07:00
Richard Yao	b76707027c	Make systemd-modules-load.service file directory configurable Installing outside of the prefix is not permissible under Gentoo Prefix. The package manager will cause the installation process to fail if/when it sees this. We could handle this by disabling systemd support on prefix because systemd does not check these paths, but the Gentoo Council decided that small files such as these should be installed. That means disabling systemd support on prefix is not an acceptable workaround. As a consequence, we need some way of control the directory into which these files are installed. Making this configurable increases our compliance with the freedesktop.org specification, which allows these files to be installed into /etc/modules-load.d: http://www.freedesktop.org/software/systemd/man/modules-load.d.html Signed-off-by: Richard Yao <richard.yao@clusterhq.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2641	2014-10-28 09:41:14 -07:00
Richard Yao	60e9f69c97	Make directory into which mount.zfs is installed configurable Installing outside of the prefix is not permissible under Gentoo Prefix. The package manager will cause the installation process to fail if/when it sees this. I could script a workaround inside the ebuild, but it seemed to make more sense to make this more configurable. Signed-off-by: Richard Yao <richard.yao@clusterhq.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2641	2014-10-28 09:40:59 -07:00
Richard Yao	d8d7826721	Search /usr/local/src for SPL Object Directory Since we changed the default location for the kernel headers to respect --prefix in the SPL, we must search that location to prevent user builds from breaking. Signed-off-by: Richard Yao <richard.yao@clusterhq.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2641	2014-10-28 09:37:23 -07:00
Brian Behlendorf	e33045ee98	Make license compatibility checks consistent Apply the license specified in the META file to ensure the compatibility checks are all performed consistently. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2757	2014-10-17 14:58:38 -07:00
Brian Behlendorf	9ad656b2d0	Retire HAVE_IOCTL_* configure checks The HAVE_IOCTL_* configure checks were originally added for compatibility with an ancient version of glibc. This support and additional complexity is no longer needed and is therefore being removed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Closes #585	2014-08-28 07:45:54 -07:00
Brian Behlendorf	1139491da7	Revert "Disable GCCs aggressive loop optimization" This reverts commit `0f62f3f9ab`. Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2010	2014-07-22 09:56:55 -07:00
Turbo Fredriksson	2ee4e7da90	Accept udev and dracut paths specified by ./configure There are two common locations where udev and dracut components are commonly installed. When building packages using the 'make rpm\|deb' targets check those common locations and pass them to rpmbuild. For non-standard configurations these values can be provided by the the following configure options: --with-udevdir=DIR install udev helpers [default=check] --with-udevruledir=DIR install udev rules [[UDEVDIR/rules.d]] --with-dracutdir=DIR install dracut helpers [default=check] When rebuilding using the source packages the per-distribution default values specified in the spec file will be used. This is the preferred way to build packages for a distribution but the ability to override the defaults is provided as a convenience. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2310 Closes #1680	2014-06-11 16:32:57 -07:00
Turbo Fredriksson	0f629346bb	Set LANG to a reasonable default (C) Set LANG=C before calling 'rpmbuild' to avoid rpmbuild failing on the translated date string in the changelog. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes: zfsonlinux/spl#306	2014-06-10 16:46:21 -07:00
Turbo Fredriksson	69c7bdb6e7	Accept kernel source dir(s) specified by ./configure This adds ability to set the location of the kernel via defines when building from the spec files. This is useful when building against a kernel installed in a non-standard location. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1874	2014-06-05 13:46:49 -07:00
Turbo Fredriksson	c9b5cc8c00	Move the libraries into separate packages From day one the various ZFS libraries should have been placed in their own sub-packages. Primarily this allows for multiple major versions of the libraries to be concurrently installed. It also facilitates a smaller build environment by minimizing the required dependencies. The specific changes required to split the libraries from the utilities are as follows: * libzpool2, libnvpair1, libuutil1, and libzfs2 packages were added and contain the versioned shared libraries. The Fedora packaging guidelines discourage providing static libraries so they are not included in the packages. http://fedoraproject.org/wiki/Packaging:Guidelines#Packaging_Static_Libraries * The zfs-devel package was renamed libzfs2-devel and the new package obsoletes the old zfs-devel package. This package includes all the required headers for the libzpool2, libnvpair1, libuutil1, and libzfs2 libraries and their respective unversioned shared libraries. This package should eventually be split in to individual lib-devel packages but it will still take some work to cleanly separate them. Therefore the libzfs2-devel package provides the expected lib-devel packages so the all proper dependencies can still be created. http://fedoraproject.org/wiki/Packaging:Guidelines#Devel_Packages * Moved '/sbin/ldconfig' execution from the zfs packge to each of the new library packages as described by the packaging guidelines. http://fedoraproject.org/wiki/Packaging:Guidelines#Shared_Libraries * The /usr/share/doc/ files were moved in to the libzfs2-devel package. * Updated config/deb.am to be aware of the packaging changes. This ensures that 'deb-utils' make target converts all the resulting packages generated by the 'rpm-utils' target. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes: #2329 Closes: #2341 Issue: #2145	2014-06-02 13:43:20 -07:00
Brian Behlendorf	79aada6105	Restrict release number to META version When creating packages in a git repository the release number can be automatically set by 'git describe'. This normally works well but if your repository has newer tags which match the form NAME-VERSION* the release may be incorrectly calculated. To prevent this the match patten has been restricted to the contents of the META file, NAME-VERSION. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-05-29 19:31:50 -07:00
Richard Yao	3b4f425a5a	Refactor inode_owner_or_capable() autotools check We need inode_owner_or_capable() for ZFS file attributes in addition to xattrs, so it should go into its own file. This moves it into its own file and changes it to be more comprehensive. It will now fail if no known good API is detected. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1691	2014-05-01 10:06:49 -07:00
Chunwei Chen	b761912b34	Linux 3.14 compat: rq_for_each_segment in dmu_req_copy rq_for_each_segment changed from taking bio_vec * to taking bio_vec. We provide rq_for_each_segment4 which takes both. Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2124	2014-04-10 14:28:51 -07:00
Chunwei Chen	d4541210f3	Linux 3.14 compat: Immutable biovec changes in vdev_disk.c bi_sector, bi_size and bi_idx are moved from bio to bio->bi_iter. This patch creates BIO_BI_*(bio) macros to hide the differences. Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2124	2014-04-10 14:28:38 -07:00
Chunwei Chen	408ec0d2e1	Linux 3.14 compat: posix_acl_{create,chmod} posix_acl_{create,chmod} is changed to __posix_acl_{create_chmod} Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2124	2014-04-10 14:27:03 -07:00
Chris Dunlap	518eba1492	Replace check for _POSIX_MEMLOCK w/ HAVE_MLOCKALL zed supports a '-M' cmdline opt to lock all pages in memory via mlockall(). The _POSIX_MEMLOCK define is checked to determine whether this function is supported. The current test assumes mlockall() is supported if _POSIX_MEMLOCK is non-zero. However, this test is insufficient according to mlock(2) and sysconf(3). If _POSIX_MEMLOCK is -1, mlockall() is not supported; but if _POSIX_MEMLOCK is 0, availability must be checked at runtime. This commit adds an autoconf check for mlockall() to user.m4. The zed code block for mlockall() is now guarded with a test for HAVE_MLOCKALL. If defined, mlockall() will be called and its runtime availability checked via its return value. Signed-off-by: Chris Dunlap <cdunlap@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2	2014-04-02 13:10:08 -07:00
Chris Dunlap	07917db990	Add defs for makefile installation dir vars Add macro definitions to AM_CPPFLAGS to propagate makefile installation directory variables for libexecdir, runstatedir, sbindir, and sysconfdir. https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Installation-Directory-Variables.html A corollary is that you should not use these variables except in makefiles. For instance, instead of trying to evaluate datadir in configure and hard-coding it in makefiles using e.g., 'AC_DEFINE_UNQUOTED([DATADIR], ["$datadir"], [Data directory.])', you should add -DDATADIR='$(datadir)' to your makefile's definition of CPPFLAGS (AM_CPPFLAGS if you are also using Automake). The runstatedir directory is for "installing data files which the programs modify while they run, that pertain to one specific machine, and which need not persist longer than the execution of the program". https://www.gnu.org/prep/standards/html_node/Directory-Variables.html It will be defined by autoconf 2.70 or later, and default to "$(localstatedir)/run". http://git.savannah.gnu.org/gitweb/?p=autoconf.git;a=commit;h=a197431414088a417b407b9b20583b2e8f7363bd Signed-off-by: Chris Dunlap <cdunlap@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2	2014-03-31 16:11:13 -07:00
Richard Yao	1de1488fdc	Linux 3.13 compat: Handle __must_check bdi_setup_and_register torvalds/linux@8077c0d983 added a __must_check to the bdi_setup_and_register(), which caused our autotools check to break. zfsonlinux/zfs@729210564a was intended to correct that, but it depended on -Wno-unused-result, which is unrecognized in older GCC versions. That commit has been reverted in favor of a solution that does not require -Wno-unused-result. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2102 Closes #2135	2014-03-24 11:10:06 -07:00
Richard Yao	6b6b8d1041	Revert "Properly ignore bdi_setup_and_register return value" Older GCC versions do not obey -Wno-unused-result. This reverts commit `729210564a` in favor of a solution that does not require -Wno-unused-result. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1906	2014-03-24 11:08:55 -07:00
Chunwei Chen	a15dac42df	config: compile test rather than run test When testing compiler flags, we only need to do compile test. Otherwise, configure will fail with "configure: error: cannot run test program while cross compiling" when cross compiling. Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2191	2014-03-20 12:11:57 -07:00
Ralf Ertzinger	881f45c6a8	Add systemd unit files for ZFS startup This adds systemd unit files replacing the functionality offered by the SysV init script found in etc/init.d. It has been developed and tested on Fedora 19, Fedora 20 and openSuSE 13.1. Four unit files and one target are offered. zfs-import-cache.service: Import pools from /etc/zfs/zpool.cache. This unit will wait for udev to settle. zfs-import-scan.service: Import pools by scanning /dev/disk/by-id for zvols. This unit will only run if /etc/zfs/zpool.cache is not present. This unit will wait for udev to settle zfs-mount.service: Mount ZFS native filesystems. It contains a dependency to be loaded before local-fs.target. zfs-share.service: Share NFS/SMB filesystems. This unit contains a dependency that will cause it to be restarted whenever the smb or nfs-server unit is restarted, restoring the shares added. zfs.target: This target pulls in the other units in order to start ZFS. It's the only unit that can be enabled/disabled, all other services are static and pulled in by dependencies. It will honour zfs=off and zfs=no options on the kernel command line. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2108	2014-02-05 12:25:30 -08:00
Brian Behlendorf	0f62f3f9ab	Disable GCCs aggressive loop optimization GCC >+ 4.8's aggressive loop optimization breaks some of the iterators over the dn_blkptr[] pseudo-array in dnode_phys. Since dn_blkptr[] is defined as a single-element array, GCC believes an iterator can only access index 0 and will unroll the loop into a single iteration. One way to resolve the issue would be to cast the array to a pointer and fix all the iterators that might break. The only loop where it is known to cause a problem is this loop in dmu_objset_write_ready(): for (i = 0; i < dnp->dn_nblkptr; i++) bp->blk_fill += dnp->dn_blkptr[i].blk_fill; In the common case where dn_nblkptr is 3, the loop is only executed a single time and "i" is equal to 1 following the loop. The specific breakage caused by this problem is that the blk_fill of root block pointers wouldn't be set properly when more than one blkptr is in use (when no indrect blocks are needed). The simple reproducing sequence is: zpool create tank /tank.img zdb -ddddd tank 0 Notice that "fill=31", however, there are two L0 indirect blocks with "F=31" and "F=5". The fill count should be 36 rather than 31. This problem causes an assert to be hit in a simple "zdb tank" when built with --enable-debug. However, this approach was not taken because we need to be absolutely sure we catch all instances of this unwanted optimization. Therefore, the build system has been updated to detect if GCC supports the aggressive loop optimization. If it does the optimization will be explicitly disabled using the -fno-aggressive-loop-optimization option. Original-fix-by: Tim Chase <tim@chase2k.com> Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2010 Closes #2051	2014-01-14 13:55:58 -08:00
Matthew Thode	11b9ec23b9	Add full SELinux support Four new dataset properties have been added to support SELinux. They are 'context', 'fscontext', 'defcontext' and 'rootcontext' which map directly to the context options described in mount(8). When one of these properties is set to something other than 'none'. That string will be passed verbatim as a mount option for the given context when the filesystem is mounted. For example, if you wanted the rootcontext for a filesystem to be set to 'system_u:object_r:fs_t' you would set the property as follows: $ zfs set rootcontext="system_u:object_r:fs_t" storage-pool/media This will ensure the filesystem is automatically mounted with that rootcontext. It is equivalent to manually specifying the rootcontext with the -o option like this: $ zfs mount -o rootcontext=system_u:object_r:fs_t storage-pool/media By default all four contexts are set to 'none'. Further information on SELinux contexts is detailed in mount(8) and selinux(8) man pages. Signed-off-by: Matthew Thode <prometheanfire@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@gentoo.org> Closes #1504	2013-12-19 10:37:31 -08:00
Richard Yao	729210564a	Properly ignore bdi_setup_and_register return value This broke compilation against Linux 3.13 and GCC 4.7.3. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1906	2013-12-04 14:53:45 -08:00
DHE	fd23663000	Fix typos in commit `b83e3e48c9` There's a missing semicolon and equals sign in the first hunk of this commit in config/kernel-bdi.m4. This results in the test always failing. The effects were noticed when rrdtool, a tool which modifies files by mmap() and msync(), would have data never get saved to disk in spite of the files working while the mounted filesystem remains mounted. Signed-off-by: DHE <git@dehacked.net> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@gentoo.org> Closes #1889	2013-11-20 15:24:39 -08:00
Massimo Maggi	023699cd62	Posix ACL Support This change adds support for Posix ACLs by storing them as an xattr which is common practice for many Linux file systems. Since the Posix ACL is stored as an xattr it will not overwrite any existing ZFS/NFSv4 ACLs which may have been set. The Posix ACL will also be non-functional on other platforms although it may be visible as an xattr if that platform understands SA based xattrs. By default Posix ACLs are disabled but they may be enabled with the new 'aclmode=noacl\|posixacl' property. Set the property to 'posixacl' to enable them. If ZFS/NFSv4 ACL support is ever added an appropriate acltype will be added. This change passes the POSIX Test Suite cleanly with the exception of xacl/00.t test 45 which is incorrect for Linux (Ext4 fails too). http://www.tuxera.com/community/posix-test-suite/ Signed-off-by: Massimo Maggi <me@massimo-maggi.eu> Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #170	2013-10-29 14:54:26 -07:00
Richard Yao	1db7b9be75	Fix libblkid support libblkid support is dormant because the autotools check is broken and liblkid identifies ZFS vdevs as "zfs_member", not "zfs". We fix that with a few changes: First, we fix the libblkid autotools check to do a few things: 1. Make a 64MB file, which is the minimum size ZFS permits. 2. Make 4 fake uberblock entries to make libblkid's check succeed. 3. Return 0 upon success to make autotools use the success case. 4. Include stdlib.h to avoid implicit declration of free(). 5. Check for "zfs_member", not "zfs" 6. Make --with-blkid disable autotools check (avoids Gentoo sandbox violation) 7. Pass '-lblkid' correctly using LIBS not LDFLAGS. Second, we change the libblkid support to scan for "zfs_member", not "zfs". This makes --with-blkid work on Gentoo. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1751	2013-10-10 16:56:51 -07:00
Richard Yao	b83e3e48c9	Stop runtime pointer modifications in autotools checks `c38367c73f` was meant to eliminate runtime function pointer modifications in autotools checks because they were prone to false negatives on kernels hardened by the PaX project. Unfortunately, I missed the xattr_handler and super_block->s_bdi autotools checks. Recent changes to PaX constified xattr_handler->get/set, which lead me to discover this oversight. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1433	2013-09-13 13:30:55 -07:00
Richard Yao	0f37d0c8be	Linux 3.11 compat: fops->iterate() Commit torvalds/linux@2233f31aad replaced ->readdir() with ->iterate() in struct file_operations. All filesystems must now use the new ->iterate method. To handle this the code was reworked to use the new ->iterate interface. Care was taken to keep the majority of changes confined to the ZPL layer which is already Linux specific. However, minor changes were required to the common zfs_readdir() function. Compatibility with older kernels was accomplished by adding versions of the trivial dir_emit* helper functions. Also the various _readdir() functions were reworked in to wrappers which create a dir_context structure to pass to the new _iterate() functions. Unfortunately, the new dir_emit* functions prevent us from passing a private pointer to the filldir function. The xattr directory code leveraged this ability through zfs_readdir() to generate the list of xattr names. Since we can no longer use zfs_readdir() a simplified zpl_xattr_readdir() function was added to perform the same task. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1653 Issue #1591	2013-08-15 16:19:07 -07:00
Brian Behlendorf	dba1d70566	Fix arc_adapt() spinning in iterate_supers_type() The iterate_supers_type() function which was introduced in the 3.0 kernel was supposed to provide a safe way to call an arbitrary function on all super blocks of a specific type. Unfortunately, because a list_head was used a bug was introduced which made it possible for iterate_supers_type() to get stuck spinning on a super block which was just deactivated. This can occur because when the list head is removed from the fs_supers list it is reinitialized to point to itself. If the iterate_supers_type() function happened to be processing the removed list_head it will get stuck spinning on that list_head. The bug was fixed in the 3.3 kernel by converting the list_head to an hlist_node. However, to resolve the issue for existing 3.0 - 3.2 kernels we detect when a list_head is used. Then to prevent the spinning from occurring the .next pointer is set to the fs_supers list_head which ensures the iterate_supers_type() function will always terminate. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1045 Closes #861 Closes #790	2013-07-17 09:28:06 -07:00
Chris Dunlop	a1d9543a39	3.10 API change: block_device_operations->release() returns void Linux kernel commit torvalds/linux@db2a144 changed the return type of block_device_operations->release() to void. Detect the expected prototype and defined our callout accordingly. Signed-off-by: Chris Dunlop <chris@onthe.net.au> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1494	2013-07-08 15:41:57 -07:00
Li Dongyang	802e7b5feb	Add SEEK_DATA/SEEK_HOLE to lseek()/llseek() The approach taken was the rework zfs_holey() as little as possible and then just wrap the code as needed to ensure correct locking and error handling. Tested with xfstests 285 and 286. All tests pass except for 7-9 of 285 which try to reserve blocks first via fallocate(2) and fail because fallocate(2) is not yet supported. Note that the filp->f_lock spinlock did not exist prior to Linux 2.6.30, but we avoid the need for autotools check by virtue of the fact that SEEK_DATA/SEEK_HOLE support was not added until Linux 3.1. An autoconf check was added for lseek_execute() which is currently a private function but the expectation is that it will be exported perhaps as early as Linux 3.11. Reviewed-by: Richard Laager <rlaager@wiktel.com> Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1384	2013-07-02 09:24:43 -07:00
Carlos Alberto Lopez Perez	5165473737	Ensure --with-spl-timeout waits for spl_config.h and symvers The previous code was only waiting for the symver file. But the postinst target of the DKMS script for SPL will not only create the symvers file, but also the header spl_config.h. If we are waiting in the configure script of ZFS for the SPL symvers file, then we also need to wait for spl_config.h. Otherwise the configure script will abort because the spl_config.h is not yet available. On top of that, the function ZFS_AC_SPL_MODULE_SYMVERS is moved to the end of the function ZFS_AC_SPL to allow both checks share the with-spl-timeout parameter. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1431	2013-05-02 15:40:44 -07:00
Brian Behlendorf	e013670550	Set RPM_DEFINE_COMMON options When the kmod packaging was introduced the ability to pass the --enable-debug and --enable-dmu-tx options from configure all the way through to `make rpm\|deb` was accidenally lost. Update ZFS_AC_RPM to explicitlu set RPM_DEFINE_COMMON with these rpmbuild defines. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1402	2013-04-24 16:18:55 -07:00
Turbo Fredriksson	1a33036df9	Add --bump=0 to alien Preserve the release field when creating Debian packages. The --keep-version option was not used because it results in a failure when the git '<commit>_<hash>' syntax is used for the release. The '_' is a valid character for RPM packages but not for DEBs. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Issue #1402 Issue #928	2013-04-24 16:18:53 -07:00
Turbo Fredriksson	d012ba3832	Support .nogitrelease file When building a custom release in a git tree provide the ability to prevent the release field from being overwritten by the `git describe` output. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1402	2013-04-24 16:18:49 -07:00
Brian Behlendorf	d17eeafbf0	Replace the ZFS_AC_META perl dependency with awk The only remaining perl dependency is part of the ZFS_AC_META macro. By eliminating this and replacing it with awk we can avoid the need to pull in perl to rebuild the packages. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1380	2013-04-02 16:05:45 -07:00
Jan Engelhardt	8c39262945	build: do not call boilerplate ourself Rationale see section 3.5 "Using `autoreconf' to Update `configure' Scripts" of the autoconf manual. http://www.gnu.org/software/autoconf/manual/autoconf-2.67/html_node/autoreconf-Invocation.html Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-04-02 10:55:20 -07:00
Jan Engelhardt	ea0fcfc875	gitignore: anchor entries at their respective directory .ko is specific to module, .m4 to config, etc. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-04-02 10:50:17 -07:00
Jan Engelhardt	ecf76e3676	build: use CPPFLAGS -D and -I are preprocessor flags, so should preferably be in the appropriate variable. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-04-02 10:48:26 -07:00
Jan Engelhardt	4e95cc99b0	build: resolve orthographic and other grammatical errors Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-04-02 10:44:52 -07:00
Brian Behlendorf	f6fb7651a0	Use 'git describe' for working builds When building from an arbitrary commit in the git tree it's useful for the resulting packages to be uniquely identifiable. Therefore, the build system has been updated to detect if your compiling in git tree. If you are building in a git tree, and there are commits after the last annotated tag. Then the <id>-<hash> component of 'git describe' will be used to overwrite the 'Release:' field in the META file. The only tricky part is that to ensure the 'make dist' tarball is built using the correct release. A dist-hook was added to the top level make file to rewrite the META file using the correct release. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-03-22 14:52:01 -07:00
Brian Behlendorf	f3757573a6	Refresh RPM packaging Refresh the existing RPM packaging to conform to the 'Fedora Packaging Guidelines'. This includes adopting the kmods2 packaging standard which is used fod kmods distributed by rpmfusion for Fedora/RHEL. http://fedoraproject.org/wiki/Packaging:Guidelines http://rpmfusion.org/Packaging/KernelModules/Kmods2 While the spec files have been entirely rewritten from a user perspective the only major changes are: * The Fedora packages now have a build dependency on the rpmfusion repositories. The generic kmod packages also have a new dependency on kmodtool-1.22 but it is bundled with the source rpm so no additional packages are needed. * The kernel binary module packages have been renamed from zfs-modules-* to kmod-zfs-* as specificed by kmods2. * The is now a common kmod-zfs-devel-* package in addition to the per-kernel devel packages. The common package contains the development headers while the per-kernel package contains kernel specific build products. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1341	2013-03-18 15:33:17 -07:00
Brian Behlendorf	9b2af9a097	Configure --with-spl{-obj} auto-detect cleanup Because the install location for the spl/zfs-devel headers was changed we need to refresh the auto-detect code. Note that for packaging which already explicitly calls --with-spl{-obj} nothing has changed. The updated code is now structured like that in ZFS_AC_KERNEL and should be cleaner and easier to maintain. In addition, it's stricter about detecting a valid source and object directory. It requires: * The source directory contains the file 'spl.release' * The object directory contains the file 'spl_config.h' * The following paths will be checked. Notice the /var/lib/ and /usr/src paths require that the spl and zfs version be matched. This is done to prevent accidentally mixing releases. dnl # 1) /var/lib/dkms/spl/<version>/build dnl # 2) /usr/src/spl-<version>/<kernel-version> dnl # 3) /usr/src/spl-<version> dnl # 4) ../spl dnl # 5) /usr/src/kernels/<kernel-version> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-03-13 13:42:16 -07:00
Brian Behlendorf	0da31cd6ca	Remove ARCH packaging The kernel modules are now available in the Arch User Repository (AUR) via zfs. Since their packaging is maintained and superior to ours it is being removed from the tree. https://wiki.archlinux.org/index.php/ZFS Now that various distributions are picking up the packages we should eventually be able to remove most of this infrastructure. Packaging belongs with the distributions not upstream. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-03-06 15:46:41 -08:00
Brian Behlendorf	ffb21118ad	Add --with-dracutdir configure option The standard dracut directory has moved from /usr/share/dracut to /usr/lib/dracut. To ensure the dracut modules get installed in the correct location provide a --with-dracutdir configure option to set the path. The default install location has been updated to /usr/lib/dracut which is used by more current versions of Fedora. However, this default is overriden by the RPM packaging for consistency. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-03-06 15:46:41 -08:00
Richard Yao	c38367c73f	Eliminate runtime function pointer mods in autotools checks PaX/GrSecurity patched kernels implement a dialect of C that relies on a GCC plugin for enforcement. A basic idea in this dialect is that function pointers in structures should not change during runtime. This causes code that modifies function pointers at runtime to fail to compile in many instances. The autotools checks rely on whether or not small test cases compile against a given kernel. Some autotools checks assume some default case if other cases fail. When one of these autotools checks tests a PaX/GrSecurity patched kernel by modifying a function pointer at runtime, the default case will be used. Early detection of such situations is possible by relying on compiler warnings, which are compiler errors when --enable-debug is used. Unfortunately, very few people build ZFS with --enable-debug. The more common situation is that these issues manifest themselves as runtime failures in the form of NULL pointer exceptions. Previous patches that addressed such issues with PaX/GrSecurity compatibility largely relied on rewriting autotools checks to avoid runtime function pointer modification or the addition of PaX/GrSecurity specific checks. This patch takes the previous work to its logical conclusion by eliminating the use of runtime function pointer modification. This permits the removal of PaX-specific autotools checks in favor of ones that work across all supported kernels. This should resolve issues that were reported to occur with PaX/GrSecurity-patched Linux 3.7.5 kernels on Gentoo Linux. https://bugs.gentoo.org/show_bug.cgi?id=457176 We should be able to prevent future regressions in PaX/GrSecurity compatibility by ensuring that all changes to ZFSOnLinux avoid runtime function pointer modification. At the same time, this does not solve the issue of silent failures triggering default cases in the autotools check, which is what permitted these regressions to become runtime failures in the first place. This will need to be addressed in a future patch. Reported-by: Marcin Mirosław <bug@mejor.pl> Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1300	2013-03-04 08:49:17 -08:00
Etienne Dechamps	d9b0ebbe82	Remove the bio_empty_barrier() check. To determine whether the kernel is capable of handling empty barrier BIOs, we check for the presence of the bio_empty_barrier() macro, which was introduced in 2.6.24. If this macro is defined, then we can flush disk vdevs; if it isn't, then flushing is disabled. Unfortunately, the bio_empty_barrier() macro was removed in 2.6.37, even though the kernel is still capable of handling empty barrier BIOs. As a result, flushing is effectively disabled on kernels >= 2.6.37, meaning that starting from this kernel version, zfs doesn't use barriers to guarantee on-disk data consistency. This is quite bad and can lead to potential data corruption on power failures. This patch fixes the issue by removing the configure check for bio_empty_barrier(), as we don't support kernels <= 2.6.24 anymore. Thanks to Richard Kojedzinszky for catching this nasty bug. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1318	2013-02-24 10:22:34 -08:00
Etienne Dechamps	d75af3c0eb	Use -Werror for all kernel configure tests. As a matter of fact, we're already using -Werror for most tests because of a bug in kernel-bio-empty-barrier.m4 which sets -Werror without reverting it afterwards. This meant that all tests which ran after this one was using -Werror. This patch simply makes it clear that we're using -Werror and makes the code more readable and more predictable. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1317	2013-02-24 10:20:28 -08:00
Brian Behlendorf	79c6e4c445	Remove NPTL_GUARD_WITHIN_STACK Commit `4b2f65b253` increased the user space stack by 4x to resolve certain stack overflows. As such it no longer makes sense to worry about a single extra page which might or might not be part of the process stack. There is now ample headroom for normal usage. By eliminating this configure check we are also resolving the following segfault which intentionally occurs at configure time and may be logged in dmesg. conftest[22156]: segfault at 7fbf18a47e48 ip 00000000004007fe sp 00007fbf18a4be50 error 6 in conftest[400000+1000] Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-01-29 10:58:20 -08:00
Brian Behlendorf	2b7ab9d4d9	Linux 2.6.26 compat, lookup_bdev() It's doubtful many people were impacted by this but commit `6c28567` accidentally broke ZFS builds for 2.6.26 and earlier kernels. This commit depends on the lookup_bdev() function which exists in 2.6.26 but wasn't exported until 2.6.27. The availability of the function isn't critical so a wrapper is introduced which returns ERR_PTR(-ENOTSUP) when the function isn't defined. This will have the effect of causing zvol_is_zvol() to always fail for 2.6.26 kernels. This in turn means vdevs will always get opened concurrently which is good for normal usage. This will only become an issue if your using a zvol as a vdev in another pool. In which case you really should be using a newer kernel anyway. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1205	2013-01-28 15:35:00 -08:00
Brian Behlendorf	ee93035378	Use sb->s_d_op default dentry operations As of Linux 2.6.37 the right way to register custom dentry operations is to use the super block's ->s_d_op field. For older kernels they should be registered as part of the lookup operation. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1223	2013-01-18 15:04:23 -08:00
Ned Bass	f1a05fa114	Fix false ENOENT on snapshot control dentries Lookups in the snapshot control directory for an existing snapshot fail with ENOENT if an earlier lookup failed before the snapshot was created. This is because the earlier lookup causes a negative dentry to be cached which is never invalidated. The bug can be reproduced as follows (the second ls should succeed): $ ls /tank/.zfs/snapshot/s ls: cannot access /tank/.zfs/snapshot/s: No such file or directory $ zfs snap tank@s $ ls /tank/.zfs/snapshot/s ls: cannot access /tank/.zfs/snapshot/s: No such file or directory To remedy this, always invalidate cached dentries in the snapshot control directory. Since these entries never exist on disk there is no significant performance penalty for the extra lookups. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1192	2013-01-16 16:28:54 -08:00
Brian Behlendorf	e191b54ecf	Only use gcc -Wunused-but-set-variable when available Certain versions of gcc generate an 'unrecognized command line option' error message when -Wunused-but-set-variable is used unconditionally. This in turn can cause several of the autoconf tests to misdetect an interface. Now, the use of -Wunused-but-set-variable in the autoconf tests was introduced by commit `b9c59ec8` to address a gcc 4.6 compatibility problem. So we really only need to pass this option for version of gcc which are known to support it. Therefore, the tests have been updated to use the result of the existing ZFS_AC_CONFIG_ALWAYS_NO_UNUSED_BUT_SET_VARIABLE which determines if gcc supports this option. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1004	2013-01-10 16:09:39 -08:00
Brian Behlendorf	8780c53961	Update SAs when an inode is dirtied Revert the portion of commit `d3aa3ea` which always resulted in the SAs being update when an mmap()'ed file was closed. That change accidentally resulted in unexpected ctime updates which upset tools like git. That was always a horrible hack and I'm happy it will never make it in to a tagged release. The right fix is something I initially resisted doing because I was worried about the additional overhead. However, in hindsight the overhead isn't as bad as I feared. This patch implemented the sops->dirty_inode() callback which is unsurprisingly called when an inode is dirtied. We leverage this callback to keep the znode SAs strictly in sync with the inode. However, for now we're going to go slowly to avoid introducing any new unexpected issues by only updating the atime, mtime, and ctime. This will cover the callpath of most concern to us. ->filemap_page_mkwrite->file_update_time->update_time-> mark_inode_dirty_sync->__mark_inode_dirty->dirty_inode Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #764 Closes #1140	2012-12-14 12:18:54 -08:00
Brian Behlendorf	56a517ae3a	Verify --with-linux source directory exists Previously this check was only performed when ./configure was attempting to autodetect your kernel source directory. But we should also handle the case where --with-linux was provided and is obviously wrong. This way we catch the error before invoking make and compiling the source with an incorrect autoconf results. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes zfsonlinux/spl#162	2012-11-29 15:08:35 -08:00
Brian Behlendorf	2404b01499	Improve AF hard disk detection Use the bdev_physical_block_size() interface to determine the minimize write size which can be issued without incurring a read-modify-write operation. This is used to set the ashift correctly to prevent a performance penalty when using AF hard disks. Unfortunately, this interface isn't entirely reliable because it's not uncommon for disks to misreport this value. For this reason you may still need to manually set your ashift with: zpool create -o ashift=12 ... The solution to this in the upstream Illumos source was to add a white list of known offending drives. Maintaining such a list will be a burden, but it still may be worth doing if we can detect a large number of these drives. This should be considered as future work. Reported-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #916	2012-11-15 11:06:14 -08:00
Richard Yao	95f5c63b47	Linux 3.6 compat, iops->mkdir() Use .mkdir instead of .create in 3.3 compatibility check. Linux 3.6 modifies inode_operations->create's function prototype. This causes an autotools Linux 3.3. compatibility check for a function prototype change in create, mkdir and mknode to fail. Since mkdir and mknode are unchanged, we modify the check to examine it instead. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #873	2012-10-14 15:29:26 -07:00
Yuxuan Shui	558ef6d080	Linux 3.6 compat, iops->create() As of Linux commit ebfc3b49a7ac25920cb5be5445f602e51d2ea559 the struct nameidata is no longer passed to iops->create. Instead only the result of (inamedata->flags & LOOKUP_EXCL) is passed. ZFS like almost all Linux fileystems never made use of this so only the prototype needs to be wrapped for compatibility. Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #873	2012-10-14 14:42:25 -07:00
Yuxuan Shui	8f195a908f	Linux 3.6 compat, iops->lookup() As of Linux commit 00cd8dd3bf95f2cc8435b4cac01d9995635c6d0b the struct nameidata is no longer passed to iops->lookup. Instead only the inamedata->flags are passed. ZFS like almost all Linux fileystems never made use of this so only the prototype needs to be wrapped for compatibility. Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #873	2012-10-14 13:06:54 -07:00
Yuxuan Shui	3c20361075	Linux 3.6 compat, sget() As of Linux commit 9249e17fe094d853d1ef7475dd559a2cc7e23d42 the mount flags are now passed to sget() so they can be used when initializing a new superblock. ZFS never uses sget() in this fashion so we can simply pass a zero and add a zpl_sget() compatibility wrapper. Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #873	2012-10-14 13:06:48 -07:00
Brian Behlendorf	6d1d976b2c	Modify vdev_elevator_switch() to use elevator_change() As of Linux 2.6.36 an elevator_change() interface was added. This commit updates vdev_elevator_switch() to use this interface when available, otherwise it falls back to the usermodehelper method. Original-patch-by: foobarz <sysop@xeon.(none)> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #906	2012-10-03 13:31:44 -07:00
Cyril Plisko	393b44c711	Implement .commit_metadata hook for NFS export In order to implement synchronous NFS metadata semantics ZFS needs to provide the .commit_metadata hook. All it takes there is to make sure changes are committed to ZIL. Fortunately zfs_fsync() does just that, so simply calling it from zpl_commit_metadata() does the trick. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #969	2012-10-03 10:49:45 -07:00
Brian Behlendorf	cda4db408c	Revert "Improve AF hard disk detection" This reverts commit `395350c85d` which accidentally introduced issue #955. Pools using AF drives which were originally created with a sector size of 512 bytes will now be correctly detected to have physical sector size of 4096. This is desirable for a new pool, however for an existing pool abruptly changing the sector size causes problems. For this reason, this change is being reverted until the additional logic can be added to detect the existing pool case. Existing pools must use the ashift size stored in the label regardless of what the disk reports. This is critical for compatibility. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #955	2012-09-11 16:33:49 -07:00
Brian Behlendorf	395350c85d	Improve AF hard disk detection Use the bdev_physical_block_size() interface to determine the minimize write size which can be issued without incurring a read-modify-write operation. This is used to set the ashift correctly to prevent a performance penalty when using AF hard disks. Unfortunately, this interface isn't entirely reliable because it's not uncommon for disks to misreport this value. For this reason you may still need to manually set your ashift with: zpool create -o ashift=12 ... The solution to this in the upstream Illumos source was to add a while list of known offending drives. Maintaining such a list will be a burden, but it still may be worth doing if we can detect a large number of these drives. This should be considered as future work. Reported-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #916	2012-09-04 15:35:32 -07:00
Prakash Surya	f86373f5b2	Remove autoconf check for CONFIG_PREEMPT The autoconf macro which failed if CONFIG_PREEMPT was set in the kernel config was removed. With the inclusion of a few previous patches targeting support for preempt enabled kernels, it is now safe to run with this kernel config option enabled. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #83	2012-08-27 11:54:41 -07:00
Brian Behlendorf	ca8b5af89d	Remove autotools products Remove all of the generated autotools products from the repository and update the .gitignore files accordingly. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #718	2012-08-27 11:47:44 -07:00
Richard Yao	074e72953c	Check kernel source directory for SPL ZFS fails to build when SPL is built into the kernel on unless --with-spl=/path/to/kernel/sources is specified. We fallback to the kernel sources directory when SPL is not found elsewhere to resolve that. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closed #896	2012-08-26 13:49:09 -07:00
Massimo Maggi	52cd92022e	Fix snapshot automounting with GrSecurity constify plugin. ./configure erroneously detects absence of dops->d_automount when built against a GrSecurity patched kernel. Summerized error message found in config.log: checking whether dops->d_automount() exists ... In function 'main': ... error: constified variable 'dops' cannot be local The "dops" variable cannot be a local variable, so it's moved to the global scope. This test also fails if the prototype of the dops->d_automount function pointer is changed. Signed-off-by: Massimo Maggi <massimo@mmmm.it> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Closes #884	2012-08-24 08:56:38 -07:00
Prakash Surya	26e08952e6	Support building a zfs-modules-dkms sub package This commit adds support for building a zfs-modules-dkms sub package built around Dynamic Kernel Module Support. This is to allow building packages using the DKMS infrastructure which is intended to ease the burden of kernel version changes, upgrades, etc. By default zfs-modules-dkms-* sub package will be built as part of the 'make rpm' target. Alternately, you can build only the DKMS module package using the 'make rpm-dkms' target. Examples: # To build packaged binaries as well as a dkms packages $ ./configure && make rpm # To build only the packaged binary utilities and dkms packages $ ./configure && make rpm-utils rpm-dkms Note: Only the RHEL 5/6, CHAOS 5, and Fedora distributions are supported for building the dkms sub package. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #535	2012-08-08 15:21:01 -07:00
Prakash Surya	5085d55817	Add '--with-spl-timeout' option When checking for the SPL Module.symvers file, a timeout can now be passed in which will pause the configure step while it waits for this file to be generated. By default, the configure behavior is unchanged as a timeout of 0 is used. If a positive number of seconds is passed, configure will wait that number of seconds for the Module.symvers file before moving on. The main motivation for this change was to support parallel execution of './configure && make' for the SPL and ZFS packages in preparation of supporting DKMS based packages. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-08-08 15:20:55 -07:00
Etienne Dechamps	ee5fd0bb80	Set zvol discard_granularity to the volblocksize. Currently, zvols have a discard granularity set to 0, which suggests to the upper layer that discard requests of arbirarily small size and alignment can be made efficiently. In practice however, ZFS does not handle unaligned discard requests efficiently: indeed, it is unable to free a part of a block. It will write zeros to the specified range instead, which is both useless and inefficient (see dnode_free_range). With this patch, zvol block devices expose volblocksize as their discard granularity, so the upper layer is aware that it's not supposed to send discard requests smaller than volblocksize. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #862	2012-08-07 14:55:31 -07:00
Etienne Dechamps	705741827a	When checking for symbol exports, try compiling. This patch adds a new autoconf function: ZFS_LINUX_TRY_COMPILE_SYMBOL. This new function does the following: - Call LINUX_TRY_COMPILE with the specified parameters. - If unsuccessful, return false. - If successful and we're configuring with --enable-linux-builtin, return true. - Else, call CHECK_SYMBOL_EXPORT with the specified parameters and return the result. All calls to CHECK_SYMBOL_EXPORT are converted to LINUX_TRY_COMPILE_SYMBOL so that the tests work even when configuring for builtin on a kernel which doesn't have loadable module support, or hasn't been built yet. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #851	2012-07-26 13:42:57 -07:00
Etienne Dechamps	fc88a6dda9	Fake modpost stage for LINUX_COMPILE. Currently, when building a test case, we're compiling an entire Linux module from beginning to end. This includes the MODPOST stage, which generates a "conftest.mod.c" file with some boilerplate module declaration code. This poses a problem when configuring for built-in on kernels which have loadable module support disabled. In this case conftest.mod.c is referencing disabled code, resulting in a compilation failure, thus breaking the tests. This patch fixes the issue by faking the modpost stage when the --enable-linux-builtin option is provided. It does so by forcing the modpost command to be /bin/true, and using an empty conftest.mod.c file. The test module still compiles fine, although the result isn't loadable, but we don't really care at this point. Note it is important to preserve the modpost stage when building out of tree. The ZFS_AC_KERNEL_BLK_END_REQUEST, ZFS_AC_KERNEL_BLK_QUEUE_FLUSH, and ZFS_AC_KERNEL_BLK_RQ_BYTES configure checks all depend on it to identify GPL-only symbols. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #851	2012-07-26 13:41:02 -07:00
Etienne Dechamps	319a99a3d4	Make configure builtin-aware. This patch adds a new option to configure: --enable-linux-builtin. When this option is used, the following happens: - Compilation of kernel modules is disabled. - A failure to find UTS_RELEASE is followed by a suggestion to run "make prepare" on the kernel source tree. This patch also adds a new test which tries to compile an empty module as a basic toolchain sanity test. If it fails and the option was specified, the error is followed by a suggestion to run "make scripts" on the kernel source tree. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #851	2012-07-26 13:40:18 -07:00
Etienne Dechamps	b2c5198b19	Don't build packages that haven't been selected. Currently, when configure --with-config is used, selective compilation is only effective for the simple "make" case. Package builders (e.g. make rpm) still build everything (utils and modules). This patch fixes that. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #851	2012-07-26 13:39:37 -07:00
Richard Yao	739a1a82e0	Linux 3.5 compat, end_writeback() changed to clear_inode() The end_writeback() function was changed by moving the call to inode_sync_wait() earlier in to evict(). This effecitvely changes the ordering of the sync but it does not impact the details of the zfs implementation. However, as part of this change end_writeback() was renamed to clear_inode() to reflect the new semantics. This change does impact us and clear_inode() now maps to end_writeback() for kernels prior to 3.5. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #784	2012-07-23 12:29:36 -07:00
Richard Yao	ea1fdf46e2	Linux 3.5 compat, iops->truncate_range() removed The vmtruncate_range() support has been removed from the kernel in favor of using the fallocate method in the file_operations table. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #784	2012-07-23 12:29:32 -07:00
Richard Yao	756c3e5a9c	Linux 3.5 compat, eops->encode_fh() takes inodes The export_operations member ->encode_fh() has been updated to take both the child and parent inodes. This interface used to take the child dentry and a bool describing if the parent is needed. NOTE: While updating this code I noticed that we do not currently cleanly handle the case where we're passed a connectable parent. This code should be audited to make sure we're doing the right thing. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #784	2012-07-23 12:29:23 -07:00
Richard Yao	ed3fc80048	Fix NULL pointer dereference on PaX/GRSecurity patched Linux 3.3 and later kernels Support for PaX/GRSecurity patched kernels was developed against Linux 3.2. Unfortunately, an autotools check introduced for a Linux 3.3 API fails on PaX/GRSecurity patched kernels. This causes the module to be built against the Linux 3.2 ABI, which results in a NULL pointer dereference at runtime. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Closes #794 Closes #809	2012-07-20 12:31:45 -07:00
Richard Yao	0a6b03d3b8	Fix build failures on PaX/GRSecurity patched kernels Gentoo Hardened kernels include the PaX/GRSecurity patches. They use a dialect of C that relies on a GCC plugin. In particular, struct file_operations has been marked do_const in the PaX/GRSecurity dialect, which causes GCC to consider all instances of it as const. This caused failures in the autotools checks and the ZFS source code. To address this, we modify the autotools checks to take into account differences between the PaX C dialect and the regular C dialect. We also modify struct zfs_acl's z_ops member to be a pointer to a function pointer table. Lastly, we modify zpl_put_link() to address a PaX change to the function prototype of nd_get_link(). This avoids compiler errors in the PaX/GRSecurity dialect. Note that the change in zpl_put_link() causes a warning that becomes a build failure when debugging is enabled. Fixing that warning requires ryao/spl@5ca50ef459. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #484	2012-07-17 09:22:43 -07:00
Etienne Dechamps	b5a28807cd	Move partition scanning from userspace to module. Currently, zpool online -e (dynamic vdev expansion) doesn't work on whole disks because we're invoking ioctl(BLKRRPART) from userspace while ZFS still has a partition open on the disk, which results in EBUSY. This patch moves the BLKRRPART invocation from the zpool utility to the module. Specifically, this is done just before opening the device in vdev_disk_open() which is called inside vdev_reopen(). This requires jumping through some hoops to get to the disk device from the partition device, and to make sure we can still open the partition after the BLKRRPART call. Note that this new code path is triggered on dynamic vdev expansion only; other actions, like creating a new pool, are unchanged and still call BLKRRPART from userspace. This change also depends on API changes which are available in 2.6.37 and latter kernels. The build system has been updated to detect this, but there is no compatibility mode for older kernels. This means that online expansion will NOT be available in older kernels. However, it will still be possible to expand the vdev offline. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #808	2012-07-17 09:17:31 -07:00
Richard Yao	6a0936babc	Linux 3.4 compat, d_make_root() replaces d_alloc_root() torvalds/linux@adc0e91ab1 introduced introduced d_make_root() as a replacement for d_alloc_root(). Further commits appear to have removed d_alloc_root() from the Linux source tree. This causes the following failure: error: implicit declaration of function 'd_alloc_root' [-Werror=implicit-function-declaration] To correct this we update the code to use the current d_make_root() interface for readability. Then we introduce an autotools check to determine if d_make_root() is available. If it isn't then we define some compatibility logic which used the older d_alloc_root() interface. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #776	2012-06-11 10:04:49 -07:00
Ned Bass	cac1f230e0	Improve CONFIG_DEBUG_LOCK_ALLOC error message The configure script error message for kernels built with CONFIG_DEBUG_LOCK_ALLOC may give the impression that the issue is strictly with license compliance. To avoid confusion add some words indicating that the linking stage will fail if the build continues. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #773	2012-06-11 09:28:04 -07:00
Brian Behlendorf	e5b8562277	Extend CONFIG_DEBUG_LOCK_ALLOC check The CONFIG_DEBUG_LOCK_ALLOC check at configure time was added to detect when mutex_lock() is defined as a GPL-only symbol. However, the check as written only inferred this from this configuration setting, it never actually checked. This change introduces that missing check to prevent false positives. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-06-01 08:51:56 -07:00
Brian Behlendorf	b39d3b9f7b	Linux 3.3 compat, iops->create()/mkdir()/mknod() The mode argument of iops->create()/mkdir()/mknod() was changed from an 'int' to a 'umode_t'. To prevent a compiler warning an autoconf check was added to detect the API change and then correctly set a zpl_umode_t typedef. There is no functional change. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #701	2012-04-30 12:52:38 -07:00
Brian Behlendorf	f47e1351db	Fix executable permissions Caught by lint, this permission change was accidentally introduced by commit `42cb3819f1`. Restore the correct permissions and while I'm at it add a missing whack-bang to config/ltmain.sh. lint: executable-not-elf-or-script: zpool_main.c zfs_main.c Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #620	2012-03-26 11:52:44 -07:00
Brian Behlendorf	1c5de20ae2	Add --enable-debug-dmu-tx configure option Allow rigorous (and expensive) tx validation to be enabled/disabled indepentantly from the standard zfs debugging. When enabled these checks ensure that all txs are constructed properly and that a dbuf is never dirtied without taking the correct tx hold. This checking is particularly helpful when adding new dmu consumers like Lustre. However, for established consumers such as the zpl with no known outstanding tx construction problems this is just overhead. --enable-debug-dmu-tx - Enable/disable validation of each tx as --disable-debug-dmu-tx it is constructed. By default validation is disabled due to performance concerns. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-03-23 12:25:17 -07:00
Brian Behlendorf	ebe7e575ea	Add .zfs control directory Add support for the .zfs control directory. This was accomplished by leveraging as much of the existing ZFS infrastructure as posible and updating it for Linux as required. The bulk of the core functionality is now all there with the following limitations. ) The .zfs/snapshot directory automount support requires a 2.6.37 or newer kernel. The exception is RHEL6.2 which has backported the d_automount patches. ) Creating/destroying/renaming snapshots with mkdir/rmdir/mv in the .zfs/snapshot directory works as expected. However, this functionality is only available to root until zfs delegations are finished. * mkdir - create a snapshot * rmdir - destroy a snapshot * mv - rename a snapshot The following issues are known defeciences, but we expect them to be addressed by future commits. ) Add automount support for kernels older the 2.6.37. This should be possible using follow_link() which is what Linux did before. ) Accessing the .zfs/snapshot directory via NFS is not yet possible. The majority of the ground work for this is complete. However, finishing this work will require resolving some lingering integration issues with the Linux NFS kernel server. *) The .zfs/shares directory exists but no futher smb functionality has yet been implemented. Contributions-by: Rohan Puri <rohan.puri15@gmail.com> Contributiobs-by: Andrew Barnes <barnes333@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #173	2012-03-22 13:03:47 -07:00
Richard Yao	76c2b24c61	Fix distribution detection Improve the distribution detection by moving the tests for distribution specific files first. The Ubuntu and Debian checks are left for last because they are the least likely to be unique. This is particularly true in the case of Debian since so many distributions are based on Debian. Since this is currently only used to identify the correct packaging method for this system the result in many instances is simply cosmetic. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-03-05 10:38:27 -08:00
Brian Behlendorf	4b787d75c8	Cleanly support debug packages Allow a source rpm to be rebuilt with debugging enabled. This avoids the need to have to manually modify the spec file. By default debugging is still largely disabled. To enable specific debugging features use the following options with rpmbuild. '--with debug' - Enables ASSERTs # For example: $ rpmbuild --rebuild --with debug zfs-modules-0.6.0-rc6.src.rpm Additionally, ZFS_CONFIG has been added to zfs_config.h for packages which build against these headers. This is critical to ensure both zfs and the dependant package are using the same prototype and structure definitions. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-27 14:08:17 -08:00
Etienne Dechamps	30930fba21	Add support for DISCARD to ZVOLs. DISCARD (REQ_DISCARD, BLKDISCARD) is useful for thin provisioning. It allows ZVOL clients to discard (unmap, trim) block ranges from a ZVOL, thus optimizing disk space usage by allowing a ZVOL to shrink instead of just grow. We can't use zfs_space() or zfs_freesp() here, since these functions only work on regular files, not volumes. Fortunately we can use the low-level function dmu_free_long_range() which does exactly what we want. Currently the discard operation is not added to the log. That's not a big deal since losing discard requests cannot result in data corruption. It would however result in disk space usage higher than it should be. Thus adding log support to zvol_discard() is probably a good idea for a future improvement. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-09 16:19:38 -08:00
Etienne Dechamps	cb2d19010d	Support the fallocate() file operation. Currently only the (FALLOC_FL_PUNCH_HOLE) flag combination is supported, since it's the only one that matches the behavior of zfs_space(). This makes it pretty much useless in its current form, but it's a start. To support other flag combinations we would need to modify zfs_space() to make it more flexible, or emulate the desired functionality in zpl_fallocate(). Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #334	2012-02-09 16:19:32 -08:00
Etienne Dechamps	34037afe24	Improve ZVOL queue behavior. The Linux block device queue subsystem exposes a number of configurable settings described in Linux block/blk-settings.c. The defaults for these settings are tuned for hard drives, and are not optimized for ZVOLs. Proper configuration of these options would allow upper layers (I/O scheduler) to take better decisions about write merging and ordering. Detailed rationale: - max_hw_sectors is set to unlimited (UINT_MAX). zvol_write() is able to handle writes of any size, so there's no reason to impose a limit. Let the upper layer decide. - max_segments and max_segment_size are set to unlimited. zvol_write() will copy the requests' contents into a dbuf anyway, so the number and size of the segments are irrelevant. Let the upper layer decide. - physical_block_size and io_opt are set to the ZVOL's block size. This has the potential to somewhat alleviate issue #361 for ZVOLs, by warning the upper layers that writes smaller than the volume's block size will be slow. - The NONROT flag is set to indicate this isn't a rotational device. Although the backing zpool might be composed of rotational devices, the resulting ZVOL often doesn't exhibit the same behavior due to the COW mechanisms used by ZFS. Setting this flag will prevent upper layers from making useless decisions (such as reordering writes) based on incorrect assumptions about the behavior of the ZVOL. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-07 16:23:06 -08:00
Etienne Dechamps	b18019d2d8	Fix synchronicity for ZVOLs. zvol_write() assumes that the write request must be written to stable storage if rq_is_sync() is true. Unfortunately, this assumption is incorrect. Indeed, "sync" does not mean what we think it means in the context of the Linux block layer. This is well explained in linux/fs.h: WRITE: A normal async write. Device will be plugged. WRITE_SYNC: Synchronous write. Identical to WRITE, but passes down the hint that someone will be waiting on this IO shortly. WRITE_FLUSH: Like WRITE_SYNC but with preceding cache flush. WRITE_FUA: Like WRITE_SYNC but data is guaranteed to be on non-volatile media on completion. In other words, SYNC does not mean that the write must be on stable storage on completion. It just means that someone is waiting on us to complete the write request. Thus triggering a ZIL commit for each SYNC write request on a ZVOL is unnecessary and harmful for performance. To make matters worse, ZVOL users have no way to express that they actually want data to be written to stable storage, which means the ZIL is broken for ZVOLs. The request for stable storage is expressed by the FUA flag, so we must commit the ZIL after the write if the FUA flag is set. In addition, we must commit the ZIL before the write if the FLUSH flag is set. Also, we must inform the block layer that we actually support FLUSH and FUA. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-07 16:23:06 -08:00
Brian Behlendorf	47621f3d76	Linux 3.3 compat, sops->show_options() The second argument of sops->show_options() was changed from a 'struct vfsmount ' to a 'struct dentry '. Add an autoconf check to detect the API change and then conditionally define the expected interface. In either case we are only interested in the zfs_sb_t. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #549	2012-02-03 10:02:01 -08:00
Brian Behlendorf	b40a77aefc	Add the release component to headers When the original build system code was added the release component was accidentally omited from the development header install path. This patch adds the missing path component so it's always clear exactly what release your compiling against. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-01-18 12:19:47 -08:00
Prakash Surya	58d956b085	Run ZFS_AC_PACMAN only if $VENDOR is "arch" Unfortunately, Arch's package manager `pacman` shares it's name with a popular arcade video game. Thus, in order to refrain from executing the video game when we mean to execute the package manager, ZFS_AC_PACMAN is now only run when $VENDOR is determined to be "arch". Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #517	2012-01-13 09:03:11 -08:00
Brian Behlendorf	166dd49de0	Linux 3.2 compat, security_inode_init_security() The security_inode_init_security() API has been changed to include a filesystem specific callback to write security extended attributes. This was done to support the initialization of multiple LSM xattrs and the EVM xattr. This change updates the code to use the new API when it's available. Otherwise it falls back to the previous implementation. In addition, the ZFS_AC_KERNEL_6ARGS_SECURITY_INODE_INIT_SECURITY autoconf test has been made more rigerous by passing the expected types. This is done to ensure we always properly the detect the correct form for the security_inode_init_security() API. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #516	2012-01-12 15:06:39 -08:00
Brian Behlendorf	ab26409db7	Linux 3.1 compat, super_block->s_shrink The Linux 3.1 kernel has introduced the concept of per-filesystem shrinkers which are directly assoicated with a super block. Prior to this change there was one shared global shrinker. The zfs code relied on being able to call the global shrinker when the arc_meta_limit was exceeded. This would cause the VFS to drop references on a fraction of the dentries in the dcache. The ARC could then safely reclaim the memory used by these entries and honor the arc_meta_limit. Unfortunately, when per-filesystem shrinkers were added the old interfaces were made unavailable. This change adds support to use the new per-filesystem shrinker interface so we can continue to honor the arc_meta_limit. The major benefit of the new interface is that we can now target only the zfs filesystem for dentry and inode pruning. Thus we can minimize any impact on the caching of other filesystems. In the context of making this change several other important issues related to managing the ARC were addressed, they include: * The dnlc_reduce_cache() function which was called by the ARC to drop dentries for the Posix layer was replaced with a generic zfs_prune_t callback. The ZPL layer now registers a callback to drop these dentries removing a layering violation which dates back to the Solaris code. This callback can also be used by other ARC consumers such as Lustre. arc_add_prune_callback() arc_remove_prune_callback() * The arc_reduce_dnlc_percent module option has been changed to arc_meta_prune for clarity. The dnlc functions are specific to Solaris's VFS and have already been largely eliminated already. The replacement tunable now represents the number of bytes the prune callback will request when invoked. * Less aggressively invoke the prune callback. We used to call this whenever we exceeded the arc_meta_limit however that's not strictly correct since it results in over zeleous reclaim of dentries and inodes. It is now only called once the arc_meta_limit is exceeded and every effort has been made to evict other data from the ARC cache. * More promptly manage exceeding the arc_meta_limit. When reading meta data in to the cache if a buffer was unable to be recycled notify the arc_reclaim thread to invoke the required prune. * Added arcstat_prune kstat which is incremented when the ARC is forced to request that a consumer prune its cache. Remember this will only occur when the ARC has no other choice. If it can evict buffers safely without invoking the prune callback it will. * This change is also expected to resolve the unexpect collapses of the ARC cache. This would occur because when exceeded just the arc_meta_limit reclaim presure would be excerted on the arc_c value via arc_shrink(). This effectively shrunk the entire cache when really we just needed to reclaim meta data. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #466 Closes #292	2012-01-11 11:46:02 -08:00
Prakash Surya	8eaa020b46	Move Arch Linux's VENDOR check above Ubuntu's If the lsb-release package is installed on an Arch Linux distribution, the configure step will incorrectly detect the running distribution as Ubuntu. This is a result of both distributions providing an /etc/lsb-release file, and the Ubuntu VENDOR check being performed first. Since the Arch Linux test check's for a file more specific to the Arch Linux distribution, moving Arch Linux's VENDOR check above Unbuntu's check provides a quick and easy solution. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-12-19 12:05:10 -08:00
Darik Horn	28eb9213d8	Linux 3.2 compat: set_nlink() Directly changing inode->i_nlink is deprecated in Linux 3.2 by commit SHA: bfe8684869601dacfcb2cd69ef8cfd9045f62170 Use the new set_nlink() kernel function instead. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes: #462	2011-12-16 20:02:52 -08:00
Prakash Surya	6ba3b44614	Add make rule for building Arch Linux packages Added the necessary build infrastructure for building packages compatible with the Arch Linux distribution. As such, one can now run: $ ./configure $ make pkg # Alternatively, one can run 'make arch' as well on the Arch Linux machine to create two binary packages compatible with the pacman package manager, one for the zfs userland utilities and another for the zfs kernel modules. The new packages can then be installed by running: # pacman -U $package.pkg.tar.xz In addition, source-only packages suitable for an Arch Linux chroot environment or remote builder can also be build using the 'sarch' make rule. NOTE: Since the source dist tarball is created on the fly from the head of the build tree, it's MD5 hash signature will be continually influx. As a result, the md5sum variable was intentionally omitted from the PKGBUILD files, and the '--skipinteg' makepkg option is used. This may or may not have any serious security implications, as the source tarball is not being downloaded from an outside source. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #491	2011-12-14 19:14:23 -08:00
Prakash Surya	b9c59ec83a	Fix configure tests to play nice with GCC 4.6 As of GCC 4.6, specific kernel 2.6.32 header files do not compile cleanly without warnings. One specific example of this is the arch/x86/include/asm/percpu.h file. Thus, a few of the configure tests were getting hung up on this and the '-Wno-unsued-but-set-variables' compile option had to be introduced. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #459	2011-11-29 16:14:25 -08:00
Prakash Surya	e89236fd28	In autoconf v2.68, AC_LANG_PROGRAM must be quoted This change updates the AC_LANG_PROGRAM autoconf macro invocations to be wrapped in quotes. As of autoconf version 2.68, the quotes are necessary to prevent warnings from appearing. Specifically, the autoconf v2.68 Forward Porting Notes specifies: It is important to note that you need to ensure that the call to AC_LANG_SOURCE is quoted and not expanded, otherwise that will cause the warning to appear nonetheless. Finally, because of the additional quoting we can drop the extra quotas used by the ZFS_AC_CONFIG_USER_STACK_GUARD autoconf check. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #464	2011-11-28 11:16:33 -08:00
Brian Behlendorf	adcd70bd1a	Linux 3.1 compat, fops->fsync() The Linux 3.1 kernel updated the fops->fsync() callback yet again. They now pass the requested range and delegate the responsibility for calling filemap_write_and_wait_range() to the callback. In addition imutex is no longer held by the caller and the callback is responsible for taking the lock if required. This commit updates the code to provide a zpl_fsync() function for the updated API. Implementations for the previous two APIs are also maintained for compatibility. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #445	2011-11-10 10:03:08 -08:00
Brian Behlendorf	8c19f5b407	Suppress packaging warning Only under Ubuntu Lucid the rpm packaging step mistakenly adds the following files twice to the package because of the /lib naming convention. This is harmless but results in a warning which the buildot flags as a failure. Suppress this warning. warning: File listed twice: /lib/udev/rules.d warning: File listed twice: /lib/udev/rules.d/60-zpool.rules warning: File listed twice: /lib/udev/rules.d/60-zvol.rules warning: File listed twice: /lib/udev/rules.d/90-zfs.rules warning: File listed twice: /lib/udev/sas_switch_id warning: File listed twice: /lib/udev/zpool_id warning: File listed twice: /lib/udev/zvol_id Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-11-08 11:32:04 -08:00
Brian Behlendorf	5547c2f1bf	Simplify BDI integration Update the code to use the bdi_setup_and_register() helper to simplify the bdi integration code. The updated code now just registers the bdi during mount and destroys it during unmount. The only complication is that for 2.6.32 - 2.6.33 kernels the helper wasn't available so in these cases the zfs code must provide it. Luckily the bdi_setup_and_register() function is trivial. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #367	2011-11-08 10:19:03 -08:00
Prakash Surya	8366cd6a83	Convert 'if' statements to AS_IF in kernel.m4 The 'if' statements found in kernel.m4 were converted to use the portable alternative provided by autoconf, the AS_IF macro. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-09-06 13:20:48 -07:00
Prakash Surya	2984e0bb0c	Fix minor autoconf error message inconsistencies A few of the autoconf error messages were inconsistent with the rest of the build system. To be specific, the inconsistencies addressed by this commit are the following: * The second line of the error message for the CONFIG_PREEMPT check was missing it's third asterisk. * A few of the error messages were prefixed by two tabs, whereas the majority of error messages are only prefixed by a single tab. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-09-06 13:20:04 -07:00
Brian Behlendorf	9c4f40b894	Buildbot suppression rules The warnings listed in the suppression file will be suppressed and not flagged during regular buildbot builds. These warnings are expected, harmless, and can obscure real issues unless they are suppressed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-08-19 16:26:06 -07:00
Brian Behlendorf	ddd052aa83	Improve HAVE_EVICT_INODE check The hardened gentoo kernel defines all of the super block operation callbacks as const. This prevents the autoconf test from assigning the callback and results in a false negative. By moving the assignment in to the declaration we can avoid this issue and get a correct result for this patched kernel. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #296	2011-08-08 16:42:09 -07:00
Kyle Fuller	12d06bac9b	Move udev rules from /etc/udev to /lib/udev This change moves the default install location for the zfs udev rules from /etc/udev/ to /lib/udev/. The correct convention is for rules provided by a package to be installed in /lib/udev/. The /etc/udev/ directory is reserved for custom rules or local overrides. Additionally, this patch cleans up some abuse of the bindir install location by adding a udevdir and udevruledir install directories. This allows us to revert to the default bin install location. The udev install directories can be set with the following new options. --with-udevdir=DIR install udev helpers [EPREFIX/lib/udev] --with-udevruledir=DIR install udev rules [UDEVDIR/rules.d] Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #356	2011-08-08 16:21:10 -07:00
Brian Behlendorf	76659dc110	Add backing_device_info per-filesystem For a long time now the kernel has been moving away from using the pdflush daemon to write 'old' dirty pages to disk. The primary reason for this is because the pdflush daemon is single threaded and can be a limiting factor for performance. Since pdflush sequentially walks the dirty inode list for each super block any delay in processing can slow down dirty page writeback for all filesystems. The replacement for pdflush is called bdi (backing device info). The bdi system involves creating a per-filesystem control structure each with its own private sets of queues to manage writeback. The advantage is greater parallelism which improves performance and prevents a single filesystem from slowing writeback to the others. For a long time both systems co-existed in the kernel so it wasn't strictly required to implement the bdi scheme. However, as of Linux 2.6.36 kernels the pdflush functionality has been retired. Since ZFS already bypasses the page cache for most I/O this is only an issue for mmap(2) writes which must go through the page cache. Even then adding this missing support for newer kernels was overlooked because there are other mechanisms which can trigger writeback. However, there is one critical case where not implementing the bdi functionality can cause problems. If an application handles a page fault it can enter the balance_dirty_pages() callpath. This will result in the application hanging until the number of dirty pages in the system drops below the dirty ratio. Without a registered backing_device_info for the filesystem the dirty pages will not get written out. Thus the application will hang. As mentioned above this was less of an issue with older kernels because pdflush would eventually write out the dirty pages. This change adds a backing_device_info structure to the zfs_sb_t which is already allocated per-super block. It is then registered when the filesystem mounted and unregistered on unmount. It will not be registered for mounted snapshots which are read-only. This change will result in flush-<pool> thread being dynamically created and destroyed per-mounted filesystem for writeback. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #174	2011-08-04 13:37:38 -07:00
Brian Behlendorf	0da7869690	Fix the configure CONFIG_* option detection The latest kernels no longer define AUTOCONF_INCLUDED which was being used to detect the new style autoconf.h kernel configure options. This results in the CONFIG_* checks always failing incorrectly for newer kernels. The fix for this is a simplification of the testing method. Rather than attempting to explicitly include to renamed config header. It is simpler to unconditionally include <linux/module.h> which must pick up the correctly named header. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #320	2011-07-22 15:07:16 -07:00
Kyle Fuller	615ab66d18	Provide a rc.d script for archlinux Unlike most other Linux distributions archlinux installs its init scripts in /etc/rc.d insead of /etc/init.d. This commit provides an archlinux rc.d script for zfs and extends the build infrastructure to ensure it get's installed in the correct place. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #322	2011-07-11 14:12:23 -07:00
Brian Behlendorf	2cf7f52bc4	Linux compat 2.6.39: mount_nodev() The .get_sb callback has been replaced by a .mount callback in the file_system_type structure. When using the new interface the caller must now use the mount_nodev() helper. Unfortunately, the new interface no longer passes the vfsmount down to the zfs layers. This poses a problem for the existing implementation because we currently save this pointer in the super block for latter use. It provides our only entry point in to the namespace layer for manipulating certain mount options. This needed to be done originally to allow commands like 'zfs set atime=off tank' to work properly. It also allowed me to keep more of the original Solaris code unmodified. Under Solaris there is a 1-to-1 mapping between a mount point and a file system so this is a fairly natural thing to do. However, under Linux they many be multiple entries in the namespace which reference the same filesystem. Thus keeping a back reference from the filesystem to the namespace is complicated. Rather than introduce some ugly hack to get the vfsmount and continue as before. I'm leveraging this API change to update the ZFS code to do things in a more natural way for Linux. This has the upside that is resolves the compatibility issue for the long term and fixes several other minor bugs which have been reported. This commit updates the code to remove this vfsmount back reference entirely. All modifications to filesystem mount options are now passed in to the kernel via a '-o remount'. This is the expected Linux mechanism and allows the namespace to properly handle any options which apply to it before passing them on to the file system itself. Aside from fixing the compatibility issue, removing the vfsmount has had the benefit of simplifying the code. This change which fairly involved has turned out nicely. Closes #246 Closes #217 Closes #187 Closes #248 Closes #231	2011-07-01 13:36:39 -07:00
Brian Behlendorf	5c03efc379	Linux compat 2.6.39: security_inode_init_security() The security_inode_init_security() function now takes an additional qstr argument which must be passed in from the dentry if available. Passing a NULL is safe when no qstr is available the relevant security checks will just be skipped. Closes #246 Closes #217 Closes #187	2011-07-01 12:40:08 -07:00

1 2 3 4 5 ...

291 Commits