Archive-Team/zfs - zfs - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Gvozden Neskovic	fc0c72b167	Support for vectorized algorithms on x86 This is initial support for x86 vectorized implementations of ZFS parity and checksum algorithms. For the compilation phase, configure step checks if toolchain supports relevant instruction sets. Each implementation must ensure that the code is not passed to compiler if relevant instruction set is not supported. For this purpose, following new defines are provided if instruction set is supported: - HAVE_SSE, - HAVE_SSE2, - HAVE_SSE3, - HAVE_SSSE3, - HAVE_SSE4_1, - HAVE_SSE4_2, - HAVE_AVX, - HAVE_AVX2. For detecting if an instruction set can be used in runtime, following functions are provided in (include/linux/simd_x86.h): - zfs_sse_available() - zfs_sse2_available() - zfs_sse3_available() - zfs_ssse3_available() - zfs_sse4_1_available() - zfs_sse4_2_available() - zfs_avx_available() - zfs_avx2_available() - zfs_bmi1_available() - zfs_bmi2_available() These function should be called once, on module load, or initialization. They are safe to use from user and kernel space. If an implementation is using more than single instruction set, both compiler and runtime support for all relevant instruction sets should be checked. Kernel fpu methods: - kfpu_begin() - kfpu_end() Use __get_cpuid_max and __cpuid_count from <cpuid.h> Both gcc and clang have support for these. They also handle ebx register in case it is used for PIC code. Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Closes #4381	2016-03-21 09:24:34 -07:00
Brian Behlendorf	6bb24f4dc7	Add the ZFS Test Suite Add the ZFS Test Suite and test-runner framework from illumos. This is a continuation of the work done by Turbo Fredriksson to port the ZFS Test Suite to Linux. While this work was originally conceived as a stand alone project integrating it directly with the ZoL source tree has several advantages: * Allows the ZFS Test Suite to be packaged in zfs-test package. * Facilitates easy integration with the CI testing. * Users can locally run the ZFS Test Suite to validate ZFS. This testing should ONLY be done on a dedicated test system because the ZFS Test Suite in its current form is destructive. * Allows the ZFS Test Suite to be run directly in the ZoL source tree enabled developers to iterate quickly during development. * Developers can easily add/modify tests in the framework as features are added or functionality is changed. The tests will then always be in sync with the implementation. Full documentation for how to run the ZFS Test Suite is available in the tests/README.md file. Warning: This test suite is designed to be run on a dedicated test system. It will make modifications to the system including, but not limited to, the following. * Adding new users * Adding new groups * Modifying the following /proc files: * /proc/sys/kernel/core_pattern * /proc/sys/kernel/core_uses_pid * Creating directories under / Notes: * Not all of the test cases are expected to pass and by default these test cases are disabled. The failures are primarily due to assumption made for illumos which are invalid under Linux. * When updating these test cases it should be done in as generic a way as possible so the patch can be submitted back upstream. Most existing library functions have been updated to be Linux aware, and the following functions and variables have been added. * Functions: * is_linux - Used to wrap a Linux specific section. * block_device_wait - Waits for block devices to be added to /dev/. * Variables: Linux Illumos * ZVOL_DEVDIR "/dev/zvol" "/dev/zvol/dsk" * ZVOL_RDEVDIR "/dev/zvol" "/dev/zvol/rdsk" * DEV_DSKDIR "/dev" "/dev/dsk" * DEV_RDSKDIR "/dev" "/dev/rdsk" * NEWFS_DEFAULT_FS "ext2" "ufs" * Many of the disabled test cases fail because 'zfs/zpool destroy' returns EBUSY. This is largely causes by the asynchronous nature of device handling on Linux and is expected, the impacted test cases will need to be updated to handle this. * There are several test cases which have been disabled because they can trigger a deadlock. A primary example of this is to recursively create zpools within zpools. These tests have been disabled until the root issue can be addressed. * Illumos specific utilities such as (mkfile) should be added to the tests/zfs-tests/cmd/ directory. Custom programs required by the test scripts can also be added here. * SELinux should be either is permissive mode or disabled when running the tests. The test cases should be updated to conform to a standard policy. * Redundant test functionality has been removed (zfault.sh). * Existing test scripts (zconfig.sh) should be migrated to use the framework for consistency and ease of testing. * The DISKS environment variable currently only supports loopback devices because of how the ZFS Test Suite expects partitions to be named (p1, p2, etc). Support must be added to generate the correct partition name based on the device location and name. * The ZFS Test Suite is part of the illumos code base at: https://github.com/illumos/illumos-gate/tree/master/usr/src/test Original-patch-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Olaf Faaland <faaland1@llnl.gov> Closes #6 Closes #1534	2016-03-16 13:46:16 -07:00
Boris Protopopov	a0bd735adb	Add support for asynchronous zvol minor operations zfsonlinux issue #2217 - zvol minor operations: check snapdev property before traversing snapshots of a dataset zfsonlinux issue #3681 - lock order inversion between zvol_open() and dsl_pool_sync()...zvol_rename_minors() Create a per-pool zvol taskq for asynchronous zvol tasks. There are a few key design decisions to be aware of. * Each taskq must be single threaded to ensure tasks are always processed in the order in which they were dispatched. * There is a taskq per-pool in order to keep the pools independent. This way if one pool is suspended it will not impact another. * The preferred location to dispatch a zvol minor task is a sync task. In this context there is easy access to the spa_t and minimal error handling is required because the sync task must succeed. Support for asynchronous zvol minor operations address issue #3681. Signed-off-by: Boris Protopopov <boris.protopopov@actifio.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2217 Closes #3678 Closes #3681	2016-03-10 09:49:22 -08:00
Thijs Cramer	95003f7098	Updated paths to scan when importing zpool(s) Added by-partlabel and by-partuuid to the default device search path. Made made device names in by-label more preferable. Signed-off-by: Thijs Cramer <thijs.cramer@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3892	2016-03-09 10:41:23 -08:00
Brian Behlendorf	7d11e37e55	Require libblkid Historically libblkid support was detected as part of configure and optionally enabled. This was done because at the time support for detecting ZFS pool vdevs had just be added to libblkid and those updated packages were not yet part of many distributions. This is no longer the case and any reasonably current distribution will ship a version of libblkid which can detect ZFS pool vdevs. This patch makes libblkid mandatory at build time and libblkid the preferred method of scanning for ZFS pools. For distributions which include a modern version of libblkid there is no change in behavior. Explicitly scanning the default search paths is still supported and can be enabled with the '-s' command line option. Additionally making libblkid mandatory means that the 'zpool create' command can reliably detect if a specified device has an existing non-ZFS filesystem (ext4, xfs) and print a warning. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2448	2016-03-09 10:39:22 -08:00
smh	9f500936c8	FreeBSD r256956: Improve ZFS N-way mirror read performance by using load and locality information. The existing algorithm selects a preferred leaf vdev based on offset of the zio request modulo the number of members in the mirror. It assumes the devices are of equal performance and that spreading the requests randomly over both drives will be sufficient to saturate them. In practice this results in the leaf vdevs being under utilized. The new algorithm takes into the following additional factors: * Load of the vdevs (number outstanding I/O requests) * The locality of last queued I/O vs the new I/O request. Within the locality calculation additional knowledge about the underlying vdev is considered such as; is the device backing the vdev a rotating media device. This results in performance increases across the board as well as significant increases for predominantly streaming loads and for configurations which don't have evenly performing devices. The following are results from a setup with 3 Way Mirror with 2 x HD's and 1 x SSD from a basic test running multiple parrallel dd's. With pre-fetch disabled (vfs.zfs.prefetch_disable=1): == Stripe Balanced (default) == Read 15360MB using bs: 1048576, readers: 3, took 161 seconds @ 95 MB/s == Load Balanced (zfslinux) == Read 15360MB using bs: 1048576, readers: 3, took 297 seconds @ 51 MB/s == Load Balanced (locality freebsd) == Read 15360MB using bs: 1048576, readers: 3, took 54 seconds @ 284 MB/s With pre-fetch enabled (vfs.zfs.prefetch_disable=0): == Stripe Balanced (default) == Read 15360MB using bs: 1048576, readers: 3, took 91 seconds @ 168 MB/s == Load Balanced (zfslinux) == Read 15360MB using bs: 1048576, readers: 3, took 108 seconds @ 142 MB/s == Load Balanced (locality freebsd) == Read 15360MB using bs: 1048576, readers: 3, took 48 seconds @ 320 MB/s In addition to the performance changes the code was also restructured, with the help of Justin Gibbs, to provide a more logical flow which also ensures vdevs loads are only calculated from the set of valid candidates. The following additional sysctls where added to allow the administrator to tune the behaviour of the load algorithm: * vfs.zfs.vdev.mirror.rotating_inc * vfs.zfs.vdev.mirror.rotating_seek_inc * vfs.zfs.vdev.mirror.rotating_seek_offset * vfs.zfs.vdev.mirror.non_rotating_inc * vfs.zfs.vdev.mirror.non_rotating_seek_inc These changes where based on work started by the zfsonlinux developers: https://github.com/zfsonlinux/zfs/pull/1487 Reviewed by: gibbs, mav, will MFC after: 2 weeks Sponsored by: Multiplay References: https://github.com/freebsd/freebsd@5c7a6f5d https://github.com/freebsd/freebsd@31b7f68d https://github.com/freebsd/freebsd@e186f564 Performance Testing: https://github.com/zfsonlinux/zfs/pull/4334#issuecomment-189057141 Porting notes: - The tunables were adjusted to have ZoL-style names. - The code was modified to use ZoL's vd_nonrot. - Fixes were done to make cstyle.pl happy - Merge conflicts were handled manually - freebsd/freebsd@e186f564bc by my collegue Andriy Gapon has been included. It applied perfectly, but added a cstyle regression. - This replaces `556011dbec` entirely. - A typo "IO'a" has been corrected to say "IO's" - Descriptions of new tunables were added to man/man5/zfs-module-parameters.5. Ported-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4334	2016-02-26 11:24:35 -08:00
Richard Yao	d2f3e292dc	Add -gLp to zpool subcommands for alt vdev names The following options have been added to the zpool add, iostat, list, status, and split subcommands. The default behavior was not modified, from zfs(8). -g Display vdev GUIDs instead of the normal short device names. These GUIDs can be used in-place of device names for the zpool detach/off‐ line/remove/replace commands. -L Display real paths for vdevs resolving all symbolic links. This can be used to lookup the current block device name regardless of the /dev/disk/ path used to open it. -p Display full paths for vdevs instead of only the last component of the path. This can be used in conjunction with the -L flag. This behavior may also be enabled using the following environment variables. ZPOOL_VDEV_NAME_GUID ZPOOL_VDEV_NAME_FOLLOW_LINKS ZPOOL_VDEV_NAME_PATH This change is based on worked originally started by Richard Yao to add a -g option. Then extended by @ilovezfs to add a -L option for openzfsonosx. Those changes have been merged, re-factored, a -p option added and extended to all relevant zpool subcommands. Original-patch-by: Richard Yao <ryao@gentoo.org> Extended-by: ilovezfs <ilovezfs@icloud.com> Extended-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: ilovezfs <ilovezfs@icloud.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2011 Closes #4341	2016-02-25 11:58:39 -08:00
Brian Behlendorf	b6fcb792ca	Illumos 6414 - vdev_config_sync could be simpler 6414 vdev_config_sync could be simpler Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> References: https://www.illumos.org/issues/6414 https://github.com/illumos/illumos-gate/commit/eb5bb58 Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <tuxoko@gmail.com>	2016-01-28 12:44:39 -05:00
Brian Behlendorf	519129ff4e	Illumos 6815179, 6844191 6815179 zpool import with a large number of LUNs is too slow 6844191 zpool import, scanning of disks should be multi-threaded References: https://github.com/illumos/illumos-gate/commit/4f67d75 Porting notes: - This change was originally never ported to Linux due to it dependence on the thread pool interface. This patch solves that issue by switching the code to use the existing taskq implementation which provides the same basic functionality. However, in order for this to work properly thread_init() and thread_fini() must be called around to taskq consumer to perform the needed thread initialization. - The check_one_slice, nozpool_all_slices, and check_slices functions have been disabled for Linux. They are difficult, but possible, to implement for Linux due to how partitions are get names. Since this is only an optimization this code can be added at a latter date. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2016-01-22 09:39:46 -08:00
Matthew Ahrens	19d55079ae	Illumos 4950 - files sometimes can't be removed from a full filesystem 4950 files sometimes can't be removed from a full filesystem Reviewed by: Adam Leventhal <adam.leventhal@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Reviewed by: Boris Protopopov <bprotopopov@hotmail.com> Approved by: Dan McDonald <danmcd@omniti.com> References: https://www.illumos.org/issues/4950 https://github.com/illumos/illumos-gate/commit/4bb7380 Porting notes: - ZoL currently does not log discards to zvols, so the portion of this patch that modifies the discard logging to mark it as freeing space has been discarded. 2. may_delete_now had been removed from zfs_remove() in ZoL. It has been reintroduced. 3. We do not try to emulate vnodes, so the following lines are not valid on Linux: mutex_enter(&vp->v_lock); may_delete_now = vp->v_count == 1 && !vn_has_cached_data(vp); mutex_exit(&vp->v_lock); This has been replaced with: mutex_enter(&zp->z_lock); may_delete_now = atomic_read(&ip->i_count) == 1 && !(zp->z_is_mapped); mutex_exit(&zp->z_lock); Ported-by: Richard Yao <richard.yao@clusterhq.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2016-01-21 16:59:30 -08:00
Brian Behlendorf	4967a3eb9d	Linux 4.5 compat: xattr list handler The registered xattr .list handler was simplified in the 4.5 kernel to only perform a permission check. Given a dentry for the file it must return a boolean indicating if the name is visible. This differs slightly from the previous APIs which also required the function to copy the name in to the provided list and return its size. That is now all the responsibility of the caller. This should be straight forward change to make to ZoL since we've always required the caller to make the copy. However, this was slightly complicated by the need to support 3 older APIs. Yes, between 2.6.32 and 4.5 there are 4 versions of this interface! Therefore, while the functional change in this patch is small it includes significant cleanup to make the code understandable and maintainable. These changes include: - Improved configure checks for .list, .get, and .set interfaces. - Interfaces checked from newest to oldest. - Strict checking for each possible known interface. - Configure fails when no known interface is available. - HAVE__XATTR_LIST renamed HAVE_XATTR_LIST_ for consistency with similar iops and fops configure checks. - POSIX_ACL_XATTR_{DEFAULT\|ACCESS} were removed forcing callers to move to their replacements, XATTR_NAME_POSIX_ACL_{DEFAULT\|ACCESS}. Compatibility wrapper were added for old kernels. - ZPL_XATTR_LIST_WRAPPER added which behaves the same as the existing ZPL_XATTR_{GET\|SET} WRAPPERs. Only the inode is guaranteed to be a valid pointer, passing NULL for the 'list' and 'name' variables is allowed and must be checked for. All .list functions were updated to use the wrapper to aid readability. - zpl_xattr_filldir() updated to use the .list function for its permission check which is consistent with the updated Linux 4.5 interface. If a .list function is registered it should return 0 to indicate a name should be skipped, if there is no registered function the name will be added. - Additional documentation from xattr(7) describing the correct behavior for each namespace was added before the relevant handlers. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Issue #4228	2016-01-20 11:36:56 -08:00
Josef 'Jeff' Sipek	bc89ac8479	Illumos 5045 - use atomic_{inc,dec}_* instead of atomic_add_* 5045 use atomic_{inc,dec}_* instead of atomic_add_* Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Approved by: Robert Mustacchi <rm@joyent.com> References: https://www.illumos.org/issues/5045 https://github.com/illumos/illumos-gate/commit/1a5e258 Porting notes: - All changes to non-ZFS files dropped. - Changes to zfs_vfsops.c dropped because they were Illumos specific. Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4220	2016-01-15 15:38:36 -08:00
Joe Stein	82f6f6e654	Illumos 6298 - zfs_create_008_neg and zpool_create_023_neg 6298 zfs_create_008_neg and zpool_create_023_neg need to be updated for large block support Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> References: https://www.illumos.org/issues/6298 https://github.com/illumos/illumos-gate/commit/e9316f7 Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4217	2016-01-15 15:38:35 -08:00
Richard Yao	546f38433a	SET_ERROR should print strings When debugging with tracepoints, we see string pointers: zfs 3017 [006] 8878.728915: zfs:zfs_set__error: ffffffffa0eec3fc:3013:ffffffffa0ebcd60(): error 0x2 ffffffffa0e1ca43 spa_open_common (/lib/modules/3.12.21-gentoo-r1/extra/zfs/zfs.ko) ffffffffa0e1cbe3 spa_open (/lib/modules/3.12.21-gentoo-r1/extra/zfs/zfs.ko) ffffffffa0e6f6ef zfs_ioc_stable (/lib/modules/3.12.21-gentoo-r1/extra/zfs/zfs.ko) ffffffffa0e6f2a9 zfsdev_ioctl (/lib/modules/3.12.21-gentoo-r1/extra/zfs/zfs.ko) ffffffff811909dd do_vfs_ioctl ([kernel.kallsyms]) ffffffff81190c41 sys_ioctl ([kernel.kallsyms]) ffffffff8156e2e9 system_call_fastpath ([kernel.kallsyms]) 7ff7d8be69c7 __GI___ioctl (/lib64/libc-2.19.so) 7ff7d90cac53 lzc_ioctl.constprop.3 (/lib64/libzfs_core.so.1.0.0) 636f695f637a6c00 [unknown] ([unknown]) Printing the actual strings is more convenient: zfs 3461 [001] 10599.847692: zfs:zfs_set__error: spa.c:3013:spa_open_common(): error 0x2 ffffffffa116ba43 spa_open_common (/lib/modules/3.12.21-gentoo-r1/extra/zfs/zfs.ko) ffffffffa116bbe3 spa_open (/lib/modules/3.12.21-gentoo-r1/extra/zfs/zfs.ko) ffffffffa11be8df zfs_ioc_stable (/lib/modules/3.12.21-gentoo-r1/extra/zfs/zfs.ko) ffffffffa11be499 zfsdev_ioctl (/lib/modules/3.12.21-gentoo-r1/extra/zfs/zfs.ko) ffffffff811909dd do_vfs_ioctl ([kernel.kallsyms]) ffffffff81190c41 sys_ioctl ([kernel.kallsyms]) ffffffff8156e2e9 system_call_fastpath ([kernel.kallsyms]) 7f11b843c9c7 __GI___ioctl (/lib64/libc-2.19.so) 7f11b8920c53 lzc_ioctl.constprop.3 (/lib64/libzfs_core.so.1.0.0) 636f695f637a6c00 [unknown] ([unknown]) A few other tracepoints have strings as well, so switch to printing the actual string values at the same time. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4212	2016-01-15 15:38:35 -08:00
Richard Yao	b10695c8f1	Remove fastwrite mutex The fast write mutex is intended to protect accounting, but it is redundant because all accounting is performed through atomic operations. It also serializes all metaslab IO behind a mutex, which introduces a theoretical scaling regression that the Illumos developers did not like when we showed this to them. Removing it makes the selection of the metaslab_group lock free as it is on Illumos. The selection is not quite the same without the lock because the loop races with IO completions, but any imbalances caused by this are likely to be corrected by subsequent metaslab group selections. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3643	2016-01-15 15:38:35 -08:00
Brian Behlendorf	c96c36fa22	Fix zsb->z_hold_mtx deadlock The zfs_znode_hold_enter() / zfs_znode_hold_exit() functions are used to serialize access to a znode and its SA buffer while the object is being created or destroyed. This kind of locking would normally reside in the znode itself but in this case that's impossible because the znode and SA buffer may not yet exist. Therefore the locking is handled externally with an array of mutexs and AVLs trees which contain per-object locks. In zfs_znode_hold_enter() a per-object lock is created as needed, inserted in to the correct AVL tree and finally the per-object lock is held. In zfs_znode_hold_exit() the process is reversed. The per-object lock is released, removed from the AVL tree and destroyed if there are no waiters. This scheme has two important properties: 1) No memory allocations are performed while holding one of the z_hold_locks. This ensures evict(), which can be called from direct memory reclaim, will never block waiting on a z_hold_locks which just happens to have hashed to the same index. 2) All locks used to serialize access to an object are per-object and never shared. This minimizes lock contention without creating a large number of dedicated locks. On the downside it does require znode_lock_t structures to be frequently allocated and freed. However, because these are backed by a kmem cache and very short lived this cost is minimal. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #4106	2016-01-15 15:33:45 -08:00
Brian Behlendorf	0720116d4d	Add zfs_object_mutex_size module option Add a zfs_object_mutex_size module option to facilitate resizing the the per-dataset znode mutex array. Increasing this value may help make the deadlock described in #4106 less common, but this is not a proper fix. This patch is primarily to aid debugging and analysis. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Issue #4106	2016-01-15 15:33:44 -08:00
Brian Behlendorf	d21f279a94	Illumos 3465, 3466, 3467, 3468, 3470, 3473 3465 ::walk ... \| ::<dcmd> misinterprets input as symbol names 3466 ::tsd should handle missing/NULL values better 3467 mdb_ctf_vread() could be more useful 3468 mdb enhancements for zfs development 3470 ::whatis does not print callers from KMF_LITE 3473 mdb_get_module() returns wrong module Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Eric Schrock <eric.schrock@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Robert Mustacchi <rm@joyent.com> Approved by: Dan McDonald <danmcd@nexenta.com> References: https://www.illumos.org/issues/3468 https://github.com/illumos/illumos-gate/commit/28e4da2 Porting notes: - The only portion of this patch which applies to ZoL is a small change to types used in the refcount structure. Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4216	2016-01-13 13:56:27 -08:00
Brian Behlendorf	89666a8e1c	Increase default user space stack size Under RHEL6/CentOS6 the default stack size must be increased to 32K to prevent overflowing the stack when running ztest. This isn't an issue for other distributions due to either the version of pthreads or perhaps the compiler. Doubling the stack size resolves the issue safely for all distribution and leaves us some headroom. $ sudo -E ztest -V -T 300 -f /var/tmp/ 5 vdevs, 7 datasets, 23 threads, 300 seconds... loading space map for vdev 0 of 1, metaslab 0 of 30 ... ... loading space map for vdev 0 of 1, metaslab 14 of 30 ... child died with signal 11 Exited ztest with error 3 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4215	2016-01-13 13:55:12 -08:00
Will Andrews	e6cfd633be	Illumos 3749 - zfs event processing should work on R/O root filesystems 3749 zfs event processing should work on R/O root filesystems Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Eric Schrock <eric.schrock@delphix.com> Approved by: Christopher Siden <christopher.siden@delphix.com> References: https://www.illumos.org/issues/3749 https://github.com/illumos/illumos-gate/commit/3cb69f7 Porting notes: - [include/sys/spa_impl.h] - `ffe9d38` Add generic errata infrastructure - `1421c89` Add visibility in to arc_read - [include/sys/fm/fs/zfs.h] - `2668527` Add linux events - `6283f55` Support custom build directories and move includes - [module/zfs/spa_config.c] - Updated spa_config_sync() to match illumos with the exception of a Linux specific block. Ported-by: kernelOfTruth kerneloftruth@gmail.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2016-01-12 14:42:32 -08:00
Matthew Ahrens	d25b4497b7	Illumos 5141 - zfs minimum indirect block size is 4K 5141 zfs minimum indirect block size is 4K Reviewed by: Christopher Siden <christopher.siden@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Richard Elling <richard.elling@gmail.com> Approved by: Dan McDonald <danmcd@omniti.com> References: https://www.illumos.org/issues/5141 https://github.com/illumos/illumos-gate/commit/e94f268 Porting notes: - GRUB -- GRand Unified Bootloader change wasn't merged (not applicable) Ported-by: kernelOfTruth kerneloftruth@gmail.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2016-01-12 14:11:31 -08:00
Justin T. Gibbs	0eb21616fa	Illumos 6171 - dsl_prop_unregister() slows down dataset eviction. 6171 dsl_prop_unregister() slows down dataset eviction. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> References: https://www.illumos.org/issues/6171 https://github.com/illumos/illumos-gate/commit/03bad06 Porting notes: - Conflicts - `3558fd7` Prototype/structure update for Linux - `2cf7f52` Linux compat 2.6.39: mount_nodev() - `13fe019` Illumos #3464 - `241b541` Illumos 5959 - clean up per-dataset feature count code - dsl_prop_unregister() preserved until out of tree consumers like Lustre can transition to dsl_prop_unregister_all(). - Fixing 'space or tab at end of line' in include/sys/dsl_dataset.h Ported-by: kernelOfTruth kerneloftruth@gmail.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2016-01-12 10:53:12 -08:00
Matthew Ahrens	7f60329a26	Illumos 5987 - zfs prefetch code needs work 5987 zfs prefetch code needs work Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Approved by: Gordon Ross <gordon.ross@nexenta.com> References: https://www.illumos.org/issues/5987 zfs prefetch code needs work illumos/illumos-gate@cf6106c 5987 zfs prefetch code needs work Porting notes: - [module/zfs/dbuf.c] - `5f6d0b6` Handle block pointers with a corrupt logical size - [module/zfs/dmu_zfetch.c] - `c65aa5b` Fix gcc missing parenthesis warnings - `428870f` Update core ZFS code from build 121 to build 141. - `79c76d5` Change KM_PUSHPAGE -> KM_SLEEP - `b8d06fc` Switch KM_SLEEP to KM_PUSHPAGE - Account for ISO C90 - mixed declarations and code - warnings - Module parameters (new/changed): - Replaced zfetch_block_cap with zfetch_max_distance (Max bytes to prefetch per stream (default 8MB; 8 * 1024 * 1024)) - Preserved zfs_prefetch_disable as 'int' for consistency with existing Linux module options. - [include/sys/trace_arc.h] - Added new tracepoints - DEFINE_ARC_BUF_HDR_EVENT(zfs_arc__sync__wait__for__async); - DEFINE_ARC_BUF_HDR_EVENT(zfs_arc__demand__hit__predictive__prefetch); - [man/man5/zfs-module-parameters.5] - Updated man page Ported-by: kernelOfTruth kerneloftruth@gmail.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2016-01-12 09:02:33 -08:00
Brian Behlendorf	b870c7e5f4	Revert "Illumos 3749 - zfs event processing should work on R/O root filesystems" This reverts commit `b47637ecdc` which introduced a regression in ztest. $ ./cmd/ztest/ztest -V 5 vdevs, 7 datasets, 23 threads, 300 seconds... * Error in `/rpool/home/behlendo/src/git/zfs/cmd/ztest/.libs/lt-ztest': double free or corruption (fasttop): 0x0000000000d339f0 *	2016-01-11 14:10:30 -08:00
Matthew Ahrens	1715493f38	Illumos 4929 - want prevsnap property 4929 want prevsnap property Reviewed by: Adam Leventhal <adam.leventhal@delphix.com> Reviewed by: Matt Amdur <matt.amdur@delphix.com> Reviewed by: Saso Kiselkov <skiselkov.ml@gmail.com> Reviewed by: Boris Protopopov <bprotopopov@hotmail.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Approved by: Dan McDonald <danmcd@omniti.com> References: https://www.illumos.org/issues/4929 https://github.com/illumos/illumos-gate/commit/b461c74 Porting notes: - [include/sys/fs/zfs.h] - f67d70 Create an 'overlay' property - 11b9ec Add full SELinux support - [fs/zfs/dsl_dataset.c] - This increases the stack size of dsl_dataset_stats() but nothing has been changed until this is shown to be an issue. Ported-by: kernelOfTruth kerneloftruth@gmail.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2016-01-11 11:58:26 -08:00
Matthew Ahrens	9867e8be2a	Illumos 4891 - want zdb option to dump all metadata 4891 want zdb option to dump all metadata Reviewed by: Sonu Pillai <sonu.pillai@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Christopher Siden <christopher.siden@delphix.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Approved by: Garrett D'Amore <garrett@damore.org> We'd like a way for zdb to dump metadata in a machine-readable format, so that we can bring that back from a customer site for in-house diagnosis. Think of it as a crash dump for zpools, which can be used for post-mortem analysis of a malfunctioning pool References: https://www.illumos.org/issues/4891 https://github.com/illumos/illumos-gate/commit/df15e41 Porting notes: - [cmd/zdb/zdb.c] - `a5778ea` zdb: Introduce -V for verbatim import - In main() getopt 'opt' variable removed and the code was brought back in line with illumos. - [lib/libzpool/kernel.c] - `1e33ac1` Fix Solaris thread dependency by using pthreads - `f0e324f` Update utsname support - `4d58b69` Fix vn_open/vn_rdwr error handling - In vn_open() allocate 'dumppath' on heap instead of stack - Properly handle 'dump_fd == -1' error path - Free 'realpath' after added vn_dumpdir_code block Ported-by: kernelOfTruth kerneloftruth@gmail.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2016-01-11 11:36:54 -08:00
Will Andrews	b47637ecdc	Illumos 3749 - zfs event processing should work on R/O root filesystems 3749 zfs event processing should work on R/O root filesystems Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Eric Schrock <eric.schrock@delphix.com> Approved by: Christopher Siden <christopher.siden@delphix.com> References: https://www.illumos.org/issues/3749 https://github.com/illumos/illumos-gate/commit/3cb69f7 Porting notes: - [include/sys/spa_impl.h] - `ffe9d38` Add generic errata infrastructure - `1421c89` Add visibility in to arc_read - [include/sys/fm/fs/zfs.h] - `2668527` Add linux events - `6283f55` Support custom build directories and move includes - [module/zfs/spa_config.c] - Updated spa_config_sync() to match illumos with the exception of a Linux specific block. Ported-by: kernelOfTruth kerneloftruth@gmail.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2016-01-11 09:23:37 -08:00
Paul Dagnelie	fcff0f35bd	Illumos 5960, 5925 5960 zfs recv should prefetch indirect blocks 5925 zfs receive -o origin= Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> References: https://www.illumos.org/issues/5960 https://www.illumos.org/issues/5925 https://github.com/illumos/illumos-gate/commit/a2cdcdd Porting notes: - [lib/libzfs/libzfs_sendrecv.c] - `b8864a2` Fix gcc cast warnings - `325f023` Add linux kernel device support - `5c3f61e` Increase Linux pipe buffer size on 'zfs receive' - [module/zfs/zfs_vnops.c] - `3558fd7` Prototype/structure update for Linux - `c12e3a5` Restructure zfs_readdir() to fix regressions - [module/zfs/zvol.c] - Function @zvol_map_block() isn't needed in ZoL - `9965059` Prefetch start and end of volumes - [module/zfs/dmu.c] - Fixed ISO C90 - mixed declarations and code - Function dmu_prefetch() 'int i' is initialized before the following code block (c90 vs. c99) - [module/zfs/dbuf.c] - `fc5bb51` Fix stack dbuf_hold_impl() - `9b67f60` Illumos 4757, 4913 - `34229a2` Reduce stack usage for recursive traverse_visitbp() - [module/zfs/dmu_send.c] - Fixed ISO C90 - mixed declarations and code - `b58986e` Use large stacks when available - `241b541` Illumos 5959 - clean up per-dataset feature count code - `77aef6f` Use vmem_alloc() for nvlists - `00b4602` Add linux kernel memory support Ported-by: kernelOfTruth kerneloftruth@gmail.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2016-01-08 15:08:19 -08:00
Matthew Ahrens	37f8a8835a	Illumos 5746 - more checksumming in zfs send 5746 more checksumming in zfs send Reviewed by: Christopher Siden <christopher.siden@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com> Approved by: Albert Lee <trisk@omniti.com> References: https://www.illumos.org/issues/5746 https://github.com/illumos/illumos-gate/commit/98110f0 https://github.com/zfsonlinux/zfs/issues/905 Porting notes: - Minor conflicts due to: - https://github.com/zfsonlinux/zfs/commit/2024041 - https://github.com/zfsonlinux/zfs/commit/044baf0 - https://github.com/zfsonlinux/zfs/commit/88904bb - Fix ISO C90 warnings (-Werror=declaration-after-statement) - arc_buf_t abuf; - dmu_buf_t bonus; - zio_cksum_t cksum_orig; - zio_cksum_t *cksump; - Fix format '%llx' format specifier warning - Align message in zstreamdump safe_malloc() with upstream Ported-by: kernelOfTruth kerneloftruth@gmail.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3611	2015-12-30 14:24:14 -08:00
Ned Bass	43b4935e53	Prevent SA length overflow The function sa_update() accepts a 32-bit length parameter and assigns it to a 16-bit field in sa_bulk_attr_t, potentially truncating the passed-in value. This could lead to corrupt system attribute (SA) records getting written to the pool. Add a VERIFY to sa_update() to detect cases where overflow would occur. The SA length is limited to 16-bit values by the on-disk format defined by sa_hdr_phys_t. The function zfs_sa_set_xattr() is vulnerable to this bug if the unpacked nvlist of xattrs is less than 64k in size but the packed size is greater than 64k. Fix this by appropriately checking the size of the packed nvlist before calling sa_update(). Add error handling to zpl_xattr_set_sa() to keep the cached list of SA-based xattrs consistent with the data on disk. Lastly, zfs_sa_set_xattr() calls dmu_tx_abort() on an assigned transaction if sa_update() returns an error, but the DMU only allows unassigned transactions to be aborted. Wrap the sa_update() call in a VERIFY0, remove the transaction abort, and call dmu_tx_commit() unconditionally. This is consistent practice with other callers of sa_update(). Signed-off-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@gentoo.org> Closes #4150	2015-12-30 13:20:12 -08:00
Chris Williamson	23de906c72	Illumos 5745 - zfs set allows only one dataset property to be set at a time 5745 zfs set allows only one dataset property to be set at a time Reviewed by: Christopher Siden <christopher.siden@delphix.com> Reviewed by: George Wilson <george@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com> Reviewed by: Richard PALO <richard@NetBSD.org> Reviewed by: Steven Hartland <killing@multiplay.co.uk> Approved by: Rich Lowe <richlowe@richlowe.net> References: https://www.illumos.org/issues/5745 https://github.com/illumos/illumos-gate/commit/3092556 Porting notes: - Fix the missing braces around initializer, zfs_cmd_t zc = {"\0"}; - Remove extra format argument in zfs_do_set() - Declare at the top: - zfs_prop_t prop; - nvpair_t elem; - nvpair_t next; - int i; - Additionally initialize: - int added_resv = 0; - zfs_prop_t prop = 0; - Assign 0 install of NULL for uint64_t types. - zc->zc_nvlist_conf = '\0'; - zc->zc_nvlist_src = '\0'; - zc->zc_nvlist_dst = '\0'; Ported-by: kernelOfTruth kerneloftruth@gmail.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3574	2015-12-29 16:59:26 -08:00
Olaf Faaland	e0553a74ad	Add lock types RW_NOLOCKDEP and MUTEX_NOLOCKDEP Both lock types were introduced in SPL to allow some locks to be taken/released with linux lockdep turned off. See SPL commit for details. Add the new lock types to zfs_context.h to allow user space compilation. Depends on SPL commit `692ae8d` SPL pull request refs/pull/480/head Signed-off-by: Olaf Faaland <faaland1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3895	2015-12-22 10:20:29 -08:00
Brian Behlendorf	6fe53787f3	Fix vdev_queue_aggregate() deadlock This deadlock may manifest itself in slightly different ways but at the core it is caused by a memory allocation blocking on file- system reclaim in the zio pipeline. This is normally impossible because zio_execute() disables filesystem reclaim by setting PF_FSTRANS on the thread. However, kmem cache allocations may still indirectly block on file system reclaim while holding the critical vq->vq_lock as shown below. To resolve this issue zio_buf_alloc_flags() is introduced which allocation flags to be passed. This can then be used in vdev_queue_aggregate() with KM_NOSLEEP when allocating the aggregate IO buffer. Since aggregating the IO is purely a performance optimization we want this to either succeed or fail quickly. Trying too hard to allocate this memory under the vq->vq_lock can negatively impact performance and result in this deadlock. * z_wr_iss zio_vdev_io_start vdev_queue_io -> Takes vq->vq_lock vdev_queue_io_to_issue vdev_queue_aggregate zio_buf_alloc -> Waiting on spl_kmem_cache process * z_wr_int zio_vdev_io_done vdev_queue_io_done mutex_lock -> Waiting on vq->vq_lock held by z_wr_iss * txg_sync spa_sync dsl_pool_sync zio_wait -> Waiting on zio being handled by z_wr_int * spl_kmem_cache spl_cache_grow_work kv_alloc spl_vmalloc ... evict zpl_evict_inode zfs_inactive dmu_tx_wait txg_wait_open -> Waiting on txg_sync Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Tim Chase <tim@chase2k.com> Closes #3808 Closes #3867	2015-12-18 13:27:12 -08:00
Chunwei Chen	2727b9d3b6	Use uio for zvol_{read,write} Since uio now supports bvec, we can convert bio into uio and reuse dmu_{read,write}_uio. This way, we can remove some duplicate code. Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4078	2015-12-15 16:21:43 -08:00
Chunwei Chen	24ef51f660	Use spa as key besides objsetid for snapentry objsetid is not unique across pool, so using it solely as key would cause panic when automounting two snapshot on different pools with the same objsetid. We fix this by adding spa pointer as additional key. Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@gentoo.org> Issue #3948 Issue #3786 Issue #3887	2015-12-08 16:38:56 -08:00
Matthew Ahrens	241b541574	Illumos 5959 - clean up per-dataset feature count code 5959 clean up per-dataset feature count code Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: George Wilson <george@delphix.com> Reviewed by: Alex Reece <alex@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> References: https://www.illumos.org/issues/5959 https://github.com/illumos/illumos-gate/commit/ca0cc39 Porting notes: illumos code doesn't check for feature_get_refcount() returning ENOTSUP (which means feature is disabled) in zdb. zfsonlinux added a check in https://github.com/zfsonlinux/zfs/commit/784652c due to #3468. The check was reintroduced here. Ported-by: Witaut Bajaryn <vitaut.bayaryn@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3965	2015-12-04 14:20:20 -08:00
Brian Behlendorf	072484504f	Add zap_prefetch() interface Provide a generic interface to prefetch ZAP entries by name. This functionality is being added for external consumers such as Lustre. It is based of the existing zap_prefetch_uint64() version which is used by the deduplication code. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@gentoo.org> Closes #4061	2015-12-04 09:39:20 -08:00
ilovezfs	917b8c5cec	Ext4's typical GPT partition type not recognized Adding additional entries to the efi conversion array will help prevent the overwriting of the GPTs of disks with in-use file systems in more cases. Most notably, this adds partition type 8300 "Linux filesystem" (0FC63DAF-8483-4772-8E79-3D69D8477DE4), which is often used for ext4 and btrfs, among others. This commit itself does nothing to address the underlying problematic behavior that check_slice() isn't called on partitions of an unrecognized type, even when they contain a currently mounted file system. The additional entries were derived from these two resources: https://en.wikipedia.org/wiki/GUID_Partition_Table http://sourceforge.net/p/gptfdisk/code/ci/master/tree/parttypes.cc Signed-off-by: ilovezfs <ilovezfs@icloud.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #4016	2015-12-04 09:27:00 -08:00
Yuri Pankov	fc80384923	Illumos 934 - FreeBSD's GPT not recognized Reviewed by: Alexander Eremin <alexander.r.eremin@gmail.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Reviewed by: Andrew Stormont <Andrew.Stormont@nexenta.com> Reviewed by: Richard Elling <richard.elling@richardelling.com> Approved by: Gordon Ross <gwr@nexenta.com> References: https://www.illumos.org/issues/934 https://github.com/illumos/illumos-gate/commit/e21ea67 Ported-by: ilovezfs <ilovezfs@icloud.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #4016	2015-12-04 09:19:40 -08:00
Richard Yao	b22e279797	Only trigger SET_ERROR tracepoint event on error Currently, the SET_ERROR tracepoint triggers regardless of whether there is an error or not. On Illumos, SET_ERROR only triggers on an actual error, which is avoids irrelevant noise. Linux 2.6.38 added support for conditional tracepoints, so we modify SET_ERROR to use them when they are avaliable for functionality equivalent to the Illumos functionality. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4043	2015-12-02 17:16:41 -08:00
Chunwei Chen	61d482f7cd	Linux 4.4 compat: xattr operations takes xattr_handler The xattr_hander->{list,get,set} were changed to take a xattr_handler, and handler_flags argument was removed and should be accessed by handler->flags. Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #4021	2015-12-01 16:48:25 -08:00
Justin T. Gibbs	bc4501f75a	Illumos 6267 - dn_bonus evicted too early 6267 dn_bonus evicted too early Reviewed by: Richard Yao <ryao@gentoo.org> Reviewed by: Xin LI <delphij@freebsd.org> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> References: https://www.illumos.org/issues/6267 https://github.com/illumos/illumos-gate/commit/d205810 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Ported-by: Ned Bass bass6@llnl.gov Issue #3865 Issue #3443	2015-10-13 14:12:02 -07:00
Lukas Wunner	784a7fe5d9	Linux 4.3 compat: bio_end_io_t / BIO_UPTODATE Commit torvalds/linux@4246a0b63b ("block: add a bi_error field to struct bio") dropped the error argument from bio_endio in favor of newly introduced bio->bi_error. This also replaces bio->bi_flags value BIO_UPTODATE. bio_endio was a 3 argument function until Linux 2.6.24, which made it a 2 argument function, and now the prototype has changed yet again to a 1 argument function. Support for pre 2.6.24 kernels was already dropped with `37f9dac592` ("zvol processing should use struct bio") which assumed the 2 argument version in zvol_request(). Remaining code to support the 3 argument version is hereby removed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Lukas Wunner <lukas@wunner.de> Issue #3799	2015-09-25 12:44:54 -07:00
Don Brady	56b3986316	Add large block support to zpios(1) benchmark As part of the large block support effort, it makes sense to add support for large blocks to zpios(1). The specifying of a zfs block size for zpios is optional and will default to 128K if the block size is not specified. `zpios ... -S size \| --blocksize size ...` This will use size ZFS blocks for each test, specified as a comma delimited list with an optional unit suffix. The supported range is powers of two from 128K through 16M. A range of block sizes can be tested as follows: `-S 128K,256K,512K,1M` Example run below (non realistic results from a VM and output abbreviated for space) ``` --regioncount=750 --regionsize=8M --chunksize=1M --offset=4K --threaddelay=0 --cleanup --human-readable --verbose --cleanup --blocksize=128K,256K,512K,1M th-cnt rg-cnt rg-sz ch-sz blksz wr-data wr-bw rd-data rd-bw --------------------------------------------------------------------- 4 750 8m 1m 128k 5g 90.06m 5g 93.37m 4 750 8m 1m 256k 5g 79.71m 5g 99.81m 4 750 8m 1m 512k 5g 42.20m 5g 93.14m 4 750 8m 1m 1m 5g 35.51m 5g 89.36m 8 750 8m 1m 128k 5g 85.49m 5g 90.81m 8 750 8m 1m 256k 5g 61.42m 5g 99.24m 8 750 8m 1m 512k 5g 49.09m 5g 108.78m 16 750 8m 1m 128k 5g 86.28m 5g 88.73m 16 750 8m 1m 256k 5g 64.34m 5g 93.47m 16 750 8m 1m 512k 5g 68.84m 5g 124.47m 16 750 8m 1m 1m 5g 53.97m 5g 97.20m --------------------------------------------------------------------- ``` Signed-off-by: Don Brady <don.brady@intel.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3795 Closes #2071	2015-09-22 09:13:20 -07:00
Ned Bass	3af56fd95f	Honor xattr=sa dataset property ZFS incorrectly uses directory-based extended attributes even when xattr=sa is specified as a dataset property or mount option. Support to honor temporary mount options including "xattr" was added in commit `0282c4137e`. There are two issues with the mount option handling: * Libzfs has historically included "xattr" in its list of default mount options. This overrides the dataset property, so the dataset is always configured to use directory-based xattrs even when the xattr dataset property is set to off or sa. Address this by removing "xattr" from the set of default mount options in libzfs. * There was no way to enable system attribute-based extended attributes using temporary mount options. Add the mount options "saxattr" and "dirxattr" which enable the xattr behavior their names suggest. This approach has the advantages of mirroring the valid xattr dataset property values and following existing conventions for mount option names. Signed-off-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3787	2015-09-19 14:04:14 -07:00
Arne Jansen	4e0f33ffe0	Illumos 6214 - zpools going south 6214 zpools going south Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Reviewed by: Saso Kiselkov <skiselkov.ml@gmail.com> References: https://www.illumos.org/issues/6214 http://cr.illumos.org/~webrev/sensille/6214_zpools_going_south/ Porting Notes: Reintroduce b_compress to the l2arc_buf_hdr_t. In commit `b9541d6` the compression flags were moved to the generic b_flags in the arc_buf_hdr_t. This is a problem because l2arc_compress_buf() may manipulate the compression flags and this can only be done safely under the hash lock which is not held. See Illumos 6214 for a detailed analysis of the race. HDR_GET_COMPRESS() macro was removed from arc_buf_info(). Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3757	2015-09-11 11:14:38 -07:00
Richard Yao	8198d18ca7	Reintroduce IO accounting on zvols on Linux 3.19+ zfsonlinux/zfs@e20cd6f7a8 caused us to lose IO accounting on zvols. When I originally wrote that last year, the symbols we needed to maintain IO accounting were GPL exported, but torvalds/linux@394ffa503b provided suitable symbols for restoring this functionality 4 months later. We can call them to restore the IO accounting on Linux 3.19 and later as well as any older kernels where that patch is backported. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3741	2015-09-09 09:29:24 -07:00
Brian Behlendorf	3b36f8319d	Add dbgmsg kstat Internally ZFS keeps a small log to facilitate debugging. By default the log is disabled, to enable it set zfs_dbgmsg_enable=1. The contents of the log can be accessed by reading the /proc/spl/kstat/zfs/dbgmsg file. Writing 0 to this proc file clears the log. $ echo 1 >/sys/module/zfs/parameters/zfs_dbgmsg_enable $ echo 0 >/proc/spl/kstat/zfs/dbgmsg $ zpool import tank $ cat /proc/spl/kstat/zfs/dbgmsg 1 0 0x01 -1 0 2492357525542 2525836565501 timestamp message 1441141408 spa=tank async request task=1 1441141408 txg 70 open pool version 5000; software version 5000/5; ... 1441141409 spa=tank async request task=32 1441141409 txg 72 import pool version 5000; software version 5000/5; ... 1441141414 command: lt-zpool import tank Note the zfs_dbgmsg() and dprintf() functions are both now mapped to the same log. As mentioned above the kernel debug log can be accessed though the /proc/spl/kstat/zfs/dbgmsg kstat. For user space consumers log messages are immediately written to stdout after applying the ZFS_DEBUG environment variable. $ ZFS_DEBUG=on ./cmd/ztest/ztest -V Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ned Bass <bass6@llnl.gov> Closes #3728	2015-09-04 16:08:14 -07:00
Brian Behlendorf	e20cd6f7a8	Merge branch 'zvol' Performance improvements for zvols. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3720	2015-09-04 13:14:21 -07:00
Richard Yao	7d6e2adb4e	Remove blk_rq_bytes()/blk_rq_sectors autotools checks Signed-off-by: Richard Yao <ryao@gentoo.org>	2015-09-04 15:37:24 -04:00

1 2 3 4 5 ...

434 Commits