Archive-Team/zfs - zfs - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Tim Chase	39d65926c9	4.10 compat - BIO flag changes and others [bio] The req_op enum was changed to req_opf. Update the "Linux 4.8 API" autotools checks to use an int to determine whether the various REQ_OP values are defined. This should work properly on kernels >= 4.8. [bio] bio_set_op_attrs() is now an inline function and can't be detected with #ifdef. Add a configure check to determine whether bio_set_op_attrs() is defined. Move the local definition of it from vdev_disk.c to blkdev_compat.h for consistency with other related compability shims. [bio] The read/write flags and their modifiers, including WRITE_FLUSH, WRITE_FUA and WRITE_FLUSH_FUA have been removed from fs.h. Add the new bio_set_flush() compatibility wrapper to replace VDEV_WRITE_FLUSH_FUA and set the flags appropriately for each supported kernel version. [vfs] The generic_readlink() function has been made static. If .readlink in inode_operations is NULL, generic_readlink() is used. [zol typo] Completely unrelated to 4.10 compat, fix a typo in the check for REQ_OP_SECURE_ERASE so that the proper macro is defined: s/HAVE_REQ_OP_SECURE_DISCARD/HAVE_REQ_OP_SECURE_ERASE/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Tim Chase <tim@chase2k.com> Closes #5499	2017-02-03 10:25:07 -08:00
Brian Behlendorf	a57228e51c	Reorder HAVE_BIO_RW_* checks The HAVE_BIO_RW_* #ifdef's must appear before REQ_* #ifdef's in the bio_is_flush() and bio_is_discard() macros. Linux 2.6.32 era kernels defined both of values and the HAVE_BIO_RW_* must be used in this case. This resulted in a panic in zconfig test 5. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Closes #4951 Closes #4959	2017-02-03 10:25:03 -08:00
Brian Behlendorf	bea68ec5bf	Remove custom root pool import code Non-Linux OpenZFS implementations require additional support to be used a root pool. This code should simply be removed to avoid confusion and improve readability. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Closes #4951	2017-02-03 10:24:59 -08:00
Tim Chase	88fa992878	Fix sync behavior for disk vdevs Prior to `b39c22b`, which was first generally available in the 0.6.5 release as `b39c22b`, ZoL never actually submitted synchronous read or write requests to the Linux block layer. This means the vdev_disk_dio_is_sync() function had always returned false and, therefore, the completion in dio_request_t.dr_comp was never actually used. In `b39c22b`, synchronous ZIO operations were translated to synchronous BIO requests in vdev_disk_io_start(). The follow-on commits `5592404` and `aa159af` fixed several problems introduced by `b39c22b`. In particular, `5592404` introduced the new flag parameter "wait" to __vdev_disk_physio() but under ZoL, since vdev_disk_physio() is never actually used, the wait flag was always zero so the new code had no effect other than to cause a bug in the use of the dio_request_t.dr_comp which was fixed by `aa159af`. The original rationale for introducing synchronous operations in `b39c22b` was to hurry certains requests through the BIO layer which would have otherwise been subject to its unplug timer which would increase the latency. This behavior of the unplug timer, however, went away during the transition of the plug/unplug system between kernels 2.6.32 and 2.6.39. To handle the unplug timer behavior on 2.6.32-2.6.35 kernels the BIO_RW_UNPLUG flag is used as a hint to suppress the plugging behavior. For kernels 2.6.36-2.6.38, the REQ_UNPLUG macro will be available and ise used for the same purpose. Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4858	2017-02-03 10:24:54 -08:00
Chunwei Chen	c09af45f7b	Use set_cached_acl and forget_cached_acl when possible Originally, these two function are inline, so their usability is tied to posix_acl_release. However, since Linux 3.14, they became EXPORT_SYMBOL, so we can always use them. In this patch, we create an independent test for these two functions so we can use them when possible. Signed-off-by: Chunwei Chen <david.chen@osnexus.com>	2017-02-03 10:24:50 -08:00
Chunwei Chen	64c259c509	Batch free zpl_posix_acl_release Currently every calls to zpl_posix_acl_release will schedule a delayed task, and each delayed task will add a timer. This used to be fine except for possibly bad performance impact. However, in Linux 4.8, a new timer wheel implementation[1] is introduced. In this new implementation, the larger the delay, the less accuracy the timer is. So when we have a flood of timer from zpl_posix_acl_release, they will expire at the same time. Couple with the fact that task_expire will do linear search with lock held. This causes an extreme amount of contention inside interrupt and would actually lockup the system. We fix this by doing batch free to prevent a flood of delayed task. Every call to zpl_posix_acl_release will put the posix_acl to be freed on a lockless list. Every batch window, 1 sec, the zpl_posix_acl_free will fire up and free every posix_acl that passed the grace period on the list. This way, we only have one delayed task every second. [1] https://lwn.net/Articles/646950/ Signed-off-by: Chunwei Chen <david.chen@osnexus.com>	2017-02-03 10:24:45 -08:00
Neal Gompa (ニール・ゴンパ)	447040c31d	Process all systemd services through the systemd scriptlets This patch ensures that all systemd services are processed through the systemd scriptlets, so that services are properly configured per the preset file installed by the package. Without this, zfs.target is set, but none of the services are enabled per the preset file, meaning automounting filesystems and such won't work out of the box. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Neal Gompa <ngompa13@gmail.com> Closes #5356	2017-02-03 10:24:41 -08:00
tuxoko	734e235f67	Fix cred leak in zpl_fallocate_common This is caught by kmemleak when running compress_004_pos Reviewed-by: Tim Chase <tim@chase2k.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Closes #5244 Closes #5330	2017-02-03 10:24:38 -08:00
Hajo Möller	ffcd0c5434	Fix lookup_bdev() on Ubuntu Ubuntu added support for checking inode permissions to lookup_bdev() in kernel commit 193fb6a2c94fab8eb8ce70a5da4d21c7d4023bee (merged in 4.4.0-6.21). Upstream bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1636517 This patch adds a test for Ubuntu's variant of lookup_bdev() to configure and calls the function in the correct way. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Hajo Möller <dasjoe@gmail.com> Closes #5336	2017-02-03 10:24:34 -08:00
LOLi	d2beed9116	Fix uninitialized variable snapprops_nvlist in zfs_receive_one The variable snapprops_nvlist was never initialized, so properties were not applied to the received snapshot. Additionally, add zfs_receive_013_pos.ksh script to ZFS test suite to exercise 'zfs receive' functionality for user properties. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes #4338	2017-02-03 10:24:30 -08:00
Tim Chase	4c83fa9b87	Write issue taskq shouldn't be dynamic This is as much an upstream compatibility as it's a bit of a performance gain. The illumos taskq implemention doesn't allow a TASKQ_THREADS_CPU_PCT type to be dynamic and in fact enforces as much with an ASSERT. As to performance, if this taskq is dynamic, it can cause excessive contention on tq_lock as the threads are created and destroyed because it can see bursts of many thousands of tasks in a short time, particularly in heavy high-concurrency zvol write workloads. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Closes #5236	2017-02-03 10:24:26 -08:00
Brian Behlendorf	cbf8713874	Use large stacks when available While stack size will vary by architecture it has historically defaulted to 8K on x86_64 systems. However, as of Linux 3.15 the default thread stack size was increased to 16K. These kernels are now the default in most non- enterprise distributions which means we no longer need to assume 8K stacks. This patch takes advantage of that fact by appropriately reverting stack conservation changes which were made to ensure stability. Changes which may have had a negative impact on performance for certain workloads. This also has the side effect of bringing the code slightly more in line with upstream. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@gentoo.org> Closes #4059	2017-02-03 10:24:22 -08:00
Stian Ellingsen	dc3d6a6db1	Use env, not sh in zfsctl_snapshot_{,un}mount() Call mount and umount via /usr/bin/env instead of /bin/sh in zfsctl_snapshot_mount() and zfsctl_snapshot_unmount(). This change fixes a shell code injection flaw. The call to /bin/sh passed the mountpoint unescaped, only surrounded by single quotes. A mountpoint containing one or more single quotes would cause the command to fail or potentially execute arbitrary shell code. This change also provides compatibility with grsecurity patches. Grsecurity only allows call_usermodehelper() to use helper binaries in certain paths. /usr/bin/* is allowed, /bin/* is not.	2017-02-03 10:24:17 -08:00
Stian Ellingsen	d71db895a1	Fix use after free in zfsctl_snapshot_unmount()	2017-02-03 10:24:12 -08:00
tuxoko	42dae6d7a6	Linux 3.14 compat: assign inode->set_acl Linux 3.14 introduces inode->set_acl(). Normally, acl modification will come from setxattr, which will handle by the acl xattr_handler, and we already handles that well. However, nfsd will directly calls inode->set_acl or return error if it doesn't exists. Reviewed-by: Tim Chase <tim@chase2k.com> Reviewed-by: Massimo Maggi <me@massimo-maggi.eu> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Closes #5371 Closes #5375	2017-02-03 10:24:09 -08:00
Brian Behlendorf	f85c85ea06	Linux 4.9 compat: inode_change_ok() renamed setattr_prepare() In torvalds/linux@31051c8 the inode_change_ok() function was renamed setattr_prepare() and updated to take a dentry ratheri than an inode. Update the code to call the setattr_prepare() and add a wrapper function which call inode_change_ok() for older kernels. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@osnexus.com>	2017-02-03 10:24:06 -08:00
Chunwei Chen	670508f080	Linux 4.9 compat: remove iops->{set,get,remove}xattr In Linux 4.9, torvalds/linux@fd50eca, iops->{set,get,remove}xattr and generic_{set,get,remove}xattr are removed. xattr operations will directly go through sb->s_xattr. Signed-off-by: Chunwei Chen <david.chen@osnexus.com>	2017-02-03 10:24:00 -08:00
Chunwei Chen	28172e8aa7	Linux 4.9 compat: iops->rename() wants flags In Linux 4.9, torvalds/linux@2773bf0, iops->rename() and iops->rename2() are merged together into iops->rename(), it now wants flags. Signed-off-by: Chunwei Chen <david.chen@osnexus.com>	2017-02-03 10:23:57 -08:00
tuxoko	c0716f13ef	Linux 4.7 compat: Fix deadlock during lookup on case-insensitive We must not use d_add_ci if the dentry already has the real name. Otherwise, d_add_ci()->d_alloc_parallel() will find itself on the lookup hash and wait on itself causing deadlock. Tested-by: satmandu Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Closes #5124 Closes #5141 Closes #5147 Closes #5148	2017-02-03 10:23:53 -08:00
DeHackEd	dbc95a682c	Kernel 4.9 compat: file_operations->aio_fsync removal Linux kernel commit 723c038475b78 removed this field. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: DHE <git@dehacked.net> Closes #5393	2017-02-03 10:23:50 -08:00
Chunwei Chen	20a0763746	Remove dir inode operations from zpl_inode_operations These operations are dir specific, there's no point putting them in zpl_inode_operations which is for regular files. Signed-off-by: Chunwei Chen <david.chen@osnexus.com>	2017-02-03 10:23:47 -08:00
Brian Behlendorf	e56852059f	Fix uninitialized variable in avl_add() Silence the following warning when compiling with gcc 5.4.0. Specifically gcc (Ubuntu 5.4.0-6ubuntu1~16.04.1) 5.4.0 20160609. module/avl/avl.c: In function ‘avl_add’: module/avl/avl.c:647:2: warning: ‘where’ may be used uninitialized in this function [-Wmaybe-uninitialized] avl_insert(tree, new_node, where); Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2017-02-03 10:23:42 -08:00
Ned Bass	1f734a62ac	Prepare to release 0.6.5.8 META file and RPM release log updated. Signed-off-by: Ned Bass <bass6@llnl.gov>	2016-09-09 13:21:10 -07:00
Brian Behlendorf	ffddb4dfab	Fix gcc -Warray-bounds check for dump_object() in zdb As of gcc 6.1.1 20160621 (Red Hat 6.1.1-3) an array bounds warnings is detected in the zdb the dump_object() function. The analysis is correct but difficult to interpret because this is implemented as a macro. Rework the ZDB_OT_NAME in to a function and remove the case detected by gcc which is a side effect of the DMU_OT_IS_VALID() macro. zdb.c: In function ‘dump_object’: zdb.c:1931:288: error: array subscript is outside array bounds [-Werror=array-bounds] Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Closes #4907	2016-09-09 13:21:10 -07:00
Brian Behlendorf	8fe1fb14cb	Handle block pointers with a corrupt logical size Commit `5f6d0b6` was originally added to gracefully handle block pointers with a damaged logical size. However, it incorrectly assumed that all passed arc_done_func_t could handle a NULL arc_buf_t. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4069 Closes #4080	2016-09-09 13:21:10 -07:00
Brian Behlendorf	bf8b4a9fd5	Linux 4.8 compat: Fix removal of bio->bi_rw member All users of bio->bi_rw have been replaced with compatibility wrappers. This allows the kernel specific logic to be abstracted away, and for each of the supported cases to be documented with the wrapper. The updated interfaces are as follows: * void blk_queue_set_write_cache(struct request_queue , bool, bool) boolean_t bio_is_flush(struct bio ) boolean_t bio_is_fua(struct bio ) boolean_t bio_is_discard(struct bio ) boolean_t bio_is_secure_erase(struct bio ) VDEV_WRITE_FLUSH_FUA Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Closes #4951	2016-09-09 13:21:10 -07:00
Brian Behlendorf	39a78fe9d4	Linux 4.8 compat: posix_acl_valid() The posix_acl_valid() function has been updated to require a user namespace. Filesystem callers should normally provide the user_ns from the super block associcated with the ACL; the zpl_posix_acl_valid() wrapper has been added for this purpose. See https://github.com/torvalds/linux/commit/0d4d717f for complete details. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Closes #4922	2016-09-09 13:21:10 -07:00
Chunwei Chen	6ae0dbdc8a	Linux 4.8 compat: REQ_OP and bio_set_op_attrs() New REQ_OP_* definitions have been introduced to separate the WRITE, READ, and DISCARD operations from the flags. This included changing the encoding of bi_rw. It places REQ_OP_* in high order bits and other stuff in low order bits. This encoding is done through the new helper function bio_set_op_attrs. For complete details refer to: https://github.com/torvalds/linux/commit/f215082 https://github.com/torvalds/linux/commit/4e1b2d5 Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4892 Closes #4899	2016-09-09 13:21:10 -07:00
Brian Behlendorf	a0591c4370	Linux 4.8 compat: REQ_PREFLUSH The REQ_FLUSH flag was renamed REQ_PREFLUSH to avoid confusion with REQ_OP_FLUSH. See https://github.com/torvalds/linux/commit/28a8f0d3 for complete details. Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #4892 Issue #4899	2016-09-09 13:21:10 -07:00
Brian Behlendorf	68b8d22c6e	Linux 4.8 compat: submit_bio() The rw argument has been removed from submit_bio/submit_bio_wait. Callers are now expected to set bio->bi_rw instead of passing it in. See https://github.com/torvalds/linux/commit/4e49ea4a for complete details. Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #4892 Issue #4899	2016-09-09 13:21:09 -07:00
smh	3d824a8878	FreeBSD rS271776 - Persist vdev_resilver_txg changes Persist vdev_resilver_txg changes to avoid panic caused by validation vs a vdev_resilver_txg value from a previous resilver. Authored-by: smh <smh@FreeBSD.org> Ported-by: Chris Dunlop <chris@onthe.net.au> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> OpenZFS-issue: https://www.illumos.org/issues/5154 FreeBSD-issue: https://reviews.freebsd.org/rS271776 FreeBSD-commit: https://github.com/freebsd/freebsd/commit/c3c60bf Closes #4790	2016-09-09 13:21:09 -07:00
GeLiXin	e5c02cbb03	Fix: Array bounds read in zprop_print_one_property() If the loop index i comes to (ZFS_GET_NCOLS - 1), the cbp->cb_columns[i + 1] actually read the data of cbp->cb_colwidths[0], which means the array subscript is above array bounds. Luckily the cbp->cb_colwidths[0] is always 0 and it seems we haven't looped enough times to exceed the array bounds so far, but it's really a secluded risk someday. Signed-off-by: GeLiXin <ge.lixin@zte.com.cn> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #5003	2016-09-09 13:21:09 -07:00
GeLiXin	e66b546cb7	Fix call zfs_get_name() with invalid parameter zfs_get_name() expects a parameter of type zfs_handle_t zhp , but gets an invalid parameter type of zfs_handle_t *zhp actually in libzfs_dataset_cmp(), which may trigger a coredump if called. libzfs_dataset_cmp() working normally so far, just because all the callers only give datasets of type ZFS_TYPE_FILESYSTEM to it, we compared their mountpoint and return, luckily. Signed-off-by: GeLiXin <ge.lixin@zte.com.cn> Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4919	2016-09-09 13:21:09 -07:00
GeLiXin	c23686524f	Fix incorrect pool state after import Import a raidz pool which has a vdev with a bad label, zpool status shows the right state of the dev, but the wrong state of the pool. The pool state should be DEGRADED, not ONLINE. We examine the label in vdev_validate while in spa_load_impl, the bad label can be detected but doesn't propagate its state to the parent. There are other chances to propagate state in the following vdev_load if we failed to load DTL, but our pool is raidz1 which can tolerate a faulted disk. So we lost the last chance to correct the pool state. Propagate the leaf vdev's state to parent if its label was corrupted, as is done elsewhere in vdev_validate. Signed-off-by: GeLiXin <ge.lixin@zte.com.cn> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Don Brady <don.brady@intel.com> Closes #4948	2016-09-09 13:21:09 -07:00
GeLiXin	74acdfc682	Fix self-healing IO prior to dsl_pool_init() completion Async writes triggered by a self-healing IO may be issued before the pool finishes the process of initialization. This results in a NULL dereference of `spa->spa_dsl_pool` in vdev_queue_max_async_writes(). George Wilson recommended addressing this issue by initializing the passed `dsl_pool_t **` prior to dmu_objset_open_impl(). Since the caller is passing the `spa->spa_dsl_pool` this has the effect of ensuring it's initialized. However, since this depends on the caller knowing they must pass the `spa->spa_dsl_pool` an additional NULL check was added to vdev_queue_max_async_writes(). This guards against any future restructuring of the code which might result in dsl_pool_init() being called differently. Signed-off-by: GeLiXin <47034221@qq.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4652	2016-09-09 13:21:09 -07:00
Paul Dagnelie	d9e1eec9a2	OpenZFS 6876 - Stack corruption after importing a pool with a too-long name Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> Calling dsl_dataset_name on a dataset with a 256 byte buffer is asking for trouble. We should check every dataset on import, using a 1024 byte buffer and checking each time to see if the dataset's new name is longer than 256 bytes. OpenZFS-issue: https://www.illumos.org/issues/6876 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/ca8674e	2016-09-09 13:21:09 -07:00
Matthew Ahrens	1421562a0d	OpenZFS 7263 - deeply nested nvlist can overflow stack nvlist_pack() and nvlist_unpack are implemented recursively, which can cause the stack to overflow with a deeply nested nvlist; i.e. an nvlist which contains an nvlist, which contains an nvlist, which... Unprivileged users can pass an nvlist to the kernel via certain ioctls on /dev/zfs, which the kernel will unpack without additional permission checking or validation. Therefore, an unprivileged user can cause the kernel's stack to overflow and panic. Ideally, these functions would be implemented non-recursively. As a quick fix, this patch limits the depth of the recursion and returns an error when attempting to pack and unpack a deeply-nested nvlist. Signed-off-by: Adam Leventhal <ahl@delphix.com> Signed-off-by: George Wilson <george.wilson@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Ported-by: Prakash Surya <prakash.surya@delphix.com> OpenZFS-issue: https://www.illumos.org/issues/7263 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/0511d6d -	2016-09-09 13:21:09 -07:00
Chunwei Chen	58000c3ec7	Fix dbuf_stats_hash_table_data race Dropping DBUF_HASH_MUTEX when walking the hash list is unsafe. The dbuf can be freed at any time. Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4846	2016-09-09 13:21:09 -07:00
Tim Chase	e871059bc4	Prevent null dereferences when accessing dbuf kstat In arc_buf_info(), the arc_buf_t may have no header. If not, don't try to fetch the arc buffer stats and instead just zero them. The null dereferences were observed while accessing the dbuf kstat with awk on a system in which millions of small files were being created in order to overflow the system's metadata limit. Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Closes #4837	2016-09-09 13:21:09 -07:00
Chunwei Chen	91f81c42f0	fh_to_dentry should return ESTALE when generation mismatch When generation mismatch, it usually means the file pointed by the file handle was deleted. We should return ESTALE to indicate this. We return ENOENT in zfs_vget since zpl_fh_to_dentry will convert it to ESTALE. Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #4828	2016-09-09 13:21:09 -07:00
Chunwei Chen	2ab9247411	Don't allow accessing XATTR via export handle Allow accessing XATTR through export handle is a very bad idea. It would allow user to write whatever they want in fields where they otherwise could not. Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #4828	2016-09-09 13:21:09 -07:00
Chunwei Chen	af4e50750b	Fix out-of-bound access in zfs_fillpage The original code will do an out-of-bound access on pl[] during last iteration. ================================================================== BUG: KASAN: stack-out-of-bounds in zfs_getpage+0x14c/0x2d0 [zfs] Read of size 8 by task tmpfile/7850 page:ffffea00017c6dc0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0xffff8000000000() page dumped because: kasan: bad access detected CPU: 3 PID: 7850 Comm: tmpfile Tainted: G OE 4.6.0+ #3 ffff88005f1b7678 0000000006dbe035 ffff88005f1b7508 ffffffff81635618 ffff88005f1b7678 ffff88005f1b75a0 ffff88005f1b7590 ffffffff81313ee8 ffffea0001ae8dd0 ffff88005f1b7670 0000000000000246 0000000041b58ab3 Call Trace: [<ffffffff81635618>] dump_stack+0x63/0x8b [<ffffffff81313ee8>] kasan_report_error+0x528/0x560 [<ffffffff81278f20>] ? filemap_map_pages+0x5f0/0x5f0 [<ffffffff813144b8>] kasan_report+0x58/0x60 [<ffffffffc12250dc>] ? zfs_getpage+0x14c/0x2d0 [zfs] [<ffffffff81312e4e>] __asan_load8+0x5e/0x70 [<ffffffffc12250dc>] zfs_getpage+0x14c/0x2d0 [zfs] [<ffffffffc1252131>] zpl_readpage+0xd1/0x180 [zfs] [<ffffffff81353c3a>] SyS_execve+0x3a/0x50 [<ffffffff810058ef>] do_syscall_64+0xef/0x180 [<ffffffff81d0ee25>] entry_SYSCALL64_slow_path+0x25/0x25 Memory state around the buggy address: ffff88005f1b7500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff88005f1b7580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff88005f1b7600: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f4 ^ ffff88005f1b7680: f4 f4 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 ffff88005f1b7700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4705 Issue #4708	2016-09-09 13:21:09 -07:00
Chunwei Chen	3602878ff7	Fix memleak in zpl_parse_options strsep() will advance tmp_mntopts, and will change it to NULL on last iteration. This will cause strfree(tmp_mntopts) to not free anything. unreferenced object 0xffff8800883976c0 (size 64): comm "mount.zfs", pid 3361, jiffies 4294931877 (age 1482.408s) hex dump (first 32 bytes): 72 77 00 73 74 72 69 63 74 61 74 69 6d 65 00 7a rw.strictatime.z 66 73 75 74 69 6c 00 6d 6e 74 70 6f 69 6e 74 3d fsutil.mntpoint= backtrace: [<ffffffff81810c4e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff811f9cac>] __kmalloc+0x16c/0x250 [<ffffffffc065ce9b>] strdup+0x3b/0x60 [spl] [<ffffffffc080fad6>] zpl_parse_options+0x56/0x300 [zfs] [<ffffffffc080fe46>] zpl_mount+0x36/0x80 [zfs] [<ffffffff81222dc8>] mount_fs+0x38/0x160 [<ffffffff81240097>] vfs_kern_mount+0x67/0x110 [<ffffffff812428e0>] do_mount+0x250/0xe20 [<ffffffff812437d5>] SyS_mount+0x95/0xe0 [<ffffffff8181aff6>] entry_SYSCALL_64_fastpath+0x1e/0xa8 [<ffffffffffffffff>] 0xffffffffffffffff Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4706 Issue #4708	2016-09-09 13:21:09 -07:00
Chunwei Chen	9f5f758d77	Fix arc_prune_task use-after-free arc_prune_task uses a refcount to protect arc_prune_t, but it doesn't prevent the underlying zsb from disappearing if there's a concurrent umount. We fix this by force the caller of arc_remove_prune_callback to wait for arc_prune_taskq to finish. Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4687 Closes #4690	2016-09-09 13:21:09 -07:00
Chunwei Chen	d5b0e7fcf1	Fix get_zfs_sb race with concurrent umount Certain ioctl operations will call get_zfs_sb, which will holds an active count on sb without checking whether it's active or not. This will result in use-after-free. We fix this by using atomic_inc_not_zero to make sure we got an active sb. P1 P2 --- --- deactivate_locked_super(): s_active = 0 zfs_sb_hold() ->get_zfs_sb(): s_active = 1 ->zpl_kill_sb() -->zpl_put_super() --->zfs_umount() ---->zfs_sb_free(zsb) zfs_sb_rele(zsb) Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2016-09-09 13:21:09 -07:00
Chunwei Chen	ec9b8fae06	Kill zp->z_xattr_parent to prevent pinning zp->z_xattr_parent will pin the parent. This will cause huge issue when unlink a file with xattr. Because the unlinked file is pinned, it will never get purged immediately. And because of that, the xattr stuff will never be marked as unlinked. So the whole unlinked stuff will stay there until shrink cache or umount. This change partially reverts `e89260a`. This is safe because only the zp->z_xattr_parent optimization is removed, zpl_xattr_security_init() is still called from the zpl outside the inode lock. Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chris Dunlop <chris@onthe.net.au> Issue #4359 Issue #3508 Issue #4413 Issue #4827	2016-09-09 13:21:09 -07:00
Chunwei Chen	f7923f4ada	xattr dir doesn't get purged during iput We need to set inode->i_nlink to zero so iput will purge it. Without this, it will get purged during shrink cache or umount, which would likely result in deadlock due to zfs_zget waiting forever on its children which are in the dispose_list of the same thread. Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chris Dunlop <chris@onthe.net.au> Issue #4359 Issue #3508 Issue #4413 Issue #4827	2016-09-09 13:21:09 -07:00
Ned Bass	5acbedbbe8	Add ZIO_CHECKSUM_IS_ZERO The ZIO_CHECKSUM_IS_ZERO macro was added in master commit: `37f8a88` Illumos 5746 - more checksumming in zfs send That whole patch is not suitable for the release branch but some other backported patches on that macro. Signed-off-by: Ned Bass <bass6@llnl.gov>	2016-09-09 13:21:09 -07:00
Rich Ercolani	3a8e13688b	Add tunable to ignore hole_birth (enabled by default) Adds a module option which disables the hole_birth optimization which has been responsible for several recent bugs, including issue #4050. Original-patch: https://gist.github.com/pcd1193182/2c0cd47211f3aee623958b4698836c48 Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4833	2016-09-09 13:20:54 -07:00
Peng	4f96e68fad	Fix PANIC: metaslab_free_dva(): bad DVA X:Y:Z The following scenario can result in garbage in the dn_spill field. The db->db_blkptr must be set to NULL when DNODE_FLAG_SPILL_BLKPTR is clear to ensure the dn_spill field is cleared. Current txg = A. * A new spill buffer is created. Its dbuf is initialized with db_blkptr = NULL and it's dirtied. Current txg = B. * The spill buffer is modified. It's marked as dirty in this txg. * Additional changes make the spill buffer unnecessary because the xattr fits into the bonus buffer, so it's removed. The dbuf is undirtied in this txg, but it's still referenced and cannot be destroyed. Current txg = C. * Starts syncing of txg A * dbuf_sync_leaf() is called for the spill buffer. Since db_blkptr is NULL, dbuf_check_blkptr() is called. * The dbuf starts being written and it reaches the ready state (not done yet). * A new change makes the spill buffer necessary again. sa_build_layouts() ends up calling dbuf_find() to locate the dbuf. It finds the old dbuf because it has not been destroyed yet (it will be destroyed when the previous write is done and there are no more references). The old dbuf has db_blkptr != NULL. * txg A write is complete and the dbuf released. However it's still referenced, so it's not destroyed. Current txg = D. * Starts syncing of txg B * dbuf_sync_leaf() is called for the bonus buffer. Its contents are directly copied into the dnode, overwriting the blkptr area because, in txg B, the bonus buffer was big enough to hold the entire xattr. * At this point, the db_blkptr of the spill buffer used in txg C gets corrupted. Signed-off-by: Peng <peng.hse@xtaotech.com> Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3937	2016-09-05 16:07:09 -07:00

1 2 3 4 5 ...

2041 Commits All Branches Search

2041 Commits

All Branches