Archive-Team/zfs - zfs - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Coleman Kane	b586ea5d93	linux 6.2 compat: get_acl() got moved to get_inode_acl() in 6.2 Linux 6.2 renamed the get_acl() operation to get_inode_acl() in the inode_operations struct. This should fix Issue #14323. Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Coleman Kane <ckane@colemankane.org> Closes #14323 Closes #14331	2023-01-10 08:43:49 -08:00
Antonio Russo	138d2b29dd	Linux 6.1 compat: open inside tmpfile() commit `d27c81847b` upstream Linux 863f144 modified the .tmpfile interface to pass a struct file, rather than a struct dentry, and expect the tmpfile implementation to open inside of tmpfile(). This patch implements a configuration test that checks for this new API and appropriately sets a HAVE_TMPFILE_DENTRY flag that tracks this old API. Contingent on this flag, the appropriate API is implemented. Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Antonio Russo <aerusso@aerusso.net> Closes #14301 Closes #14343	2023-01-09 17:15:22 -08:00
Ameer Hamza	75fbe7eb99	skip permission checks for extended attributes zfs_zaccess_trivial() calls the generic_permission() to read xattr attributes. This causes deadlock if called from zpl_xattr_set_dir() context as xattr and the dent locks are already held in this scenario. This commit skips the permissions checks for extended attributes since the Linux VFS stack already checks it before passing us the control. Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>	2023-01-05 11:10:28 -08:00
Ryan Moeller	fbbc375d43	FreeBSD: Fix potential boot panic with bad label vdev_geom_read_pool_label() can leave NULL in configs. Check for it and skip consistently when generating rootconf. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #14291 (cherry picked from commit `dc8c2f6158`)	2023-01-05 11:00:09 -08:00
Damian Szuberski	3d1e808096	Fix clang 13 compilation errors ``` os/linux/zfs/zvol_os.c:1111:3: error: ignoring return value of function declared with 'warn_unused_result' attribute [-Werror,-Wunused-result] add_disk(zv->zv_zso->zvo_disk); ^~~~~~~~ ~~~~~~~~~~~~~~~~~~~~ zpl_xattr.c:1579:1: warning: no previous prototype for function 'zpl_posix_acl_release_impl' [-Wmissing-prototypes] ``` Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: szubersk <szuberskidamian@gmail.com> Closes #13551 (cherry picked from commit `9884319666`)	2022-12-01 12:39:44 -08:00
Richard Yao	957c3776f2	FreeBSD: Fix out of bounds read in zfs_ioctl_ozfs_to_legacy() There is an off by 1 error in the check. Fortunately, this function does not appear to be used in kernel space, despite being compiled as part of the kernel module. However, it is used in userspace. Callers of lzc_ioctl_fd() likely will crash if they attempt to use the unimplemented request number. This was reported by FreeBSD's coverity scan. Reported-by: Coverity (CID 1432059) Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes #14135	2022-12-01 12:39:43 -08:00
Serapheim Dimitropoulos	85537f77a3	Expose zfs_vdev_open_timeout_ms as a tunable Some of our customers have been occasionally hitting zfs import failures in Linux because udevd doesn't create the by-id symbolic links in time for zpool import to use them. The main issue is that the systemd-udev-settle.service that zfs-import-cache.service and other services depend on is racy. There is also an openzfs issue filed (see https://github.com/openzfs/zfs/issues/10891) outlining the problem and potential solutions. With the proper solutions being significant in terms of complexity and the priority of the issue being low for the time being, this patch exposes `zfs_vdev_open_timeout_ms` as a tunable so people that are experiencing this issue often can increase it as a workaround. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Don Brady <don.brady@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #14133	2022-12-01 12:39:43 -08:00
Pavel Snajdr	52e658edd7	Remove zpl_revalidate: fix snapshot rollback Open files, which aren't present in the snapshot, which is being roll-backed to, need to disappear from the visible VFS image of the dataset. Kernel provides d_drop function to drop invalid entry from the dcache, but inode can be referenced by dentry multiple dentries. The introduced zpl_d_drop_aliases function walks and invalidates all aliases of an inode. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Pavel Snajdr <snajpa@snajpa.net> Closes #9600 Closes #14070	2022-12-01 12:39:42 -08:00
Richard Yao	c6d93d0a80	FreeBSD: Fix uninitialized pointer read in spa_import_rootpool() The FreeBSD project's coverity scans found this. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes #13923	2022-12-01 12:39:40 -08:00
Richard Yao	9f1691a964	Linux: Fix use-after-free in zfsvfs_create() Coverity reported that we pass a pointer to zfsvfs to `dmu_objset_disown()` after freeing zfsvfs in zfsvfs_create_impl() after a failure in zfsvfs_init(). We have nearly identical duplicate versions of this code for FreeBSD and Linux, but interestingly, the FreeBSD version of this code differs in such a way that it does not suffer from this bug. We remove the difference from the FreeBSD version to fix this bug. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes #13883	2022-12-01 12:39:40 -08:00
Coleman Kane	212ba9bd97	Linux 6.1 compat: change order of sys/mutex.h includes After Linux 6.1-rc1 came out, the build started failing to build a couple of the files in the linux spl code due to the mutex_init redefinition. Moving the sys/mutex.h include to a lower position within these two files appears to fix the problem. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Coleman Kane <ckane@colemankane.org> Closes #14040	2022-11-01 12:44:56 -07:00
Christian Schwarz	df000276b8	zfs_domount: fix double-disown of dataset / double-free of zfsvfs_t Before this patch, in zfs_domount, if zfs_root or d_make_root fails, we leave zfsvfs != NULL. This will lead to execution of the error handling `if` statement at the `out` label, and hence to a call to dmu_objset_disown and zfsvfs_free. However, zfs_umount, which we call upon failure of zfs_root and d_make_root already does dmu_objset_disown and zfsvfs_free. I suppose this patch rather adds to the brittleness of this part of the code base, but I don't want to invest more time in this right now. To add a regression test, we'd need some kind of fault injection facility for zfs_root or d_make_root, which doesn't exist right now. And even then, I think that regression test would be too closely tied to the implementation. To repro the double-disown / double-free, do the following: 1. patch zfs_root to always return an error 2. mount a ZFS filesystem Here's the stack trace you would see then: VERIFY3(ds->ds_owner == tag) failed (0000000000000000 == ffff9142361e8000) PANIC at dsl_dataset.c:1003:dsl_dataset_disown() Showing stack for process 28332 CPU: 2 PID: 28332 Comm: zpool Tainted: G O 5.10.103-1.nutanix.el7.x86_64 #1 Call Trace: dump_stack+0x74/0x92 spl_dumpstack+0x29/0x2b [spl] spl_panic+0xd4/0xfc [spl] dsl_dataset_disown+0xe9/0x150 [zfs] dmu_objset_disown+0xd6/0x150 [zfs] zfs_domount+0x17b/0x4b0 [zfs] zpl_mount+0x174/0x220 [zfs] legacy_get_tree+0x2b/0x50 vfs_get_tree+0x2a/0xc0 path_mount+0x2fa/0xa70 do_mount+0x7c/0xa0 __x64_sys_mount+0x8b/0xe0 do_syscall_64+0x38/0x50 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Co-authored-by: Christian Schwarz <christian.schwarz@nutanix.com> Signed-off-by: Christian Schwarz <christian.schwarz@nutanix.com> Closes #14025	2022-11-01 12:42:32 -07:00
Mark Johnston	4e3fecbdfd	FreeBSD: Fix a pair of bugs in zfs_fhtovp() - Add a zfs_exit() call in an error path, otherwise a lock is leaked. - Remove the fid_gen > 1 check. That appears to be Linux-specific: zfsctl_snapdir_fid() sets fid_gen to 0 or 1 depending on whether the snapshot directory is mounted. On FreeBSD it fails, making snapshot dirs inaccessible via NFS. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Andriy Gapon <avg@FreeBSD.org> Signed-off-by: Mark Johnston <markj@FreeBSD.org> Fixes: `43dbf88178` ("FreeBSD: vfsops: use setgen for error case") Closes #14001 Closes #13974 (cherry picked from commit `ed566bf1cd`)	2022-10-26 14:59:25 -07:00
Mateusz Guzik	63d4838b4a	FreeBSD: handle V_PCATCH See https://cgit.FreeBSD.org/src/commit/?id=a75d1ddd74312f5dd79bc1e965f7077679659f2e Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #13910	2022-09-28 10:35:13 -07:00
Mateusz Guzik	eec942cc54	FreeBSD: catch up to 1400068 Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #13909	2022-09-28 10:35:13 -07:00
Mateusz Guzik	2c8e3e4b28	FreeBSD: stop passing LK_INTERLOCK to VOP_LOCK There is an ongoing effort to eliminate this feature. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #13908	2022-09-28 10:35:13 -07:00
Richard Yao	55816c64da	FreeBSD: Fix integer conversion for vnlru_free{,_vfsops}() When reviewing #13875, I noticed that our FreeBSD code has an issue where it converts from `int64_t` to `int` when calling `vnlru_free{,_vfsops}()`. The result is that if the int64_t is `1 << 36`, the int will be 0, since the low bits are 0. Even when some low bits are set, a value such as `((1 << 36) + 1)` would truncate to 1, which is wrong. There is protection against this on 32-bit platforms, but on 64-bit platforms, there is no check to protect us, so we add a check. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes #13882	2022-09-28 10:35:13 -07:00
Richard Yao	835e03682c	Linux: Fix uninitialized variable usage in zio_do_crypt_data() Coverity complained about this. An error from `hkdf_sha512()` before uio initialization will cause pointers to uninitialized memory to be passed to `zio_crypt_destroy_uio()`. This is a regression that was introduced by `cf63739191`. Interestingly, this never affected FreeBSD, since the FreeBSD version never had that patch ported. Since moving uio initialization to the top of this function would slow down the qat_crypt() path, we only move the `memset()` calls to the top of the function. This is sufficient to fix this problem. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Neal Gompa <ngompa@datto.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes #13944	2022-09-27 15:43:26 -07:00
Alexander Motin	44cec45f72	Improve too large physical ashift handling When iterating through children physical ashifts for vdev, prefer ones above the maximum logical ashift, that we can actually use, but within the administrator defined maximum. When selecting top-level vdev ashift, do not set it to the defined maximum in case physical ashift is even higher, but just ignore one. Using the maximum does not prevent misaligned writes, but reduces space efficiency. Since ZFS tries to write data sequentially and aggregates the writes, in many cases large misanigned writes may be not as bad as the space penalty otherwise. Allow internal physical ashifts for vdevs higher than SHIFT_MAX. May be one day allocator or aggregation could benefit from that. Reduce zfs_vdev_max_auto_ashift default from 16 (64KB) to 14 (16KB), so that ZFS may still use bigger ashifts up to SHIFT_MAX (64KB), but only if it really has to or explicitly told to, but not as an "optimization". There are some read-intensive NVMe SSDs that report Preferred Write Alignment of 64KB, and attempt to build RAIDZ2 of those leads to a space inefficiency that can't be justified. Instead these changes make ZFS fall back to logical ashift of 12 (4KB) by default and only warn user that it may be suboptimal for performance. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #13798	2022-09-21 13:15:15 -07:00
Alexander Motin	b6ebf270eb	Apply arc_shrink_shift to ARC above arc_c_min It makes sense to free memory in smaller chunks when approaching arc_c_min to let other kernel subsystems to free more, since after that point we can't free anything. This also matches behavior on Linux, where to shrinker reported only the size above arc_c_min. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Closes #13794	2022-09-13 17:59:10 -07:00
Brian Behlendorf	4063d7b6b4	Linux 5.20 compat: blk_cleanup_disk() As of the Linux 5.20 kernel blk_cleanup_disk() has been removed, all callers should use put_disk(). Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13728	2022-08-09 09:41:06 -07:00
Brian Behlendorf	58571ba447	Linux 5.20 compat: bdevname() As of the Linux 5.20 kernel bdevname() has been removed, all callers should use snprintf() and the "%pg" format specifier. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13728	2022-08-09 09:41:06 -07:00
наб	37430e8211	libtpool: -Wno-clobbered Also remove -Wno-unused-but-set-variable Upstream-bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61118 Reviewed-by: Alejandro Colomar <alx.manpages@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13110	2022-07-27 13:38:56 -07:00
Alexander Motin	881249de6f	FreeBSD: Improve crypto_dispatch() handling Handle crypto_dispatch() return values same as crp->crp_etype errors. On FreeBSD 12 many drivers returned same errors both ways, and lack of proper handling for the first ended up in assertion panic later. It was changed in FreeBSD 13, but there is no reason to not be safe. While there, skip waiting for completion, including locking and wakeup() call, for sessions on synchronous crypto drivers, such as typical aesni and software. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes #13563	2022-07-26 10:10:37 -07:00
Alexander Motin	415882d228	Avoid small buffer copying on write It is wrong for arc_write_ready() to use zfs_abd_scatter_enabled to decide whether to reallocate/copy the buffer, because the answer is OS-specific and depends on the buffer size. Instead of that use abd_size_alloc_linear(), moved into public header. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Closes #12425	2022-07-26 10:10:37 -07:00
Brian Behlendorf	fec407fb69	Linux 5.19 compat: aops->read_folio() As of the Linux 5.19 kernel the readpage() address space operation has been replaced by read_folio(). Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13515	2022-06-01 14:24:49 -07:00
Brian Behlendorf	7ae5ea8864	Linux 5.19 compat: blkdev_issue_secure_erase() Linux 5.19 commit torvalds/linux@44abff2c0 splits the secure erase functionality from the blkdev_issue_discard() function. The blkdev_issue_secure_erase() must now be issued to issue a secure erase. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13515	2022-06-01 14:24:49 -07:00
Brian Behlendorf	048301b6dc	Linux 5.19 compat: bdev_max_secure_erase_sectors() Linux 5.19 commit torvalds/linux@44abff2c0 removed the blk_queue_secure_erase() helper function. The preferred interface is to now use the bdev_max_secure_erase_sectors() function to check for discard support. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13515	2022-06-01 14:24:49 -07:00
Brian Behlendorf	9ce5eb18ef	Linux 5.19 compat: bdev_max_discard_sectors() Linux 5.19 commit torvalds/linux@70200574cc removed the blk_queue_discard() helper function. The preferred interface is to now use the bdev_max_discard_sectors() function to check for discard support. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13515	2022-06-01 14:24:49 -07:00
Brian Behlendorf	5a639f0802	Linux 5.18 compat: bio_alloc() As for the Linux 5.18 kernel bio_alloc() expects a block_device struct as an argument. This removes the need for the bio_set_dev() compatibility code for 5.18 and newer kernels. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13515	2022-06-01 14:24:49 -07:00
hping	b28c0c4bf8	abd_os: remove redundant refcount creation for abd_children Refcount creation for abd_zero_scatter->abd_children is redundant in abd_alloc_zero_scatter, as it has been done in abd_init_struct. In addition, abd_children is undefined when ZFS_DEBUG is disabled, the reference of abd_children in abd_alloc_zero_scatter breaks build of libzpool when ZFS_DEBUG is disabled. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Ping Huang <huangping@smartx.com> Closes #13429	2022-05-20 10:33:24 -07:00
Aidan Harris	eee389ba2e	Fix functions without a prototype clang-15 emits the following error message for functions without a prototype: fs/zfs/os/linux/spl/spl-kmem-cache.c:1423:27: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes] Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Aidan Harris <me@aidanharr.is> Closes #13421	2022-05-20 10:33:24 -07:00
Mateusz Guzik	2c5c8bb0a6	FreeBSD: use zero_region instead of allocating a dedicated page Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #13406	2022-05-20 10:33:24 -07:00
szubersk	756c3e085b	autoconf: Fail when __copy_from_user_inatomic is a non-GPL symbol A followup to `849c14e048` Fix https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1009242 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: szubersk <szuberskidamian@gmail.com> Closes #13389	2022-05-20 10:33:24 -07:00
Damian Szuberski	13b1f336d3	PPC get_user workaround Linux 5.12 PPC 5.12 get_user() and __copy_from_user_inatomic() inline helpers very indirectly include a reference to the GPL'd array mmu_feature_keys[] and fails to build. Workaround this by using copy_from_user() and throwing EFAULT for any calls to __copy_from_user_inatomic(). This is a workaround until a fix for Linux commit 7613f5a66becfd0e43a0f34de8518695888f5458 "powerpc/64s/kuap: Use mmu_has_feature()" is fully addressed. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Authored-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: szubersk <szuberskidamian@gmail.com> Closes #11958 Closes #12590 Closes #13367	2022-05-20 10:33:24 -07:00
Brian Atkinson	60fc173251	Adding ZERO_PAGE detection On some architectures ZERO_PAGE is unavailable because it references a GPL exported symbol of empty_zero_page. Originally `e08b993` removed the call to PAGE_ZERO(0) for assignment to the abd_zero_page. However, a simple check can be done to avoid a kernel allocation and free for the abd_zero_page if ZERO_PAGE is available. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Brian Atkinson <batkinson@lanl.gov> Closes #13199	2022-05-20 10:33:24 -07:00
Ka Ho Ng	1f31889046	FreeBSD: Implement hole-punching support This adds supports for hole-punching facilities in the FreeBSD kernel starting from __FreeBSD_version 1400032. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ka Ho Ng <khng@FreeBSD.org> Sponsored-by: The FreeBSD Foundation Closes #12458	2022-05-17 11:15:29 -07:00
наб	ecec151c14	module: zfs: freebsd: fix unused, remove argsused Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12844	2022-05-02 15:42:58 -07:00
наб	a4f582f0b6	FreeBSD: remove unused variable Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Issue #12899	2022-05-02 15:42:58 -07:00
наб	b8e1366ee6	zvol: remove unused variable Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12917	2022-05-02 15:42:58 -07:00
Brian Behlendorf	49c1346c10	Linux 5.18 compat: replace __set_page_dirty_nobuffers Replace __set_page_dirty_nobuffers with filemap_dirty_folio. Upstream-commit: 6b1f86f8e9c7f9de7ca1cb987b2cf25e99b1ae3a ("Merge tag 'folio-5.18b' of git://git.infradead.org/users/willy/pagecache ") Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Authored-by: Satadru Pramanik <satadru@gmail.com> Signed-off-by: Satadru Pramanik <satadru@gmail.com> Closes #13325 Closes #13380	2022-04-28 15:17:38 -07:00
Brian Behlendorf	71cd3726c0	Fix O_APPEND for Linux 3.15 and older kernels When using a Linux kernel which predates the iov_iter interface the O_APPEND flag should be applied in zpl_aio_write() via the call to generic_write_checks(). The updated pos variable was incorrectly ignored resulting in the current offset being used. This issue should only realistically impact the RHEL/CentOS 7.x kernels which are based on Linux 3.10. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13370 Closes #13377	2022-04-28 15:15:28 -07:00
наб	642426095a	Linux 5.18 compat: kobj_type.default_attrs replaced with default_groups Upstream-commit: cdb4f26a63c391317e335e6e683a614358e70aeb ("kobject: kobj_type: remove default_attrs") Upstream-commit: `0cdda2edb3` Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13357	2022-04-25 10:00:09 -07:00
Alexander Motin	972637dc06	FreeBSD: Fix translation from ABD to physical pages. In hypothetical case of non-linear ABD with single segment, multiple to page size but not aligned to it, vdev_geom_fill_unmap_cb() could fill one page less into bio_ma array. I am not sure it is expoitable, but better to be safe than sorry. Reported-by: Mark Johnston <markj@FreeBSD.org> Signed-off-by: Alexander Motin <mav@FreeBSD.org> (cherry picked from commit `5352f85cdd`)	2022-04-21 16:59:09 -07:00
Rich Ercolani	c220771a47	Corrected oversight in ZERO_RANGE behavior It turns out, no, in fact, ZERO_RANGE and PUNCH_HOLE do have differing semantics in some ways - in particular, one requires KEEP_SIZE, and the other does not. Also added a zero-range test to catch this, corrected a flaw that made the punch-hole test succeed vacuously, and a typo in file_write. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #13329 Closes #13338	2022-04-21 16:58:07 -07:00
Brian Behlendorf	aa1c3c1d1d	Linux 5.17 compat: GENHD_FL_EXT_DEVT / GENHD_FL_NO_PART_SCAN As of the 5.17 kernel the GENHD_FL_EXT_DEVT flag has been removed and the GENHD_FL_NO_PART_SCAN flag renamed GENHD_FL_NO_PART. Update zvol_alloc() to set GENHD_FL_NO_PART for the newer kernels which is sufficient. The behavior for prior kernels remains unchanged. 1ebe2e5f ("block: remove GENHD_FL_EXT_DEVT") 46e7eac6 ("block: rename GENHD_FL_NO_PART_SCAN to GENHD_FL_NO_PART") Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13294 Closes #13297	2022-04-20 13:44:19 -07:00
Mark Johnston	b7546f92ea	FreeBSD: Return Mach error codes from VOP_(GET\|PUT)PAGES FreeBSD's memory management system uses its own error numbers and gets confused when these VOPs return EIO. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reported-by: Peter Holm <pho@FreeBSD.org> Signed-off-by: Mark Johnston <markj@FreeBSD.org> Closes #13311	2022-04-19 10:42:54 -07:00
Mark Johnston	e9cd90f6e5	FreeBSD: Parameterize ZFS_ENTER/ZFS_VERIFY_VP with an error code For legacy reasons, a couple of VOPs have to return error numbers that don't come from the usual errno namespace. To handle the cases where ZFS_ENTER or ZFS_VERIFY_ZP fail, we need to be able to override the default error return value of EIO. Extend the macros to permit this. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Mark Johnston <markj@FreeBSD.org> Closes #13311	2022-04-19 10:42:54 -07:00
Riccardo Schirone	35ddd8ee2e	Linux 5.18 compat: use address_space_operations->readahead ->readpages was removed and replaced by ->readahead. Define zpl_readahead for kernels that don't have ->readpages. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Riccardo Schirone <rschirone91@gmail.com> Closes #13278	2022-04-06 13:15:27 -07:00
Riccardo Schirone	10a9f5fc47	Linux 5.18 compat: blkg_tryget is moved to private headers Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Riccardo Schirone <rschirone91@gmail.com> Closes #13278	2022-04-06 13:15:27 -07:00
наб	9f7f704507	Linux 5.18 compat: replace genhd.h with blkdev.h includes blkdev.h includes genhd.h since dawn of upstream git, so this is globally safe Upstream-commit: 322cbb50de711814c42fb088f6d31901502c711a ("block: remove genhd.h") Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13251	2022-04-06 13:15:27 -07:00
наб	215a8255a9	Linux 5.18 compat: 4-argument bio_alloc() bio_alloc(gfp_t gfp_mask, unsigned short nr_iovecs) became bio_alloc(struct block_device *bdev, unsigned short nr_vecs, unsigned int opf, gfp_t gfp_mask) passing NULL/0 continues previous behaviour Upstream-commit: 07888c665b405b1cd3577ddebfeb74f4717a84c4 ("block: pass a block_device and opf to bio_alloc") Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13251	2022-04-06 13:15:27 -07:00
Ryan Moeller	a5a28723bd	FreeBSD: Use NDFREE_PNBUF if available NDF_ONLY_PNBUF has been removed from FreeBSD in favor of NDFREE_PNBUF. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org> Closes #13277	2022-04-06 10:29:53 -07:00
Brian Behlendorf	847d03060f	Fix ACL checks for NFS kernel server This PR changes ZFS ACL checks to evaluate fsuid / fsgid rather than euid / egid to avoid accidentally granting elevated permissions to NFS clients. Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Andrew Walker <awalker@ixsystems.com> Co-authored-by: Ryan Moeller <freqlabs@FreeBSD.org> Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org> Closes #13221	2022-03-20 21:21:18 -07:00
Kyle Evans	421750672b	module: freebsd: avoid a taking a destroyed lock in zfs_zevent bits At shutdown time, we drain all of the zevents and set the ZEVENT_SHUTDOWN flag. On FreeBSD, we may end up calling zfs_zevent_destroy() after the zevent_lock has been destroyed while the sysevent thread is winding down; we observe ESHUTDOWN, then back out. Events have already been drained, so just inline the kmem_free call in sysevent_worker() to avoid the race, and document the assumption that zfs_zevent_destroy doesn't do anything else useful at that point. This fixes a panic that can occur at module unload time. Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Kyle Evans <kevans@FreeBSD.org> Closes #13220	2022-03-18 17:11:43 -07:00
Mateusz Guzik	275c756730	FreeBSD: add missing replay check to an assert in zfs_xvattr_set Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #13219	2022-03-18 17:11:43 -07:00
Mark Johnston	b3427b18b1	zfs: Fix a deadlock between page busy and the teardown lock When rolling back a dataset, ZFS has to purge file data resident in the system page cache. To do this, it loops over all vnodes for the mountpoint and calls vn_pages_remove() to purge pages associated with the vnode's VM object. Each page is thus exclusively busied while the dataset's teardown write lock is held. When handling a page fault on a mapped ZFS file, FreeBSD's page fault handler busies newly allocated pages and then uses VOP_GETPAGES to fill them. The ZFS getpages VOP acquires the teardown read lock with vnode pages already busied. This represents a lock order reversal which can lead to deadlock. To break the deadlock, observe that zfs_rezget() need only purge those pages marked valid, and that pages busied by the page fault handler are, by definition, invalid. Furthermore, ZFS pages always transition from invalid to valid with the teardown lock held, and ZFS never creates partially valid pages. Thus, zfs_rezget() can use the new vn_pages_remove_valid() to skip over pages busied by the fault handler. PR: 258208 Tested by: pho Reviewed by: avg, sef, kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32931 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org> Closes #12828	2022-03-04 15:37:41 -08:00
Alexander Motin	0e2bb1a3ee	Really zero the zero page While switching abd_zero_buf allocation KPI I've missed the fact that kmem_zalloc() zeroed the allocation, while kmem_cache_alloc() does not. Add explicit bzero() after it. I don't think it should have caused real problems, but leaking one memory page content all over the pool is not good. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Closes #12569	2022-03-04 15:37:33 -08:00
Paul Dagnelie	7bd292e59b	Fix cpu hotplug atomic sleep issue We move the spinlock unlock before the thread creation. This should be safe because the thread creation code doesn't actually manipulate any taskq data structures; that's done by the thread once it's created. We also remove the assertion that the maxthreads is the current threads plus one; that assertion could fail if multiple hotplug events come in quick succession, and the first new taskq thread hasn't had a chance to start processing yet. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> eviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #12714	2022-02-15 21:52:45 -08:00
наб	94a4b7ec3d	module: zfs: fix unused, remove argsused Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12844	2022-02-16 17:58:56 -08:00
drowfx	bc99c809d5	Add dataset_kstats_update.. to mmap read/write paths This allows reads/writes caused by accesses to mmap files to be accounted correctly in the per-dataset kstats for both Linux and FreeBSD. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org> Signed-off-by: Matthias Blankertz <matthias@blankertz.org> Closes #12994 Closes #13044	2022-02-16 17:58:56 -08:00
George Amanakis	e257bd481b	Introduce a flag to skip comparing the local mac when raw sending Raw receiving a snapshot back to the originating dataset is currently impossible because of user accounting being present in the originating dataset. One solution would be resetting user accounting when raw receiving on the receiving dataset. However, to recalculate it we would have to dirty all dnodes, which may not be preferable on big datasets. Instead, we rely on the os_phys flag OBJSET_FLAG_USERACCOUNTING_COMPLETE to indicate that user accounting is incomplete when raw receiving. Thus, on the next mount of the receiving dataset the local mac protecting user accounting is zeroed out. The flag is then cleared when user accounting of the raw received snapshot is calculated. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #12981 Closes #10523 Closes #11221 Closes #11294 Closes #12594 Issue #11300	2022-02-04 16:14:56 -08:00
Finix1979	1009e60992	Linux <4.8 compat: submit_bio() rw arg When using the two argument version of submit_bio() in kernel's prior to 4.8 the first argument should be specified. It's used by block dump to report the bio direction. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Finix Yan <yancw@info2soft.com> Closes #13006	2022-02-04 08:33:52 -08:00
наб	4f6599416a	Linux 5.17 compat: PDE_DATA() renamed to pde_data() Upstream commit 359745d78351c6f5442435f81549f0207ece28aa ("proc: remove PDE_DATA() completely") Link: https://lore.kernel.org/all/20211124081956.87711-2-songmuchun@bytedance.com/T/#u Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13004 Closes #12989	2022-02-04 08:33:52 -08:00
наб	f42c126029	Linux 5.17 compat: dequeue_signal() takes a 4th argument Linux 5.17's dequeue_signal() takes an additional enum pid_type * output argument Upstream commit 5768d8906bc23d512b1a736c1e198aa833a6daa4 ("signal: Requeue signals in the appropriate queue") Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12989	2022-02-04 08:33:52 -08:00
наб	2ce06d93a8	Linux 5.17 compat: detect complete_and_exit() rename Linux 5.17 sees a rename from complete_and_exit() to kthread complete_and_exit() Upstream commit cead18552660702a4a46f58e65188fe5f36e9dfe ("exit: Rename complete_and_exit to kthread_complete_and_exit") Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12989	2022-02-04 08:33:52 -08:00
Rich Ercolani	8ef01afbfc	Add support for FALLOC_FL_ZERO_RANGE For us, I think it's always just FALLOC_FL_PUNCH_HOLE with a fake mustache on. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Coleman Kane <ckane@colemankane.org> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12975	2022-02-04 08:33:52 -08:00
Rich Ercolani	c31c1146b6	Linux 5.16 compat: Added add_disk check for return add_disk went from void to must-check int return. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Coleman Kane <ckane@colemankane.org> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12975	2022-02-04 08:33:52 -08:00
Ryan Moeller	af1630c883	FreeBSD: Fix zvol_cdev_open locking First open locking changes were correctly applied to zvol_geom_open but incorrectly applied to zvol_cdev_open, causing spa_namespace_lock to be held indefinitely. Make the first open locking in zvol_cdev_open match zvol_geom_open. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #13016	2022-02-03 15:28:01 -08:00
Ryan Moeller	1828b68a0b	FreeBSD: Fix zvol_*_open() locking These are the changes for FreeBSD corresponding to the changes made for Linux in #12863, see that PR for details. Changes from #12863 are applied for zvol_geom_open and zvol_cdev_open on FreeBSD. This also adds a check for the zvol dying which we had in zvol_geom_open but was missing in zvol_cdev_open. The check causes the open to fail early with ENXIO when we are in the middle of changing volmode. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #12934	2022-02-03 15:28:01 -08:00
наб	36a91d6cef	FreeBSD: vfsops: use setgen for error case Fix from https://github.com/openzfs/zfs/pull/12844#discussion_r774179413 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12905	2022-02-03 15:28:01 -08:00
chrisrd	1259dc6e6a	zfs_prune: reset sc.nr_to_scan sc.nr_to_scan is an input to super_cache_clean (via shrinker->scan_objects), used to set the number of objects to scan in the various caches. However super_cache_scan also modifies sc.nr_to_scan, so when used in a loop we need to reset sc.nr_to_scan back to our desired nr_to_scan for the next iteration. Issue discovered and solution suggested by Tenzin Lhakhang @tlhakhan. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chris Dunlop <chris@onthe.net.au> Issue #12433 Closes #12908	2022-02-03 15:28:01 -08:00
наб	5d8c081193	FreeBSD: fix unpropagated error When performing I/O on FreeBSD using a file based vdev ensure all errors encountered when reading/writing are propagated through the zio pipeline. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12904	2022-02-03 15:28:01 -08:00
Brian Behlendorf	9ec630ff2c	Fix zvol_open() lock inversion When restructuring the zvol_open() logic for the Linux 5.13 kernel a lock inversion was accidentally introduced. In the updated code the spa_namespace_lock is now taken before the zv_suspend_lock allowing the following scenario to occur: down_read <=== waiting for zv_suspend_lock zvol_open <=== holds spa_namespace_lock __blkdev_get blkdev_get_by_dev blkdev_open ... mutex_lock <== waiting for spa_namespace_lock spa_open_common spa_open dsl_pool_hold dmu_objset_hold_flags dmu_objset_hold dsl_prop_get dsl_prop_get_integer zvol_create_minor dmu_recv_end zfs_ioc_recv_impl <=== holds zv_suspend_lock via zvol_suspend() zfs_ioc_recv ... This commit resolves the issue by moving the acquisition of the spa_namespace_lock back to after the zv_suspend_lock which restores the original ordering. Additionally, as part of this change the error exit paths were simplified where possible. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12863	2022-02-03 15:28:01 -08:00
Alan Somers	4b2bac5fe9	FreeBSD: Update argument types for VOP_READDIR A recent commit to FreeBSD changed the type of vop_readdir_args.a_cookies to a uint64_t**. There is no functional impact to ZFS because ZFS only uses 32-bit cookies, which will be zero-extended to 64-bits by the existing code. `b214fcceac` Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Alan Somers <asomers@gmail.com> Closes #12874	2022-02-03 15:28:01 -08:00
Ryan Moeller	913ae45218	FreeBSD: Provide correct file generation number va_seq was actually a thin veil over va_gen, so z_gen is a more appropriate value than z_seq to populate the field with. Drop the unnecessary compat obfuscation and provide the correct file generation number. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <freqlabs@freebsd.org> Closes #12851	2022-02-03 15:28:01 -08:00
Ryan Moeller	def73c0735	FreeBSD: Add vop_standard_writecount_nomsync https://cgit.freebsd.org/src/commit?id=3ffcfa599e29686cf2b3c1a6087408c37acaed78 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org> Closes #12828	2021-12-13 13:23:07 -08:00
Ryan Moeller	effe984148	FreeBSD: Catch up with more VFS changes Unused thread argument was removed from NDINIT* https://cgit.freebsd.org/src/commit?id=7e1d3eefd410ca0fbae5a217422821244c3eeee4 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org> Closes #12828	2021-12-13 13:23:01 -08:00
Mark Johnston	19337332cc	Fix several bugs in the FreeBSD rename VOP implementation - To avoid a use-after-free, zfsvfs->z_log needs to be loaded after the teardown lock is acquired with ZFS_ENTER(). - Avoid leaking vnode locks in zfs_rename_relock() and zfs_rename_() when the ZFS_ENTER() macros forces an early return. Refactor the rename implementation so that ZFS_ENTER() can be used safely. As a bonus, this lets us use the ZFS_VERIFY_ZP() macro instead of open-coding its implementation. Reported-by: Peter Holm <pho@FreeBSD.org> Tested-by: Peter Holm <pho@FreeBSD.org> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Mark Johnston <markj@FreeBSD.org> Sponsored-by: The FreeBSD Foundation Closes #12717	2021-12-13 13:22:54 -08:00
Pawel Jakub Dawidek	b96737b83e	Remove (now unused) td argument from zfs_lookup() Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net> Closes #12748	2021-12-13 13:22:47 -08:00
Mark Johnston	4b7bfcf8a0	Exit the teardown section later in rename on FreeBSD We have to hold the teardown lock while dereferencing zfsvfs->z_os and, I believe, when committing to the ZIL. Note that jumping to the "out" label, "error" is always non-zero. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Mark Johnston <markj@FreeBSD.org> Closes #12704	2021-12-13 13:22:41 -08:00
Mark Johnston	07165ce540	Fix potential use-after-frees in FreeBSD getpages and setattr VOPs The objset object is reallocated during certain dataset operations, such as rollbacks, so the objset pointer must be loaded after acquiring the teardown lock. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Mark Johnston <markj@FreeBSD.org> Closes #12704	2021-12-13 13:22:34 -08:00
Damian Szuberski	64e88992b6	Update `checkstyle` workflow env to ubuntu-20.04 - `checkstyle` workflow uses ubuntu-20.04 environment - improved `mancheck.sh` readability Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: szubersk <szuberskidamian@gmail.com> Closes #12713	2021-12-08 13:27:56 -08:00
Coleman Kane	bef7c02c81	Linux 5.16: The blk-cgroup.h header is where struct blkcg_gq is defined The definition of struct blkcg_gq was moved into blk-cgroup.h, which is a header that's been in Linux since 2015. This is used by vdev_blkg_tryget() in module/os/linux/zfs/vdev_disk.c. Since the kernel for CentOS 7 and similar-generation releases doesn't have this header, its inclusion is guarded by a configure test. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Coleman Kane <ckane@colemankane.org> Closes #12819	2021-12-07 13:14:23 -08:00
Coleman Kane	ea61e07413	Linux 5.16: bio_set_dev is no longer a helper macro This change adds a confiugre check to determine if bio_set_dev is a helper macro or not. If not, then the attempt to override its internal call to bio_associate_blkg(), with a macro definition to our own version, is no longer possible, as the compiler won't use it when compiling the new inline function replacement implemented in the header. This change also creates a new vdev_bio_set_dev() function that performs the same work, and also performs the work implemented in vdev_bio_associate_blkg(), as it is the only thing calling that function in our code. Our custom vdev_bio_associate_blkg() is now only compiled if the bio_set_dev() is a macro in the Linux headers. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Coleman Kane <ckane@colemankane.org> Closes #12819	2021-12-07 13:14:23 -08:00
Coleman Kane	9519fe1ff8	Linux 5.16: type member of iov_iter renamed iter_type The iov_iter->type member was renamed iov_iter->iter_type. However, while looking into this, realized that in 2018 a iov_iter_type(*iov) accessor function was introduced. So if that is present, use it, otherwise fall back to trying the existing behavior of directly accessing type from iov_iter. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Coleman Kane <ckane@colemankane.org> Closes #12819	2021-12-07 13:14:23 -08:00
Coleman Kane	0c40ff56f2	Linux 5.16: block_device_operations->submit_bio now returns void The return type for the submit_bio member of struct block_device_operations was changed to no longer return a value. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Coleman Kane <ckane@colemankane.org> Closes #12819	2021-12-07 13:14:23 -08:00
Brian Behlendorf	16da688f25	Linux 5.13 compat: retry zvol_open() when contended Due to a possible lock inversion the zvol open call path on Linux needs to be able to retry in the case where the spa_namespace_lock cannot be acquired. For Linux 5.12 an older kernel this was accomplished by returning -ERESTARTSYS from zvol_open() to request that blkdev_get() drop the bdev->bd_mutex lock, reaquire it, then call the open callback again. However, as of the 5.13 kernel this behavior was removed. Therefore, for 5.12 and older kernels we preserved the existing retry logic, but for 5.13 and newer kernels we retry internally in zvol_open(). This should always succeed except in the case where a pool's vdev are layed on zvols, in which case it may fail. To handle this case vdev_disk_open() has been updated to retry when opening a device when -ERESTARTSYS is returned. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #12301 Closes #12759	2021-12-06 12:22:57 -08:00
Coleman Kane	12d27e7134	Linux 5.16: wait_on_page_bit() no longer available to modules Instead, linux/pagemap.h offers a number of folio-specific functions to be called instead. In this case, module/os/linux/zfs/zfs_vnops_os.c wants to call wait_on_page_bit(pp, PG_writeback). This gets replaced with folio_wait_bit(folio_page(pp), PG_writeback). This change modifies the code to conditionally compile that if configure identifies th presence of the folio_wait_bit() function. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Coleman Kane <ckane@colemankane.org> Closes #12800	2021-12-06 12:22:38 -08:00
Brian Behlendorf	22b0891dbb	Linux 5.16 compat: submit_bio() The submit_bio() prototype has changed again. The version is 5.16 still only expects a single argument but the return type has changed to void. Since we never used the returned value before update the configure check to detect both single arg versions. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Alexander Lobakin <alobakin@pm.me> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12725	2021-11-05 07:51:21 -07:00
Ryan Moeller	27d9c6ae2b	FreeBSD: Catch up with recent VFS changes cn_thread is always curthread. https://cgit.freebsd.org/src/commit?id=b4a58fbf640409a1e507d9f7b411c83a3f83a2f3 https://cgit.freebsd.org/src/commit?id=2b68eb8e1dbbdaf6a0df1c83b26f5403ca52d4c3 Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Alan Somers <asomers@gmail.com> Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org> Closes #12668	2021-11-02 13:48:54 -07:00
Brian Behlendorf	143476ce8d	Use fallthrough macro As of the Linux 5.9 kernel a fallthrough macro has been added which should be used to anotate all intentional fallthrough paths. Once all of the kernel code paths have been updated to use fallthrough the -Wimplicit-fallthrough option will because the default. To avoid warnings in the OpenZFS code base when this happens apply the fallthrough macro. Additional reading: https://lwn.net/Articles/794944/ Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12441	2021-11-02 09:50:30 -07:00
Brian Behlendorf	32512acbc0	Linux 5.15 compat: get_acl() Kernel commits 332f606b32b6 ovl: enable RCU'd ->get_acl() 0cad6246621b vfs: add rcu argument to ->get_acl() callback Added compatibility code to detect the new ->get_acl() interface and correctly handle the case where the new rcu argument is set. Reviewed-by: Coleman Kane <ckane@colemankane.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12548	2021-09-14 15:42:59 -07:00
Ryan Moeller	81611683c8	FreeBSD: Don't remove SA xattr if not SA znode We attempt to remove an existing SA xattr when setting a dir xattr, but this only makes sense if the znode has been upgraded to the SA format. Otherwise, we will hit an assert in zfs_sa_get_xattr. Make sure this is an SA znode before attempting to remove the SA xattr. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #12514	2021-09-14 15:11:56 -07:00
Richard Yao	1655ce5619	Linux 4.11 compat: statx support Linux 4.11 added a new statx system call that allows us to expose crtime as btime. We do this by caching crtime in the znode to match how atime, ctime and mtime are cached in the inode. statx also introduced a new way of reporting whether the immutable, append and nodump bits have been set. It adds support for reporting compression and encryption, but the semantics on other filesystems is not just to report compression/encryption, but to allow it to be turned on/off at the file level. We do not support that. We could implement semantics where we refuse to allow user modification of the bit, but we would need to do a dnode_hold() in zfs_znode_alloc() to find out encryption/compression information. That would introduce locking that will have a minor (although unmeasured) performance cost. It also would be inferior to zdb, which reports far more detailed information. We therefore omit reporting of encryption/compression through statx in favor of recommending that users interested in such information use zdb. Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Richard Yao <ryao@gentoo.org> Closes #8507	2021-09-14 14:31:50 -07:00
Alexander Motin	a4862125b8	Remove b_pabd/b_rabd allocation from arc_hdr_alloc() When a header is allocated for full overwrite it is a waste of time to allocate b_pabd/b_rabd for it, since arc_write() will free them without ever being touched. If it is a read or a partial overwrite then arc_read() and arc_hdr_decrypt() allocate them explicitly. Reduced memory allocation in user threads also reduces ARC eviction throttling there, proportionally increasing it in ZIO threads, that is not good. To minimize or even avoid it introduce ARC allocation reserve, allowing certain arc_get_data_abd() callers to allocate a bit longer in situations where user threads will already throttle. Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Closes #12398	2021-09-14 14:31:50 -07:00
Allan Jude	24e51e3749	Restore FreeBSD sysctl processing for arc.min and arc.max Before OpenZFS 2.0, trying to set the FreeBSD sysctl vfs.zfs.arc_max to a disallowed value would return an error. Since the switch, it instead only generates WARN_IF_TUNING_IGNORED Keep the ability to set the sysctl's specifically to 0, even though that is less than the minimum, because some tests depend on this. Also lost, was the ability to set vfs.zfs.arc_max to a value less than the default vfs.zfs.arc_min at boot time. Restore this as well. Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Allan Jude <allan@klarasystems.com> Closes #12161	2021-09-14 14:31:01 -07:00
Ryan Moeller	744f3009fc	zfs: add missed dependency of zfs module on zlib Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Martin Matuska <mm@FreeBSD.org> Co-authored-by: Konstantin Belousov <kib@FreeBSD.org> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> External-issue: https://reviews.freebsd.org/D31207 Closes #12442	2021-09-14 14:30:39 -07:00
Brian Behlendorf	32a971e749	Enable /proc/diskstats for zvols The /proc/diskstats accounting needs to be explicitly enabled for block devices which do not use multi-queue. Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12440 Closes #12066	2021-09-14 14:30:13 -07:00
hedongzhang	ddb732e2c8	Modify checksum obtain method of QAT CpaDcGeneratefooter function that obtain the checksum code does not support the CPA_DC_STATELESS mode. So we get the adler32 chencksum of the end of the zlib from dc_results. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Chengfei Zhu <chengfeix.zhu@intel.com> Signed-off-by: hedong.zhang <h_d_zhang@163.com> Closes #12343	2021-09-14 14:29:46 -07:00

1 2 3 4 5 ...

523 Commits