Archive-Team/zfs - zfs - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Arshad Hussain	6052060c13	Don't use hard-coded 'size' value in snprintf() This patch changes the passing of "size" to snprintf from hard-coded (openended) to sizeof(errbuf). This is bringing to standard with rest of the code where- ever 'errbuf' is used. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Closes #15003	2023-06-30 08:37:26 -07:00
Alexander Motin	eda32dca92	Fix remount when setting multiple properties. The previous code was checking zfs_is_namespace_prop() only for the last property on the list. If one was not "namespace", then remount wasn't called. To fix that move zfs_is_namespace_prop() inside the loop and remount if at least one of properties was "namespace". Reviewed-by: Umer Saleem <usaleem@ixsystems.com> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15000	2023-06-30 08:36:43 -07:00
vimproved	24554082bd	contrib: dracut: Conditionalize copying of libgcc_s.so.1 to glibc only The issue that this is designed to work around is only applicable to glibc, since it's caused by glibc's pthread_cancel() implementation using dlopen on libgcc_s.so.1 (and therefor not triggering dracut to include it in the initramfs). This commit adds an extra condition to the workaround that tests for glibc via "ldconfig -p \| grep -qF 'libc.so.6'" (which should only be present on glibc systems). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Violet Purcell <vimproved@inventati.org> Closes #14992	2023-06-29 12:54:37 -07:00
Yuri Pankov	77a3bb1f47	spa.h: use IN_BASE instead of IN_FREEBSD_BASE Consistently get the proper default value for autotrim. Currently, only the kernel module is built with IN_FREEBSD_BASE, and libzfs get the wrong default value, leading to confusion and incorrect output when autotrim value was not set explicitly. Reviewed-by: Warner Losh <imp@bsdimp.com> Signed-off-by: Yuri Pankov <yuripv@FreeBSD.org> Closes #15016	2023-06-29 11:50:52 -07:00
Mateusz Piotrowski	62ace21a14	zdb: Add missing poolname to -C synopsis Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Mateusz Piotrowski <0mp@FreeBSD.org> Sponsored-by: Klara Inc. Closes #15014	2023-06-29 10:54:43 -07:00
Alexander Motin	a9d6b0690b	ZIL: Fix another use-after-free. lwb->lwb_issued_txg can not be accessed after lwb_state is set to LWB_STATE_FLUSH_DONE and zl_lock is dropped, since the lwb may be freed by zil_sync(). We must save the txg number before that. This is similar to the `55b1842f92`, but as I see the bug is not new. It existed for quite a while, just was not triggered due to smaller race window. Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #14988 Closes #14999	2023-06-27 17:03:37 -07:00
Alexander Motin	b0cbc1aa9a	Use big transactions for small recordsize writes. When ZFS appends files in chunks bigger than recordsize, it borrows buffer from ARC and fills it before opening transaction. This supposed to help in case of page faults to not hold transaction open indefinitely. The problem appears when recordsize is set lower than default 128KB. Since each block is committed in separate transaction, per-transaction overhead becomes significant, and what is even worse, active use of of per-dataset and per-pool locks to protect space use accounting for each transaction badly hurts the code SMP scalability. The same transaction size limitation applies in case of file rewrite, but without even excuse of buffer borrowing. To address the issue, disable the borrowing mechanism if recordsize is smaller than default and the write request is 4x bigger than it. In such case writes up to 32MB are executed in single transaction, that dramatically reduces overhead and lock contention. Since the borrowing mechanism is not used for file rewrites, and it was never used by zvols, which seem to work fine, I don't think this change should create significant problems, partially because in addition to the borrowing mechanism there are also used pre-faults. My tests with 4/8 threads writing several files same time on datasets with 32KB recordsize in 1MB requests show reduction of CPU usage by the user threads by 25-35%. I would measure it in GB/s, but at that block size we are now limited by the lock contention of single write issue taskqueue, which is a separate problem we are going to work on. Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #14964	2023-06-27 17:00:30 -07:00
Laevos	bc9d0084ea	Remove unnecessary commas in zpool-create.8 Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Laevos <5572812+Laevos@users.noreply.github.com> Closes #15011	2023-06-27 16:58:32 -07:00
Alexander Motin	638a717d2a	Merge pull request #142 from truenas/NAS-122578-zfsd-exec-perm NAS-122578 / None / Make zfsd executable in order to run it from the rc.d script	2023-06-27 18:26:25 -04:00
Ameer Hamza	66066524df	Make zfsd executable in order to run it from the rc.d script Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>	2023-06-28 02:52:14 +05:00
Ameer Hamza	1e1523c2e5	Merge pull request #140 from truenas/zfs-2.2-build-fix linux 6.3 compat changes for truenas/zfs-2.2-release	2023-06-28 02:38:51 +05:00
Ameer Hamza	d00a585773	linux 6.3 compat changes for truenas/zfs-2.2-release Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>	2023-06-27 23:24:39 +05:00
Alexander Motin	8469b5aac0	Another set of vdev queue optimizations. Switch FIFO queues (SYNC/TRIM) and active queue of vdev queue from time-sorted AVL-trees to simple lists. AVL-trees are too expensive for such a simple task. To change I/O priority without searching through the trees, add io_queue_state field to struct zio. To not check number of queued I/Os for each priority add vq_cqueued bitmap to struct vdev_queue. Update it when adding/removing I/Os. Make vq_cactive a separate array instead of struct vdev_queue_class member. Together those allow to avoid lots of cache misses when looking for work in vdev_queue_class_to_issue(). Introduce deadline of ~0.5s for LBA-sorted queues. Before this I saw some I/Os waiting in a queue for up to 8 seconds and possibly more due to starvation. With this change I no longer see it. I had to slightly more complicate the comparison function, but since it uses all the same cache lines the difference is minimal. For a sequential I/Os the new code in vdev_queue_io_to_issue() actually often uses more simple avl_first(), falling back to avl_find() and avl_nearest() only when needed. Arrange members in struct zio to access only one cache line when searching through vdev queues. While there, remove io_alloc_node, reusing the io_queue_node instead. Those two are never used same time. Remove zfs_vdev_aggregate_trim parameter. It was disabled for 4 years since implemented, while still wasted time maintaining the offset-sorted tree of TRIM requests. Just remove the tree. Remove locking from txg_all_lists_empty(). It is racy by design, while 2 pair of locks/unlocks take noticeable time under the vdev queue lock. With these changes in my tests with volblocksize=4KB I measure vdev queue lock spin time reduction by 50% on read and 75% on write. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #14925	2023-06-27 09:09:48 -07:00
Rich Ercolani	35a6247c5f	Add a delay to tearing down threads. It's been observed that in certain workloads (zvol-related being a big one), ZFS will end up spending a large amount of time spinning up taskqs only to tear them down again almost immediately, then spin them up again... I noticed this when I looked at what my mostly-idle system was doing and wondered how on earth taskq creation/destroy was a bunch of time... So I added a configurable delay to avoid it tearing down tasks the first time it notices them idle, and the total number of threads at steady state went up, but the amount of time being burned just tearing down/turning up new ones almost vanished. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #14938	2023-06-26 13:57:12 -07:00
Ameer Hamza	a52c8d49c6	Merge pull request #139 from truenas/truenas/zfs-2.2-testing Forward port truenas/zfs patches to upstream openzfs master	2023-06-22 16:30:04 +05:00
Ameer Hamza	5bf0d5db13	Bump changelog for 2.1.99 Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>	2023-06-21 21:29:23 +05:00
Ameer Hamza	fc2b4d3458	Skip id-mapped tests for now due to nfsv4 acls incompatibility	2023-06-21 21:29:23 +05:00
Ameer Hamza	e3b5817448	Port latest zfsd changes from upstream FreeBSD Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>	2023-06-21 21:29:23 +05:00
Ameer Hamza	06029e211c	Port TrueNAS contrib changes and adjust github workflows Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>	2023-06-21 21:29:23 +05:00
Andrew Walker	c16f99d389	Improve zpl_permission performance This function can be frequently called with MAY_EXEC\|MAY_NOT_BLOCK during RCU path walk. Where possible we should try not to break out of it. In this case we check whether flag ZFS_NO_EXECS_DENIED is set and check mode (similar to fastexecute check in zfs_acl.c). Signed-off-by: Andrew Walker <awalker@ixsystems.com>	2023-06-21 21:29:23 +05:00
Ameer Hamza	f34365ed28	zfsd: add support for hotplugging spares If you remove an unused spare and then reinsert it, zfsd will now online it in all pools. Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>	2023-06-21 21:29:23 +05:00
Umer Saleem	0b58a60509	Fix OpenZFS build issue for Debian Bookworm dkms package layout is changed in bookworm and splits into dh-dkms package. Debhelper in Bookworm is updated to use dh-sequence-dkms instead of dkms. GitHub Actions are updated to use Ubuntu 22.04 instead of Ubuntu 20.04, since dh-sequence-dkms is not aavailable on Ubuntu 20.04. Signed-off-by: Umer Saleem <usaleem@ixsystems.com>	2023-06-21 21:29:23 +05:00
Andrew Walker	33ec2c3e96	Simplify get/set NFS4 ACL (#113 ) This removes an extra memory allocation / free from the NFS4 ACL xattr handler. Initially this was written rather quickly in the alpha cycle of SCALE and implemented in a way to ensure that xattr was exactly matching format used internally in samba's vfs_acl_xattr module. Since this time a more efficient conversion between the Samba format and various other ones was added for the purpose of inclusion in the Kernel NFS server. This change simplifies conversion between internal NFS ACL and external xattr representation, but has no impact on userspace and kernel consumers of this xattr (format does not change). Signed-off-by: Andrew Walker <awalker@ixsystems.com>	2023-06-21 21:29:23 +05:00
Andrew Walker	09a0c8a0ee	Fix ZFS_READONLY implementation on Linux (#121 ) MS-FSCC 2.6 is the governing document for DOS attribute behavior. It specifies the following: For a file, applications can read the file but cannot write to it or delete it. For a directory, applications cannot delete it, but applications can create and delete files from the directory. Signed-off-by: Andrew Walker <awalker@ixsystems.com>	2023-06-21 21:29:23 +05:00
Umer Saleem	02af6c4175	Update CI workflow for native packages CI workflow now builds RPM converted Debian packages along with native debian packages. Signed-off-by: Umer Saleem <usaleem@ixsystems.com>	2023-06-21 21:29:23 +05:00
Ryan Moeller	6115cf6a76	SCALE: ignore wholedisk We never want to partition vdevs automatically from ZFS in SCALE. Ignore the wholedisk flag in SCALE and skip the tests that expect auto partitioning to work. Signed-off-by: Ryan Moeller <ryan@iXsystems.com>	2023-06-21 21:29:23 +05:00
Umer Saleem	f4efe4ea92	Build packages with debug symbols With --enable-debuginfo configured, ZFS packages are built with debug symbols embedded into the binaries. Signed-off-by: Umer Saleem <usaleem@ixsystems.com>	2023-06-21 21:29:23 +05:00
Ameer Hamza	f41d5dc6f1	Add kfpu entry to kbuild and suppress Cppcheck checks Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>	2023-06-21 21:29:23 +05:00
Ryan Moeller	26b74065b9	Provide kfpu_begin/end from spl Jira: NAS-115648	2023-06-21 21:29:23 +05:00
Ryan Moeller	631adac5f6	initramfs: Skip lvm scan before boot pool import TrueNAS SCALE doesn't boot from pools on top of LVM, and the scan can take a significant amount of time on systems with a large number of disks. Skip the lvm commands in our local-top/zfs script. Signed-off-by: Ryan Moeller <ryan@iXsystems.com>	2023-06-21 21:29:23 +05:00
Andrew	ac2420afb0	NAS-116836 / Force BSD semantics for group ownership if NFSV4ACL (#78 ) When a new file is created on FreeBSD it is given the group of the directory which contains it. On Linux it is given to either the effective GID of the process (System V semantices) or the GID of the parent directory (BSD semantics). Since there is no hard-and-fast rule about creation semantics for NFSv4 ACLs on Linux, we should opt for what is least likely to break users permissions on change from FreeBSD to Linux. Avoid setting actually setting the SGID bit on dirs unless it was explicitly set. Signed-off-by: Andrew Walker <awalker@ixsystems.com>	2023-06-21 21:29:23 +05:00
Ameer Hamza	c0d493822b	Fix ACL build errors on sync with openzfs/master Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>	2023-06-21 21:29:23 +05:00
Andrew	6bf8daf376	Add ability for xattr handler to "strip" NFSv4 ACL (#54 ) On Linux POSIX ACLs can be removed via rmxattr() for the relevant system xattrs. On FreeBSD a non-trivial ACL can be converted to one that is described by the mode with no loss of info via combination of acl_get_file(), acl_strip_np(), and acl_set_file(). Since there's no libc equivalent of these ops in Linux for NFSv4 ACLs, this commit makes this less error prone by handling entirely in ZFS. When user performs rmxattr() vfs_setxattr() is called with value of NULL and length of 0. Add special handling for this situation in the xattr handler for the NFSv4 ACL so that we generate a new ACL and zfs_acl_chmod() with the existing mode of file, then set the ACL. Signed-off-by: Andrew Walker <awalker@ixsystems.com>	2023-06-21 21:29:23 +05:00
Andrew	6dc46c7d54	NAS-115465 / 22.12 / expose ZFS_ACL_TRIVIAL to users (#52 ) Add ACL_IS_TRIVIAL and ACL_IS_DIR flags as ACL-wide flags in the system.nfs4_acl_xdr generated on getxattr requests. This are non-RFC flags that are useful for userspace applications (especially the ACL_IS_TRIVIAL flag as it can be used to avoid relatively expensive ACL-related operations). Also add system.nfs4_acl_xdr to xattr results if ACL is not trivial. This duplicates POSIX ACL behavior where whether an ACL is set on a path can be determined via listxattr(). Since the ACL is not actually removed, we check whether the ZFS_ACL_TRIVIAL is set. If the flag is not set, then we omit the xattr name from the list. This allows users to determine whether ACL is trivial from listxattr(). Signed-off-by: Andrew Walker <awalker@ixsystems.com>	2023-06-21 21:29:23 +05:00
Ryan Moeller	e5f1583a08	Make zpl_permission work with 5.12+ kernels The "permission" inode operation takes a new `struct user_namespace *` parameter starting in Linux 5.12. Add a configure check and adapt accordingly. Signed-off-by: Ryan Moeller <ryan@iXsystems.com>	2023-06-21 02:51:24 +05:00
Ryan Moeller	e7904b8280	Switch to production builds for SCALE Jira: NAS-113186 Signed-off-by: Ryan Moeller <ryan@iXsystems.com>	2023-06-21 02:51:24 +05:00
Andrew Walker	8503a85e06	Fix access check when cred allows override of ACL Properly evaluate edge cases where user credential may grant capability to override DAC in various situations. Switch to using ns-aware checks rather than capable(). Expand optimization allow bypass of zfs_zaccess() in case of trivial ACL if MAY_OPEN is included in requested mask. This will be evaluated in generic_permission() check, which is RCU walk safe. This means that in most cases evaluating permissions on boot volume with NFSv4 ACLs will follow the fast path on checking inode permissions. Additionally, CAP_SYS_ADMIN is granted to nfsd process, and so override for this capability in access2 policy check is removed in favor of a simple check for fsid == 0. Checks for CAP_DAC_OVERRIDE and other override capabilities are kept as-is. Signed-off-by: Andrew Walker <awalker@ixsystems.com>	2023-06-21 02:51:24 +05:00
Alexander Motin	4d8b67b164	Write /sys/kernel/wait_for_device_probe before import. The new sysfs attribute makes kernel to wait for all device probe to complete before return. Without it wait_for_udev call does not give any guaranties. Ticket: NAS-108200 Signed-off-by: Alexander Motin <mav@FreeBSD.org>	2023-06-21 02:51:24 +05:00
Ryan Moeller	c078b8660e	Make acltype=nfsv4 the default on Linux, too Now that we support NFSv4 ACLs on Linux, this can now be made the default across all platforms. Update the documentation and tests accordingly. Signed-off-by: Ryan Moeller <ryan@iXsystems.com>	2023-06-21 02:51:24 +05:00
Ameer Hamza	3c72bef6bd	Adjust zfsd Makefiles for openzfs compatibility Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>	2023-06-21 02:51:15 +05:00
Ryan Moeller	35ca19b591	Add zfsd for FreeBSD Signed-off-by: Ryan Moeller <ryan@iXsystems.com>	2023-06-21 00:33:40 +05:00
Andrew	c6ba4a01f0	Implement NFSv41 ACLs through xattr This implements NFSv41 (RFC 5661) ACLs in a manner compatible with vfs_nfs4acl_xattr in Samba and nfs4xdr-acl-tools. There are three key areas of change in this commit: 1) NFSv4 ACL management through system.nfs4_acl_xdr xattr. Install an xattr handler for "system.nfs4_acl_xdr" that presents an xattr containing full NFSv41 ACL structures generated through rpcgen using specification from the Samba project. This xattr is used by userspace programs to read and set permissions. 2) add an i_op->permissions endpoint: zpl_permissions(). This is used by the VFS in Linux to determine whether to allow / deny an operation. Wherever possible, we try to avoid having to call zfs_access(). If kernel has NFSv4 patch for VFS, then perform more complete check of avaiable access mask. 3) add capability-based overrides to secpolicy_vnode_access2() there are various situations in which ACL may need to be overridden based on capabilities. This logic is almost directly copied from Linux VFS. For instance, root needs to be able to always read / write ACLs (otherwise admin can get locked out from files). This is commit was initially inspired by work from Paul B. Henson to implement NFSv4.0 (RFC3530) ACLs in ZFS on Linux. Key areas of divergence are as follows: - ACL specification, xattr format, xattr name - Addition of handling for NFSv4 masks from Linux VFS - Addition of ACL overrides based on capabilities Signed-off-by: Andrew Walker <awalker@ixsystems.com>	2023-06-21 00:33:32 +05:00
Andrew Walker	5e1eba8718	Advertise support for large xattrs on TrueNAS SB_LARGEXATTR is used in TrueNAS SCALE to indicate to the kernel that the filesystem supports large-size xattrs (greater than 64KiB). This flag is used to evaluate whether to allow large xattr read or write requests (up to 2 MiB). Signed-off-by: Andrew Walker <awalker@ixsystems.com>	2023-06-21 00:33:25 +05:00
Waqar Ahmed	cfd08bedb2	Add action to build and push docker image on master update Signed-off-by: Waqar Ahmed <waqarahmedjoyia@live.com>	2023-06-21 00:33:20 +05:00
Andrew Walker	17d7f9de97	Add check for custom TrueNAS kernel Signed-off-by: Ryan Moeller <ryan@iXsystems.com>	2023-06-21 00:33:13 +05:00
Waqar Ahmed	fd31804abc	Add CI for building zfs package Signed-off-by: Ryan Moeller <ryan@iXsystems.com>	2023-06-21 00:33:06 +05:00
Matt Macy	ae78a23f75	Fix ZFS_DEBUG_MODIFY assert in arc_buf_try_copy_decompressed_data The assert does not account for the case where there is a single buffer in the chain that is decompressed and has a valid checksum. Signed-off-by: Matt Macy <mmacy@FreeBSD.org>	2023-06-21 00:32:59 +05:00
Ryan Moeller	23f878a89d	Add packaging bits for TrueNAS SCALE	2023-06-21 00:32:51 +05:00
Alexander Motin	8e8acabdca	Fix memory leak in zil_parse(). `482da24e2` missed arc_buf_destroy() calls on log parse errors, possibly leaking up to 128KB of memory per dataset during ZIL replay. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Dagnelie <pcd@delphix.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #14987	2023-06-17 19:51:37 -07:00
George Amanakis	10e36e1761	Shorten arcstat_quiescence sleep time With the latest L2ARC fixes, 2 seconds is too long to wait for quiescence of arcstats like l2_size. Shorten this interval to avoid having the persistent L2ARC tests in ZTS prematurely terminated. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #14981	2023-06-15 12:45:36 -07:00

... 7 8 9 10 11 ...

9098 Commits All Branches Search

9098 Commits

All Branches