Archive-Team/zfs - zfs - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Shaan Nobee	9e5a297de6	Speed up WB_SYNC_NONE when a WB_SYNC_ALL occurs simultaneously Page writebacks with WB_SYNC_NONE can take several seconds to complete since they wait for the transaction group to close before being committed. This is usually not a problem since the caller does not need to wait. However, if we're simultaneously doing a writeback with WB_SYNC_ALL (e.g via msync), the latter can block for several seconds (up to zfs_txg_timeout) due to the active WB_SYNC_NONE writeback since it needs to wait for the transaction to complete and the PG_writeback bit to be cleared. This commit deals with 2 cases: - No page writeback is active. A WB_SYNC_ALL page writeback starts and even completes. But when it's about to check if the PG_writeback bit has been cleared, another writeback with WB_SYNC_NONE starts. The sync page writeback ends up waiting for the non-sync page writeback to complete. - A page writeback with WB_SYNC_NONE is already active when a WB_SYNC_ALL writeback starts. The WB_SYNC_ALL writeback ends up waiting for the WB_SYNC_NONE writeback. The fix works by carefully keeping track of active sync/non-sync writebacks and committing when beneficial. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Shaan Nobee <sniper111@gmail.com> Closes #12662 Closes #12790	2023-06-05 10:59:02 -07:00
Don Brady	30dcddaec7	Refine special_small_blocks property validation When the special_small_blocks property is being set during a pool create it enforces a limit of 128KiB even if the pool's record size is larger. If the recordsize property is being set during a pool create, then use that value instead of the default SPA_OLD_MAXBLOCKSIZE value. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Don Brady <dev.fs.zfs@gmail.com> Closes #13815 Closes #14811	2023-05-27 18:23:33 -07:00
Paul Dagnelie	e1b3ab5f51	ZTS: send-c_volume is flaky We use block_device_wait to wait for the zvol block device to actually appear, and we log the result of the dd calls by using an intermediate file. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: John Wren Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #14767	2023-05-27 18:21:49 -07:00
Brian Behlendorf	e97637d484	Add the ability to uninitialize zpool initialize functions well for touching every free byte...once. But if we want to do it again, we're currently out of luck. So let's add zpool initialize -u to clear it. Co-authored-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12451 Closes #14873	2023-05-26 10:09:04 -07:00
Akash B	c2f0aaeb3c	Fix concurrent resilvers initiated at same time For draid vdevs it was possible to initiate both the sequential and healing resilver at same time. This fixes the following two scenarios. 1) There's a window where a sequential rebuild can be started via ZED even if a healing resilver has been scheduled. - This is fixed by adding additional check in spa_vdev_attach() for any scheduled resilver and return appropriate error code when a resilver is already in progress. 2) It was possible for zpool clear to start a healing resilver when it wasn't needed at all. This occurs because during a vdev_open() the device is presumed to be healthy not until the device is validated by vdev_validate() and it's set unavailable. However, by this point an async resilver will have already been requested if the DTL isn't empty. - This is fixed by cancelling the SPA_ASYNC_RESILVER request immediately at the end of vdev_reopen() when a resilver is unneeded. Finally, added a testcase in ZTS for verification. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com> Signed-off-by: Akash B <akash-b@hpe.com> Closes #14881 Closes #14892	2023-05-26 10:07:19 -07:00
Brian Behlendorf	ecaf3ea3f2	ZTS: Minor fixes Backport two minor ZTS test case fixes from `63652e15` to resolve a few spurious failures. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2023-05-10 10:40:24 -07:00
David Hedberg	9b17d5a37d	Wait for txg sync if the last DRR_FREEOBJECTS might result in a hole If we receive a DRR_FREEOBJECTS as the first entry in an object range, this might end up producing a hole if the freed objects were the only existing objects in the block. If the txg starts syncing before we've processed any following DRR_OBJECT records, this leads to a possible race where the backing arc_buf_t gets its psize set to 0 in the arc_write_ready() callback while still being referenced from a dirty record in the open txg. To prevent this, we insert a txg_wait_synced call if the first record in the range was a DRR_FREEOBJECTS that actually resulted in one or more freed objects. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: David Hedberg <david.hedberg@findity.com> Sponsored by: Findity AB Closes #11893 Closes #14358	2023-05-09 12:57:56 -07:00
Ameer Hamza	75ec145710	zpool import -m also removing spare and cache when log device is missing spa_import() relies on a pool config fetched by spa_try_import() for spare/cache devices. Import flags are not passed to spa_tryimport(), which makes it return early due to a missing log device and missing retrieving the cache device and spare eventually. Passing ZFS_IMPORT_MISSING_LOG to spa_tryimport() makes it fetch the correct configuration regardless of the missing log device. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #14794	2023-05-05 09:07:07 -07:00
Brian Behlendorf	c7db374ac6	Fix buffered/direct/mmap I/O race When a page is faulted in for memory mapped I/O the page lock may be dropped before it has been read and marked up to date. If a buffered read encounters such a page in mappedread() it must wait until the page has been updated. Failure to do so will result in a panic on debug builds and incorrect data on production builds. The critical part of this change is in mappedread() where pages which are not up to date are now handled. Additionally, it includes the following simplifications. - zfs_getpage() and zfs_fillpage() could be passed an array of pages. This could be more efficient if it was used but in practice only a single page was ever provided. These interfaces were simplified to acknowledge that. - update_pages() was modified to correctly set the PG_error bit on a page when it cannot be read by dmu_read(). - Setting PG_error and PG_uptodate was moved to zfs_fillpage() from zpl_readpage_common(). This is consistent with the handling in update_pages() and mappedread(). - Minor additional refactoring to comments and variable declarations to improve readability. - Add a test case to exercise concurrent buffered, direct, and mmap IO to the same file. - Reduce the mmap_sync test case default run time. Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13608 Closes #14498	2023-04-21 13:12:35 -07:00
Damian Szuberski	07cc8ae46a	Removed Python 2 and Python 3.5- support Deprecation of Python versions below 3.6 gives opportunity to unify the build and install requirements for OpenZFS packages. The minimal supported Python version is 3.6 as this is the most recent Python package CentOS/RHEL 7 users can get. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: szubersk <szuberskidamian@gmail.com> Closes #12925	2023-04-13 15:59:45 -07:00
Ameer Hamza	777c98ee52	Use setproctitle to report progress of zfs send This allows parsing of zfs send progress by checking the process title. Doing so requires some changes to the send code in libzfs_sendrecv.c; primarily these changes move some of the accounting around, to allow for the code to be verbose as normal, or set the process title. Unlike BSD, setproctitle() isn't standard in Linux; thus, borrowed it from libbsd with slight modifications. Authored-by: Sean Eric Fagan <sef@FreeBSD.org> Co-authored-by: Ryan Moeller <ryan@iXsystems.com> Co-authored-by: Ameer Hamza <ahamza@ixsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #14376	2023-03-29 14:45:34 -07:00
Ameer Hamza	bd9a9a4e1a	zed: mark disks as REMOVED when they are removed ZED does not take any action for disk removal events if there is no spare VDEV available. Added zpool_vdev_remove_wanted() in libzfs and vdev_remove_wanted() in vdev.c to remove the VDEV through ZED on removal event. This means that if you are running zed and remove a disk, it will be propertly marked as REMOVED. Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>	2023-03-27 11:32:09 -07:00
George Amanakis	f806306ce0	Activate filesystem features only in syncing context When activating filesystem features after receiving a snapshot, do so only in syncing context. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #14304 Closes #14252	2023-01-19 12:50:42 -08:00
наб	6af8e80310	fgrep -> grep -F Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13259	2023-01-19 12:50:36 -08:00
наб	f8a124b104	egrep -> grep -E Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13259	2023-01-19 12:50:25 -08:00
Antonio Russo	5371d8dae7	ZTS: close in mmapwrite.c commit `a7304ab9c1` upstream mmapwrite is used during the ZTS to identify issues with mmap-ed files. This helper program exercises this pathway by continuously writing to a file. `ee6bf97c7` modified the writing threads to terminate after a set amount of total data is written. This change allows standard program execution to reach the end of a writer thread without closing the file descriptor, introducing a resource "leak." This patch appeases resource leak analyses by close()-ing the file at the end of the thread. Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Antonio Russo <aerusso@aerusso.net> Closes #14353	2023-01-09 17:15:22 -08:00
Antonio Russo	a75af541cf	ZTS: limit mmapwrite file size commit `ee6bf97c77` upstream mmapwrite spawns several threads, all of which perform writes on a file for the purpose of testing the behavior of mmap(2)-ed files. One thread performs an mmap and a write to the beginning of that region, while the others perform regular writes after lseek(2)-ing the end of the file. Because these regular writes are set in a while (1) loop, they will write an unbounded amount of data to disk. The mmap_write_001_pos test script SIGKILLs them after 30 seconds, but on fast testbeds, this may be enough time to exhaust the available space in the filesystem, leading to spurious test failures. Instead, limit the total file size by checking that the lseek return value is no greater than 250 * 1024*1024 bytes, which is less than the default minimum vdev size defined in includes/default.cfg . This also includes part of `2a493a4c71`, which checks the return value of lseek. Signed-off-by: Antonio Russo <aerusso@aerusso.net> Closes #14277 Closes #14345	2023-01-09 17:15:22 -08:00
Ameer Hamza	75fbe7eb99	skip permission checks for extended attributes zfs_zaccess_trivial() calls the generic_permission() to read xattr attributes. This causes deadlock if called from zpl_xattr_set_dir() context as xattr and the dent locks are already held in this scenario. This commit skips the permissions checks for extended attributes since the Linux VFS stack already checks it before passing us the control. Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>	2023-01-05 11:10:28 -08:00
Ameer Hamza	d0f350c962	Allow receiver to override encryption properties in case of replication Currently, the receiver fails to override the encryption property for the plain replicated dataset with the error: "cannot receive incremental stream: encryption property 'encryption' cannot be set for incremental streams.". The problem is resolved by allowing the receiver to override the encryption property for plain replicated send. Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>	2023-01-05 11:10:04 -08:00
szubersk	d50ce5c9ec	tests: mkfile: usage: () -> (void) Signed-off-by: szubersk <szuberskidamian@gmail.com>	2022-12-09 12:07:38 -08:00
George Amanakis	c8d2ab05e1	Fix setting the large_block feature after receiving a snapshot We are not allowed to dirty a filesystem when done receiving a snapshot. In this case the flag SPA_FEATURE_LARGE_BLOCKS will not be set on that filesystem since the filesystem is not on dp_dirty_datasets, and a subsequent encrypted raw send will fail. Fix this by checking in dsl_dataset_snapshot_sync_impl() if the feature needs to be activated and do so if appropriate. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #13699 Closes #13782	2022-12-01 12:39:45 -08:00
наб	670d66e7a0	tests: cmd: draid: remove unused and undocumented -v Found with -Wunused-but-set-variable on Clang trunk Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13304	2022-12-01 12:39:44 -08:00
Rich Ercolani	fa7d572a8a	Handle and detect #13709 's unlock regression (#14161 ) In #13709, as in #11294 before it, it turns out that `63a26454` still had the same failure mode as when it was first landed as `d1d47691`, and fails to unlock certain datasets that formerly worked. Rather than reverting it again, let's add handling to just throw out the accounting metadata that failed to unlock when that happens, as well as a test with a pre-broken pool image to ensure that we never get bitten by this again. Fixes: #13709 Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2022-12-01 12:39:43 -08:00
Ameer Hamza	ca3a675c74	zed: Prevent special vdev to be replaced by hot spare Special vdevs should not be replaced by a hot spare. Log vdevs already support this, extending the functionality for special vdevs. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #14129	2022-11-07 13:36:57 -08:00
Attila Fülöp	cd1f023846	Deny receiving into encrypted datasets if the keys are not loaded (#14139 ) Commit `68ddc06b61` introduced support for receiving unencrypted datasets as children of encrypted ones but unfortunately got the logic upside down. This resulted in failing to deny receives of incremental sends into encrypted datasets without their keys loaded. If receiving a filesystem, the receive was done into a newly created unencrypted child dataset of the target. In case of volumes the receive made the target volume undeletable since a dataset was created below it, which we obviously can't handle. Incremental streams with embedded blocks are affected as well. We fix the broken logic to properly deny receives in such cases. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Attila Fülöp <attila@fueloep.org> Closes #13598 Closes #14055 Closes #14119	2022-11-04 11:07:29 -07:00
Akash B	7ac732b8d6	Add options to zfs redundant_metadata property Currently, additional/extra copies are created for metadata in addition to the redundancy provided by the pool(mirror/raidz/draid), due to this 2 times more space is utilized per inode and this decreases the total number of inodes that can be created in the filesystem. By setting redundant_metadata to none, no additional copies of metadata are created, hence can reduce the space consumed by the additional metadata copies and increase the total number of inodes that can be created in the filesystem. Additionally, this can improve file create performance due to the reduced amount of metadata which needs to be written. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com> Signed-off-by: Akash B <akash-b@hpe.com> Closes #13680	2022-11-01 12:25:58 -07:00
Ameer Hamza	035e52f591	Delay ZFS_PROP_SHARESMB property to handle it for encrypted raw receive For encrypted raw receive, objset creation is delayed until a call to dmu_recv_stream(). ZFS_PROP_SHARESMB property requires objset to be populated when calling zpl_earlier_version(). To correctly handle the ZFS_PROP_SHARESMB property for encrypted raw receive, this change delays setting the property. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #13878	2022-09-21 13:15:26 -07:00
Alexander Motin	44cec45f72	Improve too large physical ashift handling When iterating through children physical ashifts for vdev, prefer ones above the maximum logical ashift, that we can actually use, but within the administrator defined maximum. When selecting top-level vdev ashift, do not set it to the defined maximum in case physical ashift is even higher, but just ignore one. Using the maximum does not prevent misaligned writes, but reduces space efficiency. Since ZFS tries to write data sequentially and aggregates the writes, in many cases large misanigned writes may be not as bad as the space penalty otherwise. Allow internal physical ashifts for vdevs higher than SHIFT_MAX. May be one day allocator or aggregation could benefit from that. Reduce zfs_vdev_max_auto_ashift default from 16 (64KB) to 14 (16KB), so that ZFS may still use bigger ashifts up to SHIFT_MAX (64KB), but only if it really has to or explicitly told to, but not as an "optimization". There are some read-intensive NVMe SSDs that report Preferred Write Alignment of 64KB, and attempt to build RAIDZ2 of those leads to a space inefficiency that can't be justified. Instead these changes make ZFS fall back to logical ashift of 12 (4KB) by default and only warn user that it may be suboptimal for performance. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #13798	2022-09-21 13:15:15 -07:00
Akash B	03fa3ef264	Add physical device size to SIZE column in 'zpool list -v' Add physical device size/capacity only for physical devices in 'zpool list -v' instead of displaying "-" in the SIZE column. This would make it easier to see the individual device capacity and to determine which spares are large enough to replace which devices. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com> Signed-off-by: Akash B <akash-b@hpe.com> Closes #12561 Closes #13106	2022-09-15 10:23:01 -07:00
Tony Hutter	b1be0a5c15	ZTS: Fix zpool_expand_001_pos `zpool_expand_001_pos` was often failing due to not seeing autoexpand commands in the `zpool history`. During testing, I found this to be unreliable (sometimes the "online" wouldn't appear in `zpool history`) and unnecessary, as we could simply check that the pool increased in size. This commit revamps the test to check for the expanded pool size and corresponding new free space. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #13743	2022-09-13 17:58:03 -07:00
Paul Zuchowski	db5fd16f0b	Fix problem with zdb_objset_id test. Use large numbers for datasets with numeric names to avoid name and id collisions. Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>	2022-08-09 11:46:12 -07:00
Paul Zuchowski	fcbddc7f7c	Fix problem with zdb -d zdb -d <pool>/<objset ID> does not work when other command line arguments are included i.e. zdb -U <cachefile> -d <pool>/<objset ID> This change fixes the command line parsing to handle this situation. Also fix issue where zdb -r <dataset> <file> does not handle the root <dataset> of the pool. Introduce -N option to force <objset ID> to be interpreted as a numeric objsetID. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Paul Zuchowski <pzuchowski@datto.com> Closes #12845 Closes #12944	2022-08-08 16:56:38 -07:00
Brian Behlendorf	98315be036	ZTS: Fix io_uring support check Not all Linux distribution kernels enable io_uring support by default. Update the run time check to verify that the booted kernel was built with CONFIG_IO_URING=y. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Co-authored-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13648 Closes #13685	2022-07-27 13:38:56 -07:00
Brian Behlendorf	3920d7f325	Scrub mirror children without BPs When scrubbing a raidz/draid pool, which contains a replacing or sparing mirror with multiple online children, only one child will be read. This is not normally a serious concern because the DTL records are used to determine where a good copy of the data is. As long as the data can be read from one child the mirror vdev will use it to repair gaps in any of its children. Furthermore, even if the data which was read is corrupt the raidz code will detect this and issue its own repair I/O to correct the damage in the mirror vdev. However, in the scenario where the DTL is wrong due to silent data corruption (say due to overwriting one child) and the scrub happens to read from a child with good data, then the other damaged mirror child will not be detected nor repaired. While this is possible for both raidz and draid vdevs, it's most pronounced when using draid. This is because by default the zed will sequentially rebuild a draid pool to a distributed spare, and the distributed spare half of the mirror is always preferred since it delivers better performance. This means the damaged half of the mirror will go undetected even after scrubbing. For system administrations this behavior is non-intuitive and in a worst case scenario could result in the only good copy of the data being unknowingly detached from the mirror. This change resolves the issue by reading all replacing/sparing mirror children when scrubbing. When the BP isn't available for verification, then compare the data buffers from each child. They must all be identical, if not there's silent damage and an error is returned to prompt the top-level vdev to issue a repair I/O to rewrite the data on all of the mirror children. Since we can't tell which child was wrong a checksum error is logged against the replacing or sparing mirror vdev. Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13555	2022-07-14 10:21:29 -07:00
Rich Ercolani	c220771a47	Corrected oversight in ZERO_RANGE behavior It turns out, no, in fact, ZERO_RANGE and PUNCH_HOLE do have differing semantics in some ways - in particular, one requires KEEP_SIZE, and the other does not. Also added a zero-range test to catch this, corrected a flaw that made the punch-hole test succeed vacuously, and a typo in file_write. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #13329 Closes #13338	2022-04-21 16:58:07 -07:00
наб	5a21214be8	zfs, libzfs: diff: accept -h/ZFS_DIFF_NO_MANGLE, disabling path escaping Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Upstream-commit: `344bbc82e7` Closes #12829	2022-04-01 09:58:45 -07:00
Brian Behlendorf	145af480d3	Fix ENOSPC when unlinking multiple files from full pool When unlinking multiple files from a pool at 100% capacity, it was possible for ENOSPC to be returned after the first unlink. e.g. rm -f /mnt/fs/test1.0.0 /mnt/fs/test1.1.0 /mnt/fs/test1.2.0 rm: cannot remove '/mnt/fs/test1.1.0': No space left on device rm: cannot remove '/mnt/fs/test1.2.0': No space left on device After waiting for the pending deferred frees from the first unlink to be processed the remaining files can then be unlinked. This is caused by the quota limit in dsl_dir_tempreserve_impl() being temporarily decreased to the allocatable pool capacity less any deferred free space. This is resolved using the existing mechanism of returning ERESTART when over quota as long as we know enough space will shortly be available after processing the pending deferred frees. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13172	2022-03-08 11:46:03 -08:00
Brian Behlendorf	b3b6491ce9	ZTS: deadman_sync fix In the CI environment it's possible for events to be slightly delayed resulting in 4, instead of 5, events appearing in the log file. This isn't a problem and should be considered a success to avoid false positive test results. Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12625	2022-03-07 15:17:49 -08:00
Brian Behlendorf	037434e4fc	ZTS: Fix import_devices_missing.ksh Related to commit `90b77a036`. Retry the `zpool export` if the pool is "busy" indicating there is a process accessing the mount point. This can happen after an import, allowing it to be retried will avoid spurious test failures. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13169	2022-03-02 11:27:05 -08:00
Brian Behlendorf	190516f0c5	ZTS: Retry in import_rewind_config_changed.ksh As explained by the disclaimer in the test case, "This test can fail since nothing guarantees that old MOS blocks aren't overwritten." This behavior is expected and correct, but results in a flaky test case which is problematic for the CI. The best we can do to resolve this is to retry the sub-test which failed when the MOS blocks have clearly been overwritten. When testing failures were rare enough that a single retry should normally be sufficient. However, we allow up to five for good measure. Reviewed by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13119	2022-03-02 11:25:35 -08:00
George Amanakis	bcddb18bae	Enable encrypted raw sending to pools with greater ashift Raw sending from pool1/encrypted with ashift=9 to pool2/encrypted with ashift=12 results to failure when mounting pool2/encrypted (Input/Output error). Notably, the opposite, raw sending from a greater ashift to a lower one does not fail. This happens because zio_compress_write() falsely checks only ZIO_FLAG_RAW_COMPRESS and not ZIO_FLAG_RAW_ENCRYPT which is also set in encrypted raw send streams. In this case it rounds up the psize and if not equal to the zio->io_size it modifies the block by zeroing out the extra bytes. Because this happens in a SA attr. registration object (type=46), the decryption fails upon mounting the filesystem, and zpool status falsely reports an error. Fix this by checking both ZIO_FLAG_RAW_COMPRESS and ZIO_FLAG_RAW_ENCRYPT before deciding whether to zero-pad a block. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #13067 Closes #13074	2022-02-23 16:47:37 -08:00
Brian Behlendorf	ccbe9efd6b	ZTS: Fix checkpoint_ro_rewind.ksh Related to commit `90b77a036`. Retry the `zpool export` if the pool is "busy" indicating there is a process accessing the mount point. This can happen after an import and allowing it to be retried will avoid spurious test failures. Reviewed by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13092	2022-02-16 17:58:56 -08:00
Brian Behlendorf	882bc4ad61	ZTS: Fix zpool_expand_001_pos The dRAID section of the zpool_expand_001_pos test would reliably fail because the calculated expansion size assumed the dRAID top-level vdev was created with a distributed spare. Create the vdev as expected to resolve the test failure. This test case flaw was accidentally caused by changing the default number of dRAID distributed spares from one to zero while dRAID was being developed. Additionally, remove zpool_expand_005_pos from the list of possible faulty tests. It appears to be passing consistently in my testing. Reviewed by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13091	2022-02-16 17:58:56 -08:00
Brian Behlendorf	f03cf651ec	ZTS: Fix zvol_misc_volmode test Changing volmode may need to remove minors, which could be open, so call udev_wait() before we "zfs set volmode=<value>". This ensures no udev process has the zvol open (i.e. blkid) and the kernel zvol_remove_minor_impl() function won't skip removing the in use device. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13075	2022-02-16 17:58:56 -08:00
Attila Fülöp	5c19af07d4	Receive checks should allow unencrypted child datasets dmu_recv_begin_check() unconditionally sets the DS_HOLD_FLAG_DECRYPT flag before calling dsl_dataset_hold_flags(). If the key on the receiving side isn't loaded or the send stream contains embedded blocks, the receive check fails for a stream which is perfectly valid and could be received without any problem. This seems like a remnant of the initial design, where unencrypted datasets below encrypted ones weren't allowed. Add a condition to set `DS_HOLD_FLAG_DECRYPT` only for encrypted datasets, modify an existing test to detect this regression and add a test for raw replication streams. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Amanakis <gamanakis@gmail.com> Co-authored-by: George Amanakis <gamanakis@gmail.com> Signed-off-by: Attila Fülöp <attila@fueloep.org> Closes #13033 Closes #13076	2022-02-16 17:58:55 -08:00
Attila Fülöp	3b52ccd7d7	Linux 5.16 compat: don't use XSTATE_XSAVE to save FPU state Linux 5.16 moved XSTATE_XSAVE and XSTATE_XRESTORE out of our reach, so add our own XSAVE{,OPT,S} code and use it for Linux 5.16. Please note that this differs from previous behavior in that it won't handle exceptions created by XSAVE an XRSTOR. This is sensible for three reasons. - Exceptions during XSAVE and XRSTOR can only occur if the feature is not supported or enabled or the memory operand isn't aligned on a 64 byte boundary. If this happens something else went terribly wrong, and it may be better to stop execution. - Previously we just printed a warning and didn't handle the fault, this is arguable for the above reason. - All other *SAVE instruction also don't handle exceptions, so this at least aligns behavior. Finally add a test to catch such a regression in the future. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Attila Fülöp <attila@fueloep.org> Closes #13042 Closes #13059	2022-02-16 17:58:55 -08:00
наб	745a7f78da	Remove basename(1). Clean up/shorten some coreutils pipelines Basenames that remain, in cmd/zed/zed.d/statechange-led.sh: dev=$(basename "$(echo "$therest" \| awk '{print $(NF-1)}')") vdev=$(basename "$ZEVENT_VDEV_PATH") I don't wanna interfere with #11988 scripts/zfs-tests.sh: SINGLETESTFILE=$(basename "$SINGLETEST") tests/zfs-tests/tests/functional/cli_user/zfs_list/zfs_list.kshlib: ACTUAL=$(basename $dataset) ACTUAL=$(basename $dataset) tests/zfs-tests/tests/functional/cli_user/zpool_iostat/ zpool_iostat_-c_homedir.ksh: typeset USER_SCRIPT=$(basename "$USER_SCRIPT_FULL") tests/zfs-tests/tests/functional/cli_user/zpool_iostat/ zpool_iostat_-c_searchpath.ksh: typeset CMD_1=$(basename "$SCRIPT_1") typeset CMD_2=$(basename "$SCRIPT_2") tests/zfs-tests/tests/functional/cli_user/zpool_status/ zpool_status_-c_homedir.ksh: typeset USER_SCRIPT=$(basename "$USER_SCRIPT_FULL") tests/zfs-tests/tests/functional/cli_user/zpool_status/ zpool_status_-c_searchpath.ksh typeset CMD_1=$(basename "$SCRIPT_1") typeset CMD_2=$(basename "$SCRIPT_2") tests/zfs-tests/tests/functional/migration/migration.cfg: export BNAME=`basename $TESTFILE` tests/zfs-tests/tests/perf/perf.shlib: typeset logbase="$(get_perf_output_dir)/$(basename \ tests/zfs-tests/tests/perf/perf.shlib: typeset logbase="$(get_perf_output_dir)/$(basename \ These are potentially Of Directories, where basename is actually useful Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12652	2022-02-16 17:58:55 -08:00
Brian Behlendorf	cd0e238049	ZTS: Update enospc_002_pos test case The on-disk cost of creating a snapshot or bookmark is sufficiently low that it is difficult to make it reliably fail even when the pool is "full". In order to avoid false positives remove these two checks from the test case. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13060	2022-02-16 17:58:55 -08:00
Pawel Jakub Dawidek	3e27b589cf	Fix clearing set-uid and set-gid bits on a file when replying a write POSIX requires that set-uid and set-gid bits to be removed when an unprivileged user writes to a file and ZFS does that during normal operation. The problem arrises when the write is stored in the ZIL and replayed. During replay we have no access to original credentials of the process doing the write, so zfs_write() will be performed with the root credentials. When root is doing the write set-uid and set-gid bits are not removed from the file. To correct that, log a separate TX_SETATTR entry that removed those bits on first write to such file. Idea from: Christian Schwarz Add test for ZIL replay of setuid/setgid clearing. Improve various edge cases when clearing setid bits: - The setid bits can be readded during a single write, so make sure to check for them on every chunk write. - Log TX_SETATTR record at most once per transaction group (if the setid bits are keep coming back). - Move zfs_log_setattr() outside of zp->z_acl_lock. Reviewed-by: Dan McDonald <danmcd@joyent.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Christian Schwarz <me@cschwarz.com> Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net> Closes #13027	2022-02-16 17:58:55 -08:00
Akash B	9221ff1888	Add enumerated vdev names to 'zpool iostat -v' and 'zpool list -v' This commit adds enumerated names to disambiguate between the different vdevs. Previously only 'zpool status' showed enumerated vdev names, now 'zpool list -v' and 'zpool iostat -v' also shows the enumerated vdev names. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com> Signed-off-by: Akash B <akash-b@hpe.com> Closes #12510 Closes #13031	2022-02-16 17:58:55 -08:00

1 2 3 4 5 ...

932 Commits