Archive-Team/zfs - zfs - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Alexander Motin	44cec45f72	Improve too large physical ashift handling When iterating through children physical ashifts for vdev, prefer ones above the maximum logical ashift, that we can actually use, but within the administrator defined maximum. When selecting top-level vdev ashift, do not set it to the defined maximum in case physical ashift is even higher, but just ignore one. Using the maximum does not prevent misaligned writes, but reduces space efficiency. Since ZFS tries to write data sequentially and aggregates the writes, in many cases large misanigned writes may be not as bad as the space penalty otherwise. Allow internal physical ashifts for vdevs higher than SHIFT_MAX. May be one day allocator or aggregation could benefit from that. Reduce zfs_vdev_max_auto_ashift default from 16 (64KB) to 14 (16KB), so that ZFS may still use bigger ashifts up to SHIFT_MAX (64KB), but only if it really has to or explicitly told to, but not as an "optimization". There are some read-intensive NVMe SSDs that report Preferred Write Alignment of 64KB, and attempt to build RAIDZ2 of those leads to a space inefficiency that can't be justified. Instead these changes make ZFS fall back to logical ashift of 12 (4KB) by default and only warn user that it may be suboptimal for performance. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #13798	2022-09-21 13:15:15 -07:00
Akash B	03fa3ef264	Add physical device size to SIZE column in 'zpool list -v' Add physical device size/capacity only for physical devices in 'zpool list -v' instead of displaying "-" in the SIZE column. This would make it easier to see the individual device capacity and to determine which spares are large enough to replace which devices. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com> Signed-off-by: Akash B <akash-b@hpe.com> Closes #12561 Closes #13106	2022-09-15 10:23:01 -07:00
Tony Hutter	b1be0a5c15	ZTS: Fix zpool_expand_001_pos `zpool_expand_001_pos` was often failing due to not seeing autoexpand commands in the `zpool history`. During testing, I found this to be unreliable (sometimes the "online" wouldn't appear in `zpool history`) and unnecessary, as we could simply check that the pool increased in size. This commit revamps the test to check for the expanded pool size and corresponding new free space. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #13743	2022-09-13 17:58:03 -07:00
Paul Zuchowski	db5fd16f0b	Fix problem with zdb_objset_id test. Use large numbers for datasets with numeric names to avoid name and id collisions. Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>	2022-08-09 11:46:12 -07:00
Paul Zuchowski	fcbddc7f7c	Fix problem with zdb -d zdb -d <pool>/<objset ID> does not work when other command line arguments are included i.e. zdb -U <cachefile> -d <pool>/<objset ID> This change fixes the command line parsing to handle this situation. Also fix issue where zdb -r <dataset> <file> does not handle the root <dataset> of the pool. Introduce -N option to force <objset ID> to be interpreted as a numeric objsetID. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Paul Zuchowski <pzuchowski@datto.com> Closes #12845 Closes #12944	2022-08-08 16:56:38 -07:00
Brian Behlendorf	98315be036	ZTS: Fix io_uring support check Not all Linux distribution kernels enable io_uring support by default. Update the run time check to verify that the booted kernel was built with CONFIG_IO_URING=y. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Co-authored-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13648 Closes #13685	2022-07-27 13:38:56 -07:00
Brian Behlendorf	3920d7f325	Scrub mirror children without BPs When scrubbing a raidz/draid pool, which contains a replacing or sparing mirror with multiple online children, only one child will be read. This is not normally a serious concern because the DTL records are used to determine where a good copy of the data is. As long as the data can be read from one child the mirror vdev will use it to repair gaps in any of its children. Furthermore, even if the data which was read is corrupt the raidz code will detect this and issue its own repair I/O to correct the damage in the mirror vdev. However, in the scenario where the DTL is wrong due to silent data corruption (say due to overwriting one child) and the scrub happens to read from a child with good data, then the other damaged mirror child will not be detected nor repaired. While this is possible for both raidz and draid vdevs, it's most pronounced when using draid. This is because by default the zed will sequentially rebuild a draid pool to a distributed spare, and the distributed spare half of the mirror is always preferred since it delivers better performance. This means the damaged half of the mirror will go undetected even after scrubbing. For system administrations this behavior is non-intuitive and in a worst case scenario could result in the only good copy of the data being unknowingly detached from the mirror. This change resolves the issue by reading all replacing/sparing mirror children when scrubbing. When the BP isn't available for verification, then compare the data buffers from each child. They must all be identical, if not there's silent damage and an error is returned to prompt the top-level vdev to issue a repair I/O to rewrite the data on all of the mirror children. Since we can't tell which child was wrong a checksum error is logged against the replacing or sparing mirror vdev. Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13555	2022-07-14 10:21:29 -07:00
Rich Ercolani	c220771a47	Corrected oversight in ZERO_RANGE behavior It turns out, no, in fact, ZERO_RANGE and PUNCH_HOLE do have differing semantics in some ways - in particular, one requires KEEP_SIZE, and the other does not. Also added a zero-range test to catch this, corrected a flaw that made the punch-hole test succeed vacuously, and a typo in file_write. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #13329 Closes #13338	2022-04-21 16:58:07 -07:00
наб	5a21214be8	zfs, libzfs: diff: accept -h/ZFS_DIFF_NO_MANGLE, disabling path escaping Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Upstream-commit: `344bbc82e7` Closes #12829	2022-04-01 09:58:45 -07:00
Brian Behlendorf	145af480d3	Fix ENOSPC when unlinking multiple files from full pool When unlinking multiple files from a pool at 100% capacity, it was possible for ENOSPC to be returned after the first unlink. e.g. rm -f /mnt/fs/test1.0.0 /mnt/fs/test1.1.0 /mnt/fs/test1.2.0 rm: cannot remove '/mnt/fs/test1.1.0': No space left on device rm: cannot remove '/mnt/fs/test1.2.0': No space left on device After waiting for the pending deferred frees from the first unlink to be processed the remaining files can then be unlinked. This is caused by the quota limit in dsl_dir_tempreserve_impl() being temporarily decreased to the allocatable pool capacity less any deferred free space. This is resolved using the existing mechanism of returning ERESTART when over quota as long as we know enough space will shortly be available after processing the pending deferred frees. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13172	2022-03-08 11:46:03 -08:00
Brian Behlendorf	b3b6491ce9	ZTS: deadman_sync fix In the CI environment it's possible for events to be slightly delayed resulting in 4, instead of 5, events appearing in the log file. This isn't a problem and should be considered a success to avoid false positive test results. Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12625	2022-03-07 15:17:49 -08:00
Brian Behlendorf	037434e4fc	ZTS: Fix import_devices_missing.ksh Related to commit `90b77a036`. Retry the `zpool export` if the pool is "busy" indicating there is a process accessing the mount point. This can happen after an import, allowing it to be retried will avoid spurious test failures. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13169	2022-03-02 11:27:05 -08:00
Brian Behlendorf	190516f0c5	ZTS: Retry in import_rewind_config_changed.ksh As explained by the disclaimer in the test case, "This test can fail since nothing guarantees that old MOS blocks aren't overwritten." This behavior is expected and correct, but results in a flaky test case which is problematic for the CI. The best we can do to resolve this is to retry the sub-test which failed when the MOS blocks have clearly been overwritten. When testing failures were rare enough that a single retry should normally be sufficient. However, we allow up to five for good measure. Reviewed by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13119	2022-03-02 11:25:35 -08:00
Brian Behlendorf	e2fddf07bd	ZTS: Modify receive-o-x_props_override.ksh exception As previously noted in #12272 the receive-o-x_props_override.ksh test reliably fails on FreeBSD. Since we don't expect this test to pass move the exception from the "maybe" to "known" section. This way we don't retry the FAILED test when it is not expected to pass. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13167	2022-03-01 13:16:43 -08:00
Brian Behlendorf	4cb88d7fdc	ZTS: Move largest_pool_001_pos.ksh to Linux runfile On FreeBSD pools are not allowed to be created using vdevs which are backed by ZFS volumes. This configuration is not recommended for any supported platform, nevertheless the largest_pool_001_pos.ksh test case makes use of it as a convenience. This causes the test case to fail reliably on FreeBSD. The layout is still tolerated on Linux so only perform this test on Linux. Reviewed-by: Igor Kozhukhov <igor@dilos.org> Reviewed by: George Melikov <mail@gmelikov.ru> Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13166	2022-03-01 13:16:43 -08:00
George Amanakis	bcddb18bae	Enable encrypted raw sending to pools with greater ashift Raw sending from pool1/encrypted with ashift=9 to pool2/encrypted with ashift=12 results to failure when mounting pool2/encrypted (Input/Output error). Notably, the opposite, raw sending from a greater ashift to a lower one does not fail. This happens because zio_compress_write() falsely checks only ZIO_FLAG_RAW_COMPRESS and not ZIO_FLAG_RAW_ENCRYPT which is also set in encrypted raw send streams. In this case it rounds up the psize and if not equal to the zio->io_size it modifies the block by zeroing out the extra bytes. Because this happens in a SA attr. registration object (type=46), the decryption fails upon mounting the filesystem, and zpool status falsely reports an error. Fix this by checking both ZIO_FLAG_RAW_COMPRESS and ZIO_FLAG_RAW_ENCRYPT before deciding whether to zero-pad a block. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #13067 Closes #13074	2022-02-23 16:47:37 -08:00
Brian Behlendorf	ccbe9efd6b	ZTS: Fix checkpoint_ro_rewind.ksh Related to commit `90b77a036`. Retry the `zpool export` if the pool is "busy" indicating there is a process accessing the mount point. This can happen after an import and allowing it to be retried will avoid spurious test failures. Reviewed by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13092	2022-02-16 17:58:56 -08:00
Brian Behlendorf	882bc4ad61	ZTS: Fix zpool_expand_001_pos The dRAID section of the zpool_expand_001_pos test would reliably fail because the calculated expansion size assumed the dRAID top-level vdev was created with a distributed spare. Create the vdev as expected to resolve the test failure. This test case flaw was accidentally caused by changing the default number of dRAID distributed spares from one to zero while dRAID was being developed. Additionally, remove zpool_expand_005_pos from the list of possible faulty tests. It appears to be passing consistently in my testing. Reviewed by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13091	2022-02-16 17:58:56 -08:00
Brian Behlendorf	f03cf651ec	ZTS: Fix zvol_misc_volmode test Changing volmode may need to remove minors, which could be open, so call udev_wait() before we "zfs set volmode=<value>". This ensures no udev process has the zvol open (i.e. blkid) and the kernel zvol_remove_minor_impl() function won't skip removing the in use device. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13075	2022-02-16 17:58:56 -08:00
Attila Fülöp	5c19af07d4	Receive checks should allow unencrypted child datasets dmu_recv_begin_check() unconditionally sets the DS_HOLD_FLAG_DECRYPT flag before calling dsl_dataset_hold_flags(). If the key on the receiving side isn't loaded or the send stream contains embedded blocks, the receive check fails for a stream which is perfectly valid and could be received without any problem. This seems like a remnant of the initial design, where unencrypted datasets below encrypted ones weren't allowed. Add a condition to set `DS_HOLD_FLAG_DECRYPT` only for encrypted datasets, modify an existing test to detect this regression and add a test for raw replication streams. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Amanakis <gamanakis@gmail.com> Co-authored-by: George Amanakis <gamanakis@gmail.com> Signed-off-by: Attila Fülöp <attila@fueloep.org> Closes #13033 Closes #13076	2022-02-16 17:58:55 -08:00
Attila Fülöp	3b52ccd7d7	Linux 5.16 compat: don't use XSTATE_XSAVE to save FPU state Linux 5.16 moved XSTATE_XSAVE and XSTATE_XRESTORE out of our reach, so add our own XSAVE{,OPT,S} code and use it for Linux 5.16. Please note that this differs from previous behavior in that it won't handle exceptions created by XSAVE an XRSTOR. This is sensible for three reasons. - Exceptions during XSAVE and XRSTOR can only occur if the feature is not supported or enabled or the memory operand isn't aligned on a 64 byte boundary. If this happens something else went terribly wrong, and it may be better to stop execution. - Previously we just printed a warning and didn't handle the fault, this is arguable for the above reason. - All other *SAVE instruction also don't handle exceptions, so this at least aligns behavior. Finally add a test to catch such a regression in the future. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Attila Fülöp <attila@fueloep.org> Closes #13042 Closes #13059	2022-02-16 17:58:55 -08:00
наб	745a7f78da	Remove basename(1). Clean up/shorten some coreutils pipelines Basenames that remain, in cmd/zed/zed.d/statechange-led.sh: dev=$(basename "$(echo "$therest" \| awk '{print $(NF-1)}')") vdev=$(basename "$ZEVENT_VDEV_PATH") I don't wanna interfere with #11988 scripts/zfs-tests.sh: SINGLETESTFILE=$(basename "$SINGLETEST") tests/zfs-tests/tests/functional/cli_user/zfs_list/zfs_list.kshlib: ACTUAL=$(basename $dataset) ACTUAL=$(basename $dataset) tests/zfs-tests/tests/functional/cli_user/zpool_iostat/ zpool_iostat_-c_homedir.ksh: typeset USER_SCRIPT=$(basename "$USER_SCRIPT_FULL") tests/zfs-tests/tests/functional/cli_user/zpool_iostat/ zpool_iostat_-c_searchpath.ksh: typeset CMD_1=$(basename "$SCRIPT_1") typeset CMD_2=$(basename "$SCRIPT_2") tests/zfs-tests/tests/functional/cli_user/zpool_status/ zpool_status_-c_homedir.ksh: typeset USER_SCRIPT=$(basename "$USER_SCRIPT_FULL") tests/zfs-tests/tests/functional/cli_user/zpool_status/ zpool_status_-c_searchpath.ksh typeset CMD_1=$(basename "$SCRIPT_1") typeset CMD_2=$(basename "$SCRIPT_2") tests/zfs-tests/tests/functional/migration/migration.cfg: export BNAME=`basename $TESTFILE` tests/zfs-tests/tests/perf/perf.shlib: typeset logbase="$(get_perf_output_dir)/$(basename \ tests/zfs-tests/tests/perf/perf.shlib: typeset logbase="$(get_perf_output_dir)/$(basename \ These are potentially Of Directories, where basename is actually useful Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12652	2022-02-16 17:58:55 -08:00
Brian Behlendorf	cd0e238049	ZTS: Update enospc_002_pos test case The on-disk cost of creating a snapshot or bookmark is sufficiently low that it is difficult to make it reliably fail even when the pool is "full". In order to avoid false positives remove these two checks from the test case. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13060	2022-02-16 17:58:55 -08:00
Pawel Jakub Dawidek	3e27b589cf	Fix clearing set-uid and set-gid bits on a file when replying a write POSIX requires that set-uid and set-gid bits to be removed when an unprivileged user writes to a file and ZFS does that during normal operation. The problem arrises when the write is stored in the ZIL and replayed. During replay we have no access to original credentials of the process doing the write, so zfs_write() will be performed with the root credentials. When root is doing the write set-uid and set-gid bits are not removed from the file. To correct that, log a separate TX_SETATTR entry that removed those bits on first write to such file. Idea from: Christian Schwarz Add test for ZIL replay of setuid/setgid clearing. Improve various edge cases when clearing setid bits: - The setid bits can be readded during a single write, so make sure to check for them on every chunk write. - Log TX_SETATTR record at most once per transaction group (if the setid bits are keep coming back). - Move zfs_log_setattr() outside of zp->z_acl_lock. Reviewed-by: Dan McDonald <danmcd@joyent.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Christian Schwarz <me@cschwarz.com> Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net> Closes #13027	2022-02-16 17:58:55 -08:00
Akash B	9221ff1888	Add enumerated vdev names to 'zpool iostat -v' and 'zpool list -v' This commit adds enumerated names to disambiguate between the different vdevs. Previously only 'zpool status' showed enumerated vdev names, now 'zpool list -v' and 'zpool iostat -v' also shows the enumerated vdev names. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com> Signed-off-by: Akash B <akash-b@hpe.com> Closes #12510 Closes #13031	2022-02-16 17:58:55 -08:00
George Amanakis	72a82f312f	Report dnodes with faulty bonuslen In files created/modified before `4254acb` there may be a corruption of xattrs which is not reported during scrub and normal send/receive. It manifests only as an error when raw sending/receiving. This happens because currently only the raw receive path checks for discrepancies between the dnode bonus length and the spill pointer flag. In case we encounter a dnode whose bonus length is greater than the predicted one, we should report an error. Modify in this regard dnode_sync() with an assertion at the end, dump_dnode() to error out, dsl_scan_recurse() to report errors during a scrub, and zstream to report a warning when dumping. Also added a test to verify spill blocks are sent correctly in a raw send. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #12720 Closes #13014	2022-02-16 17:58:55 -08:00
Brian Behlendorf	8285e1b09d	ZTS: Fix enospc_002_pos.ksh again This is a follow up commit for `e03a41a60` which aimed to resolve this same test failure. The core "problem" here is that it takes very little space to perform a clone/snapshot/bookmark, which means if we want these commands to reliably fail the pool must truely have exhausted all free space. This commit increases the number of fill iterations to try and consume every block which we can. This still can't guarantee the clone/snapshot/bookmark will fail, but it significantly improves the odds. The exception was kept since it's still not a sure thing. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Igor Kozhukhov <igor@dilos.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12903	2022-02-16 17:58:55 -08:00
Brian Behlendorf	c454e46336	ZTS: Fix rollback_003_pos.ksh Under Linux when rolling back a mounted filesystem negative dentries may not be dropped from the cache. This can result in an ENOENT being incorrectly returned on first access. Issuing a `df` before the unmount results in the negative dentries being invalidated and side steps the issue. This is solely a workaround for the test case on Linux and not correct behavior. The core issue of invalidating negative dentries needs to be handled with a kernel side change. This is being tracked as issue #6143. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12898 Issue #6143	2022-02-16 17:58:55 -08:00
Brian Behlendorf	306cccca27	Update zts-report.py with additional tests The following test cases may still occasionally fail and are being added to the "maybe" list for Linux until they can be updated to be entirely reliable. cli_root/zfs_rename/zfs_rename_002_pos.ksh cli_root/zpool_reopen/zpool_reopen_003_pos.ksh refreserv/refreserv_raidz These 6 tests consistently fail only on Fedora 31+, the failures are related to the kernel rescanning the partition table on loopback devices which is no longer reliable unless partprobe is used. In order to enable the Fedora bot by default they are also being added to the list until the tests can be updated. Any significant regression in functionality covered by these tests will still be detected by the FreeBSD builders. alloc_class/alloc_class_009_pos alloc_class/alloc_class_010_pos cli_root/zpool_expand/zpool_expand_001_pos cli_root/zpool_expand/zpool_expand_005_pos rsend/rsend_007_pos rsend/rsend_010_pos rsend/rsend_011_pos snapshot/rollback_003_pos Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #10489	2022-02-16 17:58:55 -08:00
Rich Ercolani	4730c3f249	Exclude zvol_misc_volmode for now It keeps failing, on changes which aren't related at all. So until someone runs down why, I'd like it to stop being the sole reason for CI failures. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12733	2022-02-16 17:58:55 -08:00
Brian Behlendorf	4fea6a6737	ZTS: Add known exceptions Add the following test failures to the exception list for FreeBSD to ensure we notice new unexpected failures. pool_checkpoint/checkpoint_big_rewind pool_checkpoint/checkpoint_indirect And the following for Linux. zvol/zvol_misc/zvol_misc_snapdev Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #12621 Issue #12622 Issue #12623 Closes #12624	2022-02-16 17:58:55 -08:00
Ryan Moeller	fc3230a781	ZTS: Minimize udev_wait in zvol_misc tests The zvol_misc tests, in particular zvol_misc_volmode, make use of a common udev_wait function to wait for zvol devices in /dev to quiesce on Linux. On other platforms this function currently only sleeps for one second before returning. This is insufficient, and zvol_misc_volmode has been flaky on FreeBSD as a result. Replace udev_wait with block_device_wait, passing through the optional device parameter where possible. Rearrange a few checks to strengthen the verifications we are making and avoid unnecessarily sleeping. We must keep udev_wait in a couple places to pass in Github CI workflows. Remove zvol_misc_volmode from the maybe failing tests on FreeBSD in zts-report.py. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #12583	2022-02-16 17:58:55 -08:00
Ka Ho Ng	ed064ed596	ZTS: Enable punch-hole tests on FreeBSD Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ka Ho Ng <khng@FreeBSD.org> Sponsored-by: The FreeBSD Foundation Closes #12458	2022-02-16 17:58:55 -08:00
Brian Behlendorf	74bba85423	ZTS: Fix refreserv_raidz.ksh The rerefreserv_raidz test was failing on Linux because the sync being issued doesn't guarantee a pool sync. Switch to using the sync_pool function and remove the ZTS exception for Linux. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12897	2022-02-16 17:58:55 -08:00
Georgy Yakovlev	f22ebf8fa6	zfs-test/mmap_seek: fix build on musl The build on musl needs linux/fs.h for SEEK_DATA and friends, and sys/sysmacros.h for P2ROUNDUP. Add the needed headers. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org> Closes #12891	2022-02-16 17:58:55 -08:00
Brian Behlendorf	1fb5566a25	ZTS: speed up rsend tests With some minor tweaks several of rsend tests can be sped up considerably without significantly reducing test coverage. * send-c_verify_ratio: ~120s -> ~60s * send_realloc__files: ~330s -> ~65s For the send_realloc tests this also has the advantage of removing (most of) the linux/freebsd conditional logic. Note that for this test more passes, and thus more incremental send/recvs, are preferable to a larger number of files. Total run time of the rsend test group was reduced from roughly 20 to 11 minutes in an environment similar to what's used by the CI. Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12876	2022-02-16 17:58:55 -08:00
Brian Behlendorf	be01ee8629	ZTS: rsend_007_pos failures The rsend_007_pos test reliably fails on Linux in the cleanup function. This is caused by an unmount error when attempting to recursively destroy the newly received datasets. Invoking `df` prior to the `zfs destroy` interestingly avoids the unmont error. Why this should matter is unclear and should be investigated. However, this minor tweak may allow us to remove the ZTS rsend exceptions. The subsequent rsend_010_pos and rsend_011_pos failures were a result of this initial failure. The other "maybe" failures I was unable to reproduce and have not been recently observed in the master branch. Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #5665 Closes #6086 Closes #6087 Closes #6446 Closes #12876	2022-02-16 17:58:55 -08:00
наб	9cbc2ed20f	libzfs: add keylocation=https://, backed by fetch(3) or libcurl Add support for http and https to the keylocation properly to allow encryption keys to be fetched from the specified URL. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Issue #9543 Closes #9947 Closes #11956	2022-02-16 17:58:37 -08:00
наб	9b185de6fa	ZTS: cli_root/zfs_load-key: add separate key files Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Issue: #11956 Closes #11976	2022-02-15 16:20:12 -08:00
Brian Behlendorf	fe8b0a33d4	ZTS: alloc_class.ksh must wait for the process to exit The alloc_class_* tests may fail on Linux with an EBUSY error if `zfs destroy` is run before the `dd` process has had a chance to terminate. Wait on the pid after the `kill -9` to make sure. When testing I didn't observe any failures for the alloc_class tests. Remove them from the exceptions list, the CI was used to verify the tests pass on all platforms. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12873	2022-02-10 11:05:07 -08:00
Rich Ercolani	d4794c8204	ZTS: Avoid piping send directly to /dev/null Unfortunately, #11445 means while we fail gracefully now, we still fail, unless people want to implement a complex workaround just to support /dev/null. So let's just use the cheap workaround in a test for now. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12872	2022-02-10 11:04:57 -08:00
Tony Hutter	29e05d5345	ZTS: Fix zpool_reopen_[1-5] on Fedora 35 The zpool_reopen_[1-5] tests are failing Fedora 35 with: zpool_reopen_001_pos.ksh[64]: log_must[67]: log_pos[270]: wait_for_resilver_end[98]: wait_for_action: line 71: func: is read only Renaming 'func' -> 'funct' fixes the issue. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #12871	2022-02-10 11:04:46 -08:00
George Amanakis	e257bd481b	Introduce a flag to skip comparing the local mac when raw sending Raw receiving a snapshot back to the originating dataset is currently impossible because of user accounting being present in the originating dataset. One solution would be resetting user accounting when raw receiving on the receiving dataset. However, to recalculate it we would have to dirty all dnodes, which may not be preferable on big datasets. Instead, we rely on the os_phys flag OBJSET_FLAG_USERACCOUNTING_COMPLETE to indicate that user accounting is incomplete when raw receiving. Thus, on the next mount of the receiving dataset the local mac protecting user accounting is zeroed out. The flag is then cleared when user accounting of the raw received snapshot is calculated. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #12981 Closes #10523 Closes #11221 Closes #11294 Closes #12594 Issue #11300	2022-02-04 16:14:56 -08:00
Brian Behlendorf	6575defc52	Verify dRAID empty sectors Verify that all empty sectors are zero filled before using them to calculate parity. Failure to do so can result in incorrect parity columns being generated and written to disk if the contents of an empty sector are non-zero. This was possible because the checksum only protects the data portions of the buffer, not the empty sector padding. This issue has been addressed by updating raidz_parity_verify() to check that all dRAID empty sectors are zero filled. Any sectors which are non-zero will be fixed, repair IO issued, and a checksum error logged. They can then be safely used to verify the parity. This specific type of damage is unlikely to occur since it requires a disk to have silently returned bad data, for an empty sector, while performing a scrub. However, if a pool were to have been damaged in this way, scrubbing the pool with this change applied will repair both the empty sector and parity columns as long as the data checksum is valid. Checksum errors will be reported in the `zpool status` output for any repairs which are made. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12857	2022-02-03 15:28:01 -08:00
Brian Behlendorf	6ed7d77b44	ZTS: import_rewind_device_replaced reliably fails The import_rewind_device_replaced.ksh test was never entirely reliable because it depends on MOS data not being overwritten. The MOS data is not protected by the snapshot so occasional failures were always expected. However, this test is now failing reliably on all platforms indicating something has changed in the code since the test was marked "maybe". Convert the test to a "known" failure until the root cause is identified and resolved. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12821	2021-12-08 13:28:09 -08:00
наб	ac9b1aa1bf	tests/file_check: remove unused variable Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12187	2021-12-06 13:52:34 -08:00
Brian Behlendorf	16da688f25	Linux 5.13 compat: retry zvol_open() when contended Due to a possible lock inversion the zvol open call path on Linux needs to be able to retry in the case where the spa_namespace_lock cannot be acquired. For Linux 5.12 an older kernel this was accomplished by returning -ERESTARTSYS from zvol_open() to request that blkdev_get() drop the bdev->bd_mutex lock, reaquire it, then call the open callback again. However, as of the 5.13 kernel this behavior was removed. Therefore, for 5.12 and older kernels we preserved the existing retry logic, but for 5.13 and newer kernels we retry internally in zvol_open(). This should always succeed except in the case where a pool's vdev are layed on zvols, in which case it may fail. To handle this case vdev_disk_open() has been updated to retry when opening a device when -ERESTARTSYS is returned. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #12301 Closes #12759	2021-12-06 12:22:57 -08:00
John Wren Kennedy	e9ee57f682	Temporarily remove tests from sanity runfile With the addition of functionality to rerun failing tests, some tests that fail only sometimes still fail often enough to degrade the reliability of the sanity runs. Remove them from the runfile until they reliably pass. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: John Kennedy <john.kennedy@delphix.com> Closes #12814	2021-12-06 12:22:51 -08:00
Paul Dagnelie	d346361515	Add zfs-test facility to automatically rerun failing tests This was a project proposed as part of the Quality theme for the hackthon for the 2021 OpenZFS Developer Summit. The idea is to improve the usability of the automated tests that get run when a PR is created by having failing tests automatically rerun in order to make flaky tests less impactful. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #12740	2021-12-06 12:22:43 -08:00
Brian Behlendorf	ea0dda5999	Exclude zfs_copies_003_pos on Linux This test case may fail on 5.13 and newer Linux kernels if the /dev/zvol/ device is not created by udev. Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #12301 Closes #12738	2021-11-12 16:20:15 -08:00

1 2 3 4 5 ...

1003 Commits