Archive-Team/zfs - zfs - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Rob Norris	ce782d0804	Linux 6.8 compat: update for new bdev access functions blkdev_get_by_path() and blkdev_put() have been replaced by bdev_open_by_path() and bdev_release(), which return a "handle" object with the bdev object itself inside. This adds detection for the new functions, and macros to handle the old and new forms consistently. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Sponsored-by: https://despairlabs.com/sponsor/ Closes #15805	2024-01-29 14:53:29 -08:00
Rob Norris	64afc4e66e	Linux 6.8 compat: make test functions static The kernel is now being compiled with -Wmissing-prototypes. Most of our test stub functions had no prototype, and failed to compile. Since they don't need to be visible anywhere else, just make them all static. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Sponsored-by: https://despairlabs.com/sponsor/ Closes #15805	2024-01-29 14:53:29 -08:00
Brian Behlendorf	621dfaff5c	Linux 6.7 compat: META Update the META file to reflect compatibility with the 6.7 kernel. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #15833	2024-01-29 14:53:29 -08:00
Paul Dagnelie	ab653603f8	Don't assert mg_initialized due to device addition race During device removal stress tests, we noticed that we were tripping the assertion that mg_initialized was true. After investigation, it was determined that the mg in question was the embedded log metaslab group for a newly added vdev; the normal mg had been initialized (by metaslab_sync_reassess, via vdev_sync_done). However, because the spa config alloc lock is not held as writer across both calls to metaslab_sync_reassess, it is possible for an allocation to happen between the two metaslab_groups being initialized. Because the metaslab code doesn't check the group in question, just the vdev's main mg, it is possible to get past the initial check in vdev_allocatable and later fail due to the assertion. We simply remove the assertions. We could also consider locking the ALLOC lock around the reassess calls in vdev_sync_done, but that risks deadlocks. We could check the actual target mg in vdev_allocatable, but that risks racing with a passivation that comes in after that check but before the assertion. We still won't be able to actually allocate from the metaslab group if no metaslabs are ready, so this change shouldn't break anything. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Wilson <george.wilson@delphix.com> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #15818	2024-01-29 14:53:29 -08:00
Chris Davidson	acc7cd8e99	Update man pages to time(1) from time(2) zpool-iostat.8: Updated time(2) -> time(1) to align to manual page zpool-list.8: Updated time(2) -> time(1) to align to manual page zpool-status.8: Updated time(2) -> time(1) to align to manual page zpool-wait.8: Update time(2) -> time(1) to align to manual page Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Christopher Davidson <christopher.davidson@gmail.com> Closes #15823	2024-01-29 14:53:29 -08:00
Brian Behlendorf	dd0874cf7e	ZTS: Allow longer run time for zdb_args_pos The zdb_args_pos test may take slightly longer than 600 seconds to run on some of the CI builders. To prevent this from causing failures allow up to 1200 seconds for tests in this group. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #15826	2024-01-29 14:53:29 -08:00
Andrew Innes	7cd666d54b	Move nodes into correct subgraphs Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Andrew Innes <andrew.c12@gmail.com> Closes #15828	2024-01-29 14:53:29 -08:00
Rob N	0606ce2055	zpool wait: print timestamp before the header list, status and iostat all display the -T timestamp before the header, but wait showed it after. Make it be like the others. Reported-by: Kyle Evans <kevans@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #15825	2024-01-29 14:53:29 -08:00
Ameer Hamza	dd3a0a2715	Update vdev devid and physpath if changed between imports If devid or physpath for a vdev changes between imports, ensure it is updated to the new value. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #15816	2024-01-29 14:53:29 -08:00
Tino Reichardt	9ad150446f	ZTS: Update deprecated Github Action version numbers GitHub Actions is transitioning from Node 16 to Node 20. So we need to update these: - actions/checkout@v3 -> v4 - actions/download-artifact@v3 -> v4 - actions/upload-artifact@v3 -> v4 and some minor changes Update also the documentation of the testings workflow. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Andrew Innes <andrew.c12@gmail.com> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes #15820	2024-01-29 14:53:29 -08:00
Richard Yao	9da745f5de	Switch to CodeQL to detect prohibited function use The LLVM/Clang developers pointed out that using the CPP to detect use of functions that our QA policies prohibit risks invoking undefined behavior. To resolve this, we configure CodeQL to detect forbidden function usage. Note that cpp in the context of CodeQL refers to C/C++, rather than the C PreProcessor, which C++ also uses. It really should have been written cxx, but that ship sailed a long time ago. This misuse of the term cpp is retained in the CodeQL configuration for consistency with upstream CodeQL. As a side benefit, verbose make no longer is a wall of text showing a bunch of CPP macros, which can make debugging slightly easier. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes #15819 Closes #14134	2024-01-29 14:53:29 -08:00
Tino Reichardt	cfa29b9945	ZTS: Apply small changes for speeding up the tests The Github Action Runner got some new hardware metrics. We should use the provided and empty disk which is pre-mounted at /mnt now. Disk1: 89GiB -> rootfs + bootfs with ~80MB/s -> don't care Disk2: 64GiB -> /mnt with 420MB/s -> new testing ssd This commit will mount the new disk to /var/tmp and provide hopefully some speedups within our testings. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Andrew Innes <andrew.c12@gmail.com> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes #15811	2024-01-29 14:53:29 -08:00
Val Packett	09a7961364	FreeBSD: Fix bootstrapping tools under Linux/musl musl libc has deprecated LFS64 aliases, so bootstrapping FreeBSD tools under musl distros has been failing with stat64 errors. Apply the aliases under non-glibc Linux to fix this problem. Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Val Packett <val@packett.cool> Closes #15780	2024-01-29 14:53:29 -08:00
Tino Reichardt	276be5357c	linux spl: fix typo in top comment of spl-condvar.c Credential Implementation -> Condition Variables Implementation Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes #15782	2024-01-29 14:53:29 -08:00
Lalufu	424d06a298	Make sure all necessary RPM path macros are defined When building (s)rpm files through the Makefile, a directory structure is created in /tmp to hold the various files. In case the user running the command has overridden some of the RPM path settings through their user profile (for example in `~/.rpmmacros`), these paths do not line up with the configuration, and the build fails. Make sure all paths used are properly defined. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ralf Ertzinger <ralf@skytale.net> Closes #15756	2024-01-29 14:53:29 -08:00
youzhongyang	6b64acc157	Make spl_kmem_cache size check consistent On Linux x86_64, kmem cache can have size up to 4M, however increasing spl_kmem_cache_slab_limit can lead to crash due to the size check inconsistency. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Youzhong Yang <yyang@mathworks.com> Closes #15757	2024-01-29 14:53:29 -08:00
Ameer Hamza	a2e71db664	Add path handling for aux vdevs in `label_path` If the AUX vdev is added using UUID, importing the pool falls back AUX vdev to open it with disk name instead of UUID due to the absence of path information for AUX vdevs. Since AUX label now have path information, this PR adds path handling for it in `label_path`. Reviewed-by: Umer Saleem <usaleem@ixsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #15737	2024-01-29 14:53:29 -08:00
Ameer Hamza	eb4a36bcef	Extend aux label to add path information Pool import logic uses vdev paths, so it makes sense to add path information on AUX vdev as well. Reviewed-by: Umer Saleem <usaleem@ixsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #15737	2024-01-29 14:53:29 -08:00
Ameer Hamza	52cee9a3eb	fix: Uber block label not always found for aux vdevs When spare or l2cache (aux) vdev is added during pool creation, spa->spa_uberblock is not dumped until that point. Subsequently, the aux label is never synchronized after its initial creation, resulting in the uberblock label remaining undumped. The uberblock is crucial for lib_blkid in identifying the ZFS partition type. To address this issue, we now ensure sync of the uberblock label once if it's not dumped initially. Reviewed-by: Umer Saleem <usaleem@ixsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #15737	2024-01-29 14:53:29 -08:00
Brian Behlendorf	2006ac1f4a	Fix "out of memory" error Drop the no_memory() call from zpool_in_use() when reading the label fails and instead return the error to the caller. This prevents a misleading "internal error: out of memory" error when the label can't be read. This will result in is_spare() returning B_FALSE instead of aborting, which is already safely handled. Furthermore, on Linux it's possible for EREMOTEIO to returned by an NVMe device if the device has been low-level formatted and not rescanned. In this case we want to fallback to the legacy scanning method and read any of the labels we can. Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #13538 Closes #15747	2024-01-29 14:53:29 -08:00
Benjamin Sherman	509526ad21	fix: preserve linux kmod signature in zfs-kmod rpm spec This change provides rpm spec macros to sign the zfs and spl kmods as the final step after the %install scriptlet. This is needed since the find-debuginfo.sh script strips out debug symbols plus signatures. Kernel module signing only occurs when the required files are present as typically required in the Linux source tree: - certs/signing_key.pem - certs/signing_key.x509 The method for overriding the default __spec_install_post macro is inspired by (and largely copied from) the Fedora kernel.spec. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Benjamin Sherman <benjamin@holyarmy.org> Closes #15744	2024-01-29 14:53:29 -08:00
Stefan Lendl	4db88c37cc	fix(mount): do not truncate shares not zfs mount When running zfs share -a resetting the exports.d/zfs.exports makes sense the get a clean state. Truncating was also called with zfs mount which would not populate the file again. Add test to verify shares persist after mount -a. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Stefan Lendl <s.lendl@proxmox.com> Closes #15607 Closes #15660	2024-01-29 14:53:29 -08:00
Mark Johnston	8b1c6db3d2	Fix a potential use-after-free in zfs_setsecattr() In general, VOPs must not load the "z_log" field until having called zfs_enter_verify_zp(). Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Mark Johnston <markj@FreeBSD.org> Closes #15752	2024-01-29 14:53:29 -08:00
Mark Johnston	22e4f08c30	Linux: Defer loading the object set in zfs_setattr() We need to wait until after having done a zfs_enter() to load some fields from the zfsvfs structure. Otherwise a use-after-free is possible in the face of a concurrent rollback. Other functions in this file are careful to avoid this bug, I believe this is the only instance. Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Mark Johnston <markj@FreeBSD.org> Closes #15752	2024-01-29 14:53:29 -08:00
Rich Ercolani	7bccf98a73	Make zdb -R scale less poorly zdb -R with :d tries to use gzip decompression 9 times per size. There's absolutely no reason for that, they're all the same decompressor. Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #15726	2024-01-29 14:53:29 -08:00
Rich Ercolani	4d4972ed98	Stop wasting time on malloc in snprintf_zstd_header Profiling zdb -vvvvv on datasets with a lot of zstd blocks, we find ourselves spending quite a lot of time on malloc/free, because we allocate a 16M abd each call, and never free it, so we're leaking 16M per call as well. This seems sub-optimal. So let's just keep the buffer around and reuse it. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #15721	2024-01-29 14:53:29 -08:00
Pawel Jakub Dawidek	3425484eb9	Fix file descriptor leak on pool import. Descriptor leak can be easily reproduced by doing: # zpool import tank # sysctl kern.openfiles # zpool export tank; zpool import tank # sysctl kern.openfiles We were leaking four file descriptors on every import. Similar leak most likely existed when using file-based VDEVs. External-issue: https://reviews.freebsd.org/D43529 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net> Closes #15630	2024-01-26 13:38:25 -08:00
Brian Behlendorf	9e0304c363	ZTS: Apply zfs_bclone_enabled to bclone tests If block cloning is disabled by default then enable it when running the bclone tests. Follow up to #15529. Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #15796	2024-01-22 16:15:03 -08:00
Tino Reichardt	c1161e2851	fix: variable type with zfs-tests/cmd/clonefile.c Compiling on arm64 freebsd-13.2 and arm64 almalinux-8 brings currently this error: ``` CC tests/zfs-tests/cmd/clonefile.o tests/zfs-tests/cmd/clonefile.c:166:43: error: result of comparison of \ constant -1 with expression of type 'char' is always true \ [-Werror,-Wtautological-constant-out-of-range-compare] while ((c = getopt(argc, argv, "crfdq")) != -1) { ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~ 1 error generated. gmake[2]: *** [Makefile:8675: tests/zfs-tests/cmd/clonefile.o] Error 1 ``` Fix: use correct variable type `int`. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes #15783	2024-01-19 12:28:02 -08:00
Pawel Jakub Dawidek	ef527958c6	Fix cloning into mmaped and cached file. If the destination file is mmaped and the mmaped region was already read, so it is cached, we need to update mmaped pages after successful clone using update_pages(). Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Pointed out by: Ka Ho Ng <khng@freebsd.org> Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net> Closes #15772	2024-01-19 12:28:02 -08:00
Umer Saleem	d2f7b2e557	ZTS: Test for clone, mmap and write for block cloning For block cloning, if we mmap the cloned file and write from the map into the file, it triggers a panic in dbuf_redirty() on Linux. The same scenario causes data corruption on FreeBSD. Both these issues are fixed under PR#15656 and PR#15665. It would be good to add a test for this scenario in ZTS. The test program and issue was produced by @robn. Reviewed-by: Pawel Jakub Dawidek <pawel@dawidek.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #15717	2024-01-19 12:28:02 -08:00
Brian Behlendorf	83c0ccc7cf	Enable block_cloning tests on FreeBSD Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net> Closes #15749	2024-01-19 12:28:02 -08:00
Pawel Jakub Dawidek	c16d103422	Block cloning tests. The test mostly focus on testing various corner cases. The tests take a long time to run, so for the common.run runfile we randomly select a hundred tests. To run all the bclone tests, bclone.run runfile should be used. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net> Closes #15631	2024-01-19 12:28:02 -08:00
Umer Saleem	f94a77951d	Test LWB buffer overflow for block cloning PR#15634 removes 128K into 2x68K LWB split optimization, since it was found to cause LWB buffer overflow while trying to write 128KB TX_CLONE_RANGE record with 1022 block pointers into 68KB buffer, with multiple VDEVs ZIL. This commit adds a test for this particular scenario by writing maximum sizes TX_CLONE_RANE record with 1022 block pointers into 68KB buffer, with two SLOG devices. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #15672	2024-01-19 12:28:02 -08:00
Ameer Hamza	d8b0b6032b	ZTS: Add test cases for block cloning replay Reviewed-by: Kay Pedersen <mail@mkwg.de> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #15614	2024-01-19 12:28:02 -08:00
Ameer Hamza	387f003be3	ZTS: block_cloning: Use numeric sort for get_same_blocks Reviewed-by: Kay Pedersen <mail@mkwg.de> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #15614	2024-01-19 12:28:02 -08:00
Kevin Jin	07cf973fe9	Autotrim High Load Average Fix Switch from cv_wait() to cv_wait_idle() in vdev_autotrim_wait_kick(), which should mitigate the high load average while waiting. Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: jxdking <lostking2008@hotmail.com> Closes #15781	2024-01-18 11:33:29 -08:00
Rob N	2ecc2dfe42	Linux 6.7 compat: zfs_setattr fix atime update In `db4fc559c` I messed up and changed this bit of code to set the inode atime to an uninitialised value, when actually it was just supposed to loading the atime from the inode to be stored in the SA. This changes it to what it should have been. Ensure times change by the right amount Previously, we only checked if the times changed at all, which missed a bug where the atime was being set to an undefined value. Now ensure the times change by two seconds (or thereabouts), ensuring we catch cases where we set the time to something bonkers Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Sponsored-by: https://despairlabs.com/sponsor/ Closes #15762 Closes #15773	2024-01-17 08:59:28 -08:00
Alexander Motin	fc0a9f0cda	Merge pull request #207 from truenas/truenas/zfs-2.2.3-staging-2 Sync with upstream zfs-2.2.3-staging for Dragonfish BETA.1	2024-01-17 11:07:23 -05:00
Pawel Jakub Dawidek	21acb5a27c	Fix cloning into mmaped and cached file. If the destination file is mmaped and the mmaped region was already read, so it is cached, we need to update mmaped pages after sucessful clone using update_pages(). Pointed out by: Ka Ho Ng <khng@freebsd.org> Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>	2024-01-17 19:17:17 +05:00
Alexander Motin	b8b3729242	ZIL: Improve next log block size prediction Track history in context of bursts, not individual log blocks. It allows to not blow away all the history by single large burst of many block, and same time allows optimizations covering multiple blocks in a burst and even predicted following burst. For each burst account its optimal block size and minimal first block size. Use that statistics from the last 8 bursts to predict first block size of the next burst. Remove predefined set of block sizes. Allocate any size we see fit, multiple of 4KB, as required by ZIL now. With compression enabled by default, ZFS already writes pretty random block sizes, so this should not surprise space allocator any more. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15635	2024-01-17 19:12:20 +05:00
Alexander Motin	84aa25240f	ZIL: Detect single-threaded workloads ... by checking that previous block is fully written and flushed. It allows to skip commit delays since we can give up on aggregation in that case. This removes zil_min_commit_timeout parameter, since for single-threaded workloads it is not needed at all, while on very fast devices even some multi-threaded workloads may get detected as single-threaded and still bypass the wait. To give multi-threaded workloads more aggregation chances increase zfs_commit_timeout_pct from 5 to 10%, as they should suffer less from additional latency. Also single-threaded workloads detection allows in perspective better prediction of the next block size. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Prakash Surya <prakash.surya@delphix.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15381	2024-01-17 19:11:33 +05:00
Alexander Motin	d3bf3a912f	dmu: Allow buffer fills to fail When ZFS overwrites a whole block, it does not bother to read the old content from disk. It is a good optimization, but if the buffer fill fails due to page fault or something else, the buffer ends up corrupted, neither keeping old content, nor getting the new one. On FreeBSD this is additionally complicated by page faults being blocked by VFS layer, always returning EFAULT on attempt to write from mmap()'ed but not yet cached address range. Normally it is not a big problem, since after original failure VFS will retry the write after reading the required data. The problem becomes worse in specific case when somebody tries to write into a file its own mmap()'ed content from the same location. In that situation the only copy of the data is getting corrupted on the page fault and the following retries only fixate the status quo. Block cloning makes this issue easier to reproduce, since it does not read the old data, unlike traditional file copy, that may work by chance. This patch provides the fill status to dmu_buf_fill_done(), that in case of error can destroy the corrupted buffer as if no write happened. One more complication in case of block cloning is that if error is possible during fill, dmu_buf_will_fill() must read the data via fall-back to dmu_buf_will_dirty(). It is required to allow in case of error restoring the buffer to a state after the cloning, not not before it, that would happen if we just call dbuf_undirty(). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15665	2024-01-17 19:10:47 +05:00
Alexander Motin	3009e11d3e	ZIO: Optimize zio_flush() - Generalize vdev_nowritecache handling by traversing through the VDEV tree and skipping children ZIOs where not supported. - Remove intermediate zio_null() in case of several VDEV children. - Remove children handling from zio_ioctl(). There are no other use cases for this code beside DKIOCFLUSHWRITECACHED, and would there be, I doubt they would so straightforward apply to all VDEV children. Comparing to removed previous optimization this should improve cases of redundant ZILs/SLOGs. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Wilson <george.wilson@delphix.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15515	2024-01-17 19:08:26 +05:00
Rob N	95ae656137	Linux 6.7 compat: zfs_setattr fix atime update In `db4fc559c` I messed up and changed this bit of code to set the inode atime to an uninitialised value, when actually it was just supposed to loading the atime from the inode to be stored in the SA. This changes it to what it should have been. Ensure times change by the right amount Previously, we only checked if the times changed at all, which missed a bug where the atime was being set to an undefined value. Now ensure the times change by two seconds (or thereabouts), ensuring we catch cases where we set the time to something bonkers Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Sponsored-by: https://despairlabs.com/sponsor/ Closes #15762 Closes #15773	2024-01-17 19:06:39 +05:00
Ameer Hamza	628e26fc0e	Merge branch 'zfs-2.2.3-staging' into truenas/zfs-2.2-release Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>	2024-01-17 19:05:00 +05:00
Shengqi Chen	9ecd112dc1	compact: workaround for GPL-only symbols on riscv from Linux 6.2 Since Linux 6.2, the implementation of flush_dcache_page on riscv references GPL-only symbol `PageHuge`, breaking the build of zfs. This patch uses existing mechanism to override flush_dcache_page, removing the call to `PageHuge`. According to comments in kernel, it is only used to do some check against HugeTLB pages, which only exist in userspace. ZFS uses flush_dcache_page only on kernel pages, thus this patch will not introduce any behaviour change. See also: torvalds/linux@d33deda, openzfs/zfs@589f59b Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Shengqi Chen <harry-chen@outlook.com> Closes #14974 Closes #15627	2024-01-16 13:27:29 -08:00
Mark Johnston	a00231a3fc	spa: Let spa_taskq_param_get()'s addition of a newline be optional For FreeBSD sysctls, we don't want the extra newline, since the sysctl(8) utility will format strings appropriately. Reviewed-by: Rob Norris <robn@despairlabs.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reported-by: Peter Holm <pho@FreeBSD.org> Signed-off-by: Mark Johnston <markj@FreeBSD.org> Closes #15719	2024-01-16 11:32:19 -08:00
Mark Johnston	9181e94f0b	spa: Fix FreeBSD sysctl handlers sbuf_cpy() resets the sbuf state, which is wrong for sbufs allocated by sbuf_new_for_sysctl(). In particular, this code triggers an assertion failure in sbuf_clear(). Simplify by just using sysctl_handle_string() for both reading and setting the tunable. Fixes: `6930ecbb7` ("spa: make read/write queues configurable") Reviewed-by: Rob Norris <robn@despairlabs.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reported-by: Peter Holm <pho@FreeBSD.org> Signed-off-by: Mark Johnston <markj@FreeBSD.org> Closes #15719	2024-01-16 11:32:19 -08:00
Rob Norris	3bd23fd78d	freebsd: fix compile for spa_taskq_read/spa_taskq_write params Missed in #15695, backporting #15675. Signed-off-by: Rob Norris <robn@despairlabs.com>	2024-01-16 11:32:19 -08:00

1 2 3 4 5 ...

9097 Commits All Branches Search

9097 Commits

All Branches