Archive-Team/zfs - zfs - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Brian Behlendorf	9fe3da9364	Improve resilver ETAs When resilvering the estimated time remaining is calculated using the average issue rate over the current pass. Where the current pass starts when a scan was started, or restarted, if the pool was exported/imported. For dRAID pools in particular this can result in wildly optimistic estimates since the issue rate will be very high while scanning when non-degraded regions of the pool are scanned. Once repair I/O starts being issued performance drops to a realistic number but the estimated performance is still significantly skewed. To address this we redefine a pass such that it starts after a scanning phase completes so the issue rate is more reflective of recent performance. Additionally, the zfs_scan_report_txgs module option can be set to reset the pass statistics more often. Reviewed-by: Akash B <akash-b@hpe.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #14410	2023-04-24 12:55:07 -07:00
наб	18edf7a3ba	contrib: dracut: fix race with root=zfs:dset when necessities required This had always worked in my testing, but a user on hardware reported this to happen 100%, and I reproduced it once with cold VM host caches. dracut-zfs-generator runs as a systemd generator, i.e. at Some Relatively Early Time; if root= is a fixed dataset, it tries to "solve [necessities] statically at generation time". If by that point zfs-import.target hasn't popped (because the import is taking a non-negligible amount of time for whatever reason), it'll see no children for the root datase, and as such generate no mounts. This has never had any right to work. No-one caught this earlier because it's just that much more convenient to have root=zfs:AUTO, which orders itself properly. To fix this, always run zfs-nonroot-necessities.service; this additionally simplifies the implementation by: * making BOOTFS from zfs-env-bootfs.service be the real, canonical, root dataset name, not just "whatever the first bootfs is", and only set it if we're ZFS-booting * zfs-{rollback,snapshot}-bootfs.service can use this instead of re-implementing it * having zfs-env-bootfs.service also set BOOTFSFLAGS * this means the sysroot.mount drop-in can be fixed text * zfs-nonroot-necessities.service can also be constant and always enabled, because it's conditioned on BOOTFS being set There is no longer any code generated at run-time (the sysroot.mount drop-in is an unavoidable gratuitous cp). The flow of BOOTFS{,FLAGS} from zfs-env-bootfs.service to sysroot.mount is not noted explicitly in dracut.zfs(7), because (a) at some point it's just visual noise and (b) it's already ordered via d-p-m.s from z-i.t. Backport-of: `3399a30ee0` Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>	2023-04-18 10:10:45 -07:00
Ameer Hamza	777c98ee52	Use setproctitle to report progress of zfs send This allows parsing of zfs send progress by checking the process title. Doing so requires some changes to the send code in libzfs_sendrecv.c; primarily these changes move some of the accounting around, to allow for the code to be verbose as normal, or set the process title. Unlike BSD, setproctitle() isn't standard in Linux; thus, borrowed it from libbsd with slight modifications. Authored-by: Sean Eric Fagan <sef@FreeBSD.org> Co-authored-by: Ryan Moeller <ryan@iXsystems.com> Co-authored-by: Ameer Hamza <ahamza@ixsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #14376	2023-03-29 14:45:34 -07:00
Tino Reichardt	3da577280a	Add colored output to zfs list Use a bold header row and colorize the AVAIL column based on the used space percentage of volume. We define these colors: - when > 80%, use yellow - when > 90%, use red Reviewed-by: WHR <msl0000023508@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ethan Coe-Renner <coerenner1@llnl.gov> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes #14621 Closes #14350	2023-03-28 14:13:33 -07:00
Tino Reichardt	433b9a89c4	Colorize zpool iostat output Use a bold header and colorize the space suffixes in iostat by order of magnitude like this: - K is green - M is yellow - G is red - T is lightblue - P is magenta - E is cyan - 0 space is colored gray Reviewed-by: WHR <msl0000023508@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ethan Coe-Renner <coerenner1@llnl.gov> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes #14621 Closes #14459	2023-03-28 14:13:24 -07:00
Mateusz Piotrowski	576d34cb11	Turn default_bs and default_ibs into ZFS_MODULE_PARAMs The default_bs and default_ibs tunables control the default block size and indirect block size. So far, default_bs and default_ibs were tunable only on FreeBSD, e.g., sysctl vfs.zfs.default_ibs Remove the FreeBSD-specific sysctl code and expose default_bs and default_ibs as tunables on both Linux and FreeBSD using ZFS_MODULE_PARAM. One of the use cases for changing the values of those tunables is to lower the indirect block size, which may improve performance of large directories (as discussed during the OpenZFS Leadership Meeting on 2022-08-16). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Signed-off-by: Mateusz Piotrowski <mateusz.piotrowski@klarasystems.com> Sponsored-by: Wasabi Technology, Inc. Closes #14293	2023-03-07 13:58:36 -08:00
Alexander Motin	fd0893cf1f	Introduce minimal ZIL block commit delay Despite all optimizations, tests on actual hardware show that FreeBSD kernel can't sleep for less then ~2us. Similar tests on Linux show ~50us delay at least from nanosleep() (haven't tested inside kernel). It means that on very fast log device ZIL may not be able to satisfy zfs_commit_timeout_pct block commit timeout, increasing log latency more than desired. Handle that by introduction of zil_min_commit_timeout parameter, specifying minimal timeout value where additional delays to aggregate writes may be skipped. Also skip delays if the LWB is more than 7/8 full, that often happens if I/O sizes are constant and match one of LWB sizes. Both things are applied only if there were no already outstanding log blocks, that may indicate single-threaded workload, that by definition can not benefit from the commit delays. While there, add short time moving average to zl_last_lwb_latency to make it more stable. Tests of single-threaded 4KB writes to NVDIMM SLOG on FreeBSD show IOPS increase by 9% instead of expected 5%. For zfs_commit_timeout_pct of 1 there IOPS increase by 5.5% instead of expected 1%. Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Prakash Surya <prakash.surya@delphix.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #14418	2023-03-02 14:37:07 -08:00
Ethan Coe-Renner	9ef565a185	Add color output to zfs diff. This adds support to color zfs diff (in the style of git diff) conditional on the ZFS_COLOR environment variable. Signed-off-by: Ethan Coe-Renner <coerenner1@llnl.gov>	2023-01-19 12:50:36 -08:00
Rich Ercolani	e84a2ed7a8	Add workaround for broken Linux pipes Linux has an unresolved hang if you resize a pipe with bytes in it. Since there's no obvious way to detect this happening, added a workaround to disable resizing the pipe buffer if you set an environment variable. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #13309	2023-01-05 10:47:25 -08:00
Serapheim Dimitropoulos	85537f77a3	Expose zfs_vdev_open_timeout_ms as a tunable Some of our customers have been occasionally hitting zfs import failures in Linux because udevd doesn't create the by-id symbolic links in time for zpool import to use them. The main issue is that the systemd-udev-settle.service that zfs-import-cache.service and other services depend on is racy. There is also an openzfs issue filed (see https://github.com/openzfs/zfs/issues/10891) outlining the problem and potential solutions. With the proper solutions being significant in terms of complexity and the priority of the issue being low for the time being, this patch exposes `zfs_vdev_open_timeout_ms` as a tunable so people that are experiencing this issue often can increase it as a workaround. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Don Brady <don.brady@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #14133	2022-12-01 12:39:43 -08:00
Mateusz Guzik	c8d6a91a99	Bring per_txg_dirty_frees_percent back to 30 The current value causes significant artificial slowdown during mass parallel file removal, which can be observed both on FreeBSD and Linux when running real workloads. Sample results from Linux doing make -j 96 clean after an allyesconfig modules build: before: 4.14s user 6.79s system 48% cpu 22.631 total after: 4.17s user 6.44s system 153% cpu 6.927 total FreeBSD results in the ticket. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #13932 Closes #13938	2022-11-01 12:32:40 -07:00
Akash B	7ac732b8d6	Add options to zfs redundant_metadata property Currently, additional/extra copies are created for metadata in addition to the redundancy provided by the pool(mirror/raidz/draid), due to this 2 times more space is utilized per inode and this decreases the total number of inodes that can be created in the filesystem. By setting redundant_metadata to none, no additional copies of metadata are created, hence can reduce the space consumed by the additional metadata copies and increase the total number of inodes that can be created in the filesystem. Additionally, this can improve file create performance due to the reduced amount of metadata which needs to be written. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com> Signed-off-by: Akash B <akash-b@hpe.com> Closes #13680	2022-11-01 12:25:58 -07:00
Alexander Motin	33223cbc3c	Refactor Log Size Limit Original Log Size Limit implementation blocked all writes in case of limit reached until the TXG is committed and the log is freed. It caused huge delays and following speed spikes in application writes. This implementation instead smoothly throttles writes, using exactly the same mechanism as used for dirty data. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: jxdking <lostking2008@hotmail.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Issue #12284 Closes #13476	2022-09-26 14:55:27 -07:00
Richard Yao	b66f8d3c2b	Add zfs_btree_verify_intensity kernel module parameter I see a few issues in the issue tracker that might be aided by being able to turn this on. We have no module parameter for it, so I would like to add one. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes #13874	2022-09-21 13:15:51 -07:00
Alexander Motin	44cec45f72	Improve too large physical ashift handling When iterating through children physical ashifts for vdev, prefer ones above the maximum logical ashift, that we can actually use, but within the administrator defined maximum. When selecting top-level vdev ashift, do not set it to the defined maximum in case physical ashift is even higher, but just ignore one. Using the maximum does not prevent misaligned writes, but reduces space efficiency. Since ZFS tries to write data sequentially and aggregates the writes, in many cases large misanigned writes may be not as bad as the space penalty otherwise. Allow internal physical ashifts for vdevs higher than SHIFT_MAX. May be one day allocator or aggregation could benefit from that. Reduce zfs_vdev_max_auto_ashift default from 16 (64KB) to 14 (16KB), so that ZFS may still use bigger ashifts up to SHIFT_MAX (64KB), but only if it really has to or explicitly told to, but not as an "optimization". There are some read-intensive NVMe SSDs that report Preferred Write Alignment of 64KB, and attempt to build RAIDZ2 of those leads to a space inefficiency that can't be justified. Instead these changes make ZFS fall back to logical ashift of 12 (4KB) by default and only warn user that it may be suboptimal for performance. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #13798	2022-09-21 13:15:15 -07:00
Kevin Jin	d05f3039f7	Add Module Parameter Regarding Log Size Limit zfs_wrlog_data_max The upper limit of TX_WRITE log data. Once it is reached, write operation is blocked, until log data is cleared out after txg sync. It only counts TX_WRITE log with WR_COPIED or WR_NEED_COPY. Reviewed-by: Prakash Surya <prakash.surya@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: jxdking <lostking2008@hotmail.com> Closes #12284	2022-09-21 16:12:14 -07:00
George Amanakis	8bd3dca9bf	Introduce a tunable to exclude special class buffers from L2ARC Special allocation class or dedup vdevs may have roughly the same performance as L2ARC vdevs. Introduce a new tunable to exclude those buffers from being cacheable on L2ARC. Reviewed-by: Don Brady <don.brady@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #11761 Closes #12285	2022-09-14 11:27:00 -07:00
Paul Zuchowski	fcbddc7f7c	Fix problem with zdb -d zdb -d <pool>/<objset ID> does not work when other command line arguments are included i.e. zdb -U <cachefile> -d <pool>/<objset ID> This change fixes the command line parsing to handle this situation. Also fix issue where zdb -r <dataset> <file> does not handle the root <dataset> of the pool. Introduce -N option to force <objset ID> to be interpreted as a numeric objsetID. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Paul Zuchowski <pzuchowski@datto.com> Closes #12845 Closes #12944	2022-08-08 16:56:38 -07:00
Alexander Motin	884364ea85	More speculative prefetcher improvements - Make prefetch distance adaptive: up to 4MB prefetch doubles for every, hit same as before, but after that it grows by 1/8 every time the prefetch read does not complete in time to satisfy the demand. My tests show that 4MB is sufficient for wide NVMe pool to saturate single reader thread at 2.5GB/s, while new 64MB maximum allows the same thread to reach 1.5GB/s on wide HDD pool. Further distance increase may increase speed even more, but less dramatic and with higher latency. - Allow early reuse of inactive prefetch streams: streams that never saw hits can be reused immediately if there is a demand, while others can be reused after 1s of inactivity, starting with the oldest. After 2s of inactivity streams are deleted to free resources same as before. This allows by several times increase strided read performance on HDD pool in presence of simultaneous random reads, previously filling the zfetch_max_streams limit for seconds and so blocking most of prefetch. - Always issue intermediate indirect block reads with SYNC priority. Each of those reads if delayed for longer may delay up to 1024 other block prefetches, that may be not good for wide pools. Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes #13452	2022-07-26 10:10:37 -07:00
Alexander Motin	6e1e90d64c	Improve mg_aliquot math When calculating mg_aliquot alike to #12046 use number of unique data disks in the vdev, not the total number of children vdev. Increase default value of the tunable from 512KB to 1MB to compensate. Before this change each disk in striped pool was getting 512KB of sequential data, in 2-wide mirror -- 1MB, in 3-wide RAIDZ1 -- 768KB. After this change in all the cases each disk should get 1MB. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes #13388	2022-07-26 10:10:37 -07:00
Alexander Motin	dd9c110ab5	Improve log spacemap load time Previous flushing algorithm limited only total number of log blocks to the minimum of 256K and 4x number of metaslabs in the pool. As result, system with 1500 disks with 1000 metaslabs each, touching several new metaslabs each TXG could grow spacemap log to huge size without much benefits. We've observed one of such systems importing pool for about 45 minutes. This patch improves the situation from five sides: - By limiting maximum period for each metaslab to be flushed to 1000 TXGs, that effectively limits maximum number of per-TXG spacemap logs to load to the same number. - By making flushing more smooth via accounting number of metaslabs that were touched after the last flush and actually need another flush, not just ms_unflushed_txg bump. - By applying zfs_unflushed_log_block_pct to the number of metaslabs that were touched after the last flush, not all metaslabs in the pool. - By aggressively prefetching per-TXG spacemap logs up to 16 TXGs in advance, making log spacemap load process for wide HDD pool CPU-bound, accelerating it by many times. - By reducing zfs_unflushed_log_block_max from 256K to 128K, reducing single-threaded by nature log processing time from ~10 to ~5 minutes. As further optimization we could skip bumping ms_unflushed_txg for metaslabs not touched since the last flush, but that would be an incompatible change, requiring new pool feature. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes #12789	2022-07-26 10:10:37 -07:00
наб	2a64eeb6c7	man: zpool-import.8: -d -or -c Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13437	2022-05-10 13:36:37 -07:00
наб	1781ee703b	Add dracut.zfs.7 Thorough documentation with a dracut.bootup(7)-style flowchart, dracut.cmdline(7)-style cmdline listing, and per-file docs like the old README Upstream-commit: `e3fc330d6c` Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13291	2022-05-06 12:01:48 -07:00
наб	361dc138b1	Document zfs inherit -S's interaction with noninheritable properties Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Upstream-commit: `92295af800` Closes #11894 Closes #13335	2022-04-21 11:09:35 -07:00
Brian Behlendorf	9f6943504a	Default to zfs_dmu_offset_next_sync=1 Strict hole reporting was previously disabled by default as a performance optimization. However, this has lead to confusion over the expected behavior and a variety of workarounds being adopted by consumers of ZFS. Change the default behavior to always report holes and force the TXG sync. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Upstream-commit: `05b3eb6d23` Ref: #13261 Closes #12746	2022-04-01 09:59:47 -07:00
наб	fe6f2651f5	etc/systemd/zfs-mount-generator: serialise, handle keylocation=http[s]:// * etc/systemd/zfs-mount-generator: serialise The wins for a relatively normal workload are rather slim: real 0.02119s/0.00985s=2.15029x user 0.02130s/0.00346s=6.15560x sys 0.03858s/0.00643s=6.00062x wall-total 0.014518s/0.005925s=2.45009x wall-init 0.014518s/0.002457s=5.90684x wall-real 0.014518s/0.003467s=4.18668x But this is a big win on machines with a lot of datasets and expensive forks. For example, the gain on a VM on my work laptop with 900+ legacy-mount Docker datasets, the original gains from the C rewrite were only five-fold: real 0.516s/0.102s=5.05882x user 0.237s/0.143s=1.65734x sys 0.287s/0.100s=2.87x And this serial variant gains this back there as well: real 0.102s/0.008s=12.75x user 0.143s/0.007s=20.42857 sys 0.100s/0.001s=100x wall-total 0.09717s/0.00319s=30.40255x wall-init 0.00203s/0.00200s=1.015941x wall-real 0.09513s/0.00118s=80.02043x For a total of real 0.516s/0.008s=64.5x user 0.237s/0.007s=33.85714x sys 0.287s/0.001s=287x Suggested-by: Richard Laager <rlaager@wiktel.com> * etc/systemd/zfs-mount-generator: pull in network for keylocation=https Also simplify RequiresMountsFor= handling Ref: #11956 Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Upstream-commit: `4325de09cd` Closes #12138	2022-04-01 09:58:45 -07:00
наб	5a21214be8	zfs, libzfs: diff: accept -h/ZFS_DIFF_NO_MANGLE, disabling path escaping Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Upstream-commit: `344bbc82e7` Closes #12829	2022-04-01 09:58:45 -07:00
наб	336c6d5f54	zfs-receive.8: properly unlight = in option setting Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Amanakis <gamanakis@gmail.com> Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13101	2022-02-16 17:58:56 -08:00
наб	4b3fbf3c16	zfs-receive.8: fix Op Fl x Ar encryption in running text Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Amanakis <gamanakis@gmail.com> Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13101	2022-02-16 17:58:56 -08:00
наб	d24bdf4ee4	zpool-import.8: WARNING should be emphasised Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13082	2022-02-16 17:58:56 -08:00
наб	11bd8cd002	zpool-import.8: newpool is Ar, not Sy Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13082	2022-02-16 17:58:56 -08:00
наб	a38e7bc922	zpoolprops.7: document leaked It's noted very scarcely in the code as it stands, indeed the only actual comment on this is /* * We have finished background destroying, but there is still * some space left in the dp_free_dir. Transfer this leaked * space to the dp_leak_dir. */ Introduced in `fbeddd60b7` ("Illumos 4390 - I/O errors can corrupt space map when deleting fs/vol"), which explains, alongside the references, that this can only happen with a corrupted pool Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13081	2022-02-16 17:58:56 -08:00
Zhu Chuang	d4e8dcf07e	Correct a typo in zfs-receive.8 Should be `-o keyformat=passphrase` instead of `-o -keyformat=passphrase` Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chuang Zhu <chuang@melty.land> Closes #13072	2022-02-16 17:58:56 -08:00
Brian Behlendorf	7f4f461bcf	Clarify `failmode=wait` documentation Nowhere in the description of the failmode property does it clearly state how to bring a suspended pool back online. Add a few words to property description and the zpool-clear(8) man page. Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12907 Closes #9395	2022-02-16 17:58:55 -08:00
chrisrd	5987838a3f	man: speling Fix spelling. Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chris Dunlop <chris@onthe.net.au> Closes #12911	2022-02-16 17:58:55 -08:00
наб	efbed102f0	zfs-share.8: document -l flag Description stolen from zfs-mount.8 Reviewed-by: Don Brady <don.brady@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12067	2022-02-16 17:58:55 -08:00
наб	9cbc2ed20f	libzfs: add keylocation=https://, backed by fetch(3) or libcurl Add support for http and https to the keylocation properly to allow encryption keys to be fetched from the specified URL. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Issue #9543 Closes #9947 Closes #11956	2022-02-16 17:58:37 -08:00
D. Ebdrup	4d4f0d1a05	zfsprops.7: Add note about comma-separation This change primarily seeks to make implicit documentation explicit, as it is not outright stated that options should be comma-separated, nor is there a reason given for it. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Daniel Ebdrup Jensen <debdrup@FreeBSD.org> Closes #12579	2022-02-15 16:20:12 -08:00
Georgy Yakovlev	f471a0a0a7	systemd: add weekly and monthly scrub timers Timers can be enabled as follows: systemctl enable zfs-scrub-weekly@rpool.timer --now systemctl enable zfs-scrub-monthly@datapool.timer --now Each timer will pull in zfs-scrub@${poolname}.service, which is not schedule-specific. Added PERIODIC SCRUB section to zpool-scrub.8. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org> Closes #12193	2022-02-10 11:04:35 -08:00
Alexander Motin	786abf5321	Reduce number of arc_prune threads On FreeBSD vnode reclamation is single-threaded, protected by single global lock. Linux seems to be able to use a thread per mount point, but at this time it creates more harm than good. Reduce number of threads to 1, adding tunable in case somebody wants to try more. Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Chris Dunlop <chris@onthe.net.au> Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Closes #12896 Issue #9966	2022-02-03 15:28:01 -08:00
Brian Behlendorf	664d487a5d	Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency When using lseek(2) to report data/holes memory mapped regions of the file were ignored. This could result in incorrect results. To handle this zfs_holey_common() was updated to asynchronously writeback any dirty mmap(2) regions prior to reporting holes. Additionally, while not strictly required, the dn_struct_rwlock is now held over the dirty check to prevent the dnode structure from changing. This ensures that a clean dnode can't be dirtied before the data/hole is located. The range lock is now also taken to ensure the call cannot race with zfs_write(). Furthermore, the code was refactored to provide a dnode_is_dirty() helper function which checks the dnode for any dirty records to determine its dirtiness. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #11900 Closes #12724	2021-11-05 08:08:55 -07:00
Sam Hathaway	e78d06f89b	zpool-remove.8: describe top-level vdev sector size limitation Document that top-level vdevs cannot be removed unless all top-level vdevs have the same sector size. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Sam Hathaway <sam@sam-hathaway.com> Closes #11339 Closes #12472	2021-09-14 14:32:16 -07:00
Gordon Bergling	5de6e4ec94	zfs.4: Fix typo s/compatiblity/compatibility/ Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Gordon Bergling <gbergling@googlemail.com> Closes #12464	2021-09-14 14:31:50 -07:00
George Melikov	c07ed69577	Man zpool-scrub.8: describe sequential scrub Describe sequential scrub and add examples of scrub status. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: George Melikov <mail@gmelikov.ru> Closes #12429	2021-09-14 14:29:46 -07:00
Václav Skála	898b1e173c	Add missing properties to zfs allow manpage Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Václav Skála <skala@vshosting.cz> Closes #12402	2021-09-14 13:08:19 -07:00
Rich Ercolani	056c273939	Correct zfs-send(8) on readonly sends zfs-send(8) claimed in the flags list you could use -pR when sending a readonly filesystem or volume. You cannot. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12336	2021-09-14 12:38:51 -07:00
Alexander Motin	45305a067f	Fix ARC ghost states eviction accounting arc_evict_hdr() returns number of evicted bytes in scope of specific state. For ghost states it does not mean the amount of really freed memory, but the logical buffer size. It is correct for the eviction process, but not for waking up threads waiting for ARC size reduction, as added in "Revise ARC shrinker algorithm" commit, causing premature wakeups while ARC is still overflowed, allowing even bigger overflow, plus processing overhead when next allocation will also get blocked, probably also for too short time. To fix that make arc_evict_hdr() also return the amount of really freed memory, which for the ghost states is only the header, and use it to update arc_evict_count instead. Originally I was thinking to not return it at all, since arc_get_data_impl() does not account for the headers, but decided that some slow allocation progress is better than long waits, reaching on my tests up to 100ms. To reduce negative latency effects of long time periods when reclaim thread can free little real memory, start reclamation process earlier, before we actually reached the overflow threshold, when we have to throttle new allocations. We can also do it without taking global arc_evict_lock, reducing the contention. Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Closes #12279	2021-09-14 12:38:05 -07:00
наб	4e0fff2e02	zgenhostid.8: revisit Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12212	2021-06-10 10:50:16 -07:00
наб	14973b917c	Consistentify miscellaneous style on remaining manpages Most notably this fixes the vdev_id(8) non-.Xrs in vdev_id.conf.5 Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12212	2021-06-10 10:50:16 -07:00
наб	a444efb6d7	Move properties, parameters, events, and concepts around manual sections The pages moved as follows: zpool-features.{5 => 7} spl{-module-parameters.5 => .4} zfs{-module-parameters.5 => .4} zfs-events.5 => into zpool-events.8 zfsconcepts.{8 => 7} zfsprops.{8 => 7} zpoolconcepts.{8 => 7} zpoolprops.{8 => 7} Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Co-authored-by: Daniel Ebdrup Jensen <debdrup@FreeBSD.org> Closes #12149 Closes #12212	2021-06-10 10:50:16 -07:00

1 2 3 4 5 ...

780 Commits