Archive-Team/zfs - zfs - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Richard Yao	ab8d9c1783	Cleanup: 64-bit kernel module parameters should use fixed width types Various module parameters such as `zfs_arc_max` were originally `uint64_t` on OpenSolaris/Illumos, but were changed to `unsigned long` for Linux compatibility because Linux's kernel default module parameter implementation did not support 64-bit types on 32-bit platforms. This caused problems when porting OpenZFS to Windows because its LLP64 memory model made `unsigned long` a 32-bit type on 64-bit, which created the undesireable situation that parameters that should accept 64-bit values could not on 64-bit Windows. Upon inspection, it turns out that the Linux kernel module parameter interface is extensible, such that we are allowed to define our own types. Rather than maintaining the original type change via hacks to to continue shrinking module parameters on 32-bit Linux, we implement support for 64-bit module parameters on Linux. After doing a review of all 64-bit kernel parameters (found via the man page and also proposed changes by Andrew Innes), the kernel module parameters fell into a few groups: Parameters that were originally 64-bit on Illumos: * dbuf_cache_max_bytes * dbuf_metadata_cache_max_bytes * l2arc_feed_min_ms * l2arc_feed_secs * l2arc_headroom * l2arc_headroom_boost * l2arc_write_boost * l2arc_write_max * metaslab_aliquot * metaslab_force_ganging * zfetch_array_rd_sz * zfs_arc_max * zfs_arc_meta_limit * zfs_arc_meta_min * zfs_arc_min * zfs_async_block_max_blocks * zfs_condense_max_obsolete_bytes * zfs_condense_min_mapping_bytes * zfs_deadman_checktime_ms * zfs_deadman_synctime_ms * zfs_initialize_chunk_size * zfs_initialize_value * zfs_lua_max_instrlimit * zfs_lua_max_memlimit * zil_slog_bulk Parameters that were originally 32-bit on Illumos: * zfs_per_txg_dirty_frees_percent Parameters that were originally `ssize_t` on Illumos: * zfs_immediate_write_sz Note that `ssize_t` is `int32_t` on 32-bit and `int64_t` on 64-bit. It has been upgraded to 64-bit. Parameters that were `long`/`unsigned long` because of Linux/FreeBSD influence: * l2arc_rebuild_blocks_min_l2size * zfs_key_max_salt_uses * zfs_max_log_walking * zfs_max_logsm_summary_length * zfs_metaslab_max_size_cache_sec * zfs_min_metaslabs_to_flush * zfs_multihost_interval * zfs_unflushed_log_block_max * zfs_unflushed_log_block_min * zfs_unflushed_log_block_pct * zfs_unflushed_max_mem_amt * zfs_unflushed_max_mem_ppm New parameters that do not exist in Illumos: * l2arc_trim_ahead * vdev_file_logical_ashift * vdev_file_physical_ashift * zfs_arc_dnode_limit * zfs_arc_dnode_limit_percent * zfs_arc_dnode_reduce_percent * zfs_arc_meta_limit_percent * zfs_arc_sys_free * zfs_deadman_ziotime_ms * zfs_delete_blocks * zfs_history_output_max * zfs_livelist_max_entries * zfs_max_async_dedup_frees * zfs_max_nvlist_src_size * zfs_rebuild_max_segment * zfs_rebuild_vdev_limit * zfs_unflushed_log_txg_max * zfs_vdev_max_auto_ashift * zfs_vdev_min_auto_ashift * zfs_vnops_read_chunk_size * zvol_max_discard_blocks Rather than clutter the lists with commentary, the module parameters that need comments are repeated below. A few parameters were defined in Linux/FreeBSD specific code, where the use of ulong/long is not an issue for portability, so we leave them alone: * zfs_delete_blocks * zfs_key_max_salt_uses * zvol_max_discard_blocks The documentation for a few parameters was found to be incorrect: * zfs_deadman_checktime_ms - incorrectly documented as int * zfs_delete_blocks - not documented as Linux only * zfs_history_output_max - incorrectly documented as int * zfs_vnops_read_chunk_size - incorrectly documented as long * zvol_max_discard_blocks - incorrectly documented as ulong The documentation for these has been fixed, alongside the changes to document the switch to fixed width types. In addition, several kernel module parameters were percentages or held ashift values, so being 64-bit never made sense for them. They have been downgraded to 32-bit: * vdev_file_logical_ashift * vdev_file_physical_ashift * zfs_arc_dnode_limit_percent * zfs_arc_dnode_reduce_percent * zfs_arc_meta_limit_percent * zfs_per_txg_dirty_frees_percent * zfs_unflushed_log_block_pct * zfs_vdev_max_auto_ashift * zfs_vdev_min_auto_ashift Of special note are `zfs_vdev_max_auto_ashift` and `zfs_vdev_min_auto_ashift`, which were already defined as `uint64_t`, and passed to the kernel as `ulong`. This is inherently buggy on big endian 32-bit Linux, since the values would not be written to the correct locations. 32-bit FreeBSD was unaffected because its sysctl code correctly treated this as a `uint64_t`. Lastly, a code comment suggests that `zfs_arc_sys_free` is Linux-specific, but there is nothing to indicate to me that it is Linux-specific. Nothing was done about that. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Original-patch-by: Andrew Innes <andrew.c12@gmail.com> Original-patch-by: Jorgen Lundman <lundman@lundman.net> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes #13984 Closes #14004	2022-10-13 10:03:29 -07:00
Mateusz Guzik	eb9bec0a5d	Bring per_txg_dirty_frees_percent back to 30 The current value causes significant artificial slowdown during mass parallel file removal, which can be observed both on FreeBSD and Linux when running real workloads. Sample results from Linux doing make -j 96 clean after an allyesconfig modules build: before: 4.14s user 6.79s system 48% cpu 22.631 total after: 4.17s user 6.44s system 153% cpu 6.927 total FreeBSD results in the ticket. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #13932 Closes #13938	2022-09-27 17:38:03 -07:00
Richard Yao	fdc2d30371	Cleanup: Specify unsignedness on things that should not be signed In #13871, zfs_vdev_aggregation_limit_non_rotating and zfs_vdev_aggregation_limit being signed was pointed out as a possible reason not to eliminate an unnecessary MAX(unsigned, 0) since the unsigned value was assigned from them. There is no reason for these module parameters to be signed and upon inspection, it was found that there are a number of other module parameters that are signed, but should not be, so we make them unsigned. Making them unsigned made it clear that some other variables in the code should also be unsigned, so we also make those unsigned. This prevents users from setting negative values that could potentially cause bad behaviors. It also makes the code slightly easier to understand. Mostly module parameters that deal with timeouts, limits, bitshifts and percentages are made unsigned by this. Any that are boolean are left signed, since whether booleans should be considered signed or unsigned does not matter. Making zfs_arc_lotsfree_percent unsigned caused a `zfs_arc_lotsfree_percent >= 0` check to become redundant, so it was removed. Removing the check was also necessary to prevent a compiler error from -Werror=type-limits. Several end of line comments had to be moved to their own lines because replacing int with uint_t caused us to exceed the 80 character limit enforced by cstyle.pl. The following were kept signed because they are passed to taskq_create(), which expects signed values and modifying the OpenSolaris/Illumos DDI is out of scope of this patch: * metaslab_load_pct * zfs_sync_taskq_batch_pct * zfs_zil_clean_taskq_nthr_pct * zfs_zil_clean_taskq_minalloc * zfs_zil_clean_taskq_maxalloc * zfs_arc_prune_task_threads Also, negative values in those parameters was found to be harmless. The following were left signed because either negative values make sense, or more analysis was needed to determine whether negative values should be disallowed: * zfs_metaslab_switch_threshold * zfs_pd_bytes_max * zfs_livelist_min_percent_shared zfs_multihost_history was made static to be consistent with other parameters. A number of module parameters were marked as signed, but in reality referenced unsigned variables. upgrade_errlog_limit is one of the numerous examples. In the case of zfs_vdev_async_read_max_active, it was already uint32_t, but zdb had an extern int declaration for it. Interestingly, the documentation in zfs.4 was right for upgrade_errlog_limit despite the module parameter being wrongly marked, while the documentation for zfs_vdev_async_read_max_active (and friends) was wrong. It was also wrong for zstd_abort_size, which was unsigned, but was documented as signed. Also, the documentation in zfs.4 incorrectly described the following parameters as ulong when they were int: * zfs_arc_meta_adjust_restarts * zfs_override_estimate_recordsize They are now uint_t as of this patch and thus the man page has been updated to describe them as uint. dbuf_state_index was left alone since it does nothing and perhaps should be removed in another patch. If any module parameters were missed, they were not found by `grep -r 'ZFS_MODULE_PARAM' \| grep ', INT'`. I did find a few that grep missed, but only because they were in files that had hits. This patch intentionally did not attempt to address whether some of these module parameters should be elevated to 64-bit parameters, because the length of a long on 32-bit is 32-bit. Lastly, it was pointed out during review that uint_t is a better match for these variables than uint32_t because FreeBSD kernel parameter definitions are designed for uint_t, whose bit width can change in future memory models. As a result, we change the existing parameters that are uint32_t to use uint_t. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Neal Gompa <ngompa@datto.com> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes #13875	2022-09-27 16:42:41 -07:00
Brian Behlendorf	505df8d133	Dynamically size dbuf hash mutex array Incorrectly sizing the array of hash locks used to protect the dbuf hash table can lead to contention and reduce performance. We could unconditionally allocate a larger array for the locks but it's wasteful, particularly for a low-memory system. Instead, dynamically allocate the array of locks and scale it based on total system memory. Additionally, add a new `dbuf_mutex_cache_shift` module option which can be used to override the hash lock array size. This is disabled by default (dbuf_mutex_hash_shift=0) and can only be set at module load time. The minimum target array size is set to 8192, this matches the current constant value. Note that the count of the dbuf hash table and count of the mutex array were added to the /proc/spl/kstat/zfs/dbufstats kstat. Finally, this change removes the _KERNEL conditional checks. These were not required since for the user space build there is no difference between the kmem and vmem interfaces. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13928	2022-09-22 12:59:56 -07:00
Tino Reichardt	eeca9d27d7	Add zfs_blake3_impl to zfs.4 The zfs module parameter zfs_blake3_impl got no manual page entry while adding BLAKE3 to OpenZFS. This commit adds the required notes about the parameter into zfs.4 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Co-authored-by: Ryan Moeller <ryan@freqlabs.com> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes #13725	2022-09-16 14:25:53 -07:00
Richard Yao	b24d1c77f7	Add zfs_btree_verify_intensity kernel module parameter I see a few issues in the issue tracker that might be aided by being able to turn this on. We have no module parameter for it, so I would like to add one. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes #13874	2022-09-15 16:22:33 -07:00
Mateusz Piotrowski	fa22ec569c	Use correct mdoc macros for arguments Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Signed-off-by: Mateusz Piotrowski <0mp@FreeBSD.org> Closes #13890	2022-09-15 14:22:00 -07:00
Alexander Motin	37f6845c6f	Improve too large physical ashift handling When iterating through children physical ashifts for vdev, prefer ones above the maximum logical ashift, that we can actually use, but within the administrator defined maximum. When selecting top-level vdev ashift, do not set it to the defined maximum in case physical ashift is even higher, but just ignore one. Using the maximum does not prevent misaligned writes, but reduces space efficiency. Since ZFS tries to write data sequentially and aggregates the writes, in many cases large misanigned writes may be not as bad as the space penalty otherwise. Allow internal physical ashifts for vdevs higher than SHIFT_MAX. May be one day allocator or aggregation could benefit from that. Reduce zfs_vdev_max_auto_ashift default from 16 (64KB) to 14 (16KB), so that ZFS may still use bigger ashifts up to SHIFT_MAX (64KB), but only if it really has to or explicitly told to, but not as an "optimization". There are some read-intensive NVMe SSDs that report Preferred Write Alignment of 64KB, and attempt to build RAIDZ2 of those leads to a space inefficiency that can't be justified. Instead these changes make ZFS fall back to logical ashift of 12 (4KB) by default and only warn user that it may be suboptimal for performance. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #13798	2022-09-08 10:30:53 -07:00
Andriy Gapon	ee9f3bca55	Add zfs.sync.snapshot_rename Only the single snapshot rename is provided. The recursive or more complex rename can be scripted. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Andriy Gapon <avg@FreeBSD.org> Closes #13802	2022-09-02 13:31:19 -07:00
Umer Saleem	a582d52993	Updates for snapshots_changed property Currently, snapshots_changed property is stored in dd_props_zapobj, due to which the property is assumed to be local. This causes a difference in behavior with respect to other readonly properties. This commit stores the snapshots_changed property in dd_object. Source is not set to local in this case, which makes it consistent with other readonly properties. This commit also updates the date string format to include seconds. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #13785	2022-08-24 14:20:43 -07:00
George Amanakis	0c4064d9a0	Fix zpool status in case of unloaded keys When scrubbing an encrypted filesystem with unloaded key still report an error in zpool status. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alek Pinchuk <apinchuk@axcient.com> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #13675 Closes #13717	2022-08-22 17:42:01 -07:00
Paul Dagnelie	17e212652d	Prevent zevent list from consuming all of kernel memory There are a couple changes included here. The first is to introduce a cap on the size the ZED will grow the zevent list to. One million entries is more than enough for most use cases, and if you are overflowing that value, the problem needs to be addressed another way. The value is also tunable, for those who want the limit to be higher or lower. The other change is to add a kernel module parameter that allows snapshot creation/deletion to be exempted from the history logging; for most workloads, having these things logged is valuable, but for some workloads it produces large quantities of log spam and isn't especially helpful. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Issue #13374 Closes #13753	2022-08-22 12:36:22 -07:00
George Melikov	fbc210fab2	Enable relatime by default Linux sets relatime on mount by default for any file system, but relatime=off in ZFS disables it explicitly. Let's be consistent with other file systems on Linux. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Melikov <mail@gmelikov.ru> Closes #13614	2022-08-12 14:20:25 -07:00
Umer Saleem	9681de4657	Add snapshots_changed as property Make dd_snap_cmtime property persistent across mount and unmount operations by storing in ZAP and restore the value from ZAP on hold into dd_snap_cmtime instead of updating it. Expose dd_snap_cmtime as 'snapshots_changed' property that provides a mechanism to quickly determine whether snapshot list for dataset has changed without having to mount a dataset or iterate the snapshot list. It specifies the time at which a snapshot for a dataset was last created or deleted. This allows us to be more efficient how often we query snapshots. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #13635	2022-08-02 16:45:30 -07:00
Alek P	e8cf3a4f76	Implement a new type of zfs receive: corrective receive (-c) This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes #9372	2022-07-28 15:52:46 -07:00
Tino Reichardt	1d3ba0bf01	Replace dead opensolaris.org license link The commit replaces all findings of the link: http://www.opensolaris.org/os/licensing with this one: https://opensource.org/licenses/CDDL-1.0 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes #13619	2022-07-11 14:16:13 -07:00
Alan Somers	ccf89b39fe	Add a "zstream decompress" subcommand It can be used to repair a ZFS file system corrupted by ZFS bug #12762. Use it like this: zfs send -c <DS> \| \ zstream decompress <OBJECT>,<OFFSET>[,<COMPRESSION_ALGO>] ... \| \ zfs recv <DST_DS> Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Alan Somers <asomers@gmail.com> Sponsored-by: Axcient Workaround for #12762 Closes #13256	2022-06-24 13:28:42 -07:00
Toomas Soome	83691bebf0	Use macros for quotes and such Use Dq,Pq/Po/Pc macros. illumos dumpadm is now in section 8. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Toomas Soome <tsoome@me.com> Closes #13586	2022-06-24 09:48:10 -07:00
Julian Brunner	482505fd42	Add weekly and monthly systemd timers for trimming On machines using systemd, trim timers can be enabled on a per-pool basis. Weekly and monthly timer units are provided. Timers can be enabled as follows: systemctl enable zfs-trim-weekly@rpool.timer --now systemctl enable zfs-trim-monthly@datapool.timer --now Each timer will pull in zfs-trim@${poolname}.service, which is not schedule-specific. The manpage zpool-trim has been updated accordingly. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Julian Brunner <julian.brunner@gmail.com> Closes #13544	2022-06-10 18:22:14 -07:00
Will Andrews	4ed5e25074	Add Linux namespace delegation support This allows ZFS datasets to be delegated to a user/mount namespace Within that namespace, only the delegated datasets are visible Works very similarly to Zones/Jailes on other ZFS OSes As a user: ``` $ unshare -Um $ zfs list no datasets available $ echo $$ 1234 ``` As root: ``` # zfs list NAME ZONED MOUNTPOINT containers off /containers containers/host off /containers/host containers/host/child off /containers/host/child containers/host/child/gchild off /containers/host/child/gchild containers/unpriv on /unpriv containers/unpriv/child on /unpriv/child containers/unpriv/child/gchild on /unpriv/child/gchild # zfs zone /proc/1234/ns/user containers/unpriv ``` Back to the user namespace: ``` $ zfs list NAME USED AVAIL REFER MOUNTPOINT containers 129M 47.8G 24K /containers containers/unpriv 128M 47.8G 24K /unpriv containers/unpriv/child 128M 47.8G 128M /unpriv/child ``` Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Will Andrews <will.andrews@klarasystems.com> Signed-off-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Mateusz Piotrowski <mateusz.piotrowski@klarasystems.com> Co-authored-by: Allan Jude <allan@klarasystems.com> Co-authored-by: Mateusz Piotrowski <mateusz.piotrowski@klarasystems.com> Sponsored-by: Buddy <https://buddy.works> Closes #12263	2022-06-10 09:51:46 -07:00
Tony Hutter	6f73d02168	zvol: Support blk-mq for better performance Add support for the kernel's block multiqueue (blk-mq) interface in the zvol block driver. blk-mq creates multiple request queues on different CPUs rather than having a single request queue. This can improve zvol performance with multithreaded reads/writes. This implementation uses the blk-mq interfaces on 4.13 or newer kernels. Building against older kernels will fall back to the older BIO interfaces. Note that you must set the `zvol_use_blk_mq` module param to enable the blk-mq API. It is disabled by default. In addition, this commit lets the zvol blk-mq layer process whole `struct request` IOs at a time, rather than breaking them down into their individual BIOs. This reduces dbuf lock contention and overhead versus the legacy zvol submit_bio() codepath. sequential dd to one zvol, 8k volblocksize, no O_DIRECT: legacy submit_bio() 292MB/s write 453MB/s read this commit 453MB/s write 885MB/s read It also introduces a new `zvol_blk_mq_chunks_per_thread` module parameter. This parameter represents how many volblocksize'd chunks to process per each zvol thread. It can be used to tune your zvols for better read vs write performance (higher values favor write, lower favor read). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #13148 Issue #12483	2022-06-09 08:10:38 -06:00
Tino Reichardt	985c33b132	Introduce BLAKE3 checksums as an OpenZFS feature This commit adds BLAKE3 checksums to OpenZFS, it has similar performance to Edon-R, but without the caveats around the latter. Homepage of BLAKE3: https://github.com/BLAKE3-team/BLAKE3 Wikipedia: https://en.wikipedia.org/wiki/BLAKE_(hash_function)#BLAKE3 Short description of Wikipedia: BLAKE3 is a cryptographic hash function based on Bao and BLAKE2, created by Jack O'Connor, Jean-Philippe Aumasson, Samuel Neves, and Zooko Wilcox-O'Hearn. It was announced on January 9, 2020, at Real World Crypto. BLAKE3 is a single algorithm with many desirable features (parallelism, XOF, KDF, PRF and MAC), in contrast to BLAKE and BLAKE2, which are algorithm families with multiple variants. BLAKE3 has a binary tree structure, so it supports a practically unlimited degree of parallelism (both SIMD and multithreading) given enough input. The official Rust and C implementations are dual-licensed as public domain (CC0) and the Apache License. Along with adding the BLAKE3 hash into the OpenZFS infrastructure a new benchmarking file called chksum_bench was introduced. When read it reports the speed of the available checksum functions. On Linux: cat /proc/spl/kstat/zfs/chksum_bench On FreeBSD: sysctl kstat.zfs.misc.chksum_bench This is an example output of an i3-1005G1 test system with Debian 11: implementation 1k 4k 16k 64k 256k 1m 4m edonr-generic 1196 1602 1761 1749 1762 1759 1751 skein-generic 546 591 608 615 619 612 616 sha256-generic 240 300 316 314 304 285 276 sha512-generic 353 441 467 476 472 467 426 blake3-generic 308 313 313 313 312 313 312 blake3-sse2 402 1289 1423 1446 1432 1458 1413 blake3-sse41 427 1470 1625 1704 1679 1607 1629 blake3-avx2 428 1920 3095 3343 3356 3318 3204 blake3-avx512 473 2687 4905 5836 5844 5643 5374 Output on Debian 5.10.0-10-amd64 system: (Ryzen 7 5800X) implementation 1k 4k 16k 64k 256k 1m 4m edonr-generic 1840 2458 2665 2719 2711 2723 2693 skein-generic 870 966 996 992 1003 1005 1009 sha256-generic 415 442 453 455 457 457 457 sha512-generic 608 690 711 718 719 720 721 blake3-generic 301 313 311 309 309 310 310 blake3-sse2 343 1865 2124 2188 2180 2181 2186 blake3-sse41 364 2091 2396 2509 2463 2482 2488 blake3-avx2 365 2590 4399 4971 4915 4802 4764 Output on Debian 5.10.0-9-powerpc64le system: (POWER 9) implementation 1k 4k 16k 64k 256k 1m 4m edonr-generic 1213 1703 1889 1918 1957 1902 1907 skein-generic 434 492 520 522 511 525 525 sha256-generic 167 183 187 188 188 187 188 sha512-generic 186 216 222 221 225 224 224 blake3-generic 153 152 154 153 151 153 153 blake3-sse2 391 1170 1366 1406 1428 1426 1414 blake3-sse41 352 1049 1212 1174 1262 1258 1259 Output on Debian 5.10.0-11-arm64 system: (Pi400) implementation 1k 4k 16k 64k 256k 1m 4m edonr-generic 487 603 629 639 643 641 641 skein-generic 271 299 303 308 309 309 307 sha256-generic 117 127 128 130 130 129 130 sha512-generic 145 165 170 172 173 174 175 blake3-generic 81 29 71 89 89 89 89 blake3-sse2 112 323 368 379 380 371 374 blake3-sse41 101 315 357 368 369 364 360 Structurally, the new code is mainly split into these parts: - 1x cross platform generic c variant: blake3_generic.c - 4x assembly for X86-64 (SSE2, SSE4.1, AVX2, AVX512) - 2x assembly for ARMv8 (NEON converted from SSE2) - 2x assembly for PPC64-LE (POWER8 converted from SSE2) - one file for switching between the implementations Note the PPC64 assembly requires the VSX instruction set and the kfpu_begin() / kfpu_end() calls on PowerPC were updated accordingly. Reviewed-by: Felix Dörre <felix@dogcraft.de> Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Co-authored-by: Rich Ercolani <rincebrain@gmail.com> Closes #10058 Closes #12918	2022-06-08 15:55:57 -07:00
Rich Ercolani	bc8192cd5b	Corrected parameters for zstd early abort That'll teach me to try and recall them from the definition. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #13519	2022-05-31 15:41:33 -07:00
Brian Behlendorf	d98a67a53a	Replace EXTRA_DIST with dist_noinst_DATA The EXTRA_DIST variable is ignored when used in the FALSE conditional of a Makefile.am. This results in the `make dist` target omitting these files from the generated tarball unless CONFIG_USER is defined. This issue can be avoided by switching to use the dist_noinst_DATA variable which is handled as expected by autoconf. This change also adds support for --with-config=dist as an alias for --with-config=srpm and updates the GitHub workflows to use it. Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13459 Closes #13505	2022-05-26 09:24:50 -07:00
Alexander Motin	6aa8c21a2a	More speculative prefetcher improvements - Make prefetch distance adaptive: up to 4MB prefetch doubles for every, hit same as before, but after that it grows by 1/8 every time the prefetch read does not complete in time to satisfy the demand. My tests show that 4MB is sufficient for wide NVMe pool to saturate single reader thread at 2.5GB/s, while new 64MB maximum allows the same thread to reach 1.5GB/s on wide HDD pool. Further distance increase may increase speed even more, but less dramatic and with higher latency. - Allow early reuse of inactive prefetch streams: streams that never saw hits can be reused immediately if there is a demand, while others can be reused after 1s of inactivity, starting with the oldest. After 2s of inactivity streams are deleted to free resources same as before. This allows by several times increase strided read performance on HDD pool in presence of simultaneous random reads, previously filling the zfetch_max_streams limit for seconds and so blocking most of prefetch. - Always issue intermediate indirect block reads with SYNC priority. Each of those reads if delayed for longer may delay up to 1024 other block prefetches, that may be not good for wide pools. Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes #13452	2022-05-25 10:12:52 -07:00
Alexander Motin	84d0a03f3e	Refactor Log Size Limit Original Log Size Limit implementation blocked all writes in case of limit reached until the TXG is committed and the log is freed. It caused huge delays and following speed spikes in application writes. This implementation instead smoothly throttles writes, using exactly the same mechanism as used for dirty data. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: jxdking <lostking2008@hotmail.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Issue #12284 Closes #13476	2022-05-24 09:46:35 -07:00
Rich Ercolani	f375b23c02	Tiered early abort, zstd edition It turns out that "do LZ4 and zstd-1 both fail" is a great heuristic for "don't even bother trying higher zstd tiers". By way of illustration: $ cat /incompress \| mbuffer \| zfs recv -o compression=zstd-12 evenfaster/lowcomp_1M_zstd12_normal summary: 39.8 GiByte in 3min 40.2sec - average of 185 MiB/s $ echo 3 \| sudo tee /sys/module/zzstd/parameters/zstd_lz4_pass 3 $ cat /incompress \| mbuffer -m 4G \| zfs recv -o compression=zstd-12 evenfaster/lowcomp_1M_zstd12_patched summary: 39.8 GiByte in 48.6sec - average of 839 MiB/s $ sudo zfs list -p -o name,used,lused,ratio evenfaster/lowcomp_1M_zstd12_normal evenfaster/lowcomp_1M_zstd12_patched NAME USED LUSED RATIO evenfaster/lowcomp_1M_zstd12_normal 39549931520 42721221632 1.08 evenfaster/lowcomp_1M_zstd12_patched 39626399744 42721217536 1.07 $ python3 -c "print(39626399744 - 39549931520)" 76468224 $ I'll take 76 MB out of 42 GB for > 4x speedup. Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl> Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #13244	2022-05-24 09:43:22 -07:00
наб	38f4d99f76	linux: libzfs: simplify module-loaded check The short-path is now one access() call, we always modprobe zfs (ZFS_MODULE_LOADING which doesn't use the libzfs boolean parsing is gone), and we use a simple inotify IN_CREATE loop with a timerfd timeout rather than 10ms kernel-style polling There's one substantial difference: ZFS_MODULE_TIMEOUT=-1 now means "never give up", rather than "wait 10 minutes" Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13330	2022-05-18 12:51:42 -07:00
Mateusz Piotrowski	0c11a2738d	Fix typos in zfs-bookmark examples Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Mateusz Piotrowski <0mp@FreeBSD.org> Closes #13456	2022-05-12 09:34:24 -07:00
наб	888914486e	ztest: take -B ./path/to/ztest, LD_LIBRARY_PATH=./path/lib:$L_L_P This changes the behaviour of -B from the illumos one which would, in the example in the manual, take just ./chroots/lenny; this, however, is more versatile, and scales much better for systems with ZFS in /usr/local, for example Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13411 Closes #1770	2022-05-11 10:33:12 -07:00
наб	93a8c85e7b	Move test-runner.1 into top-level man/ dist delta: +zfs-2.1.99/man/man1/test-runner.1 -zfs-2.1.99/tests/test-runner/man/ -zfs-2.1.99/tests/test-runner/man/test-runner.1 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13316	2022-05-10 10:19:18 -07:00
наб	0f6c4fd00e	autoconf: use include directives instead of recursing down man Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13316	2022-05-10 10:19:10 -07:00
наб	5cdca5b1da	autoconf: use include directives instead of recursing down cmd No installation diff, dist lost -zfs-2.1.99/cmd/fsck_zfs/fsck.zfs which was distributed erroneously, since it's generated Also clean gitrev on clean Also add -e 'any possible bashisms' to default checkbashisms flags, and fully parallelise it and shellcheck, and it works out-of-tree, too Also align the Release in the dist META file correctly Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13316	2022-05-10 10:18:38 -07:00
Rich Ercolani	a30927f763	Add workaround for broken Linux pipes Linux has an unresolved hang if you resize a pipe with bytes in it. Since there's no obvious way to detect this happening, added a workaround to disable resizing the pipe buffer if you set an environment variable. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #13309	2022-05-09 16:33:55 -07:00
наб	d1fd6dbf6d	man: zpool-import.8: -d -or -c Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13437	2022-05-09 16:03:17 -07:00
WHR	a894ae75b8	Fix incorrect use of unit prefix names in man pages Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com> Signed-off-by: WHR <msl0000023508@gmail.com> Closes #13363	2022-05-04 11:44:12 -07:00
Alexander Motin	c55b293287	Improve mg_aliquot math When calculating mg_aliquot alike to #12046 use number of unique data disks in the vdev, not the total number of children vdev. Increase default value of the tunable from 512KB to 1MB to compensate. Before this change each disk in striped pool was getting 512KB of sequential data, in 2-wide mirror -- 1MB, in 3-wide RAIDZ1 -- 768KB. After this change in all the cases each disk should get 1MB. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes #13388	2022-05-04 11:33:42 -07:00
Rich Ercolani	f2330bd156	Default zfs_max_recordsize to 16M Increase the default allowed maximum recordsize from 1M to 16M. As described in the zfs(4) man page, there are significant costs which need to be considered before using very large blocks. However, there are scenarios where they make good sense and it should no longer be necessary to artificially restrict their use behind a module option. Note that for 32-bit platforms we continue to leave this restriction in place due to the limited virtual address space available (256-512MB). On these systems only a handful of blocks could be cached at any one time severely impacting performance and potentially stability. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12830 Closes #13302	2022-04-28 15:12:24 -07:00
Alexander Motin	600a02b884	Improve log spacemap load time Previous flushing algorithm limited only total number of log blocks to the minimum of 256K and 4x number of metaslabs in the pool. As result, system with 1500 disks with 1000 metaslabs each, touching several new metaslabs each TXG could grow spacemap log to huge size without much benefits. We've observed one of such systems importing pool for about 45 minutes. This patch improves the situation from five sides: - By limiting maximum period for each metaslab to be flushed to 1000 TXGs, that effectively limits maximum number of per-TXG spacemap logs to load to the same number. - By making flushing more smooth via accounting number of metaslabs that were touched after the last flush and actually need another flush, not just ms_unflushed_txg bump. - By applying zfs_unflushed_log_block_pct to the number of metaslabs that were touched after the last flush, not all metaslabs in the pool. - By aggressively prefetching per-TXG spacemap logs up to 16 TXGs in advance, making log spacemap load process for wide HDD pool CPU-bound, accelerating it by many times. - By reducing zfs_unflushed_log_block_max from 256K to 128K, reducing single-threaded by nature log processing time from ~10 to ~5 minutes. As further optimization we could skip bumping ms_unflushed_txg for metaslabs not touched since the last flush, but that would be an incompatible change, requiring new pool feature. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes #12789	2022-04-26 10:44:21 -07:00
George Amanakis	0409d33273	Improve zpool status output, list all affected datasets Currently, determining which datasets are affected by corruption is a manual process. The primary difficulty in reporting the list of affected snapshots is that since the error was initially found, the snapshot where the error originally occurred in, may have been deleted. To solve this issue, we add the ID of the head dataset of the original snapshot which the error was detected in, to the stored error report. Then any time a filesystem is deleted, the errors associated with it are deleted as well. Any time a clone promote occurs, we modify reports associated with the original head to refer to the new head. The stored error reports are identified by this head ID, the birth time of the block which the error occurred in, as well as some information about the error itself are also stored. Once this information is stored, we can find the set of datasets affected by an error by walking back the list of snapshots in the given head until we find one with the appropriate birth txg, and then traverse through the snapshots of the clone family, terminating a branch if the block was replaced in a given snapshot. Then we report this information back to libzfs, and to the zpool status command, where it is displayed as follows: pool: test state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A scan: scrub repaired 0B in 00:00:00 with 800 errors on Fri Dec 3 08:27:57 2021 config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 sdb ONLINE 0 0 1.58K errors: Permanent errors have been detected in the following files: test@1:/test.0.0 /test/test.0.0 /test/1clone/test.0.0 A new feature flag is introduced to mark the presence of this change, as well as promotion and backwards compatibility logic. This is an updated version of #9175. Rebase required fixing the tests, updating the ABI of libzfs, updating the man pages, fixing bugs, fixing the error returns, and updating the old on-disk error logs to the new format when activating the feature. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Co-authored-by: TulsiJain <tulsi.jain@delphix.com> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #9175 Closes #12812	2022-04-25 17:25:42 -07:00
наб	6625262c70	man: zfs-send.8: fix -X synopses and description Also clean up the horrendously verbose -X handling in zfs_main() Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13352	2022-04-25 12:46:22 -07:00
наб	e3fc330d6c	Add dracut.zfs.7 Thorough documentation with a dracut.bootup(7)-style flowchart, dracut.cmdline(7)-style cmdline listing, and per-file docs like the old README Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13291	2022-04-20 16:45:47 -07:00
наб	e37e7dd6a6	man: ... -> … again zfs-program.8 is left, but that's literal Lua syntax Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13255	2022-04-20 14:31:10 -07:00
наб	92295af800	Document zfs inherit -S's interaction with noninheritable properties Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11894 Closes #13335	2022-04-20 13:39:11 -07:00
szubersk	4620342a32	zfs(8): remove errors reported by mandoc 1.14.6 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Signed-off-by: szubersk <szuberskidamian@gmail.com> Closes #13315	2022-04-13 11:34:46 -07:00
szubersk	4f0be2bdb0	zfs(8): add requirements towards fs names Provide explicit requirements towards file system naming convention in OpenZFS man pages. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Signed-off-by: szubersk <szuberskidamian@gmail.com> Mitigates #13310 Closes #13315	2022-04-13 11:34:28 -07:00
наб	95babbe573	scripts: cstyle: remove unused -h Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13285	2022-04-05 09:36:13 -07:00
наб	14ed1f2424	cstyle.1: note -g, $CI being equivalent to -g Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13285	2022-04-05 09:36:06 -07:00
наб	93f3a3517c	scripts: cstyle: remove unused -C Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13285	2022-04-05 09:35:46 -07:00
наб	7dc782e5c5	cstyle: remove unused -o Remove handling for allowing doxygen- and embedding in splint(?)-style comments. This functionality is unused by OpenZFS. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13264	2022-03-30 15:37:28 -07:00
наб	b61595ff86	man: zfs-rename.8: import examples from zfs.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:14 -07:00
наб	4d4fa5215d	man: zfs-snapshot.8: import examples from zfs.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:14 -07:00
наб	bc2fe49a4f	man: zfs-promote.8: import examples from zfs.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:14 -07:00
наб	d842ca3ff2	man: zfs-create.8: import examples from zfs.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:14 -07:00
наб	3b887e485c	man: zfs-destroy.8: import examples from zfs.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:14 -07:00
наб	9bda775bd9	man: zfs-rollback.8: import examples from zfs.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:14 -07:00
наб	c6fd08797f	man: zfs-clone.8: import examples from zfs.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:14 -07:00
наб	52f3aebdb2	man: zfs-list.8: import examples from zfs.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:14 -07:00
наб	2057b47b91	man: zfs-send.8, zfs-receive.8: import examples from zfs.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:14 -07:00
наб	b7b3f19935	man: zfs-diff.8: import examples from zfs.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:14 -07:00
наб	1e164f89e5	man: zfs-bookmark.8: import examples from zfs.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:14 -07:00
наб	c522339fd7	man: zfs-set.8: import examples from zfs.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:13 -07:00
наб	575e20602c	man: zfs-allow.8: import examples from zfs.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:13 -07:00
наб	69b06a77b6	man: zpool-iostat.8: import examples from zpool.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:13 -07:00
наб	b36069ab88	man: zpool-remove.8: import examples from zpool.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:13 -07:00
наб	d8dd89acd1	man: zpool-upgrade.8: import examples from zpool.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:13 -07:00
наб	8ccb3fd6b1	man: zpool-import.8: import examples from zpool.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:13 -07:00
наб	4849523af0	man: zpool-export.8: import examples from zpool.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:13 -07:00
наб	4b660ffe7b	man: zpool-destroy.8: import examples from zpool.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:13 -07:00
наб	74181ca44c	man: zpool-list.8: import examples from zpool.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:13 -07:00
наб	c34eb6f7a1	man: zpool-add.8: import examples from zpool.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:13 -07:00
наб	9cdb067539	man: zpool-create.8: import examples from zpool.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228 Closes #13140	2022-03-28 10:13:13 -07:00
наб	a2550c9df9	man: Examples: use subsections instead of lists Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:13 -07:00
наб	1ebe4c482d	man: zpool.8: Examples: unmirrored -> non-redundant Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:13 -07:00
наб	6571873f1b	man: zpool.8: Examples: use Pa for disks and scripts Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13228	2022-03-28 10:13:13 -07:00
George Melikov	018937884f	zpoolconcepts.7: fix comma typo Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Melikov <mail@gmelikov.ru> Closes #13239	2022-03-23 08:53:19 -07:00
Sean Eric Fagan	565089f592	Allow zfs send to exclude datasets Add support for a -exclude/-X option to `zfs send` to allow dataset hierarchies to be excluded. Snapshots can be excluded using a channel program; however, this can result in failures with 'zfs send -R'; this option allows them to be excluded. Fortunately, this required a change only to cmd/zfs/zfs_main.c, using the already-existing callback argument to zfs_send() that is currently unused. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Christian Schwarz <christian.schwarz@nutanix.com> Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Sean Eric Fagan <kithrup@mac.com> Signed-off-by: Sean Eric Fagan <kithrup@mac.com> Closes #13158	2022-03-18 17:02:12 -07:00
наб	77cdc63d09	zfs-list.8: mention -S after -s it calls back to Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12996	2022-03-15 15:14:46 -07:00
наб	497bf14a4a	zdb.8: cleanup Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13093	2022-03-03 10:44:49 -08:00
Rich Ercolani	56fa4aa96e	Default to ON for compression A simple change, but so many tests break with it, and those are the majority of this. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #13078	2022-03-03 10:43:38 -08:00
Jitendra Patidar	361a7e8211	log xattr=sa create/remove/update to ZIL As such, there are no specific synchronous semantics defined for the xattrs. But for xattr=on, it does log to ZIL and zil_commit() is done, if sync=always is set on dataset. This provides sync semantics for xattr=on with sync=always set on dataset. For the xattr=sa implementation, it doesn't log to ZIL, so, even with sync=always, xattrs are not guaranteed to be synced before xattr call returns to caller. So, xattr can be lost if system crash happens, before txg carrying xattr transaction is synced. This change adds xattr=sa logging to ZIL on xattr create/remove/update and xattrs are synced to ZIL (zil_commit() done) for sync=always. This makes xattr=sa behavior similar to xattr=on. Implementation notes: The actual logging is fairly straight-forward and does not warrant additional explanation. However, it has been 14 years since we last added new TX types to the ZIL [1], hence this is the first time we do it after the introduction of zpool features. Therefore, here is an overview of the feature activation and deactivation workflow: 1. The feature must be enabled. Otherwise, we don't log the new record type. This ensures compatibility with older software. 2. The feature is activated per-dataset, since the ZIL is per-dataset. 3. If the feature is enabled and dataset is not for zvol, any append to the ZIL chain will activate the feature for the dataset. Likewise for starting a new ZIL chain. 4. A dataset that doesn't have a ZIL chain has the feature deactivated. We ensure (3) by activating on the first zil_commit() after the feature was enabled. Since activating the features requires waiting for txg sync, the first zil_commit() after enabling the feature will be slower than usual. The downside is that this is really a conservative approximation: even if we never append a 'TX_SETSAXATTR' to the ZIL chain, we pay the penalty for feature activation. The upside is that the user is in control of when we pay the penalty, i.e., upon enabling the feature. We ensure (4) by hooking into zil_sync(), where ZIL destroy actually happens. One more piece on feature activation, since it's spread across multiple functions: zil_commit() zil_process_commit_list() if lwb == NULL // first zil_commit since zil_open zil_create() if no log block pointer in ZIL header: if feature enabled and not active: // CASE 1 enable, COALESCE txg wait with dmu_tx that allocated the log block else // log block was allocated earlier than this zil_open if feature enabled and not active: // CASE 2 enable, EXPLICIT txg wait else // already have an in-DRAM LWB if feature enabled and not active: // this happens when we enable the feature after zil_create // CASE 3 enable, EXPLICIT txg wait [1] `da6c28aaf6` Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Christian Schwarz <christian.schwarz@nutanix.com> Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jitendra Patidar <jitendra.patidar@nutanix.com> Closes #8768 Closes #9078	2022-02-22 13:06:43 -08:00
наб	7454ca413c	zpoo-features.7: raidz -> RAID-Z near dRAID Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13116	2022-02-22 10:56:21 -08:00
наб	1f4376eb01	zpool-features.7: never-return-enabled consistency Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13116	2022-02-22 10:56:14 -08:00
наб	58a4848132	zpool-features.7: zfs sendstreams Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13116	2022-02-22 10:56:06 -08:00
наб	6f763af270	zpool-features.7: spurious line break in enabled_txg Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13116	2022-02-22 10:55:58 -08:00
наб	2d232ca806	man: full stop at EOL Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13116	2022-02-22 10:55:51 -08:00
наб	6985944e2f	man: -based when -based Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13116	2022-02-22 10:55:44 -08:00
наб	a737b415d6	man: IO -> I/O; I/Os -> I/O operations again Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13116	2022-02-22 10:55:37 -08:00
наб	aafb89016b	man: final RAIDZ -> RAID-Z Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13116	2022-02-22 10:55:29 -08:00
наб	6bf6f5196b	man: final VDEV -> vdev Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13116	2022-02-22 10:55:18 -08:00
Ryan Moeller	5c0061345b	Cross-platform xattr user namespace compatibility ZFS on Linux originally implemented xattr namespaces in a way that is incompatible with other operating systems. On illumos, xattrs do not have namespaces. Every xattr name is visible. FreeBSD has two universally defined namespaces: EXTATTR_NAMESPACE_USER and EXTATTR_NAMESPACE_SYSTEM. The system namespace is used for protected FreeBSD-specific attributes such as MAC labels and pnfs state. These attributes have the namespace string "freebsd:system:" prefixed to the name in the encoding scheme used by ZFS. The user namespace is used for general purpose user attributes and obeys normal access control mechanisms. These attributes have no namespace string prefixed, so xattrs written on illumos are accessible in the user namespace on FreeBSD, and xattrs written to the user namespace on FreeBSD are accessible by the same name on illumos. Linux has several xattr namespaces. On Linux, ZFS encodes the namespace in the xattr name for every namespace, including the user namespace. As a consequence, an xattr in the user namespace with the name "foo" is stored by ZFS with the name "user.foo" and therefore appears on FreeBSD and illumos to have the name "user.foo" rather than "foo". Conversely, none of the xattrs written on FreeBSD or illumos are accessible on Linux unless the name happens to be prefixed with one of the Linux xattr namespaces, in which case the namespace is stripped from the name. This makes xattrs entirely incompatible between Linux and other platforms. We want to make the encoding of user namespace xattrs compatible across platforms. A critical requirement of this compatibility is for xattrs from existing pools from FreeBSD and illumos to be accessible by the same names in the user namespace on Linux. It is also necessary that existing pools with xattrs written by Linux retain access to those xattrs by the same names on Linux. Making user namespace xattrs from Linux accessible by the correct names on other platforms is important. The handling of other namespaces is not required to be consistent. Add a fallback mechanism for listing and getting xattrs to treat xattrs as being in the user namespace if they do not match a known prefix. Do not allow setting or getting xattrs with a name that is prefixed with one of the namespace names used by ZFS on supported platforms. Allow choosing between legacy illumos and FreeBSD compatibility and legacy Linux compatibility with a new tunable. This facilitates replication and migration of pools between hosts with different compatibility needs. The tunable controls whether or not to prefix the namespace to the name. If the xattr is already present with the alternate prefix, remove it so only the new version persists. By default the platform's existing convention is used. Reviewed-by: Christian Schwarz <christian.schwarz@nutanix.com> Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11919	2022-02-15 16:35:30 -08:00
наб	0c8aab9586	zfs-receive.8: properly unlight = in option setting Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Amanakis <gamanakis@gmail.com> Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13101	2022-02-14 13:07:50 -08:00
наб	4631bf7bfb	zfs-receive.8: fix Op Fl x Ar encryption in running text Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Amanakis <gamanakis@gmail.com> Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13101	2022-02-14 13:07:45 -08:00
наб	4ea4a46c27	zpool-import.8: WARNING should be emphasised Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13082	2022-02-11 11:50:54 -08:00
наб	a9e62ec368	zpool-import.8: newpool is Ar, not Sy Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13082	2022-02-11 11:50:14 -08:00
наб	17fd8f6b0a	zpoolprops.7: document leaked It's noted very scarcely in the code as it stands, indeed the only actual comment on this is /* * We have finished background destroying, but there is still * some space left in the dp_free_dir. Transfer this leaked * space to the dp_leak_dir. */ Introduced in `fbeddd60b7` ("Illumos 4390 - I/O errors can corrupt space map when deleting fs/vol"), which explains, alongside the references, that this can only happen with a corrupted pool Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13081	2022-02-11 11:49:07 -08:00
Zhu Chuang	0a222408f1	Correct a typo in zfs-receive.8 Should be `-o keyformat=passphrase` instead of `-o -keyformat=passphrase` Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chuang Zhu <chuang@melty.land> Closes #13072	2022-02-11 11:47:35 -08:00
наб	d9fdba124d	tests: prune remaining xargs(1), add missing zfs-project -c0 note -c0 suppresses diagnoses ‒ it's not just -c but with NULs; cf. http://build.zfsonlinux.org/builders/Debian%2010%20x86_64%20%28TEST%29/builds/10605/steps/shell_4/logs/log search for the second "zfs project -s -p" instance Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12979	2022-01-26 11:30:09 -08:00
Paul Zuchowski	5a4d282f55	Fix problem with zdb -d zdb -d <pool>/<objset ID> does not work when other command line arguments are included i.e. zdb -U <cachefile> -d <pool>/<objset ID> This change fixes the command line parsing to handle this situation. Also fix issue where zdb -r <dataset> <file> does not handle the root <dataset> of the pool. Introduce -N option to force <objset ID> to be interpreted as a numeric objsetID. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Paul Zuchowski <pzuchowski@datto.com> Closes #12845 Closes #12944	2022-01-20 10:28:55 -07:00
Brian Behlendorf	7adc190098	Clarify `failmode=wait` documentation Nowhere in the description of the failmode property does it clearly state how to bring a suspended pool back online. Add a few words to property description and the zpool-clear(8) man page. Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12907 Closes #9395	2022-01-14 15:30:07 -08:00
наб	12bd322dde	man: zfs.4: miscellaneous cleanup Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12941	2022-01-13 11:09:03 -08:00
chrisrd	0175272f64	man: speling Fix spelling. Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chris Dunlop <chris@onthe.net.au> Closes #12911	2022-01-06 11:00:01 -08:00
Manoj Joseph	6b2e32019e	Long opts for zdb This change introduces long options for zdb. It updates the usage message as well to include the long options. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Manoj Joseph <manoj.joseph@delphix.com> Closes #12818	2022-01-06 10:54:32 -08:00
Alexander Motin	462217d1c2	Reduce number of arc_prune threads On FreeBSD vnode reclamation is single-threaded, protected by single global lock. Linux seems to be able to use a thread per mount point, but at this time it creates more harm than good. Reduce number of threads to 1, adding tunable in case somebody wants to try more. Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Chris Dunlop <chris@onthe.net.au> Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Closes #12896 Issue #9966	2021-12-22 17:07:13 -08:00
наб	cf65c33c9c	zfs-share.8: document -l flag Description stolen from zfs-mount.8 Reviewed-by: Don Brady <don.brady@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12067	2021-12-17 12:52:26 -08:00
Georgy Yakovlev	2300621dc7	systemd: add weekly and monthly scrub timers Timers can be enabled as follows: systemctl enable zfs-scrub-weekly@rpool.timer --now systemctl enable zfs-scrub-monthly@datapool.timer --now Each timer will pull in zfs-scrub@${poolname}.service, which is not schedule-specific. Added PERIODIC SCRUB section to zpool-scrub.8. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org> Closes #12193	2021-12-16 11:47:22 -08:00
наб	344bbc82e7	zfs, libzfs: diff: accept -h/ZFS_DIFF_NO_MANGLE, disabling path escaping Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12829	2021-12-13 15:49:40 -08:00
Brian Behlendorf	05b3eb6d23	Default to zfs_dmu_offset_next_sync=1 Strict hole reporting was previously disabled by default as a performance optimization. However, this has lead to confusion over the expected behavior and a variety of workarounds being adopted by consumers of ZFS. Change the default behavior to always report holes and force the TXG sync. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12746	2021-11-30 10:38:09 -08:00
maxz	cfc62062ae	Replace wrong occurrences of `affect` by `effect` in the man pages Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Amanakis <gamanakis@gmail.com> Signed-off-by: Max Zettlmeißl <max@zettlmeissl.de> Closes #12784	2021-11-30 10:28:57 -08:00
Rich Ercolani	c5ccbbdcf6	Allow printing special vdev metaslab groups Sometimes, we'd like to know info about the metaslab groups on special vdevs too. So let's make -MM do something useful. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12750	2021-11-30 10:26:45 -08:00
наб	4325de09cd	etc/systemd/zfs-mount-generator: serialise, handle keylocation=http[s]:// * etc/systemd/zfs-mount-generator: serialise The wins for a relatively normal workload are rather slim: real 0.02119s/0.00985s=2.15029x user 0.02130s/0.00346s=6.15560x sys 0.03858s/0.00643s=6.00062x wall-total 0.014518s/0.005925s=2.45009x wall-init 0.014518s/0.002457s=5.90684x wall-real 0.014518s/0.003467s=4.18668x But this is a big win on machines with a lot of datasets and expensive forks. For example, the gain on a VM on my work laptop with 900+ legacy-mount Docker datasets, the original gains from the C rewrite were only five-fold: real 0.516s/0.102s=5.05882x user 0.237s/0.143s=1.65734x sys 0.287s/0.100s=2.87x And this serial variant gains this back there as well: real 0.102s/0.008s=12.75x user 0.143s/0.007s=20.42857 sys 0.100s/0.001s=100x wall-total 0.09717s/0.00319s=30.40255x wall-init 0.00203s/0.00200s=1.015941x wall-real 0.09513s/0.00118s=80.02043x For a total of real 0.516s/0.008s=64.5x user 0.237s/0.007s=33.85714x sys 0.287s/0.001s=287x Suggested-by: Richard Laager <rlaager@wiktel.com> * etc/systemd/zfs-mount-generator: pull in network for keylocation=https Also simplify RequiresMountsFor= handling Ref: #11956 Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12138	2021-11-30 09:29:50 -07:00
Allan Jude	2a673e76a9	Vdev Properties Feature Add properties, similar to pool properties, to each vdev. This makes use of the existing per-vdev ZAP that was added as part of device evacuation/removal. A large number of read-only properties are exposed, many of the members of struct vdev_t, that provide useful statistics. Adds support for read-only "removing" vdev property. Adds the "allocating" property that defaults to "on" and can be set to "off" to prevent future allocations from that top-level vdev. Supports user-defined vdev properties. Includes support for properties.vdev in SYSFS. Co-authored-by: Allan Jude <allan@klarasystems.com> Co-authored-by: Mark Maybee <mark.maybee@delphix.com> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Signed-off-by: Allan Jude <allan@klarasystems.com> Closes #11711	2021-11-30 07:46:25 -07:00
pstef	5f64bf7fde	Fix typo in zpool.8 Update zpool.8 to avoid parseltongue. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Piotr P. Stefaniak <pstef@freebsd.org> Closes #12763	2021-11-29 10:52:42 -08:00
Rich Ercolani	269b5dadcf	Enable edonr in FreeBSD The code is integrated, builds fine, runs fine, there's not really any reason not to. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12735	2021-11-16 12:40:10 -07:00
George Amanakis	c9d62d1356	Introduce a tunable to exclude special class buffers from L2ARC Special allocation class or dedup vdevs may have roughly the same performance as L2ARC vdevs. Introduce a new tunable to exclude those buffers from being cacheable on L2ARC. Reviewed-by: Don Brady <don.brady@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #11761 Closes #12285	2021-11-11 12:52:16 -08:00
Fedor Uporov	d04b5c9e87	zhack: Add repair label option In case if all label checksums will be invalid on any vdev, the pool will become unimportable. The zhack with newly added cli options could be used to restore label checksums and make pool importable again. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com> Closes #2510 Closes #12686	2021-11-11 11:26:18 -08:00
Brian Behlendorf	de198f2d95	Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency When using lseek(2) to report data/holes memory mapped regions of the file were ignored. This could result in incorrect results. To handle this zfs_holey_common() was updated to asynchronously writeback any dirty mmap(2) regions prior to reporting holes. Additionally, while not strictly required, the dn_struct_rwlock is now held over the dirty check to prevent the dnode structure from changing. This ensures that a clean dnode can't be dirtied before the data/hole is located. The range lock is now also taken to ensure the call cannot race with zfs_write(). Furthermore, the code was refactored to provide a dnode_is_dirty() helper function which checks the dnode for any dirty records to determine its dirtiness. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #11900 Closes #12724	2021-11-07 14:27:44 -07:00
D. Ebdrup	28401f6668	zfsprops.7: Add note about comma-separation This change primarily seeks to make implicit documentation explicit, as it is not outright stated that options should be comma-separated, nor is there a reason given for it. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Daniel Ebdrup Jensen <debdrup@FreeBSD.org> Closes #12579	2021-10-29 16:30:44 -07:00
felixdoerre	6cb5e1e759	libshare: nfs: pass through ipv6 addresses in bracket notation Recognize when the host part of a sharenfs attribute is an ipv6 Literal and pass that through without modification. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Felix Dörre <felix@dogcraft.de> Closes: #11171 Closes #11939 Closes: #1894	2021-10-20 10:40:00 -07:00
Rich Ercolani	543ab04cc3	Document additional -c caveat One might expect "send data as it is on disk, and cannot trigger compression changes" to imply "does not attempt to compress data that was not compressed on the sender." One would be mistaken. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12570	2021-10-05 11:48:17 -07:00
Trevor Bautista	00888c0898	Extend zpool-iostat to account for ZIO_PRIORITY_REBUILD (#12319 ) Previously, zpool-iostat did not display any data regarding rebuild I/Os in either the latency/size histograms (-w/-l/-r) or the queue data (-q). This fix essentially utilizes the existing infrastructure for tracking rebuild queue data and displays this data in the proper places within zpool-iostat's output. Signed-off-by: Trevor Bautista <tbautista@newmexicoconsortium.org> Signed-off-by: Trevor Bautista <tbautista@lanl.gov> Co-authored-by: Trevor Bautista <tbautista@newmexicoconsortium.org> Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2021-08-26 11:26:49 -07:00
Sam Hathaway	2e9e2c8ff5	zpool-remove.8: describe top-level vdev sector size limitation Document that top-level vdevs cannot be removed unless all top-level vdevs have the same sector size. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Sam Hathaway <sam@sam-hathaway.com> Closes #11339 Closes #12472	2021-08-23 14:59:18 -07:00
Gordon Bergling	0f402668f9	zfs.4: Fix typo s/compatiblity/compatibility/ Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Gordon Bergling <gbergling@googlemail.com> Closes #12464	2021-08-17 11:01:07 -06:00
Alexander Motin	72f0521aba	Increase default volblocksize from 8KB to 16KB Many things has changed since previous default was set many years ago. Nowadays 8KB does not allow adequate compression or even decent space efficiency on many of pools due to 4KB disk physical block rounding, especially on RAIDZ and DRAID. It effectively limits write throughput to only 2-3GB/s (250-350K blocks/s) due to sync thread, allocation, vdev queue and other block rate bottlenecks. It keeps L2ARC expensive despite many optimizations and dedup just unrealistic. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Closes #12406	2021-08-17 09:59:46 -06:00
George Melikov	e1870be450	Man zpool-scrub.8: describe sequential scrub Describe sequential scrub and add examples of scrub status. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: George Melikov <mail@gmelikov.ru> Closes #12429	2021-08-05 15:30:28 -06:00
Václav Skála	31c41aea9c	Add missing properties to zfs allow manpage Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Václav Skála <skala@vshosting.cz> Closes #12402	2021-07-26 12:44:01 -07:00
Kevin Jin	a7bd20e309	Add Module Parameter Regarding Log Size Limit * Add Module Parameters Regarding Log Size Limit zfs_wrlog_data_max The upper limit of TX_WRITE log data. Once it is reached, write operation is blocked, until log data is cleared out after txg sync. It only counts TX_WRITE log with WR_COPIED or WR_NEED_COPY. Reviewed-by: Prakash Surya <prakash.surya@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: jxdking <lostking2008@hotmail.com> Closes #12284	2021-07-20 09:40:24 -06:00
Rich Ercolani	b7ec530233	Correct zfs-send(8) on readonly sends zfs-send(8) claimed in the flags list you could use -pR when sending a readonly filesystem or volume. You cannot. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12336	2021-07-16 13:58:01 -06:00
Alexander Motin	f7de776da2	Fix ARC ghost states eviction accounting arc_evict_hdr() returns number of evicted bytes in scope of specific state. For ghost states it does not mean the amount of really freed memory, but the logical buffer size. It is correct for the eviction process, but not for waking up threads waiting for ARC size reduction, as added in "Revise ARC shrinker algorithm" commit, causing premature wakeups while ARC is still overflowed, allowing even bigger overflow, plus processing overhead when next allocation will also get blocked, probably also for too short time. To fix that make arc_evict_hdr() also return the amount of really freed memory, which for the ghost states is only the header, and use it to update arc_evict_count instead. Originally I was thinking to not return it at all, since arc_get_data_impl() does not account for the headers, but decided that some slow allocation progress is better than long waits, reaching on my tests up to 100ms. To reduce negative latency effects of long time periods when reclaim thread can free little real memory, start reclamation process earlier, before we actually reached the overflow threshold, when we have to throttle new allocations. We can also do it without taking global arc_evict_lock, reducing the contention. Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes #12279	2021-07-13 09:41:59 -06:00
наб	4b7ed6a286	zgenhostid.8: revisit Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12212	2021-06-09 14:36:03 -07:00
наб	1b37cc1abe	Consistentify miscellaneous style on remaining manpages Most notably this fixes the vdev_id(8) non-.Xrs in vdev_id.conf.5 Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12212	2021-06-09 14:35:53 -07:00
наб	2badb3457a	Move properties, parameters, events, and concepts around manual sections The pages moved as follows: zpool-features.{5 => 7} spl{-module-parameters.5 => .4} zfs{-module-parameters.5 => .4} zfs-events.5 => into zpool-events.8 zfsconcepts.{8 => 7} zfsprops.{8 => 7} zpoolconcepts.{8 => 7} zpoolprops.{8 => 7} Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Co-authored-by: Daniel Ebdrup Jensen <debdrup@FreeBSD.org> Closes #12149 Closes #12212	2021-06-09 14:35:30 -07:00
наб	b0f3e8a6eb	man: use one Makefile, use OpenZFS for .Os The prevailing style is to use either nothing, or the originating organisational umbrella (here: OpenZFS), and these aren't Linux manpages This also deduplicates the substitution code, and makes adding/removing sexions simpler in future Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12212	2021-06-09 14:34:47 -07:00
наб	2d815d955e	Modernise/fix/rewrite unlinted manpages zpool-destroy.8: flatten, fix description zfs-wait.8: flatten, fix description, use list for events zpool-reguid.8: flatten, fix description zpool-history.8: flatten, fix description zpool-export.8: flatten, fix description, remove -f "unmount" reference AFAICT no such command exists even in Illumos (as of today, anyway), and we definitely don't call it zpool-labelclear.8: flatten, fix description zpool-features.5: modernise spl-module-parameters.5: modernise zfs-mount-generator.8: rewrite zfs-module-parameters.5: modernise Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12169	2021-06-07 13:41:54 -06:00
наб	f84fe3fc87	Lint most manpages Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12129	2021-06-04 12:48:31 -07:00
наб	f645d4416f	zfs-module-parameters.5: remove nonexistent parameters zfs_arc_overflow_shift was never a parameter: `ca0bf58d65` ("Illumos 5497 - lock contention on arcs_mtx") is the only result in git log -Soverflow_shift, and it wasn't exposed then, nor is it now zfs_read_chunk_size was renamed to zfs_vnops_read_chunk_size in `e53d678d4a` ("Share zfs_fsync, zfs_read, zfs_write, et al between Linux and FreeBSD") zio_decompress_fail_fraction was never a parameter: it was added in `c3bd3fb4ac` ("OpenZFS 9403 - assertion failed in arc_buf_destroy()") as a developer aid for setting in zdb, but it's a dangerous test tunable and has no place in public documentation, (not to mention that it obviously doesn't work): > Although this did uncover a few low priority issues, this unfortuantely also causes ztest to ASSERT in many locations where the code is working correctly since it is designed to fail on IO errors. Developers can manually set this variable with the '-o' option to find and debug issues. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Richard Laager <rlaager@wiktel.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12157	2021-06-02 12:53:22 -07:00
наб	ace760a0b4	spl-module-parameters.5: remove spl_kmem_cache_{expire,obj_per_slab_min} Both were removed in `4fbdb10c7b` ("remove kmem_cache module parameter KMC_EXPIRE_AGE") Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Richard Laager <rlaager@wiktel.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12157	2021-06-02 12:52:32 -07:00
наб	d44449025c	zfs-events.5: modernise Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12125	2021-05-29 20:22:26 -07:00
наб	c3fb6653b5	vdev_id.conf.5: modernise Also yeet pci_slot since it doesn't seem to exist? Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12125	2021-05-29 20:22:21 -07:00
наб	56454b0391	man: use Nm/Cm/Fl consistently Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12125	2021-05-29 20:22:19 -07:00
наб	1f3cbcfcc5	zed.8: modernise Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12125	2021-05-29 20:22:14 -07:00
наб	3bcbb6af16	cstyle.1: modernise Also remove note about the OS/Net consolidation, now the illumos gate Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12125	2021-05-29 20:22:07 -07:00
наб	76b8f7cf53	vdev_id.8: modernise, note scsi topology Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12125	2021-05-29 20:22:01 -07:00
наб	5aace42ce7	zhack.1: modernise The spacing on zhack feature stat pool is a bit iffy(?) Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12125	2021-05-29 20:22:00 -07:00
наб	2c9c5bc859	zpool_influxdb.8: modernise Also rip out the section about potentially including in the OpenZFS distribution and simplify -e description Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12125	2021-05-29 20:21:59 -07:00
наб	3d00fbf9a0	zinject.8: modernise Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12125	2021-05-29 20:21:58 -07:00
наб	3365d2cf71	raidz_test.1: modernise Also re-add articles left out by the slav who wrote this Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12125	2021-05-29 20:21:56 -07:00
наб	defa5a0ae5	zpoolprops.8: fix spacing in ashift Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12125	2021-05-29 20:21:55 -07:00
наб	2b42a1e57e	fsck.zfs.8: modernise Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12125	2021-05-29 20:21:54 -07:00
наб	639871363a	arcstat.1: modernise Also slim down the description a tad Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12125	2021-05-29 20:21:53 -07:00

1 2 3 4 5 ...

965 Commits