Archive-Team/zfs - zfs - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Rob Norris	802c258fc1	compress: add "slack" compression options Signed-off-by: Allan Jude <allan@klarasystems.com>	2023-07-05 13:27:31 +00:00
Allan Jude	066532da51	Add module parameter to block 0 byte writes Some hardware has issues when issues a write of 0 bytes Add a new module paramter, zio_suppress_zero_writes That when enabled (default) will just complete these I/Os without sending them to the hardware. Signed-off-by: Allan Jude <allan@klarasystems.com>	2023-07-05 13:27:31 +00:00
Mateusz Piotrowski	91d6b61268	json: Define PRId64 and PRIu64 on FreeBSD On FreeBSD, these types are long instead of long long.	2023-07-05 13:27:31 +00:00
Mateusz Piotrowski	95d6d8d32f	json: Drop problematic casts in nvlist_to_json() The NVP_NAME() macro requires its argument to be castable to char . The compiler complains if const char is provided instead.	2023-07-05 13:27:31 +00:00
Mateusz Piotrowski	6ee35af1a4	zil: Drop an unnecessary if statement We already check for error != 0 earlier and return if true. The compiler error here is a false positive.	2023-07-05 13:27:31 +00:00
Mateusz Piotrowski	d744cdb77c	json: null_filter(): Use __maybe_unused The function fails to compile with -Wself-assign.	2023-07-05 13:27:31 +00:00
Mateusz Piotrowski	9c2c6124be	zpool: Provide GUID to zpool-reguid(8) with -g This commit extends the zpool-reguid(8) command with a -g flag, which allows the user to specify the GUID to set. Sponsored-by: Wasabi Technology, Inc. Sponsored-by: Klara Inc.	2023-07-05 13:27:31 +00:00
Allan Jude	9c9eed9737	Make zpool clear reset the removed flag on vdevs Signed-off-by: Allan Jude <allan@klarasystems.com>	2023-07-05 13:27:31 +00:00
Allan Jude	41b06f70c6	Make zpool clear reset the removed flag on vdevs Signed-off-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Richard Yao <richard.yao@klarasystems.com>	2023-07-05 13:27:31 +00:00
Fred Weigel	b6a9054a0e	Fix checkstyle for zil.c Returns are to be parenthesized	2023-07-05 13:27:31 +00:00
Fred Weigel	eb3607bcec	Fixes for Wasabi json endpoint Corrects status output.	2023-07-05 13:27:31 +00:00
Fred Weigel	cf5a6fbc82	Change 5 char tag limit to 255 Changes 5 character maximum tag to 255 characters.	2023-07-05 13:27:31 +00:00
Fred Weigel	6ccb1a75af	Klara update for json Fix checkstyle indicated errors, source format fixes Signed-off-by: Fred Weigel <fred.weigel@klarasystems.com>	2023-07-05 13:27:31 +00:00
Allan Jude	2284c4d200	Add module parameter to block 0 byte writes Some hardware has issues when issues a write of 0 bytes Add a new module paramter, zio_suppress_zero_writes That when enabled (default) will just complete these I/Os without sending them to the hardware. Signed-off-by: Allan Jude <allan@klarasystems.com>	2023-07-05 13:27:31 +00:00
Rob Norris	f882884358	btree: fix double-free in zfs_btree_remove_idx We applied 03c0ee94b to fix two use-after-free cases, backporting `13f2b8fb9` from upstream. Unfortunately that patch seems to have been misapplied, introducing a double-free in one of them. This commit fixes that. Signed-off-by: Rob Norris <rob.norris@klarasystems.com>	2023-07-05 13:27:31 +00:00
Rob Norris	88149e0873	zil_create: don't try to deallocate a block we never allocated (cherry picked from commit 8a35cfdcdd62ffc47e7628616f0dcb2ef172cf4b)	2023-07-05 13:27:31 +00:00
Rob Norris	5a256eaed1	zil_close: don't try to deallocate on-disk blocks If we're force-exporting or failed then there's no guarantee the IO will get anywhere. If its a clean shutdown then that's actually the lead block and it'll be sorted out during replay or next txg. (cherry picked from commit 01e04a4eef7811a31a6258c99d0cc51217732758)	2023-07-05 13:27:31 +00:00
Allan Jude	11d3cff47b	Normalize the endpoint name Signed-off-by: Allan Jude <allan@klarasystems.com>	2023-07-05 13:27:31 +00:00
fredw	43b705c787	stats_version: 2, scan_stats added even if never done. pass_scrub_scrub_spent_paused is now pass_scrub_spent_paused. stats is stats.json Signed-off-by: Allan Jude <allan@klarasystems.com>	2023-07-05 13:27:31 +00:00
Mateusz Piotrowski	3828f754f1	json_stats.c: Rename the stats file to "status.json"	2023-07-05 13:27:31 +00:00
Rob Norris	2724bcb3d6	zil: allow the ZIL to fail and restart independently of the pool zil_commit() has always returned void, and thus, cannot fail. Everything inside it assumed that if anything ever went wrong, it could fall back on txg_wait_synced() until the txg covering the operations being flushed from the ZIL has fully committed. This meant that if the pool failed and failmode=continue was set, syncing operations like fsync() would still block. Unblocking zil_commit() means largely the same approach. The difficulty is that the ZIL carries the record of uncommitted VFS operations (vs the changed data), and attached to those, callbacks and cvs that will release userspace callers once the data is on disk. So if we can't write the ZIL, we also can't release those records until the data is on disk. This wasn't a problem before, because the zil_commit() would block. If we change zil_commit() to return error, we still need to track those entries until the data they represent hits the disk. We also need to accept new records; just because the ZIL fails may not necessarily mean the pool itself is unavailable. This commit reorganises the ZIL to allow zil_commit() to return failure. If ZIL writes or flushes fail, the ZIL is moved into a "failed" state, and no further writes are done; all zil_commit() calls are serviced by the regular txg mechanism. Outstanding records (itx_ts) are held until the main pool writes their associated txg out. The records are then released. Once all records are cleared, the ZIL is reset and reopened. Signed-off-by: Rob Norris <rob.norris@klarasystems.com> (cherry picked from commit af821006f6602261e690fe6635689cabdeefcadf)	2023-07-05 13:27:31 +00:00
Rob Norris	cdaf041d39	zil: ensure flush errors are recieved Its possible for a hardware failure to occur in a way that the ZIL block writes appear to succeed, but the flush fails. Because flush errors were being ignored, the lwb chain would finish with a zero error code, which would result in zil_commit() returning and thus fsync() returning success to the caller, even though the data was not recorded in the ZIL. If the ZIL is on the main pool (no SLOG device) it would typically suspend around the same time. If that happened before the txg committed, then those writes are now totally lost - not on the pool, not in the ZIL. zil_lwb_flush_vdevs_done() has the necessary code to deal with this situation, but zio_flush() would never return failure, so it never saw it. This just allows flushes to report failure, and now we never miss a failed ZIL write. Signed-off-by: Rob Norris <rob.norris@klarasystems.com> (cherry picked from commit d9db5dccc56b551d0bf66bc9022b6c19a659b7e1)	2023-07-05 13:27:31 +00:00
Rob Norris	8ec175d7e1	zio_flush: require caller to decide if errors should propagate Ignoring flush errors makes it possible for callers to never know that their writes didn't succeed, and allows writes to be lost if the pool fails. This commit gives zio_flush() a flag argument, and updates the call sites to pass ZIO_FLAG_DONT_PROPAGATE to it. Thus, this commit does not change any behaviour, but opens the floor for further changes to allow those callers to handle flush failures sensibly. Signed-off-by: Rob Norris <rob.norris@klarasystems.com> (cherry picked from commit 6d0deb8a5a0c3d6bbc69d9625d55fc776bb98ea3)	2023-07-05 13:27:31 +00:00
Rob Norris	589cea17a9	dmu_tx_wait: handle pool suspension when failmode=continue Let txg_wait_synced_tx fail, so the caller can retry. Signed-off-by: Rob Norris <rob.norris@klarasystems.com> (cherry picked from commit d560d64dbdf853d8fb9e18fc7570bd309091b2e4)	2023-07-05 13:27:30 +00:00
Rob Norris	7b7af8ba02	vnops: thread DMU_TX_ASSIGN_CONTINUE to a bunch of vnops These are ones that I'm reasonably sure connect to a real syscall and have a reasonable error response. I've left stuff like `dirty_inode`, `zfs_inactive`, etc, which are internal kernel housekeeping things, as well as anything that looks like it belongs to zvols, ioctls, admin commands, etc. Signed-off-by: Rob Norris <rob.norris@klarasystems.com> (cherry picked from commit 39c2801c611e27b521d716fea8f771307820362e)	2023-07-05 13:27:30 +00:00
Rob Norris	aea007e336	dmu: add DMU_TX_ASSIGN_CONTINUE flag This is like DMU_TX_ASSIGN_NOSUSPEND, but only when failmode=continue, and returning EIO if the pool is suspended. Its designed to be easy to use from syscalls and similar without the ceremony of checking the for EAGAIN and failmode every time. Signed-off-by: Rob Norris <rob.norris@klarasystems.com> (cherry picked from commit 6bed8644dd2afa0e39727e9e90642479c2416521)	2023-07-05 13:27:30 +00:00
Rob Norris	48a48059c7	dmu: rename dmu_tx_assign flags Their names clash with those for txg_wait_synced_tx, and they aren't directly compatible, leading to confusion. Signed-off-by: Rob Norris <rob.norris@klarasystems.com> (cherry picked from commit 1f0fb1dae7c1e84de3b39e669e09b8b3d5b80b87)	2023-07-05 13:27:30 +00:00
Rob Norris	b0d75996ba	zio: don't report suspend IOs if the pool is already suspended This can happen if the pool suspended and then new IO is issued which then fails too. This doesn't change behaviour, just silences the noise. Signed-off-by: Rob Norris <rob.norris@klarasystems.com> (cherry picked from commit 3fa696404fb40205ed631538c62ec1a54d8ee6cd)	2023-07-05 13:27:30 +00:00
Rob Norris	3aea149bf8	linux: reject syncing ops if the filesystem is unmounting The kernel can call these during unmount, so we have to handle them directly to prevent any further IO being issued. zfs_fsync reorganised slightly to not set up zfs_fsyncer_key until after the teardown lock is acquired, just in case we don't get it. Signed-off-by: Rob Norris <rob.norris@klarasystems.com> (cherry picked from commit 900c26570ddcdd1d3ca135e6aee5df6456f6bfd6)	2023-07-05 13:27:30 +00:00
Mariusz Zaborski	40a9efd0e8	zfs: support force exporting pools This is primarily of use when a pool has lost its disk, while the user doesn't care about any pending (or otherwise) transactions. Implement various control methods to make this feasible: - txg_wait can now take a NOSUSPEND flag, in which case the caller will be alerted if their txg can't be committed. This is primarily of interest for callers that would normally pass TXG_WAIT, but don't want to wait if the pool becomes suspended, which allows unwinding in some cases, specifically when one is attempting a non-forced export. Without this, the non-forced export would preclude a forced export by virtue of holding the namespace lock indefinitely. - txg_wait also returns failure for TXG_WAIT users if a pool is actually being force exported. Adjust most callers to tolerate this. - spa_config_enter_flags now takes a NOSUSPEND flag to the same effect. - DMU objset initiator which may be set on an objset being forcibly exported / unmounted. - SPA export initiator may be set on a pool being forcibly exported. - DMU send/recv now use an interruption mechanism which relies on the SPA export initiator being able to enumerate datasets and closing any send/recv streams, causing their EINTR paths to be invoked. - ZIO now has a cancel entry point, which tells all suspended zios to fail, and which suppresses the failures for non-CANFAIL users. - metaslab, etc. cleanup, which consists of simply throwing away any changes that were not able to be synced out. - Linux specific: introduce a new tunable, zfs_forced_export_unmount_enabled, which allows the filesystem to remain in a modified 'unmounted' state upon exiting zpl_umount_begin, to achieve parity with FreeBSD and illumos, which have VFS-level support for yanking filesystems out from under users. However, this only helps when the user is actively performing I/O, while not sitting on the filesystem. In particular, this allows test #3 below to pass on Linux. - Add basic logic to zpool to indicate a force-exporting pool, instead of crashing due to lack of config, etc. Add tests which cover the basic use cases: - Force export while a send is in progress - Force export while a recv is in progress - Force export while POSIX I/O is in progress This change modifies the libzfs ABI: - New ZPOOL_STATUS_FORCE_EXPORTING zpool_status_t enum value. - New field libzfs_force_export for libzfs_handle. Signed-off-by: Will Andrews <will@firepipe.net> Signed-off-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Mariusz Zaborski <mariusz.zaborski@klarasystems.com> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Sponsored-by: Klara, Inc. Sponsored-by: Catalogics, Inc. Sponsored-by: Wasabi Technology, Inc. Closes #3461 (cherry picked from commit 852e633772217d779a63e8c46fe3c5f81dd8960e)	2023-07-05 13:27:30 +00:00
Mateusz Piotrowski	f65b59c5e5	module/zfs/Makefile.in: Add jprint.o and json_stats.o	2023-07-05 13:27:30 +00:00
Mateusz Piotrowski	dcf745c378	Remove remaining bits of zpool addlog and ZFS_IOC_ADD_LOG	2023-07-05 13:27:30 +00:00
Mateusz Piotrowski	676d1dcc8c	json_stats.c: Do not print value of vs_noalloc The vs_noalloc member of the vdev_stat structure was implemented in `2a673e76a9`. It is not available in ZFS 2.1.5, so code using it needs to be disabled.	2023-07-05 13:27:30 +00:00
Mateusz Piotrowski	bcde0da8e4	json_stats.c: Move variable declarations out of a switch statement This patch fixes the following compilation error: ``` ../../module/zfs/json_stats.c: In function ‘nvlist_to_json’: ../../module/zfs/json_stats.c:92:4: error: a label can only be part of a statement and a declaration is not a statement uint64_t u = (uint64_t )p; ^~~~~~~~ ../../module/zfs/json_stats.c:102:4: error: a label can only be part of a statement and a declaration is not a statement nvlist_t a = (nvlist_t )p; ^~~~~~~~ ```	2023-07-05 13:27:30 +00:00
Fred Weigel	747c7bbcf6	Add a JSON equivalent to zpool-status(8) This is a squashed commit of the commits from 03a64568f318c696b9e4be19429e72b446c97462 to 1c64f0c8832b34bfa82645125351d6c62815ae21 developed by Fred Weigel. Usage: cat /proc/spl/kstat/zfs/POOLNAME/stats The following changes has been applied during the rebase of the patches on top of the 2.1.5 branch: - Drop ZFS_IOC_ADD_LOG. This ioctl was introduced to support introducing messages into the ZFS kernel log. It was used for debugging during development. The implementation of this debugging feature made `zpool addlog` output messages to /proc/spl/kstat/zfs/dbgmsg. The messages could later be retrieved with `zdbgmsg show`. - Change the fmgw.c entry in lib/libzpool/Makefile.am to json_stats.c. The fmgw.c file has already been renamed to json_stats.c in other places. Co-authored-by: Mateusz Piotrowski <mateusz.piotrowski@klarasystems.com> (cherry picked from commit 75f3395d7fc0c93c02c8a8e792515f3e821aa05a)	2023-07-05 13:27:30 +00:00
Richard Yao	18ae26747c	Fix use-after-free in btree code Coverty static analysis found these. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Neal Gompa <ngompa@datto.com> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes #10989 Closes #13861 (cherry picked from commit `13f2b8fb92`)	2023-07-05 13:27:30 +00:00
Mateusz Piotrowski	2f327a2457	Turn default_bs and default_ibs into ZFS_MODULE_PARAMs The default_bs and default_ibs tunables control the default block size and indirect block size. So far, default_bs and default_ibs were tunable only on FreeBSD, e.g., sysctl vfs.zfs.default_ibs Remove the FreeBSD-specific sysctl code and expose default_bs and default_ibs as tunables on both Linux and FreeBSD using ZFS_MODULE_PARAM. One of the use cases for changing the values of those tunables is to lower the indirect block size, which may improve performance of large directories (as discussed during the OpenZFS Leadership Meeting on 2022-08-16). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Signed-off-by: Mateusz Piotrowski <mateusz.piotrowski@klarasystems.com> Sponsored-by: Wasabi Technology, Inc. Closes #14293 (cherry picked from commit `926715b9fc`)	2023-07-05 13:27:30 +00:00
Mateusz Piotrowski	3790dc2485	Add tunable to allow changing micro ZAP's max size This change turns `MZAP_MAX_BLKSZ` into a `ZFS_MODULE_PARAM()` called `zap_micro_max_size`. As a result, we can experiment with different micro ZAP sizes to improve directory size scaling. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Mateusz Piotrowski <mateuszpiotrowski@klarasystems.com> Co-authored-by: Toomas Soome <toomas.soome@klarasystems.com> Signed-off-by: Mateusz Piotrowski <mateuszpiotrowski@klarasystems.com> Sponsored-by: Wasabi Technology, Inc. Closes #14292 (cherry picked from commit `a4b21eadec`)	2023-07-05 13:27:30 +00:00
Ryan Moeller	403d4bc66e	FreeBSD: Silence clang unused-but-set-variable Quick and dirty build fix for warnings being treated as errors. Signed-off-by: Ryan Moeller <ryan@iXsystems.com>	2022-06-15 11:27:28 -07:00
Alexander Motin	6ff89fe126	Improve sorted scan memory accounting Since we use two B-trees q_exts_by_size and q_exts_by_addr, we should count 2x sizeof (range_seg_gap_t) per node. And since average B-tree memory efficiency is about 75%, we should increase it to 3x. Previous code under-counted up to 30% of the memory usage. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes #13537	2022-06-15 11:23:49 -07:00
Rich Ercolani	cc565f557b	Corrected edge case in uncompressed ARC->L2ARC handling I genuinely don't know why this didn't come up before, but adding the LZ4 early abort pointed out this flaw, in which we're allocating a buffer of one size, and then telling the compressor that we're handing it buffers of a different size, which may be Very Different - say, allocating 512b and then telling it the inputs are 128k. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Amanakis <gamanakis@gmail.com> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #13375	2022-06-14 18:10:21 -07:00
Alexander Motin	338188562b	Remove wrong assertion in log spacemap It is typical, but not generally true that if log summary has more blocks it must also have unflushed metaslabs. Normally with metaslabs flushed in order it works, but there are known exceptions, such as device removal or metaslab being loaded during its flush attempt. Before `600a02b884` if spa_flush_metaslabs() hit loading metaslab it usually stopped (unless memlimit is also exceeded), but now it may flush more metaslabs, just skipping that particular one. This increased chances of assertion to fire when the skipped metaslab is flushed on next iteration if all other metaslabs in that summary entry are already flushed out of order. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes #13486 Closes #13513	2022-06-06 16:57:56 -07:00
Brian Behlendorf	fec407fb69	Linux 5.19 compat: aops->read_folio() As of the Linux 5.19 kernel the readpage() address space operation has been replaced by read_folio(). Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13515	2022-06-01 14:24:49 -07:00
Brian Behlendorf	7ae5ea8864	Linux 5.19 compat: blkdev_issue_secure_erase() Linux 5.19 commit torvalds/linux@44abff2c0 splits the secure erase functionality from the blkdev_issue_discard() function. The blkdev_issue_secure_erase() must now be issued to issue a secure erase. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13515	2022-06-01 14:24:49 -07:00
Brian Behlendorf	048301b6dc	Linux 5.19 compat: bdev_max_secure_erase_sectors() Linux 5.19 commit torvalds/linux@44abff2c0 removed the blk_queue_secure_erase() helper function. The preferred interface is to now use the bdev_max_secure_erase_sectors() function to check for discard support. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13515	2022-06-01 14:24:49 -07:00
Brian Behlendorf	9ce5eb18ef	Linux 5.19 compat: bdev_max_discard_sectors() Linux 5.19 commit torvalds/linux@70200574cc removed the blk_queue_discard() helper function. The preferred interface is to now use the bdev_max_discard_sectors() function to check for discard support. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13515	2022-06-01 14:24:49 -07:00
Brian Behlendorf	5a639f0802	Linux 5.18 compat: bio_alloc() As for the Linux 5.18 kernel bio_alloc() expects a block_device struct as an argument. This removes the need for the bio_set_dev() compatibility code for 5.18 and newer kernels. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13515	2022-06-01 14:24:49 -07:00
hping	b28c0c4bf8	abd_os: remove redundant refcount creation for abd_children Refcount creation for abd_zero_scatter->abd_children is redundant in abd_alloc_zero_scatter, as it has been done in abd_init_struct. In addition, abd_children is undefined when ZFS_DEBUG is disabled, the reference of abd_children in abd_alloc_zero_scatter breaks build of libzpool when ZFS_DEBUG is disabled. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Ping Huang <huangping@smartx.com> Closes #13429	2022-05-20 10:33:24 -07:00
Aidan Harris	eee389ba2e	Fix functions without a prototype clang-15 emits the following error message for functions without a prototype: fs/zfs/os/linux/spl/spl-kmem-cache.c:1423:27: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes] Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Aidan Harris <me@aidanharr.is> Closes #13421	2022-05-20 10:33:24 -07:00
Mateusz Guzik	2c5c8bb0a6	FreeBSD: use zero_region instead of allocating a dedicated page Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #13406	2022-05-20 10:33:24 -07:00

1 2 3 4 5 ...

3609 Commits