Archive-Team/zfs - zfs - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Anton Gubarkov	bc371b2806	vdev_id: Return an error if config file is not found Signed-off-by: Anton Gubarkov <anton.gubarkov@gmail.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>	2021-09-14 14:36:32 -07:00
Matthew Ahrens	8a969f3e2d	Read past end of argv array in zpool_do_import() `zpool_do_import()` passes `argv[0]`, (optionally) `argv[1]`, and `pool_specified` to `import_pools()`. If `pool_specified==FALSE`, the `argv[]` arguments are not used. However, these values may be off the end of the `argv[]` array, so loading them could dereference unmapped memory. This error is reported by the asan build: ``` ================================================================= ==6003==ERROR: AddressSanitizer: heap-buffer-overflow READ of size 8 at 0x6030000004a8 thread T0 #0 0x562a078b50eb in zpool_do_import zpool_main.c:3796 #1 0x562a078858c5 in main zpool_main.c:10709 #2 0x7f5115231bf6 in __libc_start_main #3 0x562a07885eb9 in _start 0x6030000004a8 is located 0 bytes to the right of 24-byte region allocated by thread T0 here: #0 0x7f5116ac6b40 in __interceptor_malloc #1 0x562a07885770 in main zpool_main.c:10699 #2 0x7f5115231bf6 in __libc_start_main ``` This commit passes NULL for these arguments if they are off the end of the `argv[]` array. Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #12339	2021-09-14 13:08:53 -07:00
George Melikov	f8c2e91db5	zpool_influxdb: fix -Werror=stringop-truncation Use strlcpy instead of problematic strncpy Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: George Melikov <mail@gmelikov.ru> Closes #12344	2021-09-14 12:39:17 -07:00
Rich Ercolani	056c273939	Correct zfs-send(8) on readonly sends zfs-send(8) claimed in the flags list you could use -pR when sending a readonly filesystem or volume. You cannot. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12336	2021-09-14 12:38:51 -07:00
Justin Gottula	dab147d65a	Use substantially more robust program exit status logic in zvol_id Currently, there are several places in zvol_id where the program logic returns particular errno values, or even particular ioctl return values, as the program exit status, rather than a straightforward system of explicit zero on success and explicit nonzero value(s) on failure. This is problematic for multiple reasons. One particularly interesting problem that can arise, is that if any of these values happens to have all 8 least significant bits unset (i.e., it is a positive or negative multiple of 256), then although the C program sees a nonzero int value (presumed to be a failure exit status), the actual exit status as seen by the system is only the bottom 8 bits of that integer: zero. This can happen in practice, and I have encountered it myself. In a particularly weird situation, the zvol_open code in the zfs kernel module was behaving in such a manner that it caused the open() syscall to fail and for errno to be set to a kernel-private value (ERESTARTSYS, which happens to be defined as 512). It turns out that 512 is evenly divisible by 256; or, in other words, its least significant 8 bits are all-zero. So even though zvol_id believed it was returning a nonzero (failure) exit status of 512, the system modulo'd that value by 256, resulting in the actual exit status visible by other programs being 0! This actually-zero (non-failure) exit status caused problems: udev believed that the program was operating successfully, when in fact it was attempting to indicate failure via a nonzero exit status integer. Combined with another problem, this led to the creation of nonsense symlinks for zvol dev nodes by udev. Let's get rid of all this problematic logic, and simply return EXIT_SUCCESS (0) is everything went fine, and EXIT_FAILURE (1) if anything went wrong. Additionally, let's clarify some of the variable names (error is similar to errno, etc) and clean up the overall program flow a bit. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes #12302	2021-09-14 12:23:38 -07:00
Justin Gottula	7138fe7205	Print zvol_id error messages to stderr rather than stdout The zvol_id program is invoked by udev, via a PROGRAM key in the 60-zvol.rules.in rule file, to determine the "pretty" /dev/zvol/* symlink paths paths that should be generated for each opaquely named /dev/zd* dev node. The udev rule uses the PROGRAM key, followed by a SYMLINK+= assignment containing the %c substitution, to collect the program's stdout and then "paste" it directly into the name of the symlink(s) to be created. Unfortunately, as currently written, zvol_id outputs both its intended output (a single string representing the symlink path that should be created to refer to the name of the dataset whose /dev/zd* path is given) AND its error messages (if any) to stdout. When processing PROGRAM keys (and others, such as IMPORT{program}), udev uses only the data written to stdout for functional purposes. Any data written to stderr is used solely for the purposes of logging (if udev's log_level is set to debug). The unintended consequence of this is as follows: if zvol_id encounters an error condition; and then udev fails to halt processing of the current rule (either because zvol_id didn't return a nonzero exit status, or because the PROGRAM key in the rule wasn't written properly to result in a "non-match" condition that would stop the current rule on a nonzero exit); then udev will create a space-delimited list of symlink names derived directly from the words of the error message string! I've observed this exact behavior on my own system, in a situation where the open() syscall on /dev/zd* dev nodes was failing sporadically (for reasons that aren't especially relevant here). Because the open() call failed, zvol_id printed "Unable to open device file: /dev/zd736\n" to stdout and then exited. The udev rule finished with SYMLINK+="zvol/%c %c". Assuming a volume name like pool/foo/bar, this would ordinarily expand to SYMLINK+="zvol/pool/foo/bar pool/foo/bar" and would cause symlinks to be created like this: /dev/zvol/pool/foo/bar -> /dev/zd736 /dev/pool/foo/bar -> /dev/zd736 But because of the combination of error messages being printed to stdout, and the udev syntax freely accepting a space-delimited sequence of names in this context, the error message string "Unable to open device file: /dev/zd736\n" in reality expanded to SYMLINK+="zvol/Unable to open device file: /dev/zd736" which caused the following symlinks to actually be created: /dev/zvol/Unable -> /dev/zd736 /dev/to -> /dev/zd736 /dev/open -> /dev/zd736 /dev/device -> /dev/zd736 /dev/file: -> /dev/zd736 /dev//dev/zd736 -> /dev/zd736 (And, because multiple zvols had open() syscall errors, multiple zvols attempted to claim several of those symlink names, resulting in numerous udev errors and timeouts and general chaos.) This commit rectifies all this silliness by simply printing error messages to stderr, as Dennis Ritchie originally intended. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Justin Gottula <justin@jgottula.com> Closes #12302	2021-09-14 12:23:38 -07:00
Ryan Moeller	d6c2b89032	ZED: Match added disk by pool/vdev GUID if found (#12217 ) This enables ZED to auto-online vdevs that are not wholedisk managed by ZFS. Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Don Brady <don.brady@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2021-09-14 12:10:44 -07:00
Laurențiu Nicola	0584ad8f94	zed: fix sending emails (#12292 ) Commit `6fc3099` broke the quoting when invoking the mail program, revert that change. Signed-off-by: Laurențiu Nicola <lnicola@dend.ro> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2021-06-29 13:15:21 -07:00
Rich Ercolani	5e89181544	Annotated dprintf as printf-like ZFS loves using %llu for uint64_t, but that requires a cast to not be noisy - which is even done in many, though not all, places. Also a couple places used %u for uint64_t, which were promoted to %llu. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12233	2021-06-24 13:12:36 -07:00
наб	9a865b7fb7	libspl: implement atomics in terms of atomics This replaces the generic libspl atomic.c atomics implementation with one based on builtin gcc atomics. This functionality was added as an experimental feature in gcc 4.4. Today even CentOS 7 ships with gcc 4.8 as the default compiler we can make this the default. Furthermore, the builtin atomics are as good or better than our hand-rolled implementation so it's reasonable to drop that custom code. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11904 Closes #12252 Closes #12244	2021-06-21 21:48:31 -07:00
Rich Ercolani	2e158b0e0b	Added error for writing to /dev/ on Linux Starting in Linux 5.10, trying to write to /dev/{null,zero} errors out. Prefer to inform people when this happens rather than hoping they guess what's wrong. Reviewed-by: Antonio Russo <aerusso@aerusso.net> Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes: #11991	2021-06-10 10:50:16 -07:00
наб	a444efb6d7	Move properties, parameters, events, and concepts around manual sections The pages moved as follows: zpool-features.{5 => 7} spl{-module-parameters.5 => .4} zfs{-module-parameters.5 => .4} zfs-events.5 => into zpool-events.8 zfsconcepts.{8 => 7} zfsprops.{8 => 7} zpoolconcepts.{8 => 7} zpoolprops.{8 => 7} Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Co-authored-by: Daniel Ebdrup Jensen <debdrup@FreeBSD.org> Closes #12149 Closes #12212	2021-06-10 10:50:16 -07:00
Brian Behlendorf	1180d61152	Fix minor shellcheck 0.7.2 warnings The first warning of a misspelling is a false positive, so we annotate the script accordingly. As for the x-prefix warnings update the check to use the conventional '[ -z <string> ]' syntax. all-syslog.sh:46:47: warning: Possible misspelling: ZEVENT_ZIO_OBJECT may not be assigned, but ZEVENT_ZIO_OBJSET is. [SC2153] make_gitrev.sh:53:6: note: Avoid x-prefix in comparisons as it no longer serves a purpose [SC2268] man-dates.sh:10:7: note: Avoid x-prefix in comparisons as it no longer serves a purpose [SC2268] Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12208	2021-06-10 10:50:16 -07:00
наб	6ce97bb4a2	zed.d/history_event-zfs-list-cacher.sh.in: parallelise, simplify This: (a) improves the error log message, (b) locks per pool instead of globally, (c) locks the actual output file instead of /var/lock/zfs-list, which would otherwise linger there forever (well, still will, but you can remove it and it won't come back), and (d) preserves attributes of the output file instead of reverting them to 0:0 644 It is imperative that the previous commit ("zed-functions.sh: zed_lock(): don't truncate lock") be included in any series that contains this one Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12042	2021-06-09 13:06:39 -07:00
наб	27d3cc6cd3	zed.d/all-debug.sh: simplify By locking the log file itself, we can omit arduous rebinding and explicit umask setting, but, perhaps more importantly, avoid permanently littering /var/lock/ with zed.debug.log.lock we will never delete It is imperative that the previous commit ("zed-functions.sh: zed_lock(): don't truncate lock") be included in any series that contains this one Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12042	2021-06-09 13:06:09 -07:00
наб	d6a0cecab1	zed-functions.sh: zed_lock(): don't truncate lock By appending instead of truncating, we can lock on any file (with write permissions) instead of only dedicated lock files, since the locking process itself no longer alters the file in any way Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12042	2021-06-09 13:06:05 -07:00
Serapheim Dimitropoulos	a377bde727	Livelist logic should handle dedup blkptrs Update the logic to handle the dedup-case of consecutive FREEs in the livelist code. The logic still ensures that all the FREE entries are matched up with a respective ALLOC by keeping a refcount for each FREE blkptr that we encounter and ensuring that this refcount gets to zero by the time we are done processing the livelist. zdb -y no longer panics when encountering double frees Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Don Brady <don.brady@delphix.com> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #11480 Closes #12177	2021-06-09 13:05:34 -07:00
Rich Ercolani	cc9dae2404	Added another missed case to arc_summary3 It turns out that sometimes, evidently only when run inside the ZTS handler, arc_summary3 \| head > /dev/null will die with ENOTCONN, and ruin the test run. Added handling for that. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12160	2021-06-09 13:05:34 -07:00
grembo	1fb18c4bde	FreeBSD boot code reminder after zpool upgrade There used to be a warning after upgrading a zpool in FreeBSD, so users won't forget to update the boot loader that pool is booted from. This change brings this warning back, but only if the bootfs property is set on the pool, which should be sufficient for the vast majority of FreeBSD installations. People running something custom are most likely aware of what to do after an upgrade in their specific environment. Functionality is implemented in an OS specific helper function. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Co-authored-by: Michael Gmelin <grembo@FreeBSD.org> Signed-off-by: Michael Gmelin <grembo@FreeBSD.org> Closes #12099 Closes #12104	2021-06-09 13:05:34 -07:00
наб	91bb2e91bd	Turn checkbashisms into a make target make_gitrev.sh actually breaks checkbashisms' parser, which /insists/ that the end-of-line " is actually a string start Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12101	2021-06-09 13:05:34 -07:00
наб	132240507d	Turn shellcheck into a normal make target. Fix new files it caught This checks every file it checked (and a few more), but explicitly instead of "if it works it works" best-effort (which wasn't that good anyway) Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #10512 Closes #12101	2021-06-09 13:05:34 -07:00
наб	dbfbc1e524	zstream: force-install zstreamdump link Accidentally introduced by commit `dd00925e8d`. Force-install the zstreamdump link, this is a supported configuration and the install should not fail if it needs to overwrite an existing file. Also cd to work around some funny platforms as noted in AC_PROG_LN_S doc Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12143	2021-06-09 13:05:34 -07:00
Manoj Joseph	c3c65e3cfb	long options for ztest This change introduces long options for ztest. It builds the usage message as well as the long_options array from a single table. It also adds #defines for the default values. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Manoj Joseph <manoj.joseph@delphix.com> Closes #12117	2021-06-09 13:05:34 -07:00
наб	0458070928	zstreamdump: replace with link to zstream zstreamdump(8) was in quite a bad state, and the wrapper didn't work if invoked without /sbin in $PATH Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12015	2021-06-08 14:48:58 -07:00
наб	c1a64be6d4	zgenhostid: use argument path directly Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12042	2021-06-08 14:47:05 -07:00
наб	e40ffed021	Trim excess shellcheck annotations. Widen to all non-Korn scripts Before, make shellcheck checked scripts/{commitcheck,make_gitrev,man-dates,paxcheck,zfs-helpers,zfs, zfs-tests,zimport,zloop}.sh cmd/zed/zed.d/{{all-debug,all-syslog,data-notify,generic-notify, resilver_finish-start-scrub,scrub_finish-notify, statechange-led,statechange-notify,trim_finish-notify, zed-functions}.sh,history_event-zfs-list-cacher.sh.in} cmd/zpool/zpool.d/{dm-deps,iostat,lsblk,media,ses,smart,upath} now it also checks contrib/dracut/{02zfsexpandknowledge/module-setup, 90zfs/{export-zfs,parse-zfs,zfs-needshutdown, zfs-load-key,zfs-lib,module-setup, mount-zfs,zfs-generator}}.sh.in cmd/zed/zed.d/{pool_import-led,vdev_attach-led, resilver_finish-notify,vdev_clear-led}.sh contrib/initramfs/{zfsunlock,hooks/zfs.in,scripts/local-top/zfs} tests/zfs-tests/tests/perf/scripts/prefetch_io.sh scripts/common.sh.in contrib/bpftrace/zfs-trace.sh autogen.sh Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12042	2021-06-08 14:46:31 -07:00
наб	59d91b4d10	Fix SC2181 ("[ $?") outside tests/ Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12042	2021-06-08 14:45:03 -07:00
наб	c5f268fa14	mount.zfs.8: match to reality; zfsprops.8: add missing temporary options Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12111	2021-05-27 22:31:57 -07:00
Rich Ercolani	ecebf770c1	Correct flaws in arc_summary[23] and their test. The change correctly handles BrokenPipeError and improves the associated tests. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12037 Closes #12036	2021-05-27 22:31:57 -07:00
vermavipinkumar	d6bedbbc44	Propagate vdev state due to invalid label corruption Propagate vdev child state to parents on invalid label Add VDEV_AUX_BAD_LABEL to print_import_config() Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Co-authored-by: Srikanth N S <srikanth.nagasubbaraoseetharaman@hpe.com> Signed-off-by: Vipin Kumar Verma <vipin.verma@hpe.com> Closes #12088	2021-05-27 22:31:57 -07:00
наб	214ae461f1	zpool: vdev_run_cmd(): don't free undefined pointers Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11993	2021-05-10 12:21:12 -07:00
наб	b1dd6351bb	Replace ZoL with OpenZFS where applicable Afterward, git grep ZoL matches: * README.md: * [ZoL Site](https://zfsonlinux.org) - Correct * etc/default/zfs.in:# ZoL userland configuration. - Changing this would induce a needless upgrade-check, if the user has modified the configuration; this can be updated the next time the defaults change * module/zfs/dmu_send.c: * ZoL < 0.7 does not handle [...] - Before 0.7 is ZoL, so fair enough Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Issue #11956	2021-05-10 12:16:46 -07:00
наб	0bb2a48ee6	zed: protect against wait4()/fork() races to the global PID table This can be very easily triggered by adding a sleep(1) before the wait4() on a PID-starved system: the reaper thread would wait for a child before its entry appeared, letting old entries accumulate: Invoking "all-debug.sh" eid=3021 pid=391 Finished "(null)" eid=0 pid=391 time=0.002432s exit=0 Invoking "all-syslog.sh" eid=3021 pid=336 Finished "(null)" eid=0 pid=336 time=0.002432s exit=0 Invoking "history_event-zfs-list-cacher.sh" eid=3021 pid=347 Invoking "all-debug.sh" eid=3022 pid=349 Finished "history_event-zfs-list-cacher.sh" eid=3021 pid=347 time=0.001669s exit=0 Finished "(null)" eid=0 pid=349 time=0.002404s exit=0 Invoking "all-syslog.sh" eid=3022 pid=370 Finished "(null)" eid=0 pid=370 time=0.002427s exit=0 Invoking "history_event-zfs-list-cacher.sh" eid=3022 pid=391 avl_find(tree, new_node, &where) == NULL ASSERT at ../../module/avl/avl.c:641:avl_add() Thread 1 "zed" received signal SIGABRT, Aborted. By employing this wider lock, we atomise [wait, remove] and [fork, add]: slowing down the reaper thread now just causes some zombies to accumulate until it can get to them Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Don Brady <don.brady@delphix.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11963 Closes #11965	2021-05-10 12:13:53 -07:00
наб	08d2e39719	zed.d/zed-functions.sh: fix zed_guid_to_pool() on dash Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11935 Closes #11954	2021-05-10 12:12:20 -07:00
наб	f87f41009f	zed.d/history_event-zfs-list-cacher.sh: no grep for snapshot detection Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11935	2021-05-10 12:12:04 -07:00
наб	918e5c6fca	zed.d/*-notify.sh: use mktemp instead of generating temp path manually Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11935	2021-05-10 12:11:44 -07:00
наб	9c5256a2fe	zed.d/pool_import-led.sh: fix for current zpool scripts Also minor clean-up with folding state_to_val() into a case, unrolling the lesser-available seq into numbers, ignoring vdev states we don't care about, and documentation comments Signed-off-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11934 Closes #11935	2021-05-10 12:11:33 -07:00
Arshad Hussain	910b43310f	vdev_id: variable not getting expanded under map_slot() Under function map_slot() variable passed as args were not getting properly substituted or expanded. This patch fixes the substitution issue. Reviewed-by: Niklas Edmundsson <nikke@acc.umu.se> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Closes #11951 Closes #11959	2021-05-10 12:08:18 -07:00
Toomas Soome	96f04062e4	zdb: ASSERT issues when DEBUG is not defined If zdb is not built with DEBUG mode, the ASSERT macros will be eliminated. This will leave vim defined, but not used (gcc warning) and checkpoint spacemap validation loop will do nothing. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Toomas Soome <tsoome@me.com> Closes #11932	2021-05-10 12:07:09 -07:00
наб	8fd65351a7	zed: protect against wait4()/fork() races to the launched process tree As soon as wait4() returns, fork() can immediately return with the same PID, and race to lock _launched_processes_lock, then try to add the new (duplicate) PID to _launched_processes, which asserts By locking before wait4(), we ensure, that, given that same unfortunate scheduling, _launched_processes_lock cannot be locked by the spawner before we pop the process in the reaper, and only afterward will it be added This moves where the reaper idles when there are children from the wait4() to the pause(), locking for the duration of that single syscall in both the no-children and running-children cases; the impact of this is one to two syscalls (depending on _launched_processes_lock state) per loop Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Don Brady <don.brady@delphix.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11924 Closes #11928	2021-04-22 14:53:51 -07:00
наб	55cf5a255a	zed: set O_CLOEXEC on persistent fds, remove closefrom() from pre-exec Also don't dup /dev/null over stdio if daemonised Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11891	2021-04-19 15:12:41 -07:00
Yuri Pankov	7bf3959601	Fix vdev health padding in zpool list -v Do not (incorrectly, right instead left) pad health string itself, it will be taken care of when printing property value below. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Yuri Pankov <yuripv@FreeBSD.org> Closes #11899	2021-04-14 13:23:08 -07:00
Colm	1f3de97374	Improvements to the 'compatibility' property Several improvements to the operation of the 'compatibility' property: 1) Improved handling of unrecognized features: Change the way unrecognized features in compatibility files are handled. * invalid features in files under /usr/share/zfs/compatibility.d only get a warning (as these may refer to future features not yet in the library), * invalid features in files under /etc/zfs/compatibility.d get an error (as these are presumed to refer to the current system). 2) Improved error reporting from zpool_load_compat. Note: slight ABI change to zpool_load_compat for better error reporting. 3) compatibility=legacy inhibits all 'zpool upgrade' operations. 4) Detect when features are enabled outside current compatibility set * zpool set compatibility=foo <-- print a warning * zpool set feature@xxx=enabled <-- error * zpool status <-- indicate this state Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Colm Buckley <colm@tuatha.org> Closes #11861	2021-04-14 13:23:08 -07:00
pablofsf	07d64c07e0	Allow zfs to send replication streams with missing snapshots A tentative implementation and discussion was done in #5285. According to it a send --skip-missing\|-s flag has been added. In a replication stream, when there are snapshots missing in the hierarchy, if -s is provided print a warning and ignore dataset (and its children) instead of throwing an error Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Pablo Correa Gómez <ablocorrea@hotmail.com> Closes #11710	2021-04-14 13:19:50 -07:00
наб	afe2a5ca12	zvol_wait: properly handle zvol_volmode sysctl being 3/none Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11859	2021-04-14 13:19:50 -07:00
наб	2a667acfcb	zfs_ids_to_path: print correct wrong values Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11859	2021-04-14 13:19:50 -07:00
наб	5f3f4def83	zfs_ids_to_path: the -v comes after the executable name Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11859	2021-04-14 13:19:50 -07:00
наб	b3bb604a4c	arc_summary3: just read /s/m/{mod}/version instead of spawning cat Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11859	2021-04-14 13:19:50 -07:00
наб	2ae2892235	zvol_wait: fix for zvols with spaces in name, optimise list_zvols() would happily, for zvols with spaces in their names, assign the second half to volmode, &c., so use a normal read and set IFS to a tab instead of using 4 separate AWK processes(?) Similarly, in filter_out_deleted_zvols(), run zfs(8) once and use the output directly instead of spawning a zfs(8) process per zvol Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11859	2021-04-14 13:19:50 -07:00
наб	9d47853c74	zstreamdump: exec zstream dump Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11859	2021-04-14 13:19:50 -07:00
наб	bb8db9d927	zed: untangle _zed_conf_parse_path() Dunno, maybe it's just me, but the previous style was /really/ confusing Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11860	2021-04-14 13:19:49 -07:00
наб	32cc3f0837	zed: don't malloc() global zed_conf instance, optimise zed_conf layout It's all of 40 bytes with 4-byte pointers and 64 with 8-byte ones (previously 44 and 88, respectively) ‒ there's no reason it can't live on the stack Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11860	2021-04-14 13:19:49 -07:00
наб	f96dbd7a29	zed: remove zed_conf::{min,max}_events and ZED_{MIN,MAX}_EVENTS No users, fields marked "reserved for future use", macros defined to 0 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11860	2021-04-14 13:19:49 -07:00
наб	c30fef2523	zed: remove zed_conf::syslog_facility No users, nobody sets it, main() hard-codes LOG_DAEMON, which is the only correct value for this Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11860	2021-04-14 13:19:49 -07:00
наб	487cd2df11	zed: _zed_conf_display_help(): be consistent about what got_err means Users passed in EXIT_SUCCESS and EXIT_FAILURE, despite it being a bool Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11860	2021-04-14 13:19:49 -07:00
наб	5a674a1b42	zed: untangle -h option listing Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11860	2021-04-14 13:19:49 -07:00
наб	52f65648e0	zed: print out licence string as one big chunk Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11860	2021-04-14 13:19:49 -07:00
наб	55780d8ec0	zed: only go up to current limit in close_from() fallback Consider the following strace log: prlimit64(0, RLIMIT_NOFILE, NULL, {rlim_cur=1024, rlim_max=10241024}) = 0 dup2(0, 30) = 30 dup2(0, 300) = 300 dup2(0, 3000) = -1 EBADF (Bad file descriptor) dup2(0, 30000) = -1 EBADF (Bad file descriptor) dup2(0, 300000) = -1 EBADF (Bad file descriptor) prlimit64(0, RLIMIT_NOFILE, {rlim_cur=10241024, rlim_max=1024*1024}, NULL) = 0 dup2(0, 30) = 30 dup2(0, 300) = 300 dup2(0, 3000) = 3000 dup2(0, 30000) = 30000 dup2(0, 300000) = 300000 Even a privileged process needs to bump its rlimit before being able to use fds higher than rlim_cur. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11834	2021-04-14 13:19:49 -07:00
наб	ea30225fdb	zed: replace zed_file_write_n() with write(2), purge it We set SA_RESTART early on, which will prevent EINTRs (indeed, to the point of needing to clear it in the reaper, since it interferes with pause(2)), which is the only error zed_file_write_n() actually handled (plus, the pid write is no bigger than 12 bytes anyway) Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11834	2021-04-14 13:19:49 -07:00
наб	018560b153	zed: merge all _NOT_IMPLEMENTED_ events These events should currently never be generated. Also untag _zed_event_add_nvpair() from merge with zpool_do_events_nvprint() ‒ they serve different purposes (machine, usually script vs human consumption) and format the output differently as it stands Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11834	2021-04-14 13:19:49 -07:00
наб	48b60cffda	zed: remove unused zed_file_read_n() Same deal as zed_file_close_on_exec() Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11834	2021-04-14 13:19:49 -07:00
наб	cb97db792e	zed: bump zfs_zevent_len_max if we miss any events Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11834	2021-04-14 13:19:49 -07:00
наб	718ee43362	zed.8: don't pretend an unprivileged user could change the script owner And add a note on /why/ ZEDLETs need to be owned by root Quoth chown(2), Linux man-pages project: Only a privileged process (Linux: one with the CAP_CHOWN capability) may change the owner of a file. Quoth chown(2), FreeBSD: [EPERM] The operation would change the ownership, but the effective user ID is not the super-user. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11834	2021-04-14 13:19:49 -07:00
наб	01219379cf	zed: purge all mentions of a configuration file There simply isn't a need for one, since the flags the daemon takes are all short (mostly just toggles) and administrative in nature, and are therefore better served by the age-old tradition of sourcing an environment file and preparing the cmdline in the init-specific handler itself, if needed at all Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11834	2021-04-14 13:19:49 -07:00
наб	0a51083e39	zed: implement close_from() in terms of /proc/self/fd, if available /dev/fd on Darwin Consider the following strace output: prlimit64(0, RLIMIT_NOFILE, NULL, {rlim_cur=1024, rlim_max=1024*1024}) = 0 Yes, that is well over a million file descriptors! This reduces the ZED start-up time from "at least a second" to "instantaneous", and, under strace, from "don't even try" to "usable" by simple virtue of doing five syscalls instead of over a million; in most cases the main loop does nothing Recent Linuxes (5.8+) have close_range(2) for this, but that's an overoptimisation (and libcs don't have wrappers for it yet) This is also run by the ZEDLET pre-exec. Compare: Finished "all-syslog.sh" eid=13 pid=6717 time=1.027100s exit=0 Finished "history_event-zfs-list-cacher.sh" eid=13 pid=6718 time=1.046923s exit=0 to Finished "all-syslog.sh" eid=12 pid=4834 time=0.001836s exit=0 Finished "history_event-zfs-list-cacher.sh" eid=12 pid=4835 time=0.001346s exit=0 lol Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11834	2021-04-14 13:19:49 -07:00
наб	b73e40a5ad	zed: print combined system/user time after ZEDLET death Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11834	2021-04-14 13:19:49 -07:00
наб	fa991f2a47	zed: allow limiting concurrent jobs 200ms time-out is relatively long, but if we already hit the cap, then we'll likely be able to spawn multiple new jobs when we wake up Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11807	2021-04-14 13:19:49 -07:00
наб	dbf8baf8dd	zed: remove unused zed_file_close_on_exec() The FIXME comment was there since the initial implementation in 2014, there are no users Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11807	2021-04-14 13:19:49 -07:00
наб	9160c441a4	zed: use separate reaper thread and collect ZEDLETs asynchronously Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11807	2021-04-14 13:19:49 -07:00
наб	84c1652a0a	zed: set names for all threads Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11807	2021-04-14 13:19:48 -07:00
наб	d09e1c1343	libzutil: zfs_isnumber(): return false if input empty zpool list, which is the only user, would mistakenly try to parse the empty string as the interval in this case: $ zpool list "a" cannot open 'a': no such pool $ zpool list "" interval cannot be zero usage: <usage string follows> which is now symmetric with zpool get: $ zpool list "" cannot open '': name must begin with a letter Avoid breaking the "interval cannot be zero" string. There simply isn't a need for this, and it's user-facing. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11841 Closes #11843	2021-04-07 13:27:42 -07:00
Andrea Gelmini	ca7af7f675	Fix various typos Correct an assortment of typos throughout the code base. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Closes #11774	2021-04-07 13:27:11 -07:00
наб	dc52c0d725	fsck.zfs: implement 4/8 exit codes as suggested in manpage Update the fsck.zfs helper to bubble up some already-known-about errors if they are detected in the pool. health=degraded => 4/"Filesystem errors left uncorrected" health=faulted && dataset in /etc/fstab => 8/"Operational error" pool not found => 8/"Operational error" everything else => 0 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11806	2021-04-07 13:24:18 -07:00
Mike Swanson	bfdd001679	Add compatibility file sets (ZoL 0.6.1, 0.6.4, OpenZFS 2.1) ZoL 0.6.1 introduced feature flags with the three features that all implementations at the time were guaranteed to have. 0.6.4 introduced a few more until 0.6.5 added two after that. OpenZFS 2.1 added the dRAID feature. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Mike Swanson <mikeonthecomputer@gmail.com> Closes #11818	2021-04-07 13:24:08 -07:00
наб	38280c3526	zed: reap child after killing on time-out When a child process is killed waitpid() must be called on the pid the reap the zombie process. Update BUGS section to reflect reality by replacing "zedlets aren't time limited with "zedlets can be interrupted". Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11769 Closes #11798	2021-03-26 14:21:00 -07:00
Andrea Gelmini	8a915ba1f6	Removed duplicated includes Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Closes #11775	2021-03-22 12:34:58 -07:00
Chunwei Chen	296a4a369b	Fix zfs_get_data access to files with wrong generation If TX_WRITE is create on a file, and the file is later deleted and a new directory is created on the same object id, it is possible that when zil_commit happens, zfs_get_data will be called on the new directory. This may result in panic as it tries to do range lock. This patch fixes this issue by record the generation number during zfs_log_write, so zfs_get_data can check if the object is valid. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@nutanix.com> Closes #10593 Closes #11682	2021-03-19 22:53:31 -07:00
Matthew Ahrens	330c6c0523	Clean up RAIDZ/DRAID ereport code The RAIDZ and DRAID code is responsible for reporting checksum errors on their child vdevs. Checksum errors represent events where a disk returned data or parity that should have been correct, but was not. In other words, these are instances of silent data corruption. The checksum errors show up in the vdev stats (and thus `zpool status`'s CKSUM column), and in the event log (`zpool events`). Note, this is in contrast with the more common "noisy" errors where a disk goes offline, in which case ZFS knows that the disk is bad and doesn't try to read it, or the device returns an error on the requested read or write operation. RAIDZ/DRAID generate checksum errors via three code paths: 1. When RAIDZ/DRAID reconstructs a damaged block, checksum errors are reported on any children whose data was not used during the reconstruction. This is handled in `raidz_reconstruct()`. This is the most common type of RAIDZ/DRAID checksum error. 2. When RAIDZ/DRAID is not able to reconstruct a damaged block, that means that the data has been lost. The zio fails and an error is returned to the consumer (e.g. the read(2) system call). This would happen if, for example, three different disks in a RAIDZ2 group are silently damaged. Since the damage is silent, it isn't possible to know which three disks are damaged, so a checksum error is reported against every child that returned data or parity for this read. (For DRAID, typically only one "group" of children is involved in each io.) This case is handled in `vdev_raidz_cksum_finish()`. This is the next most common type of RAIDZ/DRAID checksum error. 3. If RAIDZ/DRAID is not able to reconstruct a damaged block (like in case 2), but there happens to be additional copies of this block due to "ditto blocks" (i.e. multiple DVA's in this blkptr_t), and one of those copies is good, then RAIDZ/DRAID compares each sector of the data or parity that it retrieved with the good data from the other DVA, and if they differ then it reports a checksum error on this child. This differs from case 2 in that the checksum error is reported on only the subset of children that actually have bad data or parity. This case happens very rarely, since normally only metadata has ditto blocks. If the silent damage is extensive, there will be many instances of case 2, and the pool will likely be unrecoverable. The code for handling case 3 is considerably more complicated than the other cases, for two reasons: 1. It needs to run after the main raidz read logic has completed. The data RAIDZ read needs to be preserved until after the alternate DVA has been read, which necessitates refcounts and callbacks managed by the non-raidz-specific zio layer. 2. It's nontrivial to map the sections of data read by RAIDZ to the correct data. For example, the correct data does not include the parity information, so the parity must be recalculated based on the correct data, and then compared to the parity that was read from the RAIDZ children. Due to the complexity of case 3, the rareness of hitting it, and the minimal benefit it provides above case 2, this commit removes the code for case 3. These types of errors will now be handled the same as case 2, i.e. the checksum error will be reported against all children that returned data or parity. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #11735	2021-03-19 16:22:10 -07:00
George Wilson	0936981d86	zpool import cachefile improvements Importing a pool using the cachefile is ideal to reduce the time required to import a pool. However, if the devices associated with a pool in the cachefile have changed, then the import would fail. This can easily be corrected by doing a normal import which would then read the pool configuration from the labels. The goal of this change is make importing using a cachefile more resilient and auto-correcting. This is accomplished by having the cachefile import logic automatically fallback to reading the labels of the devices similar to a normal import. The main difference between the fallback logic and a normal import is that the cachefile import logic will only look at the device directories that were originally used when the cachefile was populated. Additionally, the fallback logic will always import by guid to ensure that only the pools in the cachefile would be imported. External-issue: DLPX-71980 Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Wilson <gwilson@delphix.com> Closes #11716	2021-03-12 15:42:27 -08:00
Tony Hutter	4fdbd43450	vdev_id: Create symlinks even if no /dev/mapper/ vdev_id uses the /dev/mapper/ symlinks to resolve a UUID to a dm name (like dm-1). However on some multipath setups, there is no /dev/mapper/ entry for the UUID at the time vdev_id is called by udev. However, this isn't necessarily needed, as we may be able to resolve the dm name from the $DEVNAME that udev passes us (like DEVNAME="/dev/dm-1"). This patch tries to resolve the dm name from $DEVNAME first, before falling back to looking in /dev/mapper/. This fixed an issue where the by-vdev names weren't reliably showing up on one of our nodes. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #11698	2021-03-08 08:43:30 -08:00
Brian Behlendorf	e7a06356c1	Suppress cppcheck invalidSyntax warninigs For some reason cppcheck 1.90 is generating an invalidSyntax warning when the BF64_SET macro is used in the zstream source. The same warning is not reported by cppcheck 2.3, nor is their any evident problem with the expanded macro. This appears to be an issue with this version of cppcheck. This commit annotates the source to suppress the warning. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11700	2021-03-05 17:56:35 -08:00
Thomas Lamprecht	fd1c366f82	zpool: use tab to intend continuation from removal status Bring the output of the removal status in line with the other "fields" that zpool status outputs, and thus allows an parser to easier detect this as continuation of the 'remove:' output. Before: remove: Removal of vdev 0 copied 282G in 0h9m, completed on [...] 776K memory used for removed device mappings Now: remove: Removal of vdev 0 copied 282G in 0h9m, completed on [...] 776K memory used for removed device mappings Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Closes #11674	2021-03-05 12:15:35 -08:00
Martin Matuška	03ef8f09e1	Add missing checks for unsupported features After `35ec517` it has become possible to import ZFS pools witn an active org.illumos:edonr feature on FreeBSD, leading to a panic. In addition, "zpool status" reported all pools without edonr as upgradable and "zpool upgrade -v" reported edonr in the list of upgradable features. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Martin Matuska <mm@FreeBSD.org> Closes #11653	2021-02-27 17:16:02 -08:00
Tony Hutter	3ee4e6d8b7	vdev_id: Fix partition regular expression Given a DM device name, the old vdev_id script would extract any text after a 'p' as the partition number. It then appends "-part" + the partition number to the name, giving a by-vdev name like "L0-part5". This works fine if the DM name is like 'dm-2p5', but doesn't work if the DM name is a multipath name like "mpatha". In those cases it incorrectly matches the 'p' in "mpatha", giving by-vdev names like "L0-partatha". This patch fixes the issue by making the partition regex match stricter. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #11637	2021-02-24 09:58:46 -08:00
Ryan Moeller	94fa1c3d96	Force symlink creation for zpool.d compat links gmake install fails when zpool.d compat links already exist. Force the symlinks to be recreated if already present. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11633	2021-02-24 09:49:59 -08:00
Christian Schwarz	52cb284f7b	ztest: propagate -o to the zdb child process I think this is the behavior that most users expect. Future work: have a separate flag, e.g., -O, to specify separate set_global_vars for the zdb child than for the ztest children. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Christian Schwarz <me@cschwarz.com> Closes #11602	2021-02-19 22:45:16 -08:00
Christian Schwarz	2801e68d1b	ztest: fix -o by calling set_global_var in child processes Without set_global_var() in the child processes the -o option provides little use. Before this change set_global_var() was called as a side-effect of getopt processing which only happens for the parent ztest process. This change limits the set of options that can be set and makes them available to the child through ztest_shared_opts_t. Future work: support arbitrary option count and length. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Christian Schwarz <me@cschwarz.com> Closes #11602	2021-02-19 22:45:10 -08:00
Andriy Gapon	778869fa13	Fix report_mount_progress never calling set_progress_header That happens because of an off-by-one mistake. share_mount_one_cb() calls report_mount_progress(current=sm_done) after having incremented sm_done by one. Then report_mount_progress() increments the parameter again. It appears that that logic became obsolete after commit `a10d50f999`, parallel zfs mount. On FreeBSD I observe that zfs mount -a -v prints, for example, (null): (201/248) That happens because set_progress_header() is never called. With this change the output becomes correct: Mounting ZFS filesystems: (209/248) Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Andriy Gapon <avg@FreeBSD.org> Closes #11607	2021-02-18 13:53:05 -08:00
Colm	658fb8020f	Add "compatibility" property for zpool feature sets Property to allow sets of features to be specified; for compatibility with specific versions / releases / external systems. Influences the behavior of 'zpool upgrade' and 'zpool create'. Initial man page changes and test cases included. Brief synopsis: zpool create -o compatibility=off\|legacy\|file[,file...] pool vdev... compatibility = off : disable compatibility mode (enable all features) compatibility = legacy : request that no features be enabled compatibility = file[,file...] : read features from specified files. Only features present in all files will be enabled on the resulting pool. Filenames may be absolute, or relative to /etc/zfs/compatibility.d or /usr/share/zfs/compatibility.d (/etc checked first). Only affects zpool create, zpool upgrade and zpool status. ABI changes in libzfs: * New function "zpool_load_compat" to load and parse compat sets. * Add "zpool_compat_status_t" typedef for compatibility parse status. * Add ZPOOL_PROP_COMPATIBILITY to the pool properties enum * Add ZPOOL_STATUS_COMPATIBILITY_ERR to the pool status enum An initial set of base compatibility sets are included in cmd/zpool/compatibility.d, and the Makefile for cmd/zpool is modified to install these in $pkgdatadir/compatibility.d and to create symbolic links to a reasonable set of aliases. Reviewed-by: ericloewe Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Colm Buckley <colm@tuatha.org> Closes #11468	2021-02-17 21:30:45 -08:00
Brian Behlendorf	35ec51796f	FreeBSD: disable edonr in zfs_mod_supported_feature() Rather than conditionally compiling out the edonr code for FreeBSD update zfs_mod_supported_feature() to indicate this feature is unsupported. This ensures that all spa features are defined on every platform, even if they are not supported. Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11605 Issue #11468	2021-02-17 08:14:51 -08:00
José Luis Salvador Rufo	aef1830f93	Support uClibc for the tests compilations There are two issues that don't allow ZFS to be compiled using uClibc. `backtrace()`, and `program_invocation_short_name` as a `const`. This patch adds uClibc to the conditionals in the same way there are already for Glibc for `backtrace()`; and removes the external param `program_invocation_short_name` because its only used here for the whole project. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: José Luis Salvador Rufo <salvador.joseluis@gmail.com> Closes #11600	2021-02-16 21:51:46 -08:00
Arshad Hussain	677cdf7865	vdev_id: Support daisy-chained JBODs in multipath mode Within function sas_handler() userspace commands like '/usr/sbin/multipath' have been replaced with sourcing device details from within sysfs which reduced a significant amount of overhead and processing time. Multiple JBOD enclosures and their order are sourced from the bsg driver (/sys/class/enclosure) to isolate chassis top-level expanders, which are then dynamically indexed based on host channel of the multipath subordinate disk member device being processed. Additionally added a "mixed" mode for slot identification for environments where a ZFS server system may contain SAS disk slots where there is no expander (direct connect to HBA) while an attached external JBOD with an expander have different slot identifier methods. How Has This Been Tested? ~~~~~~~~~~~~~~~~~~~~~~~~~ Testing was performed on a AMD EPYC based dual-server high-availability multipath environment with multiple HBAs per ZFS server and four SAS JBODs. The two primary JBODs were multipath/cross-connected between the two ZFS-HA servers. The secondary JBODs were daisy-chained off of the primary JBODs using aligned SAS expander channels (JBOD-0 expanderA--->JBOD-1 expanderA, JBOD-0 expanderB--->JBOD-1 expanderB, etc). Pools were created, exported and re-imported, imported globally with 'zpool import -a -d /dev/disk/by-vdev'. Low level udev debug outputs were traced to isolate and resolve errors. Result: ~~~~~~~ Initial testing of a previous version of this change showed how reliance on userspace utilities like '/usr/sbin/multipath' and '/usr/bin/lsscsi' were exacerbated by increasing numbers of disks and JBODs. With four 60-disk SAS JBODs and 240 disks the time to process a udevadm trigger was 3 minutes 30 seconds during which nearly all CPU cores were above 80% utilization. By switching reliance on userspace utilities to sysfs in this version, the udevadm trigger processing time was reduced to 12.2 seconds and negligible CPU load. This patch also fixes few shellcheck complains. Reviewed-by: Gabriel A. Devenyi <gdevenyi@gmail.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Jeff Johnson <jeff.johnson@aeoncomputing.com> Signed-off-by: Jeff Johnson <jeff.johnson@aeoncomputing.com> Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Closes #11526	2021-02-09 13:04:09 -08:00
nssrikanth	32366649d3	Fixed issue with processing of EC_dev_remove event The pool guid and vdev guid received by zfs_agent_post_event(), which calls zfs_retire_recv(), are normally non-zero. However, later in this same method they may be unconditionally reset to zero by the code which is intended to handle multipath, spare and l2arc vdevs. This will result in the EC_dev_remove not being handled. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>\ Co-authored-by: Vipin Kumar Verma <vipin.verma@hpe.com> Signed-off-by: Srikanth N S <srikanth.nagasubbaraoseetharaman@hpe.com> Closes #11564	2021-02-05 08:30:50 -08:00
nssrikanth	cf0a6dd3ed	Added extra check to replace Faulted VDEV with Distributed Spare In ZED zfs_retire agent added a check to handle Distributed Spare replacement for Faulted VDEV also. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Vipin Kumar Verma <vipin.verma@hpe.com> Signed-off-by: Mark Maybee <mark.maybee@hpe.com> Closes #11354 Closes #11355	2021-01-28 17:00:26 -08:00
Allan Jude	393e69241e	Add zdb -r <dataset> <object-id \| file> <output> While you can use zdb -R poolname vdev:offset:[<lsize>/]<psize>[:flags] to extract individual DVAs from a vdev, it would be handy for be able copy an entire file out of the pool. Given a file or object number, add support to copy the contents to a file. Useful for debugging and recovery. Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Allan Jude <allan@klarasystems.com> Closes #11027	2021-01-27 21:36:01 -08:00
Brian Behlendorf	0e6c493fec	cppcheck: integrete cppcheck In order for cppcheck to perform a proper analysis it needs to be aware of how the sources are compiled (source files, include paths/files, extra defines, etc). All the needed information is available from the Makefiles and can be leveraged with a generic cppcheck Makefile target. So let's add one. Additional minor changes: * Removing the cppcheck-suppressions.txt file. With cppcheck 2.3 and these changes it appears to no longer be needed. Some inline suppressions were also removed since they appear not to be needed. We can add them back if it turns out they're needed for older versions of cppcheck. * Added the ax_count_cpus m4 macro to detect at configure time how many processors are available in order to run multiple cppcheck jobs. This value is also now used as a replacement for nproc when executing the kernel interface checks. * "PHONY =" line moved in to the Rules.am file which is included at the top of all Makefile.am's. This is just convenient becase it allows us to use the += syntax to add phony targets. * One upside of this integration worth mentioning is it now allows `make cppcheck` to be run in any directory to check that subtree. * For the moment, cppcheck is not run against the FreeBSD specific kernel sources. The cppcheck-FreeBSD target will need to be implemented and testing on FreeBSD to support this. Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11508	2021-01-26 16:12:26 -08:00
Brian Behlendorf	7454d2bb8d	cppcheck: zpool_main.c possible null pointer dereference Explicitly check for NULL to satisfy cppcheck that "val" can never be NULL when passed to printf(). This looks like a false positive since is_blank_str() can never take the false conditional branch when passed a NULL. But there's no harm in adding the extra check. Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11508	2021-01-26 16:11:46 -08:00
Arshad Hussain	e2870fb24a	vdev_id: Add error message when $CONFIG is missing It was observed that vdev_id exists silently when the $CONFIG file is missing. This patch adds error message in case vdev_id is called without default $CONFIG or '-c'. This makes end user observe the exit message more easily. Before Patch: ~~~~~~~~~~~~~ $ ./cmd/vdev_id/vdev_id $ After Patch: ~~~~~~~~~~~~ $ ./cmd/vdev_id/vdev_id Error: Config file "/etc/zfs/vdev_id.conf" not found $ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Closes #11498	2021-01-23 15:52:29 -08:00
Brian Behlendorf	76e1f78d4b	Only add supported features during pool creation When creating a pool only features supported by both user and kernel space should be enabled. Furthermore, improve the error messages when attempting to create, or add, a dRAID vdev when the dRAID feature is not supported by the kernel modules. Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11492	2021-01-22 09:47:06 -08:00
Matthew Ahrens	aa755b3549	Set aside a metaslab for ZIL blocks Mixing ZIL and normal allocations has several problems: 1. The ZIL allocations are allocated, written to disk, and then a few seconds later freed. This leaves behind holes (free segments) where the ZIL blocks used to be, which increases fragmentation, which negatively impacts performance. 2. When under moderate load, ZIL allocations are of 128KB. If the pool is fairly fragmented, there may not be many free chunks of that size. This causes ZFS to load more metaslabs to locate free segments of 128KB or more. The loading happens synchronously (from zil_commit()), and can take around a second even if the metaslab's spacemap is cached in the ARC. All concurrent synchronous operations on this filesystem must wait while the metaslab is loading. This can cause a significant performance impact. 3. If the pool is very fragmented, there may be zero free chunks of 128KB or more. In this case, the ZIL falls back to txg_wait_synced(), which has an enormous performance impact. These problems can be eliminated by using a dedicated log device ("slog"), even one with the same performance characteristics as the normal devices. This change sets aside one metaslab from each top-level vdev that is preferentially used for ZIL allocations (vdev_log_mg, spa_embedded_log_class). From an allocation perspective, this is similar to having a dedicated log device, and it eliminates the above-mentioned performance problems. Log (ZIL) blocks can be allocated from the following locations. Each one is tried in order until the allocation succeeds: 1. dedicated log vdevs, aka "slog" (spa_log_class) 2. embedded slog metaslabs (spa_embedded_log_class) 3. other metaslabs in normal vdevs (spa_normal_class) The space required for the embedded slog metaslabs is usually between 0.5% and 1.0% of the pool, and comes out of the existing 3.2% of "slop" space that is not available for user data. On an all-ssd system with 4TB storage, 87% fragmentation, 60% capacity, and recordsize=8k, testing shows a ~50% performance increase on random 8k sync writes. On even more fragmented systems (which hit problem #3 above and call txg_wait_synced()), the performance improvement can be arbitrarily large (>100x). Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: Don Brady <don.brady@delphix.com> Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #11389	2021-01-21 15:12:54 -08:00

1 2 3 4 5 ...

1190 Commits