Compare commits

...

346 Commits

Author SHA1 Message Date
Brian Behlendorf d9f49627c1 ZTS: Waiting for zvols to be available
The ZTS block_device_wait helper function should use -e when waiting
for a file to appear since it will be either a block special device
or a symlink.  This didn't cause any failures but when a device path
was specified the function would wait longer than needed.

Additionally update the most flakey test cases to pass the file path
to block_device_wait to try and improve the test reliability.  The
udev behavior on Fedora in particular can result in frequent false
positives.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12515
2021-08-31 10:30:21 -07:00
Anton Gubarkov 5eaecd36af vdev_id: Return an error if config file is not found
Signed-off-by: Anton Gubarkov <anton.gubarkov@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
2021-08-31 10:30:21 -07:00
Ryan Moeller f738158014 Correct checking bdev_check_media_change message
We're not looking for bdev_disk_changed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12492
2021-08-31 10:30:21 -07:00
Tony Hutter 2b7855ae6b Make 'zpool labelclear -f' work on offlined disks
This patch allows you to clear the label on offlined disks in an active
pool with `-f`.  Previously, labelclear wouldn't let you do that.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #12511
2021-08-31 10:30:21 -07:00
Brian Behlendorf 9b418a96b3 Linux 5.14 compat: META
Increase the Linux-Maximum version in the META file to 5.14.
All of the required compatibility patches have been merged
and the 5.14 kernel is close to being officially released.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2021-08-31 10:30:21 -07:00
Alexander Motin c02004475a Optimize small random numbers generation
In all places except two spa_get_random() is used for small values,
and the consumers do not require well seeded high quality values.
Switch those two exceptions directly to random_get_pseudo_bytes()
and optimize spa_get_random(), renaming it to random_in_range(),
since it is not related to SPA or ZFS in general.

On FreeBSD directly map random_in_range() to new prng32_bounded() KPI
added in FreeBSD 13.  On Linux and in user-space just reduce the type
used to uint32_t to avoid more expensive 64bit division.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12183
2021-08-31 10:30:21 -07:00
Sam Hathaway 9faf3cf6fa zpool-remove.8: describe top-level vdev sector size limitation
Document that top-level vdevs cannot be removed unless all top-level
vdevs have the same sector size.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Sam Hathaway <sam@sam-hathaway.com>
Closes #11339
Closes #12472
2021-08-31 10:30:21 -07:00
Mark Johnston 8fc1a04860 Initialize parity blocks before RAID-Z reconstruction benchmarking
benchmark_raidz() allocates a row to benchmark parity calculation and
reconstruction.  In the latter case, the parity blocks are left
uninitialized, leading to reports from KMSAN.

Initialize parity blocks to 0xAA as we do for the data earlier in the
function.  This does not affect the selected RAID-Z implementation on
any of several systems tested.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12473
2021-08-31 10:30:21 -07:00
Ryan Moeller 213bb01b2f ZTS: Add tests for creation time
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12432
2021-08-31 10:30:21 -07:00
Richard Yao 3c72042c6f Linux 4.11 compat: statx support
Linux 4.11 added a new statx system call that allows us to expose crtime
as btime. We do this by caching crtime in the znode to match how atime,
ctime and mtime are cached in the inode.

statx also introduced a new way of reporting whether the immutable,
append and nodump bits have been set. It adds support for reporting
compression and encryption, but the semantics on other filesystems is
not just to report compression/encryption, but to allow it to be turned
on/off at the file level. We do not support that.

We could implement semantics where we refuse to allow user modification
of the bit, but we would need to do a dnode_hold() in zfs_znode_alloc()
to find out encryption/compression information. That would introduce
locking that will have a minor (although unmeasured) performance cost.
It also would be inferior to zdb, which reports far more detailed
information. We therefore omit reporting of encryption/compression
through statx in favor of recommending that users interested in such
information use zdb.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Closes #8507
2021-08-31 10:30:21 -07:00
Gordon Bergling a923a9ef2e zfs.4: Fix typo s/compatiblity/compatibility/
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Gordon Bergling <gbergling@googlemail.com>
Closes #12464
2021-08-31 10:30:21 -07:00
Alexander Motin 2cf8f7195c Remove b_pabd/b_rabd allocation from arc_hdr_alloc()
When a header is allocated for full overwrite it is a waste of time
to allocate b_pabd/b_rabd for it, since arc_write() will free them
without ever being touched.  If it is a read or a partial overwrite
then arc_read() and arc_hdr_decrypt() allocate them explicitly.

Reduced memory allocation in user threads also reduces ARC eviction
throttling there, proportionally increasing it in ZIO threads, that
is not good.  To minimize or even avoid it introduce ARC allocation
reserve, allowing certain arc_get_data_abd() callers to allocate a
bit longer in situations where user threads will already throttle.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12398
2021-08-31 10:30:21 -07:00
Alexander Motin 28aefb20f3 Optimize arc_l2c_only lists assertions
It is very expensive and not informative to call multilist_is_empty()
for each arc_change_state() on debug builds to check for impossible.
Instead implement special index function for arc_l2c_only->arcs_list,
multilists, panicking on any attempt to use it.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12421
2021-08-31 10:30:21 -07:00
Alexander Motin 772f58cccd Fix/improve dbuf hits accounting
Instead of clearing stats inside arc_buf_alloc_impl() do it inside
arc_hdr_alloc() and arc_release().  It fixes statistics being wiped
every time a new dbuf is filled from the ARC.

Remove b_l1hdr.b_l2_hits. L2ARC hits are accounted at b_l2hdr.b_hits.
Since the hits are accounted under hash lock, replace atomics with
simple increments.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12422
2021-08-31 10:30:21 -07:00
Alexander Motin c066ab1763 Avoid vq_lock drop in vdev_queue_aggregate()
vq_lock is already too congested for two more operations per I/O.
Instead of dropping and reacquiring it inside vdev_queue_aggregate()
delegate the zio_vdev_io_bypass() and zio_execute() calls for parent
I/Os to callers, that drop the lock any way to execute the new I/O.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12297
2021-08-31 10:30:21 -07:00
Alexander Motin 72175a5076 Use more atomics in refcounts
Use atomic_load_64() for zfs_refcount_count() to prevent torn reads
on 32-bit platforms.  On 64-bit ones it should not change anything.

When built with ZFS_DEBUG but running without tracking enabled use
atomics instead of mutexes same as for builds without ZFS_DEBUG.
Since rc_tracked can't change live we can check it without lock.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12420
2021-08-31 10:30:21 -07:00
Ryan Moeller bed2062818 ZTS: Avoid unset $tmpdir in redacted_panic
The redacted_send tests make use of a $tmpdir variable, except in
redacted_send/redacted_panic the variable is never defined.

Use $TEST_BASE_DIR instead.

Clean up the stream file after the test.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12455
2021-08-31 10:30:21 -07:00
Allan Jude de669cfa0c Restore FreeBSD sysctl processing for arc.min and arc.max
Before OpenZFS 2.0, trying to set the FreeBSD sysctl vfs.zfs.arc_max
to a disallowed value would return an error.
Since the switch, it instead only generates WARN_IF_TUNING_IGNORED

Keep the ability to set the sysctl's specifically to 0, even though
that is less than the minimum, because some tests depend on this.

Also lost, was the ability to set vfs.zfs.arc_max to a value less
than the default vfs.zfs.arc_min at boot time. Restore this as well.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #12161
2021-08-31 10:30:21 -07:00
Ryan Moeller 9b611300c9 zfs: add missed dependency of zfs module on zlib
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Martin Matuska <mm@FreeBSD.org>
Co-authored-by: Konstantin Belousov <kib@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
External-issue: https://reviews.freebsd.org/D31207
Closes #12442
2021-08-31 10:30:21 -07:00
Ryan Moeller 332a0bb494 Add zfs.sh -r flag to reload modules
zfs.sh already can load and unload, so why not both?

This is convenient when developing changes to the module and you want
to rapidly make some changes, rebuild the module, reload the module,
and test the changes.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12450
2021-08-31 10:30:21 -07:00
Ryan Moeller 768ce47a1f Fix usage of find in tests/Makefile.am
The path is not optional on FreeBSD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12453
2021-08-31 10:30:21 -07:00
Tony Nguyen cdca09901c Run arc_evict thread at higher priority
Run arc_evict thread at higher priority, nice=0, to give it more CPU
time which can improve performance for workload with high ARC evict
activities.

On mixed read/write and sequential read workloads, I've seen between
10-40% better performance.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Tony Nguyen <tony.nguyen@delphix.com>
Closes #12397
2021-08-31 10:30:21 -07:00
Rich Ercolani 1b34ef29be Make get_key_material_file fail more verbosely
It turns out, there are a lot of possible reasons for fopen to fail.
Let's share which reason we failed for today.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12410
2021-08-31 10:30:21 -07:00
Brian Behlendorf c4a02a6829 Enable /proc/diskstats for zvols
The /proc/diskstats accounting needs to be explicitly enabled
for block devices which do not use multi-queue.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12440
Closes #12066
2021-08-31 10:30:21 -07:00
George Melikov f5160b1c0d Man zpool-scrub.8: describe sequential scrub
Describe sequential scrub and add examples of scrub status.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #12429
2021-08-31 10:30:21 -07:00
hedongzhang 44c1eed38f Modify checksum obtain method of QAT
CpaDcGeneratefooter function that obtain the checksum code
does not support the CPA_DC_STATELESS mode. So we get the
adler32 chencksum of the end of the zlib from dc_results.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chengfei Zhu <chengfeix.zhu@intel.com>
Signed-off-by: hedong.zhang <h_d_zhang@163.com>
Closes #12343
2021-08-31 10:30:21 -07:00
Mark Johnston 469d457093 Allow disabling of unmapped I/O on FreeBSD
We have a tunable which permits one to disable the use of unmapped I/O
for the buffer cache.  Respect it in ZFS as well.  This is useful for
KMSAN, which cannot easily maintain shadow state for unmapped pages.

No functional change intended, as unmapped I/O is permitted by default
and there's no real reason to disable it in practice except for
debugging.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12446
2021-08-31 10:30:21 -07:00
Alexander Motin 480d4f7b0d Avoid small buffer copying on write
It is wrong for arc_write_ready() to use zfs_abd_scatter_enabled to
decide whether to reallocate/copy the buffer, because the answer is
OS-specific and depends on the buffer size.  Instead of that use
abd_size_alloc_linear(), moved into public header.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12425
2021-08-31 10:30:21 -07:00
John Wren Kennedy e0890987ef Assorted fixes for the performance tests
- Bail out early if we're running the perf tests and forget to
  specify disks.
- Allow perf tests to run with any number of disks.
- Remove weekly vs. nightly settings
- Move variables with common values to perf.shlib
- Use zinject to clear the ARC over export/import
- Fix dbuf cache size calculation

When the meaning of `dbuf_cache_max_bytes` changed, the performance
test that covers the dbuf cache started to fail. The test would try to
write files for the test using the max possible size of the cache,
inevitably filling the pool and failing. This change uses
`dbuf_cache_shift` to correctly calculate the dbuf cache size.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: John Kennedy <john.kennedy@delphix.com>
Closes #12408
2021-08-31 10:30:21 -07:00
Matthew Ahrens aa3b9bd715 Read past end of argv array in zpool_do_import()
`zpool_do_import()` passes `argv[0]`, (optionally) `argv[1]`, and
`pool_specified` to `import_pools()`.  If `pool_specified==FALSE`, the
`argv[]` arguments are not used.  However, these values may be off the
end of the `argv[]` array, so loading them could dereference unmapped
memory.  This error is reported by the asan build:

```
=================================================================
==6003==ERROR: AddressSanitizer: heap-buffer-overflow
READ of size 8 at 0x6030000004a8 thread T0
    #0 0x562a078b50eb in zpool_do_import zpool_main.c:3796
    #1 0x562a078858c5 in main zpool_main.c:10709
    #2 0x7f5115231bf6 in __libc_start_main
    #3 0x562a07885eb9 in _start

0x6030000004a8 is located 0 bytes to the right of 24-byte region
allocated by thread T0 here:
    #0 0x7f5116ac6b40 in __interceptor_malloc
    #1 0x562a07885770 in main zpool_main.c:10699
    #2 0x7f5115231bf6 in __libc_start_main
```

This commit passes NULL for these arguments if they are off the end
of the `argv[]` array.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #12339
2021-08-31 10:30:21 -07:00
Václav Skála 051658e639 Add missing properties to zfs allow manpage
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Václav Skála <skala@vshosting.cz>
Closes #12402
2021-08-31 10:30:21 -07:00
George Amanakis d0aadfed27 Fixes in persistent L2ARC
In l2arc_add_vdev() first decide whether the device is eligible for
L2ARC rebuild or whole device trim and then add it to the list of cache
devices. Otherwise l2arc_feed_thread() might already start writing on
the device invalidating previous content as l2ad_hand = l2ad_start.
However l2arc_rebuild_vdev() needs the device present in the cache
device list to figure out its l2arc_dev_t. Fix this by moving most of
l2arc_rebuild_vdev() in a new function l2arc_rebuild_dev() which does
not need to search in the cache device list.

In contrast to l2arc_add_vdev() we do not have to worry about
l2arc_feed_thread() invalidating previous content when onlining a
cache device. The device parameters (l2ad*) are not cleared when
offlining the device and writing new buffers will not invalidate
all previous content. In worst case only buffers that have not had
their log block written to the device will be lost.

Retire persist_l2arc_00{4,5,8} tests since they cover code already
covered by the remaining ones. Test persist_l2arc_006 is renamed to
persist_l2arc_004 and persist_l2arc_007 is renamed to persist_l2arc_005.

Fix a typo in persist_l2arc_004, and remove an assertion that is not
always true from l2arc_arcstats_pos. Also update an assertion in
persist_l2arc_005 and explain why in a comment.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #12365
2021-08-31 10:30:21 -07:00
Alexander Motin 1c0a02d79d Add comment on metaslab_class_throttle_reserve() locking
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Issue #12314
Closes #12419
2021-08-31 10:30:21 -07:00
Mark Johnston bf113ee0c8 Initialize dn_next_type[] in the dnode constructor
It seems nothing ensures that this array is zeroed when a dnode is
freshly allocated, so in principle it retains the values from the
previous allocation.  In practice it seems to be the case that the
fields should end up zeroed, but we can zero the field anyway for
consistency.

This was found using KMSAN.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12383
2021-08-31 10:30:21 -07:00
Mark Johnston 1d2eccb67a Zero pad bytes following TX_WRITE log data
When logging a TX_WRITE record in the case where file data has to be
copied from the DMU, we pad the log record size to a multiple of 8
bytes.  In this case, any padding bytes should be zeroed, otherwise the
contents of uninitialized memory are written to the ZIL.

This was found using KMSAN.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12383
2021-08-31 10:30:21 -07:00
Mark Johnston 951aef3cd5 Zero pad bytes when allocating a ZIL record
When allocating a record, we round up the allocation size to a multiple
of 8.  In this case, any padding bytes should be zeroed, otherwise the
contents of uninitialized memory are written to the ZIL.

This was found using KMSAN.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12383
2021-08-31 10:30:21 -07:00
Mark Johnston 992569cab9 Initialize all fields in zfs_log_xvattr()
When logging TX_SETATTR, we could otherwise fail to initialize part of
the corresponding ZIL record depending on which fields are present in
the xvattr.  Initialize the creation time and the AV scan timestamp to
zero so that uninitialized bytes are not written to the ZIL.

This was found using KMSAN.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12383
2021-08-31 10:30:21 -07:00
Mark Johnston e79cd79cd9 Initialize "autoreplace" in spa_ld_get_props()
spa_prop_find() may fail to find the specified property, in which case
it suppresses ENOENT from zap_lookup().  In this case, the return value
is left uninitialized, so spa_autoreplace was being initialized using an
uninitialized stack variable.

This was found using KMSAN.  It appears to be a regression from commit
9eb7b46ed0, which removed the initialization of "autoreplace" from the
definition.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12383
2021-08-31 10:30:21 -07:00
Coleman Kane 06b98ba564 Linux 5.14 compat: explicity assign set_page_dirty
Kernel 5.14 introduced a change where set_page_dirty of
struct address_space_operations is no longer implicitly set to
__set_page_dirty_buffers(), which ended up resulting in a NULL
pointer deref in the kernel when it is attempted to be called.
This change sets .set_page_dirty in the structure to
__set_page_dirty_nobuffers(), which was introduced with the
related patch set. The breaking change was introduce in commit
0af573780b0b13fceb7fabd49dc1b073cee9a507 to torvalds/linux.git.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12427
2021-08-31 10:30:21 -07:00
Rich Ercolani d753118a82 Fix unfortunate NULL in spa_update_dspace
After 1325434b, we can in certain circumstances end up calling
spa_update_dspace with vd->vdev_mg NULL, which ends poorly during
vdev removal.

So let's not do that further space adjustment when we can't.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12380
Closes #12428
2021-08-31 10:30:21 -07:00
Brian Behlendorf 89d1d38162 Linux 5.14 compat: blk_alloc_disk()
In Linux 5.14, blk_alloc_queue is no longer exported, and its usage
has been superseded by blk_alloc_disk, which returns a gendisk struct
from which we can still retrieve the struct request_queue* that is
needed in the one place where it is used. This also replaces the call
to alloc_disk(minors), and minors is now set via struct member
assignment.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12362
Closes #12409
2021-08-31 10:30:21 -07:00
Ryan Moeller 522365481b zloop: Add a max iterations option, use default run/pass times
It is useful to have control over the number of iterations of zloop so
we can easily produce "x core dumps found *in y iterations*" metrics.

Using random values for run/pass times doesn't improve coverage in a
meaningful way.

Randomizing run time could be seen as a compromise between running a
greater variety of shorter tests versus a smaller variety of longer
tests within a fixed time span.  However, it is not desirable when
running a fixed number of iterations.

Pass time already incorporates randomness within ztest.

Either parameter can be passed to ztest explicitly if the defaults are
not satisfactory.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12411
2021-08-31 10:30:21 -07:00
Alexander Motin d4326d2500 FreeBSD: Ignore make_dev_s() errors
Since errors returned by zvol_create_minor_impl() are ignored by the
common code, it is more convenient to ignore make_dev_s() errors there.
It allows, for example, to get device created for the zvol after later
rename instead of having it further stuck in half-created state.
zvol_rename_minor() already ignores those errors.

While there, switch from MAXPHYS to maxphys in FreeBSD 13+.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12375
2021-08-31 10:30:21 -07:00
Jorgen Lundman 7341b40b71 Remove old orig_fd variable from zfs send
Possibly required in the past, but is currently fills no purpose.
Ordinarily such tiny cleanup is not generally worth it, however
on the macOS port, in a future commit, we do unspeakable things to the
"fd" for send/recv, and it would be easier to only have to deal with
one "fd" instead of two.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #12404
2021-08-31 10:30:21 -07:00
Alexander Motin fa9b58809e Optimize allocation throttling
Remove mc_lock use from metaslab_class_throttle_*().  The math there
is based on refcounts and so atomic, so the only race possible there
is between zfs_refcount_count() and zfs_refcount_add().  But in most
cases metaslab_class_throttle_reserve() is called with the allocator
lock held, which covers the race.  In cases where the lock is not
held, GANG_ALLOCATION() or METASLAB_MUST_RESERVE are set, and so we
do not use zfs_refcount_count().  And even if we assume some other
non-existing scenario, the worst that may happen from this race is
few more I/Os get to allocation earlier, that is not a problem.

Move locks and data of different allocators into different cache
lines to avoid false sharing.  Group spa_alloc_* arrays together
into single array of aligned struct spa_alloc spa_allocs.  Align
struct metaslab_class_allocator.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12314
2021-08-31 10:30:21 -07:00
George Melikov a14f18a77a CI: generate ABI files if changed
So commit author can just download them as
artifacts and commit.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #12379
2021-08-31 10:30:21 -07:00
Alexander Motin a633ee7da1 Minor ARC optimizations
Remove unneeded global, practically constant, state pointer variables
(arc_anon, arc_mru, etc.), replacing them with macros of real state
variables addresses (&ARC_anon, &ARC_mru, etc.).

Change ARC_EVICT_ALL from -1ULL to UINT64_MAX, not requiring special
handling in inner loop of ARC reclamation.  Respectively change bytes
argument of arc_evict_state() from int64_t to uint64_t.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12348
2021-08-31 10:30:21 -07:00
Jorgen Lundman 3bb93002a1 dmu_redact.c does not call bqueue_destroy
Ensure all calls to bqueue_init() has a corresponding call to bqueue_destroy()

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #12118
2021-08-31 10:30:21 -07:00
Alexander 5c59562c87 A few fixes of callback typecasting (for the upcoming ClangCFI)
* zio: avoid callback typecasting
* zil: avoid zil_itxg_clean() callback typecasting
* zpl: decouple zpl_readpage() into two separate callbacks
* nvpair: explicitly declare callbacks for xdr_array()
* linux/zfs_nvops: don't use external iput() as a callback
* zcp_synctask: don't use fnvlist_free() as a callback
* zvol: don't use ops->zv_free() as a callback for taskq_dispatch()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #12260
2021-08-31 10:30:21 -07:00
Ryan Moeller 4c5a408713 Remove unused fields from zvol_task_t
We don't use or need the pool name or value source in the zvol tasks.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12361
2021-08-31 10:30:21 -07:00
Alexander Motin a5f3da53b3 FreeBSD: Switch from MAXPHYS to maxphys on FreeBSD 13+
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12378
2021-08-31 10:30:21 -07:00
George Melikov 67cc223a91 zpool_influxdb: fix -Werror=stringop-truncation
Use strlcpy instead of problematic strncpy

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #12344
2021-08-31 10:30:21 -07:00
Rich Ercolani be29f88951 Correct zfs-send(8) on readonly sends
zfs-send(8) claimed in the flags list you could use -pR when sending
a readonly filesystem or volume. You cannot.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12336
2021-08-31 10:30:21 -07:00
Alexander Motin d2fd664710 Introduce dsl_dir_diduse_transfer_space()
Most of dsl_dir_diduse_space() and dsl_dir_transfer_space() CPU time
is a dd_lock overhead and time spent in dmu_buf_will_dirty(). Calling
them one after another is a waste of time and even more contention.
Doing that twice for each rewritten block within dbuf_write_done()
via dsl_dataset_block_kill() and dsl_dataset_block_born() created one
of the biggest CPU overheads in case of small blocks rewrite.

dsl_dir_diduse_transfer_space() combines functionality of these two
functions for cases where it is needed, but without double overhead,
practically for the cost of dsl_dir_diduse_space() or even cheaper.

While there, optimize dsl_dir_phys() calls in dsl_dir_diduse_space()
and dsl_dir_transfer_space().  It seems Clang detects some aliasing
there, repeating dd->dd_dbuf->db_data dereference multiple times,
increasing dd_lock scope and contention.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Author: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12300
2021-08-31 10:30:21 -07:00
наб 43849d6f4a config/libatomic: require -latomic iff atomic.c doesn't link w/o it
In absence of LTO, and dynamic libatomic, la.so ends up in the needs
section of every toolchain executable; some consider this an issue.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12345
Closes #12359
2021-08-31 10:30:21 -07:00
Rich Ercolani c57b38bab1 Tinker with slop space accounting with dedup
* Tinker with slop space accounting with dedup

Do not include the deduplicated space usage in the slop space
reservation, it leads to surprising outcomes.

* Update spa_dedup_dspace sometimes

Sometimes, we get into spa_get_slop_space() with
spa_dedup_dspace=~0ULL, AKA "unset", while spa_dspace is correctly set.

So call the code to update it before we use it if we hit that case.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12271
2021-08-31 10:30:21 -07:00
Alexander Motin d0c3d10a52 Fix ARC ghost states eviction accounting
arc_evict_hdr() returns number of evicted bytes in scope of specific
state.  For ghost states it does not mean the amount of really freed
memory, but the logical buffer size.  It is correct for the eviction
process, but not for waking up threads waiting for ARC size reduction,
as added in "Revise ARC shrinker algorithm" commit, causing premature
wakeups while ARC is still overflowed, allowing even bigger overflow,
plus processing overhead when next allocation will also get blocked,
probably also for too short time.

To fix that make arc_evict_hdr() also return the amount of really
freed memory, which for the ghost states is only the header, and use
it to update arc_evict_count instead.  Originally I was thinking to
not return it at all, since arc_get_data_impl() does not account for
the headers, but decided that some slow allocation progress is better
than long waits, reaching on my tests up to 100ms.

To reduce negative latency effects of long time periods when reclaim
thread can free little real memory, start reclamation process earlier,
before we actually reached the overflow threshold, when we have to
throttle new allocations.  We can also do it without taking global
arc_evict_lock, reducing the contention.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12279
2021-08-31 10:30:21 -07:00
Brian Behlendorf 44d3beb23d Update bug report template
- Remove the "SPL Version" line, the repositories have been merged
  since the 0.8 release and we no longer need to ask about this.

- Simply ask for the kernel version / patch level and add a hint
  about how to get this information on Linux and FreeBSD.

- Remove "Status: Triage Needed" from the template, in practice
  we really haven't been using this label so let's step setting it.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes: #12340
2021-08-31 10:30:21 -07:00
George Wilson 28a15aec7c file reference counts can get corrupted
Callers of zfs_file_get and zfs_file_put can corrupt the reference
counts for the file structure resulting in a panic or a soft lockup.
When zfs send/recv runs, it will add a reference count to the
open file, and begin to send or recv the stream. If the file descriptor
is closed, then when dmu_recv_stream() or dmu_send() return we will
call zfs_file_put to remove the reference we placed on the file
structure. Unfortunately, because zfs_file_put() uses the file
descriptor to lookup the file structure, it may end up finding that
the file descriptor table no longer contains the file struct, thus
leaking the file structure. Or it might end up finding a file
descriptor for a different file and blindly updating its reference
counts. Other failure modes probably exists.

This change reworks the zfs_file_[get|put] interface to not rely
on the file descriptor but instead pass the zfs_file_t pointer around.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Co-authored-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: George Wilson <gwilson@delphix.com>
External-issue: DLPX-76119
Closes #12299
2021-08-31 10:30:21 -07:00
Jorgen Lundman e44e8ecb93 dprintf_dnode: strcpy -> strlcpy
Missed a couple of strcpy() in earlier commit, this is only used with
--enable-debug.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #12311
2021-08-31 10:30:21 -07:00
Jorgen Lundman 4bcd45fd0e Replace strchrnul() with strrchr()
Could have gone either way with this one, either adding it to
macOS/Windows SPL, or returning it to "classic" usage with strrchr().
Since the new special way isn't really used, and only used once,
we have this commit.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes  #12312
2021-08-31 10:30:21 -07:00
Alexander Motin e64528c33f FreeBSD: Use unmapped I/O for scattered/gang ABD buffers
Many FreeBSD disk drivers support "unmapped" I/O mode, when data
buffer represented not with a virtually contiguous KVA-mapped address
range, but with a list of physical memory pages.  Originally it was
designed to do I/O from buffers without KVA mapping (unmapped).  But
moving virtual addresses out of equation allows us to operate even
non-contiguous data buffers with one condition: all buffer discon-
tinuities must be aligned to memory page borders.

Doing I/O to capable GEOM device this patch traverses through non-
linear ABD buffers, validating the chunks borders.  If the condition
is met, it supplies GEOM with the list of original physical memory
pages instead of copying the data into temporary contiguous buffer.
On capable hardware on pools with ashift=12 and default ABD chunk of
4KB it should handle all the I/O without additional memory copying.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12320
2021-08-31 10:30:21 -07:00
Alexander Motin 21d7ba2a13 FreeBSD: Hardcode abd_chunk_size to PAGE_SIZE
It makes no sense to set it below PAGE_SIZE, since it increases all
overheads and makes returning memory to OS problematic.  It makes no
sense to set it above PAGE_SIZE, since such allocations and especially
frees are too expensive and cause KVA fragmentation to benefit from
fewer chunks.  After that it makes no sense to keep more complicated
math here.

What may have sense though is just a tunable border between linear and
scatter ABDs, previously also controlled by this tunable.  Retain that
functionality by taking abd_scatter_min_size tunable from Linux, just
with different default value.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12328
2021-08-31 10:30:21 -07:00
Alexander Motin 3b3b893def Move gethrtime() calls out of vdev queue lock
This dramatically reduces the lock contention on systems with slower
(non-TSC) timecounters.  With TSC the difference is minimal, but since
this lock is pretty congested, any improvement counts.  Plus I don't
see any reason to do it under the lock other than the latency of the
lock itself, which this change actually reduces.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12281
2021-08-31 10:30:21 -07:00
Justin Gottula 37e70b1814 Use substantially more robust program exit status logic in zvol_id
Currently, there are several places in zvol_id where the program logic
returns particular errno values, or even particular ioctl return values,
as the program exit status, rather than a straightforward system of
explicit zero on success and explicit nonzero value(s) on failure.

This is problematic for multiple reasons. One particularly interesting
problem that can arise, is that if any of these values happens to have
all 8 least significant bits unset (i.e., it is a positive or negative
multiple of 256), then although the C program sees a nonzero int value
(presumed to be a failure exit status), the actual exit status as seen
by the system is only the bottom 8 bits of that integer: zero.

This can happen in practice, and I have encountered it myself. In a
particularly weird situation, the zvol_open code in the zfs kernel
module was behaving in such a manner that it caused the open() syscall
to fail and for errno to be set to a kernel-private value (ERESTARTSYS,
which happens to be defined as 512). It turns out that 512 is evenly
divisible by 256; or, in other words, its least significant 8 bits are
all-zero. So even though zvol_id believed it was returning a nonzero
(failure) exit status of 512, the system modulo'd that value by 256,
resulting in the actual exit status visible by other programs being 0!
This actually-zero (non-failure) exit status caused problems: udev
believed that the program was operating successfully, when in fact it
was attempting to indicate failure via a nonzero exit status integer.
Combined with another problem, this led to the creation of nonsense
symlinks for zvol dev nodes by udev.

Let's get rid of all this problematic logic, and simply return
EXIT_SUCCESS (0) is everything went fine, and EXIT_FAILURE (1) if
anything went wrong.

Additionally, let's clarify some of the variable names (error is similar
to errno, etc) and clean up the overall program flow a bit.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Justin Gottula <justin@jgottula.com>
Closes #12302
2021-08-31 10:30:21 -07:00
Justin Gottula 8d21d1c033 Print zvol_id error messages to stderr rather than stdout
The zvol_id program is invoked by udev, via a PROGRAM key in the
60-zvol.rules.in rule file, to determine the "pretty" /dev/zvol/*
symlink paths paths that should be generated for each opaquely named
/dev/zd* dev node.

The udev rule uses the PROGRAM key, followed by a SYMLINK+= assignment
containing the %c substitution, to collect the program's stdout and then
"paste" it directly into the name of the symlink(s) to be created.

Unfortunately, as currently written, zvol_id outputs both its intended
output (a single string representing the symlink path that should be
created to refer to the name of the dataset whose /dev/zd* path is
given) AND its error messages (if any) to stdout.

When processing PROGRAM keys (and others, such as IMPORT{program}), udev
uses only the data written to stdout for functional purposes. Any data
written to stderr is used solely for the purposes of logging (if udev's
log_level is set to debug).

The unintended consequence of this is as follows: if zvol_id encounters
an error condition; and then udev fails to halt processing of the
current rule (either because zvol_id didn't return a nonzero exit
status, or because the PROGRAM key in the rule wasn't written properly
to result in a "non-match" condition that would stop the current rule on
a nonzero exit); then udev will create a space-delimited list of symlink
names derived directly from the words of the error message string!

I've observed this exact behavior on my own system, in a situation where
the open() syscall on /dev/zd* dev nodes was failing sporadically (for
reasons that aren't especially relevant here). Because the open() call
failed, zvol_id printed "Unable to open device file: /dev/zd736\n" to
stdout and then exited.

The udev rule finished with SYMLINK+="zvol/%c %c". Assuming a volume
name like pool/foo/bar, this would ordinarily expand to
   SYMLINK+="zvol/pool/foo/bar pool/foo/bar"
and would cause symlinks to be created like this:
   /dev/zvol/pool/foo/bar -> /dev/zd736
   /dev/pool/foo/bar      -> /dev/zd736

But because of the combination of error messages being printed to
stdout, and the udev syntax freely accepting a space-delimited sequence
of names in this context, the error message string
   "Unable to open device file: /dev/zd736\n"
in reality expanded to
   SYMLINK+="zvol/Unable to open device file: /dev/zd736"
which caused the following symlinks to actually be created:
   /dev/zvol/Unable -> /dev/zd736
   /dev/to          -> /dev/zd736
   /dev/open        -> /dev/zd736
   /dev/device      -> /dev/zd736
   /dev/file:       -> /dev/zd736
   /dev//dev/zd736  -> /dev/zd736

(And, because multiple zvols had open() syscall errors, multiple zvols
attempted to claim several of those symlink names, resulting in numerous
udev errors and timeouts and general chaos.)

This commit rectifies all this silliness by simply printing error
messages to stderr, as Dennis Ritchie originally intended.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Justin Gottula <justin@jgottula.com>
Closes #12302
2021-08-31 10:30:21 -07:00
Justin Gottula 771e6462bb Udev rules: use match (==) rather than assign (=) for PROGRAM
Assignment syntax (=) can be used for the PROGRAM key. But the PROGRAM
key is really a match key, not an assign key. The internal logic used by
udev to decide whether a PROGRAM key "matched" or not (which determines
whether the remainder of the rule is evaluated) depends on whether the
operator was OP_MATCH (==) or OP_NOMATCH (!=). [1]

The man page claims that '"=", ":=", and "+=" have the same effect as
"=="' for PROGRAM keys. And, after a brief perusal, the udev source code
does seem to confirm that operators other than OP_MATCH (==) or
OP_NOMATCH (!=) are implicitly converted to OP_MATCH (==). [2] But it's
not entirely clear that this is definitely the case: anecdotal testing
seems to indicate that when OP_ASSIGN (=) is used, the program's exit
status is disregarded and the remainder of the rule is processed
regardless of whether it was, in fact, a successful exit.

The bottom line here is that, if zvol_id hits some snag and returns a
nonzero exit status, then we almost certainly do NOT want to continue on
with the rule and use whatever the stdout contents may have been to
mindlessly create /dev/zvol/* symlinks. Therefore, let's be extra-sure
and use the match (==) operator explicitly, to eliminate any possibility
that udev might do the wrong thing, and ensure that a nonzero exit
status will definitely short-circuit the rest of the rule, bypassing the
SYMLINK+= assignments.

[1]
udev,
 file src/udev/udev-rules.c,
  func udev_rule_apply_token_to_event,
   switch case TK_M_PROGRAM if r != 0 (nonzero exit status):
      return token->op == OP_NOMATCH;
   switch case TK_M_PROGRAM if r == 0 (zero exit status):
      return token->op == OP_MATCH;
   func retval 0 => key is considered to have matched
   func retval 1 => key is considered to have NOT matched

[2]
udev,
 file src/udev/udev-rules.c,
  func parse_token,
   at func start:
      bool is_match = IN_SET(op, OP_MATCH, OP_NOMATCH);
   in else-if case streq(key, "PROGRAM"):
      if (!is_match) op = OP_MATCH;

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Justin Gottula <justin@jgottula.com>
Closes #12302
2021-08-31 10:30:21 -07:00
Justin Gottula a3960439aa Udev rules: replace deprecated $tempnode with $devnode
The $tempnode substitution is so old that it's not even mentioned in the
man page anymore. It is still technically supported by udev, but with
plenty of "deprecated" comments surrounding it.

The preferred modern equivalent of $tempnode is $devnode (or
alternatively, %N).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Justin Gottula <justin@jgottula.com>
Closes #12302
2021-08-31 10:30:21 -07:00
Justin Gottula 8145b62c47 Udev rules: use non-ancient comma syntax
This file is old as dirt. It's entirely possible that commas were
optional in udev back at that time. But they're definitely supposed to
be there nowadays.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Justin Gottula <justin@jgottula.com>
Closes #12302
2021-08-31 10:30:21 -07:00
Alexander Motin 917f13a7c2 Compact dbuf/buf hashes and lock arrays
With default dbuf cache size of 1/32 of ARC, it makes no sense to have
hash table of the same size (or even bigger on Linux).  Reduce it to
1/8 of ARC's one, still leaving some slack, assuming higher I/O rate
via dbuf cache than via ARC.

Remove padding from ARC hash locks array.  The idea behind padding
is to avoid false sharing between locks.  It would have sense if
there would be a limited number of very busy locks.  But since we
have no limit on the number, using the same memory for more locks we
can achieve even lower lock contention with the same false sharing,
or we can use less memory for the same contention level.

Reduce number of hash locks from 8192 to 2048.  The number is still
big enough to not cause contention, but reduced memory size improves
cache hit rate for mutex_tryenter() in ARC eviction thread, saving
about 1% of the thread time.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12289
2021-08-31 10:30:21 -07:00
Jorgen Lundman 802eff14eb Fix abd leak, kmem_free correct size of abd_t
Fix a leak of abd_t that manifested mostly when using
raidzN with at least as many columns as N (e.g. a
four-disk raidz2 but not a three-disk raidz2).
Sufficiently heavy raidz use would eventually run a system
out of memory.

Additionally:

* Switch abd_cache arena to FIRSTFIT, which empirically
improves perofrmance.

* Make abd_chunk_cache more performant and debuggable.

* Allocate the abd_zero_buf from abd_chunk_cache rather
than the heap.

* Don't try to reap non-existent qcaches in abd_cache arena.

* KM_PUSHPAGE->KM_SLEEP when allocating chunks from their
own arena

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Co-authored-by: Sean Doran <smd@use.net>
Closes #12295
2021-08-31 10:30:21 -07:00
Jorgen Lundman ccf3122a4b Upstream: dmu_zfetch_stream_fini leaks refcount
dmu_zfetch_stream_fini() is missing calls to destroy the refcounts,
leaking them and the mutex inside.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #12294
2021-08-31 10:30:21 -07:00
Alexander Motin c794ccdcb2 Remove refcount from spa_config_*()
The only reason for spa_config_*() to use refcount instead of simple
non-atomic (thanks to scl_lock) variable for scl_count is tracking,
hard disabled for the last 8 years.  Switch to simple int scl_count
reduces the lock hold time by avoiding atomic, plus makes structure
fit into single cache line, reducing the locks contention.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12287
2021-08-31 10:30:21 -07:00
Ryan Moeller 307f7472a5 ZED: Match added disk by pool/vdev GUID if found (#12217)
This enables ZED to auto-online vdevs that are not wholedisk managed by
ZFS.

Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2021-08-31 10:30:21 -07:00
Kevin Bowling ec6a6e8691
Detect HAVE_LARGE_STACKS at compile time (#12384)
Move HAVE_LARGE_STACKS definitions to header and set when appropriate.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Kevin Bowling <kbowling@FreeBSD.org>
Closes #12350
2021-08-19 12:17:08 -07:00
Brian Behlendorf 4f92fe0f5c Tag 2.1.0
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2021-07-02 11:04:34 -07:00
Brian Behlendorf 508fff0e4b Tag 2.1.0-rc8
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2021-06-29 13:19:19 -07:00
Brian Behlendorf d3942963ee Linux 5.13 compat: META
Increase the Linux-Maximum version in the META file to 5.13.
All of the required compatibility patches have been merged
and the 5.13 kernel has been officially released.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2021-06-29 13:18:58 -07:00
Laurențiu Nicola 0584ad8f94 zed: fix sending emails (#12292)
Commit 6fc3099 broke the quoting when invoking the mail program, revert
that change.

Signed-off-by: Laurențiu Nicola <lnicola@dend.ro>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2021-06-29 13:15:21 -07:00
Alexander Motin ea47857090 Avoid 64bit division in multilist index functions
The number of sublists in a multilist is relatively small. We dont need
64 bits to calculate an index. 32 bits is sufficient and makes the
code more efficient.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> 
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12288
2021-06-29 13:15:04 -07:00
Michal Vasilek 4ebda5d4d3 Fix plymouth passphrase prompt with dracut
plymouth --command splits the command on spaces which means
that zfs-load-key was getting the filesystem name enclosed
in single quotes (since 13c59bb76) and failing. This commit
fixes it by piping the password directly to the command
similar to how it's done in other scripts (initramfs,
dracut without plymouth).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Michal Vasilek <michal@vasilek.cz>
Related-to: #9193
Related-to: #9202
Closes #12147
2021-06-29 13:14:49 -07:00
Rich Ercolani 57ce66d293 Fix build with KASAN
The stock zstd code expects some helpers from ASAN if present.
This works fine in userland, but in kernel, KASAN also gets detected,
and lacks those helpers. So let's make some empty substitutes for
that case.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12232
2021-06-29 13:14:42 -07:00
Alexander Motin da6c288dfc Help compiller optimize out abd_verify()
While abd_verify() does nothing when built without debug, compiler
can't optimize it out by itself due to calls to external list_*()
and abd_verify_scatter().  This commit makes it explicit.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Adam Moss <c@yotes.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12280
2021-06-29 13:14:20 -07:00
Brian Behlendorf aee26af277 Update cache file when setting compatibility property
Unlike most other properties the 'compatibility' property is stored
in the pool config object and not the DMU_OT_POOL_PROPS object.

This had the advantage that the compatibility information is available
without needing to fully import the pool (it can be read with zdb).
However, this means we need to make sure to update both the copy of
the config in the MOS and the cache file.  This wasn't being done.

This commit adds a call to spa_async_request() to ensure the copy of
the config in the cache file gets updated as well as the one stored
in the pool.  This same change is made for the 'comment' property
which suffers from the same inconsistency.

Reviewed-by: Sean Eric Fagan <sef@ixsystems.com>
Reviewed-by: Colm Buckley <colm@tuatha.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12261 
Closes #12276
2021-06-24 14:33:51 -07:00
Paul Dagnelie 620389d32f Fix flag copying in resume case
A couple flags weren't being copied in the case where we're doing size
estimation on a resume.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes: #12266
2021-06-24 13:14:04 -07:00
jumbi77 5d7f47d828 zfs_metaslab_mem_limit should be 25 instead of 75
According to current zfs man page zfs_metaslab_mem_limit should be
25 instead of 75.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: jumbi77@users.noreply.github.com
Closes #12273
2021-06-24 13:13:56 -07:00
Rich Ercolani 977540168c Stop using "zstreamdump" in tests/
zstreamdump was replaced with "zstream dump"; let's stop using the
old name, compat symlink or no.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12277
2021-06-24 13:13:49 -07:00
Jonathon 1143d3d29d Update libera webchat client URL
Libera have made a webchat client available. This change builds on #12127.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jonathon Fernyhough <jonathon@m2x.dev>
Closes #12251
2021-06-24 13:13:44 -07:00
Attila Fülöp 088712793e gcc 11 cleanup
Compiling with gcc 11.1.0 produces three new warnings.
Change the code slightly to avoid them.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #12130
Closes #12188
Closes #12237
2021-06-24 13:13:40 -07:00
Brian Behlendorf e0886c96a8 ZTS: Add known exceptions
The receive-o-x_props_override test case reliably fails on the
FreeBSD main builders (but not on Linux), until the root cause is
understood add this test to the FreeBSD exception list.

On Linux the alloc_class_012_pos test case may occasionally fail.
This is a known false positive which has also been added to the
Linux exception list until the test can be made entirely reliable.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12272
2021-06-24 13:12:52 -07:00
Rich Ercolani 5e89181544 Annotated dprintf as printf-like
ZFS loves using %llu for uint64_t, but that requires a cast to not 
be noisy - which is even done in many, though not all, places.
Also a couple places used %u for uint64_t, which were promoted
to %llu. 

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12233
2021-06-24 13:12:36 -07:00
Antonio Russo dfbda2465f Revert Consolidate arc_buf allocation checks
This reverts commit 13fac09868.

Per the discussion in #11531, the reverted commit---which intended only
to be a cleanup commit---introduced a subtle, unintended change in
behavior.

Care was taken to partially revert and then reapply 10b3c7f5e4
which would otherwise have caused a conflict.  These changes were
squashed in to this commit.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Suggested-by: @chrisrd
Suggested-by: robn@despairlabs.com
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #11531 
Closes #12227
2021-06-24 13:12:25 -07:00
Alexander Motin 6b239d1757 Use wmsum for arc, abd, dbuf and zfetch statistics. (#12172)
wmsum was designed exactly for cases like these with many updates
and rare reads.  It allows to completely avoid atomic operations on
congested global variables.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12172
2021-06-24 13:10:59 -07:00
наб 9a865b7fb7 libspl: implement atomics in terms of atomics
This replaces the generic libspl atomic.c atomics implementation
with one based on builtin gcc atomics.  This functionality was added
as an experimental feature in gcc 4.4.  Today even CentOS 7 ships
with gcc 4.8 as the default compiler we can make this the default.

Furthermore, the builtin atomics are as good or better than our
hand-rolled implementation so it's reasonable to drop that custom code.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11904
Closes #12252
Closes #12244
2021-06-21 21:48:31 -07:00
George Amanakis 3de7aeb68a Avoid deadlock when removing L2ARC devices under I/O
In case we have I/O and try to remove an L2ARC device a deadlock might
occur. arc_read()->zio_read()->zfs_blkptr_verify() waits for SCL_VDEV
to be dropped while holding the hash_lock. However, spa_l2cache_load()
holds SCL_ALL and waits for the hash_lock in l2arc_evict().

Fix this by moving zfs_blkptr_verify() to the top top arc_read() before
the hash_lock is taken. Verify the block pointer and return a checksum
error if damaged rather than halting the system, by using
BLK_VERIFY_LOG instead of BLK_VERIFY_HALT.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #12054
2021-06-17 13:35:29 -07:00
наб ff64096e75 systemd: import: expand $ZPOOL_IMPORT_OPTS correctly
Turns out $ZPOOL_IMPORT_OPTS expands in a shell-like fashion,
yielding 'import' '-aN' '-o' 'cachefile=none' for an unset variable,
and 'import' '-aN' '-o' 'cachefile=none' 'word1' 'word2' for a
white-spaced one, but ${ZPOOL_IMPORT_OPTS} expands like "${Z_I_O}"
would in a shell, yielding 'import' '-aN' '-o' 'cachefile=none' ''
(empty) and 'import' '-aN' '-o' 'cachefile=none' 'word1 word2' (spaced)

Fixes eec5ba113e "dracut: 90zfs: respect
zfs_force=1 on systemd systems"

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes: #12231
2021-06-15 16:56:53 -07:00
Matthew Ahrens 7e2212990b vdev_draid_min_asize() ignores reserved space
vdev_draid_min_asize() returns the minimum size of a child vdev.  This
is used when determining if a disk is big enough to replace a child.
It's also used by zdb to determine how big of a child to make to test
replacement.

vdev_draid_min_asize() says that the child’s asize has to be at least
1/Nth of the entire draid’s asize, which is the same logic as raidz.
However, this contradicts the code in vdev_draid_open(), which
calculates the draid’s asize based on a reduced child size:

  An additional 32MB of scratch space is reserved at the end of each
  child for use by the dRAID expansion feature

So the problem is that you can replace a draid disk with one that’s
vdev_draid_min_asize(), but it actually needs to be larger to accommodate
the additional 32MB.  The replacement is allowed and everything works at
first (since the reserved space is at the end, and we don’t try to use
it yet), but when you try to close and reopen the pool,
vdev_draid_open() calculates a smaller asize for the draid, because of
the smaller leaf, which is not allowed.

I think the confusion is that vdev_draid_min_asize() is correctly
returning the amount of required *allocatable* space in a leaf, but the
actual *size* of the leaf needs to be at least 32MB more than that.
ztest_vdev_attach_detach() assumes that it can attach that size of
device, and it actually can (the kernel/libzpool accepts it), but it
then later causes zdb to not be able to open the pool.

This commit changes vdev_draid_min_asize() to return the required size
of the leaf, not the size that draid will make available to the metaslab
allocator.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #11459
Closes #12221
2021-06-15 16:56:42 -07:00
Paul Zuchowski bd83c1e0c6 Do not hash unlinked inodes
In zfs_znode_alloc we always hash inodes.  If the
znode is unlinked, we do not need to hash it.  This
fixes the problem where zfs_suspend_fs is doing zrele
(iput) in an async fashion, and zfs_resume_fs unlinked
drain processing will try to hash an inode that could
still be hashed, resulting in a panic.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #9741
Closes #11223
Closes #11648
Closes #12210
2021-06-15 16:56:19 -07:00
Rich Ercolani a416b29e16 Added uncompress requirement
Having an old enough version of "file" and no "uncompress" program
installed can cause rpmbuild as root to crash and mangle rpmdb.

So let's add a build dependency for RPM-based systems.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes: #12071
Closes: #12168
2021-06-15 16:55:49 -07:00
Brian Behlendorf af4b6f7dab ZTS: Add zfs_clone_livelist_dedup.ksh to Makefile.am
Commit 86b5f4c12 added a new zfs_clone_livelist_dedup.ksh test case
but didn't include it in the Makefile.am.  This results in the test
not being included in the dist tarball so it's never run by the CI.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes: #12224
2021-06-15 16:55:31 -07:00
Brian Behlendorf c3b60ededa Tag 2.1.0-rc7
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2021-06-10 10:50:16 -07:00
Alexander Motin 57196f8ae9 Re-embed multilist_t storage
This commit partially reverts changes to multilists in PR 7968
(multi-threaded spa-sync()) and adds some cache line alignments to
separate read-only multilists and heavily modified refcount's to
different cache lines.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-by: iXsystems, Inc.
Closes #12158
2021-06-10 10:50:16 -07:00
наб 8dc540ae16 dracut: 90zfs: respect zfs_force=1 on systemd systems
On systemd systems provide an environment generator in order
to respect the zfs_force=1 kernel command line option.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11403
Closes #12195
2021-06-10 10:50:16 -07:00
Alexander Motin efdfb14fc8 Remove pool io kstats
This mostly reverts "3537 want pool io kstats" commit of 8 years ago.

From one side this code using pool-wide locks became pretty bad for
performance, creating significant lock contention in I/O pipeline.
From another, there are more efficient ways now to obtain detailed
statistics, while this statistics is illumos-specific and much less
usable on Linux and FreeBSD, reported only via procfs/sysctls.

This commit does not remove KSTAT_TYPE_IO implementation, that may
be removed later together with already unused KSTAT_TYPE_INTR and
KSTAT_TYPE_TIMER.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12212
2021-06-10 10:50:16 -07:00
Rich Ercolani 2e158b0e0b Added error for writing to /dev/ on Linux
Starting in Linux 5.10, trying to write to /dev/{null,zero} errors out.
Prefer to inform people when this happens rather than hoping they guess
what's wrong.

Reviewed-by: Antonio Russo <aerusso@aerusso.net>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes:  #11991
2021-06-10 10:50:16 -07:00
наб 4cf3e48a3b libzfs: format safety
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12116
2021-06-10 10:50:16 -07:00
наб 4e0fff2e02 zgenhostid.8: revisit
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12212
2021-06-10 10:50:16 -07:00
наб 14973b917c Consistentify miscellaneous style on remaining manpages
Most notably this fixes the vdev_id(8) non-.Xrs in vdev_id.conf.5

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12212
2021-06-10 10:50:16 -07:00
наб a444efb6d7 Move properties, parameters, events, and concepts around manual sections
The pages moved as follows:
  zpool-features.{5 => 7}
  spl{-module-parameters.5 => .4}
  zfs{-module-parameters.5 => .4}
  zfs-events.5 => into zpool-events.8
  zfsconcepts.{8 => 7}
  zfsprops.{8 => 7}
  zpoolconcepts.{8 => 7}
  zpoolprops.{8 => 7}

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Co-authored-by: Daniel Ebdrup Jensen <debdrup@FreeBSD.org>
Closes #12149
Closes #12212
2021-06-10 10:50:16 -07:00
наб 0bef46e6d5 man: use one Makefile, use OpenZFS for .Os
The prevailing style is to use either nothing, or the originating
organisational umbrella (here: OpenZFS), and these aren't Linux manpages

This also deduplicates the substitution code, and makes adding/removing
sexions simpler in future

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12212
2021-06-10 10:50:16 -07:00
Brian Behlendorf 1180d61152 Fix minor shellcheck 0.7.2 warnings
The first warning of a misspelling is a false positive, so we annotate
the script accordingly.  As for the x-prefix warnings update the check
to use the conventional '[ -z <string> ]' syntax.

all-syslog.sh:46:47: warning: Possible misspelling: ZEVENT_ZIO_OBJECT
    may not be assigned, but ZEVENT_ZIO_OBJSET is. [SC2153]
make_gitrev.sh:53:6: note: Avoid x-prefix in comparisons as it no
    longer serves a purpose [SC2268]
man-dates.sh:10:7: note: Avoid x-prefix in comparisons as it no
    longer serves a purpose [SC2268]

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12208
2021-06-10 10:50:16 -07:00
наб 6ce97bb4a2 zed.d/history_event-zfs-list-cacher.sh.in: parallelise, simplify
This:
  (a) improves the error log message,
  (b) locks per pool instead of globally,
  (c) locks the actual output file instead of /var/lock/zfs-list,
      which would otherwise linger there forever (well, still will,
      but you can remove it and it won't come back), and
  (d) preserves attributes of the output file
      instead of reverting them to 0:0 644

It is imperative that the previous commit
("zed-functions.sh: zed_lock(): don't truncate lock")
be included in any series that contains this one

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12042
2021-06-09 13:06:39 -07:00
наб 27d3cc6cd3 zed.d/all-debug.sh: simplify
By locking the log file itself, we can omit arduous rebinding and
explicit umask setting, but, perhaps more importantly, avoid permanently
littering /var/lock/ with zed.debug.log.lock we will never delete

It is imperative that the previous commit
("zed-functions.sh: zed_lock(): don't truncate lock")
be included in any series that contains this one

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12042
2021-06-09 13:06:09 -07:00
наб d6a0cecab1 zed-functions.sh: zed_lock(): don't truncate lock
By appending instead of truncating, we can lock on any file (with write
permissions) instead of only dedicated lock files, since the locking
process itself no longer alters the file in any way

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12042
2021-06-09 13:06:05 -07:00
Alan Somers 2c3d7283b4 libzfs: On FreeBSD, use MNT_NOWAIT with getfsstat
`getfsstat(2)` is used to retrieve the list of mounted file systems,
which libzfs uses when fetching properties like mountpoint, atime,
setuid, etc.  The `mode` parameter may be `MNT_NOWAIT`, which uses
information in the VFS's cache, or `MNT_WAIT`, which effectively does a
`statfs` on every single mounted file system in order to fetch the most
up-to-date information.  As far as I can tell, the only fields that
libzfs cares about are the filesystem's name, mountpoint, fstypename,
and mount flags.  Those things are always updated on mount and unmount,
so they will always be accurate in the VFS's mount cache except in two
circumstances:

1) When a file system is busy unmounting
2) When a ZFS file system changes the value of a mount-overridable
   property like atime or setuid, but doesn't remount the file system.
   Right now that only happens when the property is changed by an
   unprivileged user who has delegated authority to change the property
   but not to mount the dataset.  But perhaps libzfs could choose to do
   it for other reasons in the future.

Switching to `MNT_NOWAIT` will greatly improve speed with no downside,
as long as we explicitly update the mount cache whenever we change a
mount-overridable property.

For comparison, Illumos gets this information using the native
`getmntany` and `getmntent` functions, which also use cached
information.  The illumos function that would refresh the cache,
`resetmnttab`, is never called by libzfs.

And on GNU/Linux, `getmntany` and `getmntent` don't even communicate
with the kernel directly.  They simply parse the file they are given,
which is usually /etc/mtab or /proc/mounts.  Perhaps the implementation
of /proc/mounts is synchronous, ala MNT_WAIT; I don't know.

Sponsored-by:	Axcient
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alan Somers <asomers@gmail.com>
Closes: #12091
2021-06-09 13:05:34 -07:00
наб d7e6f293da Modernise/fix/rewrite unlinted manpages
zpool-destroy.8: flatten, fix description
zfs-wait.8: flatten, fix description, use list for events
zpool-reguid.8: flatten, fix description
zpool-history.8: flatten, fix description
zpool-export.8: flatten, fix description, remove -f "unmount" reference
  AFAICT no such command exists even in Illumos (as of today, anyway),
  and we definitely don't call it
zpool-labelclear.8: flatten, fix description
zpool-features.5: modernise
spl-module-parameters.5: modernise
zfs-mount-generator.8: rewrite
zfs-module-parameters.5: modernise

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12169
2021-06-09 13:05:34 -07:00
Rich Ercolani 2f23f0f940 Force --enable-debug on FreeBSD if INVARIANTS is set
There's already logic to force INVARIANTS on for building if it's
present in the running kernel; however, not having DEBUG enabled
when DEBUG and INVARIANTS are can cause strange panics.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12185
Closes #12163
2021-06-09 13:05:34 -07:00
Serapheim Dimitropoulos a377bde727 Livelist logic should handle dedup blkptrs
Update the logic to handle the dedup-case of consecutive
FREEs in the livelist code. The logic still ensures that
all the FREE entries are matched up with a respective
ALLOC by keeping a refcount for each FREE blkptr that we
encounter and ensuring that this refcount gets to zero
by the time we are done processing the livelist.

zdb -y no longer panics when encountering double frees

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #11480
Closes #12177
2021-06-09 13:05:34 -07:00
Alexander Motin e76373de7b More aggsum optimizations
- Avoid atomic_add() when updating as_lower_bound/as_upper_bound.
Previous code was excessively strong on 64bit systems while not
strong enough on 32bit ones.  Instead introduce and use real
atomic_load() and atomic_store() operations, just an assignments
on 64bit machines, but using proper atomics on 32bit ones to avoid
torn reads/writes.

 - Reduce number of buckets on large systems.  Extra buckets not as
much improve add speed, as hurt reads.  Unlike wmsum for aggsum
reads are still important.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12145
2021-06-09 13:05:34 -07:00
наб b05ae1a82a libzfs: write_inuse_diffs_one: format strerror() with "%s"
Fixes 50353dbd ("Let zfs diff be more  permissive") which accidentally
introduced a build warning.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12197
2021-06-09 13:05:34 -07:00
наб b12a6d961c i-t: don't try to import from empty cache
Chases 7c64ee9e77
 ("zfs-import-{cache,scan}: change condition to FileNotEmpty")

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12108
2021-06-09 13:05:34 -07:00
наб 11b165cc9b Use %%/* instead of awk -F/ {print $1} to strip datasets
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12108
2021-06-09 13:05:34 -07:00
наб 2dde9202d9 dracut: 90zfs: zfs-load-key: don't load unencrypted bootfs' keylocation
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11800
Closes #12108
2021-06-09 13:05:34 -07:00
наб 998035d534 dracut: 90zfs: module-setup: try /lib*/libgcc_s.so*, relax /u/l/gcc path
SUSE stores the library at /lib64/libgcc_s.so.1 (/lib/libgcc_s.so.1 for
i686 glibc), which is in the search path

Also relax the /usr/lib path to catch systems similar to SUSE
(/usr/lib64/gcc/x86_64-suse-linux/10/libgcc_s.so) but without
the top-level lib64

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11750
Closes #12108
2021-06-09 13:05:34 -07:00
Rich Ercolani 64e38df237 Let zfs diff be more permissive
In the current world, `zfs diff` will die on certain kinds of errors
that come up on ordinary, not-mangled filesystems - like EINVAL,
which can come from a file with multiple hardlinks having the one
whose name is referenced deleted.

Since it should always be safe to continue, let's relax about all
error codes - still print something for most, but don't immediately
abort when we encounter them.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12072
2021-06-09 13:05:34 -07:00
jharmening d3dddbaa20 FreeBSD: incorporate changes to the VFS_QUOTACTL(9) KPI
VFS_QUOTACTL(9) has been updated to allow each filesystem to indicate
whether it has changed the busy state of the mount.  The filesystem
may still assume that its .vfs_quotactl entrypoint is always called
with the mount busied, but only needs to unbusy the mount (and clear
*mp_busy) if it does something that actually requires the mount to be
unbusied.  It no longer needs to blindly copy-paste the UFS protocol
for calling vfs_unbusy(9) for the Q_QUOTAOFF and Q_QUOTAON commands.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Jason Harmening <jason.harmening@gmail.com>
Closes #12052
2021-06-09 13:05:34 -07:00
Ryan Moeller e298695809 Fix error check in nvlist_print_json_string
Move check for errors from mbrtowc() into the loop.  The error values
are not actually negative, so we don't break out of the loop when they
are encountered.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12175
Closes #12176
2021-06-09 13:05:34 -07:00
наб c650ceb64d Lint most manpages
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12129
2021-06-09 13:05:34 -07:00
наб 13c9a41140 mancheck: accept lints, accept lint overrides
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12129
2021-06-09 13:05:34 -07:00
Brian Behlendorf af80160f48 Linux: Set spl_kmem_cache_slab_limit when page size !4K
For small objects the kernel's slab implementation is very fast and
space efficient. However, as the allocation size increases to
require multiple pages performance suffers. The SPL kmem cache
allocator was designed to better handle these large allocation
sizes. Therefore, on Linux the kmem_cache_* compatibility wrappers
prefer to use the kernel's slab allocator for small objects and
the custom SPL kmem cache allocator for larger objects.

This logic was effectively disabled for all architectures using
a non-4K page size which caused all kmem caches to only use the
SPL implementation. Functionally this is fine, but the SPL code
which calculates the target number of objects per-slab does not
take in to account that __vmalloc() always returns page-aligned
memory. This can result in a massive amount of wasted space when
allocating tiny objects on a platform using large pages (64k).

To resolve this issue we set the spl_kmem_cache_slab_limit cutoff
to 16K for all architectures. 

This particular change does not attempt to update the logic used
to calculate the optimal number of pages per slab. This remains
an issue which should be addressed in a future change.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12152
Closes #11429
Closes #11574
Closes #12150
2021-06-09 13:05:34 -07:00
Colm db9048741b A couple of small style cleanups
In `zpool_load_compat()`:

  * initialize `l_features[]` with a loop rather than a static
    initializer.

  * don't redefine system constants; use private names instead

Rationale here:

When an array is initialized using a static {foo}, only the specified
members are initialized to the provided values, the rest are
initialized to zero. While B_FALSE is of course zero, it feels
unsafe to rely on this being true forever, so I'm inclined to sacrifice
a few microseconds of runtime here and initialize using a loop.

When looking for the correct combination of system constants to use
(in open() and mmap()), I prefer to use private constants rather than
redefining system ones; due to the small chance that the system
ones might be referenced later in the file. So rather than defining
O_PATH and MAP_POPULATE, I use distinct constant names.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Colm Buckley <colm@tuatha.org>
Closes #12156
2021-06-09 13:05:34 -07:00
наб b1b1faf0f1 zfs-module-parameters.5: remove nonexistent parameters
zfs_arc_overflow_shift was never a parameter:
ca0bf58d65 ("Illumos 5497 - lock
contention on arcs_mtx") is the only result in
git log -Soverflow_shift, and it wasn't exposed then, nor is it now

zfs_read_chunk_size was renamed to zfs_vnops_read_chunk_size in
e53d678d4a ("Share zfs_fsync, zfs_read,
zfs_write, et al between Linux and FreeBSD")

zio_decompress_fail_fraction was never a parameter: it was added in
c3bd3fb4ac ("OpenZFS 9403 - assertion
failed in arc_buf_destroy()") as a developer aid for setting in zdb, but
it's a dangerous test tunable and has no place in public documentation,
(not to mention that it obviously doesn't work):
> Although this did uncover a few low priority issues, this
  unfortuantely also causes ztest to ASSERT in many locations where the
  code is working correctly since it is designed to fail on IO errors.
  Developers can manually set this variable with the '-o' option to find
  and debug issues.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12157
2021-06-09 13:05:34 -07:00
наб 2fe8060cee spl-module-parameters.5: remove spl_kmem_cache_{expire,obj_per_slab_min}
Both were removed in 4fbdb10c7b ("remove
kmem_cache module parameter KMC_EXPIRE_AGE")

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12157
2021-06-09 13:05:34 -07:00
Rich Ercolani 6e01b47a28 Quick fixes for two ZTS failures
On FreeBSD 14, these two tests started erroring out like the
objects they're attempting to examine don't exist.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12165
2021-06-09 13:05:34 -07:00
Rich Ercolani cc9dae2404 Added another missed case to arc_summary3
It turns out that sometimes, evidently only when run inside the
ZTS handler, arc_summary3 | head > /dev/null will die with ENOTCONN,
and ruin the test run.

Added handling for that.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12160
2021-06-09 13:05:34 -07:00
grembo 1fb18c4bde FreeBSD boot code reminder after zpool upgrade
There used to be a warning after upgrading a zpool in FreeBSD, so users
won't forget to update the boot loader that pool is booted from.

This change brings this warning back, but only if the bootfs property
is set on the pool, which should be sufficient for the vast majority of
FreeBSD installations. People running something custom are most likely
aware of what to do after an upgrade in their specific environment.

Functionality is implemented in an OS specific helper function.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Co-authored-by: Michael Gmelin <grembo@FreeBSD.org>
Signed-off-by: Michael Gmelin <grembo@FreeBSD.org>
Closes #12099
Closes #12104
2021-06-09 13:05:34 -07:00
Rich Ercolani a0c055cfd3 Remove iov_iter_advance() for iter_write
The additional iter advance is incorrect, as copy_from_iter() has
already done the right thing.  This will result in the following
warning being printed to the console as of the 5.12 kernel.

    Attempted to advance past end of bvec iter

This change should have been included with #11378 when a
similar change was made on the read side.

Suggested-by: @siebenmann
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Issue #11378
Closes #12041
Closes #12155
2021-06-09 13:05:34 -07:00
наб 91bb2e91bd Turn checkbashisms into a make target
make_gitrev.sh actually breaks checkbashisms' parser,
which /insists/ that the end-of-line " is actually a string start

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12101
2021-06-09 13:05:34 -07:00
наб 132240507d Turn shellcheck into a normal make target. Fix new files it caught
This checks every file it checked (and a few more),
but explicitly instead of "if it works it works" best-effort
(which wasn't that good anyway)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #10512
Closes #12101
2021-06-09 13:05:34 -07:00
наб d6f8f41c21 udev/rules.d: .gitignore: glob all rules
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12101
2021-06-09 13:05:34 -07:00
наб 2f5c39137a i-t: don't suggest zpool-import with altroot to /root
This *will fail* when remounted by the real root

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12148
2021-06-09 13:05:34 -07:00
наб 90d0023005 i-t: let rootdelay= set $ZFS_INITRD_PRE_MOUNTROOT_SLEEP
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Ref: https://github.com/openzfs/zfs/issues/11420#issuecomment-850338673
Closes #11663
Closes #12148
2021-06-09 13:05:34 -07:00
наб dbfbc1e524 zstream: force-install zstreamdump link
Accidentally introduced by commit dd00925e8d.

Force-install the zstreamdump link, this is a supported configuration
and the install should not fail if it needs to overwrite an existing
file.

Also cd to work around some funny platforms as noted in AC_PROG_LN_S doc

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12143
2021-06-09 13:05:34 -07:00
наб a61c502907 Widen mancheck to all of man and test-runner
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
наб f2e890ddfa test-runner.1: modernise
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
наб 71f6e6480e zfs-events.5: modernise
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
наб a1b1007e95 vdev_id.conf.5: modernise
Also yeet pci_slot since it doesn't seem to exist?

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
наб 01918aa3f4 man: use Nm/Cm/Fl consistently
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
наб e1ae0c4b99 zed.8: modernise
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
наб 475ee3bc2b cstyle.1: modernise
Also remove note about the OS/Net consolidation, now the illumos gate

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
наб f6ce806298 vdev_id.8: modernise, note scsi topology
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
наб ea034765b4 zhack.1: modernise
The spacing on zhack    feature stat    pool is a bit iffy(?)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
наб 9d7c10388a zpool_influxdb.8: modernise
Also rip out the section about potentially including in the OpenZFS
distribution and simplify -e description

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
наб f67a920a6d zinject.8: modernise
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
наб 5c7c2b1301 raidz_test.1: modernise
Also re-add articles left out by the slav who wrote this

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
наб 5d08b105a2 zpoolprops.8: fix spacing in ashift
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
наб 55b6d4a3d8 fsck.zfs.8: modernise
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
наб a0372059b1 arcstat.1: modernise
Also slim down the description a tad

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
наб 68a310dc45 ztest.1: modernise
I fixed a few typos, but avoided changing anything beyond that;
the sould of the document should be preserved

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
наб 8ed04625db zgenhostid.8: use single-line indent macro for single-line examples
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-06-09 13:05:34 -07:00
Manoj Joseph c3c65e3cfb long options for ztest
This change introduces long options for ztest. It builds the usage
message as well as the long_options array from a single table. It also
adds #defines for the default values.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Manoj Joseph <manoj.joseph@delphix.com>
Closes #12117
2021-06-09 13:05:34 -07:00
Paul Dagnelie eb9d335009 Don't direct to freenode in issue template
While Libera doesn't yet have a webchat client, we should at least
direct them to the right network. Once a webchat client is available,
we can direct them to it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #12127
2021-06-09 13:05:34 -07:00
Alexander Motin 85c43508f3 Introduce write-mostly sums
wmsum counters are a reduced version of aggsum counters, optimized for
write-mostly scenarios.  They do not provide optimized read functions,
but instead allow much cheaper add function.  The primary usage is
infrequently read statistic counters, not requiring exact precision.

The Linux implementation is directly mapped into percpu_counter KPI.
The FreeBSD implementation is directly mapped into counter(9) KPI.
In user-space due to lack of better implementation mapped to aggsum.

Unfortunately neither Linux percpu_counter nor FreeBSD counter(9)
provide sufficient functionality to completelly replace aggsum, so
it still remains to be used for several hot counters.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12114
2021-06-09 13:05:34 -07:00
Alexander Motin 09c0a8fd1a Improve scrub maxinflight_bytes math.
Previously, ZFS scaled maxinflight_bytes based on total number of
disks in the pool.  A 3-wide mirror was receiving a queue depth of 3
disks, which it should not, since it reads from all the disks inside.
For wide raidz the situation was slightly better, but still a 3-wide
raidz1 received a depth of 3 disks instead of 2.

The new code counts only unique data disks, i.e. 1 disk for mirrors
and non-parity disks for raidz/draid.  For draid the math is still
imperfect, since vdev_get_nparity() returns number of parity disks
per group, not per vdev, but still some better than it was.

This should slightly reduce scrub influence on payload for some pool
topologies by avoiding excessive queuing.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored-By:	iXsystems, Inc.
Closing #12046
2021-06-08 14:50:57 -07:00
наб ec3b25825e etc/systemd/zfs-mount-generator: output tweaks
git-diff--w-dirty, but:
  * zfs-load-key-$DSET.service -> zfs-load-key@$DSET.service
  * flattened set -eu into other /bin/sh flags
  * simpler (for 1 2 3 vs while [ counter ]; counter+=1) prompt loop
  * exec $ZFS where applicable

Reviewed-by: Antonio Russo <aerusso@aerusso.net>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: InsanePrawn <insane.prawny@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #11915
Closes #11917
2021-06-08 14:50:42 -07:00
наб 0382362ce0 etc/systemd/zfs-mount-generator: rewrite in C
A plain rewrite of the shell version, and generates identical
units, save for replacing some empty lines with nothing, having fewer
meaningless spaces in After=s and different spacing in the lock scripts,
for a clean git diff -w

This is a gain of anywhere from 0m0.336s vs 0m0.022s (15.27x)
to 0m0.202s vs 0m0.006s (33.67x), depending on the hardware,
a.k.a. from "absolutely unusable" to "perfectly fine"

This also properly deals with canmount=noauto units across multiple
pools

See PR for detailed timings (of an early version) and diffs

Reviewed-by: Antonio Russo <aerusso@aerusso.net>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: InsanePrawn <insane.prawny@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #11915
Closes #11917
2021-06-08 14:50:38 -07:00
наб 0458070928 zstreamdump: replace with link to zstream
zstreamdump(8) was in quite a bad state,
and the wrapper didn't work if invoked without /sbin in $PATH

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12015
2021-06-08 14:48:58 -07:00
наб 1fc5f8cbfd d/zfsutils.zfs.init derivatives: shellcheck, fix header
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12042
2021-06-08 14:47:24 -07:00
наб 263b3d64ab contrib/bash_completion.d: fix obvious shellcheck problems
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12042
2021-06-08 14:47:13 -07:00
наб c1a64be6d4 zgenhostid: use argument path directly
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12042
2021-06-08 14:47:05 -07:00
наб e40ffed021 Trim excess shellcheck annotations. Widen to all non-Korn scripts
Before, make shellcheck checked
  scripts/{commitcheck,make_gitrev,man-dates,paxcheck,zfs-helpers,zfs,
           zfs-tests,zimport,zloop}.sh
  cmd/zed/zed.d/{{all-debug,all-syslog,data-notify,generic-notify,
                 resilver_finish-start-scrub,scrub_finish-notify,
                 statechange-led,statechange-notify,trim_finish-notify,
                 zed-functions}.sh,history_event-zfs-list-cacher.sh.in}
  cmd/zpool/zpool.d/{dm-deps,iostat,lsblk,media,ses,smart,upath}
now it also checks
  contrib/dracut/{02zfsexpandknowledge/module-setup,
                  90zfs/{export-zfs,parse-zfs,zfs-needshutdown,
                         zfs-load-key,zfs-lib,module-setup,
                         mount-zfs,zfs-generator}}.sh.in
  cmd/zed/zed.d/{pool_import-led,vdev_attach-led,
                 resilver_finish-notify,vdev_clear-led}.sh
  contrib/initramfs/{zfsunlock,hooks/zfs.in,scripts/local-top/zfs}
  tests/zfs-tests/tests/perf/scripts/prefetch_io.sh
  scripts/common.sh.in
  contrib/bpftrace/zfs-trace.sh
  autogen.sh

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12042
2021-06-08 14:46:31 -07:00
наб 59d91b4d10 Fix SC2181 ("[ $?") outside tests/
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12042
2021-06-08 14:45:03 -07:00
наб d53a6969c1 i-t: rewrite hooks
This produces a leaner image, doesn't fail if zdb doesn't exist,
properly handles hostnameless systems, doesn't mention crypto modules
for no reason, doesn't add useless empty executable in hopes an
eight-year-old PR is merged, uses i-t builtins for all copies

Also optimize the checkbashisms filter to spawn one (or a few) awks
instead of one per regular file and remove initramfs/hooks therefrom due
to a command -v false positive

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12017
2021-06-08 14:44:35 -07:00
наб 019739b6e9 dracut/90/module-setup: mainly shellcheck cleanup
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #11956
2021-06-08 14:43:06 -07:00
Brian Behlendorf 7d9f3ef0ef Tag 2.1.0-rc6
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2021-05-28 09:01:47 -07:00
Armin Wehrfritz 0a01f36b69 RPM: Explicitly set the required min/max kernel version for the DKMS package
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Armin Wehrfritz <dkxls23@gmail.com>
Closes #12124
2021-05-28 09:01:31 -07:00
Rich Ercolani 9e06df8ab0 Minor fix to configure on s390x
configure on s390x has a key check fail with an error about
a variable being used uninitialized. So let's initialize it.

Reviewed-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12126
2021-05-28 09:01:24 -07:00
Rich Ercolani 57e3b9c3cc Bend zpl_set_acl to permit the new userns* parameter
Just like #12087, the set_acl signature changed with all the bolted-on
*userns parameters, which disabled set_acl usage, and caused #12076.

Turn zpl_set_acl into zpl_set_acl and zpl_set_acl_impl, and add a
new configure test for the new version.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12076
Closes #12093
2021-05-27 22:31:57 -07:00
Rich Ercolani 605b901ca6 Reinstate the old zpool read label logic as a fallback
In case of AIO failure, we should probably fallback to the old
behavior and still work.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Alan Somers <asomers@gmail.com>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12032
Closes #12040
2021-05-27 22:31:57 -07:00
наб c5f268fa14 mount.zfs.8: match to reality; zfsprops.8: add missing temporary options
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12111
2021-05-27 22:31:57 -07:00
наб bc7792cad2 mount.zfs.8: modernise
No changes to the text itself

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12111
2021-05-27 22:31:57 -07:00
наб f3222062c0 zfsprops.8: remove nbmand-not-used-on-Linux and pointer to mount(8)
Linux man-pages' mount(8) points at fcntl(2), as does mount(2),
and support for it is little-used, deprecated, and configurable
since 4.5.

As far as I can tell, FreeBSD doesn't support nbmand at all ‒
mandatory locks are mostly dead

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12111
2021-05-27 22:31:57 -07:00
наб 6316086b72 Various Linux kABI cosmetics
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12103
2021-05-27 22:31:57 -07:00
наб 7cdd4dd33b linux: don't fall through to 3-arg vfs_getattr
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12103
2021-05-27 22:31:57 -07:00
Alexander Motin a7bb2dab4e FreeBSD: Update dataset_kstats for zvols in dev mode
Previous commit added accounting for geom mode, but not for dev.
In geom mode we actually have GEOM statistics, while in dev mode
additional accounting actually makes more sense.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12097
2021-05-27 22:31:57 -07:00
Rich Ercolani ecebf770c1 Correct flaws in arc_summary[23] and their test.
The change correctly handles BrokenPipeError and improves the
associated tests.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12037
Closes #12036
2021-05-27 22:31:57 -07:00
Alexander Motin aa4a84e616 FreeBSD: avoid memory allocation in arc_prune_async
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Closes #12049
2021-05-27 22:31:57 -07:00
Alexander Motin cc1c7b0171 FreeBSD: Retry OCF ENOMEM errors.
ZFS does not expect transient errors from crypto.  For read they are
counted as checksum errors, while for write end up in panic.  To not
panic on random low memory conditions retry ENOMEM errors in the OCF
wrapper function.

While there remove unneeded timeout and priority from msleep().

External-issue: https://reviews.freebsd.org/D30339
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12077
2021-05-27 22:31:57 -07:00
Rich Ercolani fa7ee48e10 Add note for printing all dbgmsg entries on FreeBSD
I looked for a bit, and couldn't find any documentation on
how to print all logged dbgmsg entries, just messages since
the DTrace probe started, until @allanjude kindly pointed me
toward the sysctl.

So let's add that note where the DTrace probe is mentioned for
FreeBSD, so other people can find it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12113
2021-05-27 22:31:57 -07:00
vermavipinkumar d6bedbbc44 Propagate vdev state due to invalid label corruption
Propagate vdev child state to parents on invalid label
Add VDEV_AUX_BAD_LABEL to print_import_config()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Co-authored-by: Srikanth N S <srikanth.nagasubbaraoseetharaman@hpe.com>
Signed-off-by: Vipin Kumar Verma <vipin.verma@hpe.com>
Closes #12088
2021-05-27 22:31:57 -07:00
Rich Ercolani ab6717cba6 Update tmpfile() existence detection
Linux changed the tmpfile() signature again in torvalds/linux@6521f89,
which in turn broke our HAVE_TMPFILE detection in configure.

Update that macro to include the new case, and change the signature of
zpl_tmpfile as appropriate.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes: #12060
Closes: #12087
2021-05-27 22:31:56 -07:00
Brian Behlendorf 33a06f27e6 Fix dRAID sequential resilver silent damage handling
This change addresses two distinct scenarios which are possible
when performing a sequential resilver to a dRAID pool with vdevs
that contain silent unknown damage. Which in this circumstance
took the form of the devices being intentionally overwritten with
zeros. However, it could also result from a device returning incorrect
data while a sequential resilver was in progress.

Scenario 1) A sequential resilver is performed while all of the
dRAID vdevs are ONLINE and there is silent damage present on the
vdev being resilvered. In this case, nothing will be repaired
by vdev_raidz_io_done_reconstruct_known_missing() because
rc->rc_error isn't set on any of the raid columns. To address
this vdev_draid_io_start_read() has been updated to always mark
the resilvering column as ESTALE for sequential resilver IO.

Scenario 2) Multiple columns contain silent damage for the same
block and a sequential resilver is performed. In this case it's
impossible to generate the correct data from parity unless all of
the damaged columns are being sequentially resilvered (and thus
only good data is used to generate parity). This is as expected
and there's nothing which can be done about it. However, we need
to be careful not to make to situation worse. Since we can't
verify the data is actually good without a checksum, we must
only repair the devices which are being sequentially resilvered.
Otherwise, an incorrect repair to a device which previously
contained good data could effectively lock in the damage and
make reconstruction impossible. A check for this was added to
vdev_raidz_io_done_verified() along with a new test case.

Lastly, this change updates the redundancy_draid_spare1 and
redundancy_draid_spare3 test cases to be more representative
of normal dRAID replacement operation.  Specifically, what we
care about is that the scrub run after a sequential resilver
does not find additional blocks which need repair.  This would
indicate the sequential resilver failed to rebuild a section of
one of the devices. Note also the tests were switched to using
the verify_pool() function which still checks for checksum errors.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12061
2021-05-27 22:31:56 -07:00
Lauri Tirkkonen 7ad8fc5407 zfs-allow.8: mention 'bookmark' permission
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Lauri Tirkkonen <lauri@hacktheplanet.fi>
Closes #12064
2021-05-27 22:31:56 -07:00
Rich Ercolani 272b178d52 Simple change to fix building in recent environments
Renamed _fini too for symmetry.

Suggested-by: @ensch
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12059
Closes: #11987
Closes: #12056
2021-05-27 22:31:56 -07:00
Alexander Motin 8f3584292f Scale worker threads and taskqs with number of CPUs
While use of dynamic taskqs allows to reduce number of idle threads,
hardcoded 8 taskqs of each kind is a big overkill for small systems,
complicating CPU scheduling, increasing I/O reorder, etc, while
providing no real locking benefits, just not needed there.

On another side, 12*8 worker threads per kind are able to overload
almost any system nowadays.  For example, pool of several fast SSDs
with SHA256 checksum makes system barely responsive during scrub, or
with dedup enabled barely responsive during large file deletion.

To address both problems this patch introduces ZTI_SCALE macro, alike
to ZTI_BATCH, but with multiple taskqs, depending on number of CPUs,
to be used in places where lock scalability is needed, while request
ordering is not so much.  The code is made to create new taskq for
~6 worker threads (less for small systems, but more for very large)
up to 80% of CPU cores (previous 75% was not good for rounding down).
Both number of threads and threads per taskq are now tunable in case
somebody really wants to use all of system power for ZFS.

While obviously some benchmarks show small peak performance reduction
(not so big really, especially on systems with SMT, where use of the
second threads does not give as much performance as the first ones),
they also show dramatic latency reduction and much more smooth user-
space operation in case of high CPU usage by ZFS.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #11966
2021-05-27 22:31:56 -07:00
Brian Behlendorf 289b1393a4 ZTS: Increase redundancy test timeout
The redundancy_draid.ksh and redundancy_raidz.ksh tests were updated
by commit 93c8e91fe to additionally verify self-healing.  This
additional check increased the run time which can now occasionally
exceed the default maximum timeout in the CI environment.  To prevent
this from causing failures increase the default timeout for the
redundancy test cases.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12043
2021-05-27 22:12:48 -07:00
Paul Zuchowski ec75f80a50 Fix dmu_recv_stream test for resumable
Use dsl_dataset_has_resume_receive_state()
not dsl_dataset_is_zapified() to check if
stream is resumable.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Alek Pinchuk <apinchuk@axcient.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #12034
2021-05-27 22:12:40 -07:00
Ryan Moeller c2c02e490f FreeBSD: Use SET_ERROR to trace xattr name errors
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11997
2021-05-27 22:12:26 -07:00
Ryan Moeller 793dffb04e FreeBSD: Don't force xattr mount option
The kernel will use the xattr property by default when not overridden
by a mount option.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11997
2021-05-27 22:12:19 -07:00
Brian Behlendorf faa5673982 Revert "Fix raw sends on encrypted datasets when copying back snapshots"
Commit d1d4769 takes into account the encryption key version to
decide if the local_mac could be zeroed out. However, this could lead
to failure mounting encrypted datasets created with intermediate
versions of ZFS encryption available in master between major releases.
In order to prevent this situation revert d1d4769 pending a more
comprehensive fix which addresses the mount failure case.

Reviewed-by: George Amanakis <gamanakis@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #11294
Issue #12025
Issue #12300
Closes #12033
2021-05-27 22:10:13 -07:00
наб ade8e4b7d6 Widen mancheck target to all pages, fix them
mandoc: ./man/man8/zfs-mount-generator.8.in:188:2:
        ERROR: skipping end of block that is not open: RE
mandoc: ./man/man8/zfs_ids_to_path.8:38:2:
        ERROR: skipping unknown macro: .LP
mandoc: ./man/man8/zfs_ids_to_path.8:48:2:
        ERROR: inserting missing end of block: Sh breaks Bl
mandoc: ./man/man8/zfs-wait.8:69:2:
        ERROR: skipping end of block that is not open: El
mandoc: ./man/man8/zfs-program.8:460:2:
        ERROR: inserting missing end of block: It breaks Bd
mandoc: ./man/man8/zfs-mount-generator.8:188:2:
        ERROR: skipping end of block that is not open: RE
mandoc: ./man/man8/zstream.8:43:2:
        ERROR: skipping unknown macro: .LP
mandoc: ./man/man8/zstream.8:107:2:
        ERROR: inserting missing end of block: Sh breaks Bl
mandoc: ./man/man8/zstream.8:107:2:
        ERROR: inserting missing end of block: Sh breaks Bl
make: *** [Makefile:1529: mancheck] Error 1

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #12017
2021-05-27 22:09:51 -07:00
Brian Behlendorf 1a7bad542b ZTS: Add known exceptions
The following seven tests been observed to occasionally fail during
CI testing.  This commit adds them to the list of known somewhat
flaky test cases.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12023
2021-05-27 22:09:32 -07:00
Coleman Kane 17351a79e2 linux 5.13 compat: bdevops->revalidate_disk() removed
Linux kernel commit 0f00b82e5413571ed225ddbccad6882d7ea60bc7 removes the
revalidate_disk() handler from struct block_device_operations. This
caused a regression, and this commit eliminates the call to it and the
assignment in the block_device_operations static handler assignment
code, when configure identifies that the kernel doesn't support that
API handler.

Reviewed-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #11967 
Closes #11977
2021-05-27 22:09:26 -07:00
наб 1cb517aebd module/zfs: remove zfs_zevent_console and zfs_zevent_cols
zfs_zevent_console committed multiple printk()s per line without
properly continuing them ‒ a single event could easily be fragmented
across over thirty lines, making it useless for direct application

zfs_zevent_cols exists purely to wrap the output from zfs_zevent_console

The niche this was supposed to fill can be better served by something
akin to the all-syslog ZEDLET

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #7082 
Closes #11996
2021-05-27 22:09:19 -07:00
Brian Behlendorf cb2e336038 Tag 2.1.0-rc5
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2021-05-10 15:13:37 -07:00
наб 381a0ca1e8 libzfs: zfs_asprintf(): don't return undefined pointer
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11993
2021-05-10 12:21:38 -07:00
наб f7a69b93c6 libzfsbootenv: lzbe_set_boot_device(): don't free undefined pointer
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11993
2021-05-10 12:21:31 -07:00
наб 8f6e2b5485 zfs_get_enclosure_sysfs_path(): don't free undefined pointer
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11993
2021-05-10 12:21:23 -07:00
наб 14b56624c8 zfs_get_enclosure_sysfs_path(): don't leak dev path
Also always free tmp2 at the end

Before:
nabijaczleweli@tarta:~/uwu$ valgrind --leak-check=full ./blergh
==8947== Memcheck, a memory error detector
==8947== Using Valgrind-3.14.0 and LibVEX
==8947== Command: ./blergh
==8947==
(null)
==8947==
==8947== HEAP SUMMARY:
==8947==     in use at exit: 23 bytes in 1 blocks
==8947==   total heap usage: 3 allocs, 2 frees, 1,147 bytes allocated
==8947==
==8947== 23 bytes in 1 blocks are definitely lost in loss record 1 of 1
==8947==    at 0x483577F: malloc (vg_replace_malloc.c:299)
==8947==    by 0x48D74B7: vasprintf (vasprintf.c:73)
==8947==    by 0x48B7833: asprintf (asprintf.c:35)
==8947==    by 0x401258: zfs_get_enclosure_sysfs_path
                         (zutil_device_path_os.c:191)
==8947==    by 0x401482: main (blergh.c:107)
==8947==
==8947== LEAK SUMMARY:
==8947==    definitely lost: 23 bytes in 1 blocks
==8947==    indirectly lost: 0 bytes in 0 blocks
==8947==      possibly lost: 0 bytes in 0 blocks
==8947==    still reachable: 0 bytes in 0 blocks
==8947==         suppressed: 0 bytes in 0 blocks
==8947==
==8947== For counts of detected and suppressed errors, rerun with: -v
==8947== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

nabijaczleweli@tarta:~/uwu$ sed -n 191p zutil_device_path_os.c
        tmpsize = asprintf(&tmp1, "/sys/block/%s/device", dev_name);

After:
nabijaczleweli@tarta:~/uwu$ valgrind --leak-check=full ./blergh
==9512== Memcheck, a memory error detector
==9512== Using Valgrind-3.14.0 and LibVEX
==9512== Command: ./blergh
==9512==
(null)
==9512==
==9512== HEAP SUMMARY:
==9512==     in use at exit: 0 bytes in 0 blocks
==9512==   total heap usage: 3 allocs, 3 frees, 1,147 bytes allocated
==9512==
==9512== All heap blocks were freed -- no leaks are possible
==9512==
==9512== For counts of detected and suppressed errors, rerun with: -v
==9512== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11993
2021-05-10 12:21:18 -07:00
наб 214ae461f1 zpool: vdev_run_cmd(): don't free undefined pointers
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11993
2021-05-10 12:21:12 -07:00
наб 12ed5275d1 libzfs: zpool_load_compat(): don't free undefined pointers
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11993
2021-05-10 12:21:06 -07:00
наб 133fd00930 libzfs: zpool_load_compat(): open feature file cloexec
As a bonus, this also passes the open flags into the open flags instead
of the mode (it worked by accident because O_RDONLY is 0),
correctly detects a failed map,
and prefaults the entire file since we're always writing to every page

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11993
2021-05-10 12:21:00 -07:00
illiliti 74256266ff copy-builtin: posix conformance
This commits contains changes to allow running `copy-builtin` without
bash + some minor improvements.

changed shebang to /bin/sh
added -f option to `set` to globally disable unneeded globbing
replaced all `echo` commands within add_after() with `printf`
alternative to avoid possible issues with options (-neE)
dropped non-portable superfluous `readlink` command
replaced superfluous `true` command with `:` builtin alternative
replaced non-portable `--recursive` option of `cp` command with `-R`
alternative
dropped non-portable `local` keyword

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: illiliti <illiliti@protonmail.com>
Closes #12004
2021-05-10 12:18:54 -07:00
Brian Behlendorf 2085a5f992 Fix dRAID self-healing short columns
When dRAID performs a normal read operation only the data columns
in the raid map are read from disk.  This is enough information to
calculate the checksum, verify it, and return the needed data to the
application.  It's only in the event of a checksum failure that the
additional parity and any empty columns must be read since they are
required for parity reconstruction.

Reading these additional columns is handled by vdev_raidz_read_all()
which calls vdev_draid_map_alloc_empty() to expand the raid_map_t
and submit IOs for the missing columns.  This all works correctly,
but it fails to account for any "short" columns.  These are data
columns which are padded with a empty skip sector at the end.
Since that empty sector is not needed for a normal read it's not
read when columns is first read from disk.  However, like the parity
and empty columns the skip sector is needed to perform reconstruction.

The fix is to mark any "short" columns as never being read by clearing
the rc_tried flag when expanding the raid_map_t.  This will cause
the entire column to re-read from disk in the event of a checksum
failure allowing the self-healing functionality to repair the block.

Note that this only effects the self-healing feature because when
scrubbing a pool the parity, data, and empty columns are all read
initially to verify their contents.  Furthermore, only blocks which
contain "short" columns would be effected, and only when the memory
backing the skip sector wasn't already zeroed out.

This change extends the existing redundancy_raidz.ksh test case to
verify self-healing (as well as resilver and scrub).  Then applies
the same test case to dRAID with a slightly modified version of
the test script called redundancy_draid.ksh.  The unused variable
combrec was also removed from both test cases.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12010
2021-05-10 12:18:36 -07:00
наб b1dd6351bb Replace ZoL with OpenZFS where applicable
Afterward, git grep ZoL matches:
  * README.md:  * [ZoL Site](https://zfsonlinux.org)
  - Correct
  * etc/default/zfs.in:# ZoL userland configuration.
  - Changing this would induce a needless upgrade-check,
    if the user has modified the configuration;
    this can be updated the next time the defaults change
  * module/zfs/dmu_send.c:   * ZoL < 0.7 does not handle [...]
  - Before 0.7 is ZoL, so fair enough

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #11956
2021-05-10 12:16:46 -07:00
Ryan Moeller cb18cf6b0a FreeBSD: Remove !FreeBSD ifdef'd code
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11994
2021-05-10 12:16:39 -07:00
Ryan Moeller 6c25218c7e Clean up use of zfs_log_create in zfs_dir
zfs_log_create returns void, so there is no reason to cast its return
value to void at the call site.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11994
2021-05-10 12:16:32 -07:00
наб 0bb2a48ee6 zed: protect against wait4()/fork() races to the global PID table
This can be very easily triggered by adding a sleep(1) before
the wait4() on a PID-starved system: the reaper thread would wait
for a child before its entry appeared, letting old entries accumulate:

  Invoking "all-debug.sh" eid=3021 pid=391
  Finished "(null)" eid=0 pid=391 time=0.002432s exit=0
  Invoking "all-syslog.sh" eid=3021 pid=336
  Finished "(null)" eid=0 pid=336 time=0.002432s exit=0
  Invoking "history_event-zfs-list-cacher.sh" eid=3021 pid=347
  Invoking "all-debug.sh" eid=3022 pid=349
  Finished "history_event-zfs-list-cacher.sh" eid=3021 pid=347
                                              time=0.001669s exit=0
  Finished "(null)" eid=0 pid=349 time=0.002404s exit=0
  Invoking "all-syslog.sh" eid=3022 pid=370
  Finished "(null)" eid=0 pid=370 time=0.002427s exit=0
  Invoking "history_event-zfs-list-cacher.sh" eid=3022 pid=391
  avl_find(tree, new_node, &where) == NULL
  ASSERT at ../../module/avl/avl.c:641:avl_add()
  Thread 1 "zed" received signal SIGABRT, Aborted.

By employing this wider lock, we atomise [wait, remove] and [fork, add]:
slowing down the reaper thread now just causes some zombies
to accumulate until it can get to them

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11963
Closes #11965
2021-05-10 12:13:53 -07:00
Alyssa Ross f15ec889a9 Return required size when encode_fh size too small
Quoting <linux/exportfs.h>:

> encode_fh() should return the fileid_type on success and on error
> returns 255 (if the space needed to encode fh is greater than
> @max_len*4 bytes). On error @max_len contains the minimum size (in 4
> byte unit) needed to encode the file handle.

ZFS was not setting max_len in the case where the handle was too
small.  As a result of this, the `t_name_to_handle_at.c' example in
name_to_handle_at(2) did not work on ZFS.

zfsctl_fid() will itself set max_len if called with a fid that is too
small, so if we give zfs_fid() that behavior as well, the fix is quite
easy: if the handle is too small, just use a zero-size fid instead of
the handle.

Tested by running t_name_to_handle_at on a normal file, a directory, a
.zfs directory, and a snapshot.

Thanks-to: Puck Meerburg <puck@puckipedia.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Alyssa Ross <hi@alyssa.is>
Closes #11995
2021-05-10 12:13:45 -07:00
Alexander Motin 0db64e9036 Simplify/fix dnode_move() for dn_zfetch
Previous code tried to keep prefetch streams while moving dnode.  But
it was at least not updating per-stream zs_fetchback pointers, causing
use-after-free on next access.  Instead of that I see much easier and
cleaner to just drop old prefetch state and start new from scratch.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #11936
Closes #11998
2021-05-10 12:13:29 -07:00
Ryan Moeller 85071b2ff5 FreeBSD: Initialize/destroy zp->z_lock
zp->z_lock is used in shared code for protecting projid and scantime.
We don't exercise these paths much if at all on FreeBSD, so have been
lucky enough not to have issues with the uninitialized locks so far.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #12003
2021-05-10 12:13:12 -07:00
Rich Ercolani 1a0ce3d5ef Updated zfs_dbgmsg_enable documentation to be more accurate
Changed the default specified for zfs_dbgmsg_enable, added
clarification of interaction with zfs_flags.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #11984
Closes #11986
2021-05-10 12:13:03 -07:00
наб 08d2e39719 zed.d/zed-functions.sh: fix zed_guid_to_pool() on dash
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11935
Closes #11954
2021-05-10 12:12:20 -07:00
наб f87f41009f zed.d/history_event-zfs-list-cacher.sh: no grep for snapshot detection
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11935
2021-05-10 12:12:04 -07:00
наб 918e5c6fca zed.d/*-notify.sh: use mktemp instead of generating temp path manually
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11935
2021-05-10 12:11:44 -07:00
наб 9c5256a2fe zed.d/pool_import-led.sh: fix for current zpool scripts
Also minor clean-up with folding state_to_val() into a case,
unrolling the lesser-available seq into numbers,
ignoring vdev states we don't care about,
and documentation comments

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11934
Closes #11935
2021-05-10 12:11:33 -07:00
наб 39bcdd8d74 libzutil: fix dm_get_underlying_path() return if not a DM device
For example, this would happily return "/dev/(null)" for /dev/sda1

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11935
2021-05-10 12:10:20 -07:00
Ryan Moeller 075dcad351 ZTS: Fix xattr_002_neg passing too soon
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11970
2021-05-10 12:09:42 -07:00
Ryan Moeller 5701e393b7 FreeBSD: Prune some unneeded definitions
IS_XATTRDIR is never used.
v_count is only used in two places, one immediately followed by the
use of the real name, v_usecount.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #11973
2021-05-10 12:09:34 -07:00
Arshad Hussain 910b43310f vdev_id: variable not getting expanded under map_slot()
Under function map_slot() variable passed as args
were not getting properly substituted or expanded.
This patch fixes the substitution issue.

Reviewed-by: Niklas Edmundsson <nikke@acc.umu.se>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Closes #11951 
Closes #11959
2021-05-10 12:08:18 -07:00
Nathaniel Wesley Filardo 4ac37f8b2e vdev_mirror: don't scrub/resilver devices that can't be read
This ensures that we don't accumulate checksum errors against offline or
unavailable devices but, more importantly, means that we don't
needlessly create DTL entries for offline devices that are already
up-to-date.

Consider a 3-way mirror, with disk A always online (and so always with
an empty DTL) and B and C only occasionally online.  When A & B resilver
with C offline, B's DTL will effectively be appended to C's due to these
spurious ZIOs even as the resilver empties B's DTL:

  * These ZIOs land in vdev_mirror_scrub_done() and flag an error

  * That flagged error causes vdev_mirror_io_done() to see
    unexpected_errors, so it issues a ZIO_TYPE_WRITE repair ZIO, which
    inherits ZIO_FLAG_SCAN_THREAD because zio_vdev_child_io() includes
    that flag in ZIO_VDEV_CHILD_FLAGS.

  * That ZIO fails, too, and eventually zio_done() gets its hands on it
    and calls vdev_stat_update().

  * vdev_stat_update() sees the error and this zio...

    * is not speculative,
    * is not due to EIO (but rather ENXIO, since the device is closed)
    * has an ->io_vd != NULL (specifically, the offline leaf device)
    * is a write
    * is for a txg != 0 (but rather the read block's physical birth txg)
    * has ZIO_FLAG_SCAN_THREAD asserted

  * So: vdev_stat_update() calls vdev_dtl_dirty() on the offline vdev.

Then, when A & C resilver with B offline, that story gets replayed and
C's DTL will be appended to B's.

In fact, one does not need this permanently-broken-mirror scenario to
induce badness: breaking a mirror with no DTLs and then scrubbing will
create DTLs for all offline devices.  These DTLs will persist until the
entire mirror is reassembled for the duration of the *resilver*, which,
incidentally, will not consider the devices with good data to be sources
of good data in the case of a read failure.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Nathaniel Wesley Filardo <nwfilardo@gmail.com>
Closes #11930
2021-05-10 12:08:07 -07:00
Brian Behlendorf 6f2012c1dc zfs.spec.in: remove post ldconfig scriptlets
In Fedora 28 the packaging guidelines were changed such that ldconfig
should no longer be called in either the %post or %postun scriptlets.
Instead the new compatibility macros %ldconfig_post, %ldconfig_postun,
and %ldocnfig_scriptlets should be used.

Since we only currently support Fedora 31 and newer, we could drop
%post or %postun scriptlets entirely according to the guidelines.
However, since we also use the same spec file for CentOS / RHEL
it's convenient to call the macros which are available starting
with CentOS / RHEL 8.  For CentOS / RHEL 7 we must still call
ldconfig in the traditional way.

https://fedoraproject.org/wiki/Changes/Removing_ldconfig_scriptlets

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11931
2021-05-10 12:07:16 -07:00
Toomas Soome 96f04062e4 zdb: ASSERT issues when DEBUG is not defined
If zdb is not built with DEBUG mode, the ASSERT macros will be
eliminated.

This will leave vim defined, but not used (gcc warning) and
checkpoint spacemap validation loop will do nothing.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #11932
2021-05-10 12:07:09 -07:00
Brian Behlendorf a7a579e6e1 ZTS: Add known exceptions
Both the zpool_initialize_import_export and checkpoint_discard_busy
test cases a known to occasionally fail.  Add them to the list of
known possible failures and reference the appropriate issue on the
tracker.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11949
2021-05-10 12:06:31 -07:00
Martin Matuška 463d7e1a61 Drop "All rights reserved" from files by trasz@FreeBSD.org
This obeys the change in freebsd/freebsd-src@bce7ee9d4

External-issue: https://reviews.freebsd.org/D26980
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Martin Matuska <mm@FreeBSD.org>
Closes #11947
2021-05-10 12:06:21 -07:00
Prawn 6c1a7be11e receive: don't fail inheriting (-x) properties on wrong dataset type
Receiving datasets while blanket inheriting properties like zfs 
receive -x mountpoint can generally be desirable, e.g. to avoid 
unexpected mounts on backup hosts.

Currently this will fail to receive zvols due to the mountpoint 
property being applicable to filesystems only.  This limitation 
currently requires operators to special-case their minds and tools 
for zvols.

This change gets rid of this limitation for inherit (-x) by
Spiting up the dataset type handling: Warnings for inheriting (-x), 
errors for overriding (-o).

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: InsanePrawn <insane.prawny@gmail.com>
Closes #11416
Closes #11840
Closes #11864
2021-05-10 12:06:11 -07:00
Mateusz Guzik 1b7d883eaa FreeBSD: damage control racing .. lookups in face of mkdir/rmdir
External-issue: https://reviews.freebsd.org/D29769
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11926
2021-05-10 12:05:49 -07:00
Romain Dolbeau e5c01296ff Fix AVX512BW Fletcher code on AVX512-but-not-BW machines
Introduce a specific valid function for avx512f+avx512bw (instead 
of checking only for avx512f).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Adam Moss <c@yotes.com>
Signed-off-by: Romain Dolbeau <romain@dolbeau.org>
Closes #11937
Closes #11938
2021-05-10 12:05:36 -07:00
Brian Behlendorf 64b83a0bf1 Tag 2.1.0-rc4
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2021-04-22 14:54:35 -07:00
наб 8fd65351a7 zed: protect against wait4()/fork() races to the launched process tree
As soon as wait4() returns, fork() can immediately return with the same
PID, and race to lock _launched_processes_lock, then try to add the new
(duplicate) PID to _launched_processes, which asserts

By locking before wait4(), we ensure, that, given that same
unfortunate scheduling, _launched_processes_lock cannot be locked by the
spawner before we pop the process in the reaper, and only afterward will
it be added

This moves where the reaper idles when there are children from the
wait4() to the pause(), locking for the duration of that single syscall
in both the no-children and running-children cases; the impact of this
is one to two syscalls (depending on _launched_processes_lock state)
per loop

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11924
Closes #11928
2021-04-22 14:53:51 -07:00
Daniel Stevenson 79d9f663b0 Fixed incorrect man page reference in zfsprops(8)
The special_small_blocks section directed readers to zpool(8) for
documentation on special allocation classes, while they are actually
documented in zpoolconcepts(8).

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Daniel Stevenson <daniel@dstev.net>
Closes #11918
2021-04-21 10:26:57 -07:00
наб 478b8ec8f2 etc/systemd/zfs-mount-generator: don't fail if no cached pools
If $FSLIST exists but is empty, the generator fails with
  sort: cannot read: '/etc/zfs/zfs-list.cache/*':
  No such file or directory

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11915
2021-04-19 15:22:58 -07:00
наб 219acd907b freebsd/libshare: nfs: make nfs_is_shared() thread-safe
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11886
2021-04-19 15:22:58 -07:00
наб cc33149e5a libshare: nfs: don't leak nfs_lock_fd when lock fails
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11886
2021-04-19 15:22:58 -07:00
Brian Behlendorf 40abc63c40 ZTS: Improve redundancy test scripts
- Add additional logging to provide more information about why the
  test failed.  This including logging more of the individual commands
  and the contents and differences of the record files on failure.

- Updated get_vdevs() to properly exclude all top-level vdevs
  including raidz3 and draid[1-3].

- Replaced gnudd with dd.  This is the only remaining place in the
  test suite gnudd is used and it shouldn't be needed.

- The refill_test_env function expects the pool as the first argument
  but never sets the pool variable.

- Only fill the test pools to 50% of capacity instead of 75% to help
  speed up the tests.

- Fix replace_missing_devs() calculation, MINDEVSIZE should be
  MINVDEVSIZE.

- Fix damage_devs() so it overwrites almost all of the device so
  we're guaranteed to damage filesystem blocks.

- redundancy_stripe.ksh should not use log_mustnot to check if the
  pool is healthy since the return value may be misinterpreted.
  Just perform a normal conditional check and log the failure.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11906
2021-04-19 15:22:58 -07:00
Attila Fülöp cdc27fd061 ICP: Silence objtool "stack pointer realignment" warnings
Objtool requires the use of a DRAP register while aligning the
stack. Since a DRAP register is a gcc concept and we are
notoriously low on registers in the crypto code, it's not worth
the effort to mimic gcc generated stack realignment.

We simply silence the warning by adding the offending object files
to OBJECT_FILES_NON_STANDARD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #6950
Closes #11914
2021-04-19 15:22:58 -07:00
наб 121a34e36b libzfs: refresh property cache after inheriting userprop
This matches what happens when inheriting a system property

Consider the following program:
	int main() {
		void *zhp = libzfs_init();
		void *dataset = zfs_open(zhp, "zest/__test", 1);

		printf("before:");
		dump_nvlist(zfs_get_user_props(dataset), 2);
		printf("\n");

		zfs_prop_inherit(dataset, "xyz.nabijaczleweli:test", 0);
		printf("after:");
		dump_nvlist(zfs_get_user_props(dataset), 2);
		printf("\n");

		zfs_refresh_properties(dataset);
		printf("refreshed:");
		dump_nvlist(zfs_get_user_props(dataset), 2);
		printf("\n");
	}

And the output before:
	# zfs set xyz.nabijaczleweli:test=hehe zest/__test
	# ./a.out
	before:  xyz.nabijaczleweli:test:
	      value: 'hehe'
	      source: 'zest/__test'

	after:  xyz.nabijaczleweli:test:
	      value: 'hehe'
	      source: 'zest/__test'

	refreshed:

As compared to the output after:
	# zfs set xyz.nabijaczleweli:test=hehe zest/__test
	# ./a.out
	before:  xyz.nabijaczleweli:test:
	      value: 'hehe'
	      source: 'zest/__test'

	after:
	refreshed:

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11064
Closes #11911
2021-04-19 15:22:58 -07:00
наб dfa2beddea libzfs: don't mark prompt+raw as retriable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11911
Closes #11031
2021-04-19 15:22:58 -07:00
Mateusz Guzik 14a1980b35 Combine zio caches if possible
This deduplicates 2 sets of caches which use the same allocation size.

Memory savings fluctuate a lot, one sample result is FreeBSD running
"make buildworld" saving ~180MB RAM in reduced page count associated
with zio caches.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11877
2021-04-19 15:22:58 -07:00
наб 9747310cc1 contrib/dracut: 90: zfs-{rollback,snapshot}-bootfs: use @sbindir@
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11898
2021-04-19 15:22:57 -07:00
наб 0b10994cea contrib/i-t: properly mount root's children with spaces
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11898
2021-04-19 15:22:57 -07:00
наб 692387343f contrib/dracut: 90: mount essential datasets under root
This partly mirrors what the i-t script does (though that mounts all
children, recursively) ‒ /etc, /usr, /lib*, and /bin are all essential,
if present, to successfully invoke the real init, which will then mount
everything else it might need in the right order

The following extreme-case set-up boots w/o issues now:
  /               zoot            zfs  rw,relatime,xattr,noacl
  ├─/etc          zoot/etc        zfs  rw,relatime,xattr,noacl
  ├─/usr          zoot/usr        zfs  rw,relatime,xattr,noacl
  │ └─/usr/local  zoot/usr/local  zfs  rw,relatime,xattr,noacl
  ├─/var          zoot/var        zfs  rw,relatime,xattr,noacl
  │ ├─/var/lib    zoot/var/lib    zfs  rw,relatime,xattr,noacl
  │ ├─/var/log    zoot/var/log    zfs  rw,relatime,xattr,posixacl
  │ ├─/var/cache  zoot/var/cache  zfs  rw,relatime,xattr,noacl
  │ └─/var/tmp    zoot/var/tmp    zfs  rw,relatime,xattr,noacl
  ├─/home         zoot/home       zfs  rw,relatime,xattr,noacl
  │ └─/home/nab   zoot/home/nab   zfs  rw,relatime,xattr,noacl
  ├─/boot         zoot/boot       zfs  rw,relatime,xattr,noacl
  ├─/root         zoot/home/root  zfs  rw,relatime,xattr,noacl
  ├─/opt          zoot/opt        zfs  rw,relatime,xattr,noacl
  └─/srv          zoot/srv        zfs  rw,relatime,xattr,noacl

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11898
2021-04-19 15:22:57 -07:00
наб 8d869cd840 contrib/dracut: 90: generator: only log to kmsg if debug set on cmdline
"debug" is also used by systemd itself, and there's really no reason for
the generator to write this much garbage by default

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11898
2021-04-19 15:22:57 -07:00
наб ae8dd6676b contrib/dracut: 02: don't spill device names across multiple lines
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11898
2021-04-19 15:22:57 -07:00
Paul Zuchowski d938c6ee27 Fix crash in zio_done error reporting
Fix NULL pointer dereference when reporting
checksum error for gang block in zio_done.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #11872
Closes #11896
2021-04-19 15:22:57 -07:00
Ryan Moeller dc4d55268c zfs-send(8): Restore sorting of flags
Before #11710 the flags in zfs-send(8) were sorted.
Restore order and bump the date.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11905
2021-04-19 15:22:57 -07:00
наб bdcf0cca10 linux/spl: proc: use global table_{min,max} values instead of local ones
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11879
2021-04-19 15:22:57 -07:00
наб 7be4669a65 linux/libspl: gethostid: read from /proc/sys/kernel/spl/hostid, simplify
Fixes get_system_hostid() if it was set via the aforementioned sysctl
and simplifies the code a bit.  The kernel and user-space must agree,
after all.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11879
2021-04-19 15:22:57 -07:00
наб e5c4f86e7a linux/spl: base proc_dohostid() on proc_dostring()
This fixes /proc/sys/kernel/spl/hostid on kernels with mainline commit
32927393dc1ccd60fb2bdc05b9e8e88753761469 ("sysctl: pass kernel pointers
to ->proc_handler") ‒ 5.7-rc1 and up

The access_ok() check in copy_to_user() in proc_copyout_string() would
always fail, so all userspace reads and writes would fail with EINVAL

proc_dostring() strips only the final new-line,
but simple_strtoul() doesn't actually need a back-trimmed string ‒
writing "012345678   \n" is still allowed, as is "012345678zupsko", &c.

This alters what happens when an invalid value is written ‒
previously it'd get set to what-ever simple_strtoul() returned
(probably 0, thereby resetting it to default), now it does nothing

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11878
Closes #11879
2021-04-19 15:22:57 -07:00
Jitendra Patidar 4c925936e3 ZFS traverse_visitbp optimization to limit prefetch
Traversal code, traverse_visitbp() does visit blocks recursively.
Indirect (Non L0) Block of size 128k could contain, 1024 block pointers
of 128 bytes. In case of full traverse OR incremental traverse, where
all blocks were modified, it could traverse large number of blocks
pointed by indirect. Traversal code does issue prefetch of blocks
traversed below indirect. This could result into large number of
async reads queued on vdev queue. So, account for prefetch issued for
blocks pointed by indirect and limit max prefetch in one go.

Module Param:
zfs_traverse_indirect_prefetch_limit: Limit of prefetch while traversing
an indirect block.

Local counters:
prefetched: Local counter to account for number prefetch done.
pidx: Index for which next prefetch to be issued.
ptidx: Index at which next prefetch to be triggered.

Keep "ptidx" somewhere in the middle of blocks prefetched, so that
blocks prefetch read gets the enough time window before their demand
read is issued.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Jitendra Patidar <jitendra.patidar@nutanix.com>
Closes #11802 
Closes #11803
2021-04-19 15:22:57 -07:00
наб 15d3470c2e ZTS: add zed_fd_spill to verify the fds ZEDLETs inherit
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11891
2021-04-19 15:12:45 -07:00
наб 55cf5a255a zed: set O_CLOEXEC on persistent fds, remove closefrom() from pre-exec
Also don't dup /dev/null over stdio if daemonised

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11891
2021-04-19 15:12:41 -07:00
Paul Dagnelie d682e20ba4 Add SIGSTOP and SIGTSTP handling to issig
This change adds SIGSTOP and SIGTSTP handling to the issig function; 
this mirrors its behavior on Solaris. This way, long running kernel 
tasks can be stopped with the appropriate signals. Note that doing 
so with ctrl-z on the command line doesn't return control of the tty 
to the shell, because tty handling is done separately from stopping 
the process. That can be future work, if people feel that it is a 
necessary addition.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Issue #810 
Issue #10843 
Closes #11801
2021-04-19 15:12:33 -07:00
Brian Behlendorf 9c470dc6c0 Fix 'make checkbashisms` warnings
The awk command used by the checkbashisms target incorrectly
adds the escape character before the ! and # characters.  This
results in the following warnings because these characters do not
need to be escaped.

    awk: cmd. line:1: warning: regexp escape sequence
        `\!' is not a known regexp operator
    awk: cmd. line:1: warning: regexp escape sequence
        `\#' is not a known regexp operator

Remove the unneeded escape character before ! and #.

Valid escape sequences are:

    https://www.gnu.org/software/gawk/manual/html_node/Escape-Sequences.html

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11902
2021-04-19 15:12:28 -07:00
Brian Behlendorf 7616fc7971 Tag 2.1.0-rc3
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2021-04-14 14:50:46 -07:00
Yuri Pankov 7bf3959601 Fix vdev health padding in zpool list -v
Do not (incorrectly, right instead left) pad health string itself,
it will be taken care of when printing property value below.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Yuri Pankov <yuripv@FreeBSD.org>
Closes #11899
2021-04-14 13:23:08 -07:00
Brian Behlendorf 25cd7b520d Obsolete earlier packages due to version bump
Follow up to d5ef91af which adds a missing 'obsoletes' for the
libzfs-devel package.

Add a comment to the zfs.spec file as a reminder that previous
versions of the package should be marked as obsolete.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11844 
Closes #11895
2021-04-14 13:23:08 -07:00
наб 4a30a0b947 linux/libspl: getextmntent(): don't leak mnttab FILE*
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11868
2021-04-14 13:23:08 -07:00
наб ecd6ad66b0 libzfs: zfs_mount_at(): load key for encryption root if MS_CRYPT
zfs_crypto_load_key() only works on encryption roots,
and zfs mount -la would fail if it encounters a datasets that
is sorted before their encroots.

To trigger:
  truncate -s 40G /tmp/test
  dd if=/dev/urandom of=/tmp/k bs=128 count=1 status=none
  zpool create -O encryption=on -O keylocation=file:///tmp/k \
               -O keyformat=passphrase test /tmp/test
  zfs create -o mountpoint=/a test/a
  zfs create -o mountpoint=/b test/b
  zfs umount test
  zfs unload-key test
  zfs mount -la

The final mount errored out with:
  Key load error: Keys must be loaded for
    encryption root of 'test/a' (test).
  Key load error: Keys must be loaded for
    encryption root of 'test/b' (test).

And only /test was mounted

This technically breaks the libzfs API, but the previous behavior was
decidedly a bug.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11870 
Closes #11875
2021-04-14 13:23:08 -07:00
Mateusz Guzik 2aa0d643fd FreeBSD: use vnlru_free_vfsops if available
Fixes issues when zfs is used along with other filesystems.

External-issue: https://cgit.freebsd.org/src/commit/?id=e9272225e6bed840b00eef1c817b188c172338ee
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11881
2021-04-14 13:23:08 -07:00
Mateusz Guzik a6b82cc0bb FreeBSD: add missing seqc write begin/end around zfs_acl_chown_setattr
It happens to trip over an assert but does not matter for correctness at
this time. Done for future proofing.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11884
2021-04-14 13:23:08 -07:00
Mateusz Guzik 4568b5cfba FreeBSD: add support for lockless symlink lookup
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11883
2021-04-14 13:23:08 -07:00
наб ffc2e74f79 .gitmodules: link to openzfs github repository
Update the test images link to reference the openzfs github repository.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #11868
2021-04-14 13:23:08 -07:00
Prawn 8b03fce289 cmd/zfs receive: allow dry-run (-n) to check property args
zfs recv -n does not report some errors it could.  The code to bail 
out of the receive if in dry-run mode came a little early, skipping 
validation of cmdprops (recv -x and -o) among others.  Move the
check down to enable these additional checks.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: InsanePrawn <insane.prawny@gmail.com>
Closes #11862
2021-04-14 13:23:08 -07:00
Colm 1f3de97374 Improvements to the 'compatibility' property
Several improvements to the operation of the 'compatibility' property:

1) Improved handling of unrecognized features:
Change the way unrecognized features in compatibility files are handled.

 * invalid features in files under /usr/share/zfs/compatibility.d
   only get a warning (as these may refer to future features not yet in
   the library),
 * invalid features in files under /etc/zfs/compatibility.d
   get an error (as these are presumed to refer to the current system).

2) Improved error reporting from zpool_load_compat.
Note: slight ABI change to zpool_load_compat for better error reporting.

3) compatibility=legacy inhibits all 'zpool upgrade' operations.

4) Detect when features are enabled outside current compatibility set
   * zpool set compatibility=foo <-- print a warning
   * zpool set feature@xxx=enabled <-- error
   * zpool status <-- indicate this state

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Colm Buckley <colm@tuatha.org>
Closes #11861
2021-04-14 13:23:08 -07:00
Brian Behlendorf 7de1797cee ZTS: fix removal_condense_export test case
It's been observed in the CI that the required 25% of obsolete bytes
in the mapping can be to high a threshold for this test resulting in
condensing never being triggered and a test failure.  To prevent these
failures make the existing zfs_condense_indirect_obsolete_pct tuning
available so the obsolete percentage can be reduced from 25% to 5%
during this test.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11869
2021-04-14 13:23:08 -07:00
Brian Behlendorf 44ccb1d20f Update libzfs.abi for zfs_send() change
Commit 099fa7e4 intentionally modified the libzfs ABI.  However, it
failed to include an update for the libzfs.abi file.  This commit
resolves the `make checkabi` warning due to that omission.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11710
2021-04-14 13:23:07 -07:00
pstef dfd3015048 Balance parentheses in parameter descriptions
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Piotr Paweł Stefaniak <pstef@freebsd.org>
Closes #11882
2021-04-14 13:23:07 -07:00
Brian Behlendorf 4a9b29375e ZTS: Add known exceptions
The fault/auto_spare_shared, l2arc/persist_l2arc_007_pos, and
alloc_class/alloc_class_013_pos test cases are not entirely reliable
and may occasionally fail resulting in a false positive in the CI.
Add these tests to known list of possible failures until they can
be made 100% reliable.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11890
2021-04-14 13:23:07 -07:00
наб 2453d0263d lib/: set O_CLOEXEC on all fds
As found by
  git grep -E '(open|setmntent|pipe2?)\(' |
    grep -vE '((zfs|zpool)_|fd|dl|lzc_re|pidfile_|g_)open\('

FreeBSD's pidfile_open() says nothing about the flags of the files it
opens, but we can't do anything about it anyway; the implementation does
open all files with O_CLOEXEC

Consider this output with zpool.d/media appended with
"pid=$$; (ls -l /proc/$pid/fd > /dev/tty)":
  $ /sbin/zpool iostat -vc media
  lrwx------ 0 -> /dev/pts/0
  l-wx------ 1 -> 'pipe:[3278500]'
  l-wx------ 2 -> /dev/null
  lrwx------ 3 -> /dev/zfs
  lr-x------ 4 -> /proc/31895/mounts
  lrwx------ 5 -> /dev/zfs
  lr-x------ 10 -> /usr/lib/zfs-linux/zpool.d/media
vs
  $ ./zpool iostat -vc vendor,upath,iostat,media
  lrwx------ 0 -> /dev/pts/0
  l-wx------ 1 -> 'pipe:[3279887]'
  l-wx------ 2 -> /dev/null
  lr-x------ 10 -> /usr/lib/zfs-linux/zpool.d/media

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11866
2021-04-14 13:23:07 -07:00
наб ab88e9e264 libzfs{,_core}: set O_CLOEXEC on persistent (ZFS_DEV and MNTTAB) fds
These were fd 3, 4, and 5 by the time zfs change-key hit
execute_key_fob()

glibc appends "e" to setmntent() mode, but musl's just returns fopen()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11866
2021-04-14 13:23:07 -07:00
наб 54aaed5c45 libzfs: zfs_crypto_create() requires a new key by definition: set newkey
This changes the password prompt for new encryption roots from
  Enter passphrase:
  Re-enter passphrase:
to
  Enter new passphrase:
  Re-enter new passphrase:
which makes more sense and is more consistent with "new passphrase"
now always meaning "come up with something" and plain "passphrase"
"remember that thing"

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11866
2021-04-14 13:23:07 -07:00
наб aa52015d45 libzfs_crypto.c: remove unused key_locator enum
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11866
2021-04-14 13:23:07 -07:00
наб 5faeaa1365 zfprops(8): fix spacing in jailed= arguments
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11866
2021-04-14 13:19:50 -07:00
наб 17792783d0 zfs-[un]jail(8): fix "zfs-jail [un]jail" leftovers
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11866
2021-04-14 13:19:50 -07:00
pablofsf 07d64c07e0 Allow zfs to send replication streams with missing snapshots
A tentative implementation and discussion was done in #5285.
According to it a send --skip-missing|-s flag has been added.
In a replication stream, when there are snapshots missing in
the hierarchy, if -s is provided print a warning and ignore
dataset (and its children) instead of throwing an error

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pablo Correa Gómez <ablocorrea@hotmail.com>
Closes #11710
2021-04-14 13:19:50 -07:00
Olaf Faaland f8631d0fe0 kmod-zfs should obsolete kmod-spl as well as spl-kmod
Without this Obsoletes, using packages built --with-spec=redhat, an
upgrade from zfs-0.7 to zfs-2.x does not cause the kmod-spl-0.7 package
to be removed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #11865
2021-04-14 13:19:50 -07:00
наб afe2a5ca12 zvol_wait: properly handle zvol_volmode sysctl being 3/none
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11859
2021-04-14 13:19:50 -07:00
наб 2a667acfcb zfs_ids_to_path: print correct wrong values
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11859
2021-04-14 13:19:50 -07:00
наб 5f3f4def83 zfs_ids_to_path: the -v comes after the executable name
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11859
2021-04-14 13:19:50 -07:00
наб 8bc955f546 contrib/bpftrace: exec bpftrace, remove useless cat
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11859
2021-04-14 13:19:50 -07:00
наб b3bb604a4c arc_summary3: just read /s/m/{mod}/version instead of spawning cat
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11859
2021-04-14 13:19:50 -07:00
наб 2ae2892235 zvol_wait: fix for zvols with spaces in name, optimise
list_zvols() would happily, for zvols with spaces in their names,
assign the second half to volmode, &c., so use a normal read
and set IFS to a tab instead of using 4 separate AWK processes(?)

Similarly, in filter_out_deleted_zvols(), run zfs(8) once and use the
output directly instead of spawning a zfs(8) process per zvol

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11859
2021-04-14 13:19:50 -07:00
наб 9d47853c74 zstreamdump: exec zstream dump
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11859
2021-04-14 13:19:50 -07:00
TerraTech 30f5b2fbe2 zpl_inode.c: Fix SMACK interoperability
SMACK needs to have the ZFS dentry security field setup before
SMACK's d_instantiate() hook is called as it requires functioning
'__vfs_getxattr()' calls to properly set the labels.

Fxes:
1) file instantiation properly setting the object label to the
   subject's label
2) proper file labeling in a transmutable directory

Functions Updated:
1) zpl_create()
2) zpl_mknod()
3) zpl_mkdir()
4) zpl_symlink()

External-issue: https://github.com/cschaufler/smack-next/issues/1
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: TerraTech <TerraTech@users.noreply.github.com>
Closes #11646 
Closes #11839
2021-04-14 13:19:50 -07:00
Ryan Moeller ad34215364 ZTS: Improve cleanup in removal_with_export
Kill the removal operation on every platform, not just Linux.
The test has been fixed and is now stable on FreeBSD.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11856
2021-04-14 13:19:49 -07:00
Rich Ercolani 1eee2c7b27 Added check for broken alien version
Added a check for alien 8.95.{1,2,3}, which is known to fail to 
generate debs 100% of the time, and instead print out a message 
informing the developer that it's known to be broken and linking 
them to more information.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #11848 
Closes #11850
2021-04-14 13:19:49 -07:00
Brian Behlendorf f17d146ca6 Use dsl_scan_setup_check() to setup a scrub
When a rebuild completes it will automatically schedule a follow up
scrub to verify all of the block checksums.  Before setting up the
scrub execute the counterpart dsl_scan_setup_check() function to
confirm the scrub can be started.  Prior to this change we'd only
check vdev_rebuild_active() which isn't as comprehensive, and using
the check function keeps all of this logic in one place.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11849
2021-04-14 13:19:49 -07:00
Tino Reichardt a35cadcf14 Fix double sha1/sha1.o line in module/icp/Makefile.in
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #11852
2021-04-14 13:19:49 -07:00
Ryan Moeller 72a9750ac8 ZTS: Tests using zhack may fail on FreeBSD
As described in #11854, zhack is occasionally segfaulting on FreeBSD. 
Debugging this is proving to be tricky. To avoid false positives in
the CI add entries for the tests that use zhack in zts-report to 
accept that they may occasionally fail on FreeBSD.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Issue #11854
Closes #11855
2021-04-14 13:19:49 -07:00
Ryan Moeller 7822c01eb6 Ratelimit deadman zevents as with delay zevents
Just as delay zevents can flood the zevent pipe when a vdev becomes
unresponsive, so do the deadman zevents.

Ratelimit deadman zevents according to the same tunable as for delay
zevents.

Enable deadman tests on FreeBSD and add a test for deadman event
ratelimiting. 

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11786
2021-04-14 13:19:49 -07:00
Marcin Skarbek 96e15d29fa Add kmodtool fix to detect different System.map location
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Marcin Skarbek <git@skarbek.name>
Closes #7807 
Closes #11836
2021-04-14 13:19:49 -07:00
наб bb8db9d927 zed: untangle _zed_conf_parse_path()
Dunno, maybe it's just me, but the previous style was /really/ confusing

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11860
2021-04-14 13:19:49 -07:00
наб 32cc3f0837 zed: don't malloc() global zed_conf instance, optimise zed_conf layout
It's all of 40 bytes with 4-byte pointers and 64 with 8-byte ones
(previously 44 and 88, respectively) ‒
there's no reason it can't live on the stack

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11860
2021-04-14 13:19:49 -07:00
наб f96dbd7a29 zed: remove zed_conf::{min,max}_events and ZED_{MIN,MAX}_EVENTS
No users, fields marked "reserved for future use", macros defined to 0

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11860
2021-04-14 13:19:49 -07:00
наб c30fef2523 zed: remove zed_conf::syslog_facility
No users, nobody sets it, main() hard-codes LOG_DAEMON, which is the
only correct value for this

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11860
2021-04-14 13:19:49 -07:00
наб 487cd2df11 zed: _zed_conf_display_help(): be consistent about what got_err means
Users passed in EXIT_SUCCESS and EXIT_FAILURE, despite it being a bool

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11860
2021-04-14 13:19:49 -07:00
наб 5a674a1b42 zed: untangle -h option listing
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11860
2021-04-14 13:19:49 -07:00
наб 52f65648e0 zed: print out licence string as one big chunk
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11860
2021-04-14 13:19:49 -07:00
наб 55780d8ec0 zed: only go up to current limit in close_from() fallback
Consider the following strace log:
  prlimit64(0, RLIMIT_NOFILE,
            NULL, {rlim_cur=1024, rlim_max=1024*1024}) = 0
  dup2(0, 30)                         = 30
  dup2(0, 300)                        = 300
  dup2(0, 3000)                       = -1 EBADF (Bad file descriptor)
  dup2(0, 30000)                      = -1 EBADF (Bad file descriptor)
  dup2(0, 300000)                     = -1 EBADF (Bad file descriptor)
  prlimit64(0, RLIMIT_NOFILE,
            {rlim_cur=1024*1024, rlim_max=1024*1024}, NULL) = 0
  dup2(0, 30)                         = 30
  dup2(0, 300)                        = 300
  dup2(0, 3000)                       = 3000
  dup2(0, 30000)                      = 30000
  dup2(0, 300000)                     = 300000

Even a privileged process needs to bump its rlimit before being able
to use fds higher than rlim_cur.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11834
2021-04-14 13:19:49 -07:00
наб b3a7e6e7f3 zed.8: the Diagnosis Engine is implemented
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11834
2021-04-14 13:19:49 -07:00
наб ea30225fdb zed: replace zed_file_write_n() with write(2), purge it
We set SA_RESTART early on, which will prevent EINTRs (indeed, to the
point of needing to clear it in the reaper, since it interferes with
pause(2)), which is the only error zed_file_write_n() actually handled
(plus, the pid write is no bigger than 12 bytes anyway)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11834
2021-04-14 13:19:49 -07:00
наб 018560b153 zed: merge all _NOT_IMPLEMENTED_ events
These events should currently never be generated.

Also untag _zed_event_add_nvpair() from merge with
zpool_do_events_nvprint() ‒ they serve different purposes (machine,
usually script vs human consumption) and format the output differently
as it stands

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11834
2021-04-14 13:19:49 -07:00
наб 48b60cffda zed: remove unused zed_file_read_n()
Same deal as zed_file_close_on_exec()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11834
2021-04-14 13:19:49 -07:00
наб cb97db792e zed: bump zfs_zevent_len_max if we miss any events
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11834
2021-04-14 13:19:49 -07:00
наб 718ee43362 zed.8: don't pretend an unprivileged user could change the script owner
And add a note on /why/ ZEDLETs need to be owned by root

Quoth chown(2), Linux man-pages project:
  Only a privileged process (Linux: one with the CAP_CHOWN capability)
  may change the owner of a file.

Quoth chown(2), FreeBSD:
     [EPERM]  The operation would change the ownership,
              but the effective user ID is not the super-user.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11834
2021-04-14 13:19:49 -07:00
наб 01219379cf zed: purge all mentions of a configuration file
There simply isn't a need for one, since the flags the daemon takes
are all short (mostly just toggles) and administrative in nature,
and are therefore better served by the age-old tradition of sourcing an
environment file and preparing the cmdline in the init-specific handler
itself, if needed at all

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11834
2021-04-14 13:19:49 -07:00
наб 0a51083e39 zed: implement close_from() in terms of /proc/self/fd, if available
/dev/fd on Darwin

Consider the following strace output:
  prlimit64(0, RLIMIT_NOFILE, NULL, {rlim_cur=1024, rlim_max=1024*1024}) = 0

Yes, that is well over a million file descriptors!

This reduces the ZED start-up time from "at least a second" to
"instantaneous", and, under strace, from "don't even try" to "usable"
by simple virtue of doing five syscalls instead of over a million;
in most cases the main loop does nothing

Recent Linuxes (5.8+) have close_range(2) for this, but that's an
overoptimisation (and libcs don't have wrappers for it yet)

This is also run by the ZEDLET pre-exec. Compare:
  Finished "all-syslog.sh" eid=13 pid=6717 time=1.027100s exit=0
  Finished "history_event-zfs-list-cacher.sh" eid=13 pid=6718 time=1.046923s exit=0
to
  Finished "all-syslog.sh" eid=12 pid=4834 time=0.001836s exit=0
  Finished "history_event-zfs-list-cacher.sh" eid=12 pid=4835 time=0.001346s exit=0
lol

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11834
2021-04-14 13:19:49 -07:00
наб b73e40a5ad zed: print combined system/user time after ZEDLET death
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11834
2021-04-14 13:19:49 -07:00
наб fa991f2a47 zed: allow limiting concurrent jobs
200ms time-out is relatively long, but if we already hit the cap,
then we'll likely be able to spawn multiple new jobs when we wake up

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11807
2021-04-14 13:19:49 -07:00
наб dbf8baf8dd zed: remove unused zed_file_close_on_exec()
The FIXME comment was there since the initial implementation in 2014,
there are no users

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11807
2021-04-14 13:19:49 -07:00
наб 9160c441a4 zed: use separate reaper thread and collect ZEDLETs asynchronously
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11807
2021-04-14 13:19:49 -07:00
наб 84c1652a0a zed: set names for all threads
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11807
2021-04-14 13:19:48 -07:00
Brian Behlendorf 7474d775c7 Tag 2.1.0-rc2
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2021-04-07 13:30:30 -07:00
Olaf Faaland 506ffdf648 fix misplaced quotes in kmod-preamble
rpm/redhat/zfs-kmod.spec.in has a typo in the shell code that
creates the kmod-preamble file.  This typo results in the
preamble file having the wrong name,

./SOURCES/kmod-preamblenObsoletes

and missing the Obsoletes clause that has become part of the name.

Because the filename is incorrect, the built package does not have
"obsoletes" or "conflicts" set.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #11851
2021-04-07 13:29:10 -07:00
Brian Behlendorf 0bf91beab9 Obsolete earlier packages due to version bump
In order for package managers such as dnf to upgrade cleanly after
the package SONAME bump the obsolete package names must be known.
Update the new packages to correctly obsolete the old ones.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11844
Closes #11847
2021-04-07 13:29:06 -07:00
наб 04344d5bc3 i-t: don't brokenly set the scheduler for root pool vdev's disks
This effectively reverts
  4fc411f7a3 (part of #6807) and
  f6fbe25664 (#9042) ‒
the code itself and latter PR cite symmetry with whole-disk-vdev
behaviour (presumably because rootfs vdevs are rarely whole disks),
but the code is broken for NVME devices (indeed, it'd strip the
controller number instead of the (potential) partition number, turning
"nvme0n1p1" into "nvmen1p1", which would then subsequently fail the
sysfs existence check); it could be fixed to handle those (and any
others) rather easily by dereferencing /sys/class/block/$devname,
but this isn't the place for setting this ‒ as noted in the commit that
removed setting the scheduler by default
(9e17e6f254) ‒ use an udev rule

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11838
2021-04-07 13:28:56 -07:00
наб 46aec6d439 i-t: fix root=zfs:AUTO
IFS= would break loops in import_pool(), which would fault
any automatic import

Additionally $ZFS_BOOTFS from cmdline would interfere with find_rootfs()

If many pools were present, same thing could happen across multiple
find_rootfs() runs, so bail out early and clean up in error path

Suggested-by: @nachtgeist
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11278
Closes #11838
2021-04-07 13:28:52 -07:00
matt-fidd 1bb4b5a5ae zfs get -p only outputs 3 columns if "clones" property is empty
get_clones_string currently returns an empty string for filesystem
snapshots which have no clones. This breaks parsable `zfs get` output as
only three columns are output, instead of 4.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Fiddaman <github@m.fiddaman.uk>
Co-authored-by: matt <matt@fiddaman.net>
Closes #11837
2021-04-07 13:28:14 -07:00
наб 71a3487a89 zpool-features.5: remove "booting not possible with this feature"s
The exact limitations on what features are supported when booting
vary considerably depending on the environment.  In order to minimize
confusion avoid categorical statements which assume GRUB2 is being 
used.  The supported GRUB2 features are covered earlier in this man 
page for easy reference.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11842
2021-04-07 13:28:06 -07:00
George Melikov 2aed1ab13d man: fix wrong .Xr macros usages
In addition, html doc will have working hyperlinks.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #11845
2021-04-07 13:27:57 -07:00
наб d09e1c1343 libzutil: zfs_isnumber(): return false if input empty
zpool list, which is the only user, would mistakenly try to parse the
empty string as the interval in this case:

  $ zpool list "a"
  cannot open 'a': no such pool
  $ zpool list ""
  interval cannot be zero
  usage: <usage string follows>
which is now symmetric with zpool get:
  $ zpool list ""
  cannot open '': name must begin with a letter

Avoid breaking the  "interval cannot be zero" string.
There simply isn't a need for this, and it's user-facing.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11841 
Closes #11843
2021-04-07 13:27:42 -07:00
Brian Behlendorf e2f5074c0f ZTS: pool_checkpoint improvements
The pool_checkpoint tests may incorrectly fail because several of
them invoke zdb for an imported pool.  In this scenario it's not
unexpected for zdb to fail if the pool is modified.  To resolve
this these zdb checks are now done after the pool has been exported.

Additionally, the default cleanup functions assumed the pool would
be imported when they were run.  If this was not the case they're
exit early and fail to cleanup all of the test state causing
subsequent tests to fail.  Add a check to only destroy the pool
when it is imported.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11832
2021-04-07 13:27:28 -07:00
Andrea Gelmini ca7af7f675 Fix various typos
Correct an assortment of typos throughout the code base.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #11774
2021-04-07 13:27:11 -07:00
наб 35cce6ea63 bash_completion.d: always call zfs/zpool binaries directly
/dev/zfs is 0:0 666 on most systems, so the [ -w /dev/zfs ] check always
succeeds, but if zfs isn't in $PATH (e.g. when completing from
"/sbin/zfs list" on a regular account) this can lead to error spew like

  nabijaczleweli@szarotka:~$ /sbin/zfs list bash: zfs: command not found
  @ bash: zfs: command not found

We only do read-only commands, and quite general ones at that,
so there's no need to elevate one way or another.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11828
2021-04-07 13:27:06 -07:00
Brian Behlendorf d539b77934 Add RELEASES.md file
Document the project's policy regarding publishing and maintaining
official OpenZFS releases.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11821
2021-04-07 13:26:58 -07:00
Ryan Moeller 202e7545dc ZTS: inheritance/inherit_001_pos is flaky
Add inheritance/inherit_001_pos to the maybe fails on FreeBSD list.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11830
2021-04-07 13:26:35 -07:00
Ryan Moeller 003f2d04b6 FreeBSD: Fix stable/12 after AT_BENEATH removal
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11827
2021-04-07 13:25:20 -07:00
Brian Behlendorf ec311430e2 Bump libzfs.so and libzpool.so versions
Bump the library versions as advised by the libtool guidelines.

https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html

Two new functions were added but no existing functions were changed,
so we increase the version and the age (version:revision:age).

Added functions (2):
- boolean_t zpool_is_draid_spare(const char *);
- zpool_compat_status_t zpool_load_compat(const char *,
      boolean_t *, char *, char *);

Additionally bump the libzpool.so version information.  This library
is for internal use but we still want to update the version to track
major changes to the interfaces.

The libzfsbootenv, libuutil, libnvpair and libzfs_core libraries
have not been updated.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11817
2021-04-07 13:25:13 -07:00
Ryan Moeller 895d39aa83 Allow pool names that look like Solaris disk names
Nothing bad happens if a prefix of your pool name matches a disk name.
This is a bit of a silly restriction at this point.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #11781 
Closes #11813
2021-04-07 13:24:46 -07:00
Ryan Moeller 7f789e150f Don't scale zfs_zevent_len_max by CPU count
The lower bound for this scaling to too low and the upper bound is too
high.  Use a fixed default length of 512 instead, which is a reasonable
value on any system.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11822
2021-04-07 13:24:38 -07:00
Ryan Moeller 917f4b334c Atomically check and set dropped zevent count
ratelimit_dropped isn't protected by a lock and is expected to
be updated atomically.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11822
2021-04-07 13:24:33 -07:00
Brian Behlendorf 6613ea3b33 CI: Increase free space in workflow
Recently we've been running out of free space in the ubuntu 20.04
environment resulting in test failures.  This appears to be caused
by a change in the default available free space and not because of
any change in OpenZFS. Try and avoid this failure by applying a
suggested workaround which removes some unnecessary files.

https://github.com/actions/virtual-environments/issues/2840

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11826
2021-04-07 13:24:28 -07:00
Brian Atkinson e783fa0925 Fixing m4 iops rename check
The configure check for iops->rename wanting flags was missing the
AC_MSG_CHECKING() so it would just print yes without saying what was
being checked.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #11825
2021-04-07 13:24:22 -07:00
наб dc52c0d725 fsck.zfs: implement 4/8 exit codes as suggested in manpage
Update the fsck.zfs helper to bubble up some already-known-about 
errors if they are detected in the pool.

health=degraded => 4/"Filesystem errors left uncorrected"
health=faulted && dataset in /etc/fstab => 8/"Operational error"
pool not found => 8/"Operational error"
everything else => 0

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11806
2021-04-07 13:24:18 -07:00
Mike Swanson bfdd001679 Add compatibility file sets (ZoL 0.6.1, 0.6.4, OpenZFS 2.1)
ZoL 0.6.1 introduced feature flags with the three features that all
implementations at the time were guaranteed to have.  0.6.4 introduced
a few more until 0.6.5 added two after that.  OpenZFS 2.1 added the
dRAID feature.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mike Swanson <mikeonthecomputer@gmail.com>
Closes #11818
2021-04-07 13:24:08 -07:00
578 changed files with 19686 additions and 20934 deletions

View File

@ -2,7 +2,7 @@
name: Bug report
about: Create a report to help us improve OpenZFS
title: ''
labels: 'Type: Defect, Status: Triage Needed'
labels: 'Type: Defect'
assignees: ''
---
@ -25,14 +25,16 @@ Type | Version/Name
--- | ---
Distribution Name |
Distribution Version |
Linux Kernel |
Kernel Version |
Architecture |
ZFS Version |
SPL Version |
OpenZFS Version |
<!--
Commands to find ZFS/SPL versions:
modinfo zfs | grep -iw version
modinfo spl | grep -iw version
Command to find OpenZFS version:
zfs version
Commands to find kernel version:
uname -r # Linux
freebsd-version -r # FreeBSD
-->
### Describe the problem you're observing

View File

@ -10,5 +10,5 @@ contact_links:
url: https://lists.freebsd.org/mailman/listinfo/freebsd-fs
about: Get community support for OpenZFS on FreeBSD
- name: OpenZFS on IRC
url: https://webchat.freenode.net/#openzfs
url: https://web.libera.chat/#openzfs
about: Use IRC to get community support for OpenZFS

View File

@ -32,5 +32,19 @@ jobs:
run: |
make lint
- name: CheckABI
id: CheckABI
run: |
make checkabi
- name: StoreABI
if: failure() && steps.CheckABI.outcome == 'failure'
run: |
make storeabi
- name: Prepare artifacts
if: failure() && steps.CheckABI.outcome == 'failure'
run: |
find -name *.abi | tar -cf abi_files.tar -T -
- uses: actions/upload-artifact@v2
if: failure() && steps.CheckABI.outcome == 'failure'
with:
name: New ABI files (use only if you're sure about interface changes)
path: abi_files.tar

View File

@ -44,6 +44,12 @@ jobs:
sudo sed -i.bak 's/updates/extra updates/' /etc/depmod.d/ubuntu.conf
sudo depmod
sudo modprobe zfs
# Workaround to provide additional free space for testing.
# https://github.com/actions/virtual-environments/issues/2840
sudo rm -rf /usr/share/dotnet
sudo rm -rf /opt/ghc
sudo rm -rf "/usr/local/share/boost"
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
- name: Tests
run: |
/usr/share/zfs/zfs-tests.sh -v -s 3G

View File

@ -40,6 +40,12 @@ jobs:
sudo sed -i.bak 's/updates/extra updates/' /etc/depmod.d/ubuntu.conf
sudo depmod
sudo modprobe zfs
# Workaround to provide additional free space for testing.
# https://github.com/actions/virtual-environments/issues/2840
sudo rm -rf /usr/share/dotnet
sudo rm -rf /opt/ghc
sudo rm -rf "/usr/local/share/boost"
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
- name: Tests
run: |
/usr/share/zfs/zfs-tests.sh -v -s 3G -r sanity

View File

@ -45,7 +45,7 @@ jobs:
run: |
sudo mkdir -p $TEST_DIR
# run for 20 minutes to have a total runner time of 30 minutes
sudo /usr/share/zfs/zloop.sh -t 1200 -l -m1
sudo /usr/share/zfs/zloop.sh -t 1200 -l -m1 -- -T 120 -P 60
- name: Prepare artifacts
if: failure()
run: |

2
.gitmodules vendored
View File

@ -1,3 +1,3 @@
[submodule "scripts/zfs-images"]
path = scripts/zfs-images
url = https://github.com/zfsonlinux/zfs-images
url = https://github.com/openzfs/zfs-images

4
META
View File

@ -2,9 +2,9 @@ Meta: 1
Name: zfs
Branch: 1.0
Version: 2.1.0
Release: rc1
Release: 1
Release-Tags: relext
License: CDDL
Author: OpenZFS
Linux-Maximum: 5.11
Linux-Maximum: 5.14
Linux-Minimum: 3.10

View File

@ -1,3 +1,5 @@
include $(top_srcdir)/config/Shellcheck.am
ACLOCAL_AMFLAGS = -I config
SUBDIRS = include
@ -6,7 +8,7 @@ SUBDIRS += rpm
endif
if CONFIG_USER
SUBDIRS += etc man scripts lib tests cmd contrib
SUBDIRS += man scripts lib tests cmd etc contrib
if BUILD_LINUX
SUBDIRS += udev
endif
@ -26,8 +28,8 @@ endif
AUTOMAKE_OPTIONS = foreign
EXTRA_DIST = autogen.sh copy-builtin
EXTRA_DIST += config/config.awk config/rpm.am config/deb.am config/tgz.am
EXTRA_DIST += META AUTHORS COPYRIGHT LICENSE NEWS NOTICE README.md
EXTRA_DIST += CODE_OF_CONDUCT.md
EXTRA_DIST += AUTHORS CODE_OF_CONDUCT.md COPYRIGHT LICENSE META NEWS NOTICE
EXTRA_DIST += README.md RELEASES.md
EXTRA_DIST += module/lua/README.zfs module/os/linux/spl/README.md
# Include all the extra licensing information for modules
@ -123,17 +125,8 @@ cstyle:
filter_executable = -exec test -x '{}' \; -print
PHONY += shellcheck
shellcheck:
@if type shellcheck > /dev/null 2>&1; then \
shellcheck --exclude=SC1090 --exclude=SC1117 --format=gcc \
$$(find ${top_srcdir}/scripts/*.sh -type f) \
$$(find ${top_srcdir}/cmd/zed/zed.d/*.sh -type f) \
$$(find ${top_srcdir}/cmd/zpool/zpool.d/* \
-type f ${filter_executable}); \
else \
echo "skipping shellcheck because shellcheck is not installed"; \
fi
SHELLCHECKDIRS = cmd contrib etc scripts tests
SHELLCHECKSCRIPTS = autogen.sh
PHONY += checkabi storeabi
checkabi: lib
@ -142,40 +135,9 @@ checkabi: lib
storeabi: lib
$(MAKE) -C lib storeabi
PHONY += checkbashisms
checkbashisms:
@if type checkbashisms > /dev/null 2>&1; then \
checkbashisms -n -p -x \
$$(find ${top_srcdir} \
-name '.git' -prune \
-o -name 'build' -prune \
-o -name 'tests' -prune \
-o -name 'config' -prune \
-o -name 'zed-functions.sh*' -prune \
-o -name 'zfs-import*' -prune \
-o -name 'zfs-mount*' -prune \
-o -name 'zfs-zed*' -prune \
-o -name 'smart' -prune \
-o -name 'paxcheck.sh' -prune \
-o -name 'make_gitrev.sh' -prune \
-o -name '90zfs' -prune \
-o -type f ! -name 'config*' \
! -name 'libtool' \
-exec sh -c 'awk "NR==1 && /\#\!.*bin\/sh.*/ {print FILENAME;}" "{}"' \;); \
else \
echo "skipping checkbashisms because checkbashisms is not installed"; \
fi
PHONY += mancheck
mancheck:
@if type mandoc > /dev/null 2>&1; then \
find ${top_srcdir}/man/man8 -type f -name 'zfs.8' \
-o -name 'zpool.8' -o -name 'zdb.8' \
-o -name 'zgenhostid.8' | \
xargs mandoc -Tlint -Werror; \
else \
echo "skipping mancheck because mandoc is not installed"; \
fi
${top_srcdir}/scripts/mancheck.sh ${top_srcdir}/man ${top_srcdir}/tests/test-runner/man
if BUILD_LINUX
stat_fmt = -c '%A %n'

37
RELEASES.md Normal file
View File

@ -0,0 +1,37 @@
OpenZFS uses the MAJOR.MINOR.PATCH versioning scheme described here:
* MAJOR - Incremented at the discretion of the OpenZFS developers to indicate
a particularly noteworthy feature or change. An increase in MAJOR number
does not indicate any incompatible on-disk format change. The ability
to import a ZFS pool is controlled by the feature flags enabled on the
pool and the feature flags supported by the installed OpenZFS version.
Increasing the MAJOR version is expected to be an infrequent occurrence.
* MINOR - Incremented to indicate new functionality such as a new feature
flag, pool/dataset property, zfs/zpool sub-command, new user/kernel
interface, etc. MINOR releases may introduce incompatible changes to the
user space library APIs (libzfs.so). Existing user/kernel interfaces are
considered to be stable to maximize compatibility between OpenZFS releases.
Additions to the user/kernel interface are backwards compatible.
* PATCH - Incremented when applying documentation updates, important bug
fixes, minor performance improvements, and kernel compatibility patches.
The user space library APIs and user/kernel interface are considered to
be stable. PATCH releases for a MAJOR.MINOR are published as needed.
Two release branches are maintained for OpenZFS, they are:
* OpenZFS LTS - A designated MAJOR.MINOR release with periodic PATCH
releases that incorporate important changes backported from newer OpenZFS
releases. This branch is intended for use in environments using an
LTS, enterprise, or similarly managed kernel (RHEL, Ubuntu LTS, Debian).
Minor changes to support these distribution kernels will be applied as
needed. New kernel versions released after the OpenZFS LTS release are
not supported. LTS releases will receive patches for at least 2 years.
The current LTS release is OpenZFS 2.1.
* OpenZFS current - Tracks the newest MAJOR.MINOR release. This branch
includes support for the latest OpenZFS features and recently releases
kernels. When a new MINOR release is tagged the previous MINOR release
will no longer be maintained (unless it is an LTS release). New MINOR
releases are planned to occur roughly annually.

View File

@ -1,10 +1,15 @@
SUBDIRS = zfs zpool zdb zhack zinject zstream zstreamdump ztest
include $(top_srcdir)/config/Shellcheck.am
SUBDIRS = zfs zpool zdb zhack zinject zstream ztest
SUBDIRS += fsck_zfs vdev_id raidz_test zfs_ids_to_path
SUBDIRS += zpool_influxdb
CPPCHECKDIRS = zfs zpool zdb zhack zinject zstream ztest
CPPCHECKDIRS += raidz_test zfs_ids_to_path zpool_influxdb
# TODO: #12084: SHELLCHECKDIRS = fsck_zfs vdev_id zpool
SHELLCHECKDIRS = fsck_zfs zpool
if USING_PYTHON
SUBDIRS += arcstat arc_summary dbufstat
endif
@ -12,6 +17,7 @@ endif
if BUILD_LINUX
SUBDIRS += mount_zfs zed zgenhostid zvol_id zvol_wait
CPPCHECKDIRS += mount_zfs zed zgenhostid zvol_id
SHELLCHECKDIRS += zed
endif
PHONY = cppcheck

View File

@ -102,18 +102,6 @@ show_tunable_descriptions = False
alternate_tunable_layout = False
def handle_Exception(ex_cls, ex, tb):
if ex is IOError:
if ex.errno == errno.EPIPE:
sys.exit()
if ex is KeyboardInterrupt:
sys.exit()
sys.excepthook = handle_Exception
def get_Kstat():
"""Collect information on the ZFS subsystem from the /proc virtual
file system. The name "kstat" is a holdover from the Solaris utility
@ -1137,48 +1125,55 @@ def main():
global alternate_tunable_layout
try:
opts, args = getopt.getopt(
sys.argv[1:],
"adp:h", ["alternate", "description", "page=", "help"]
)
except getopt.error as e:
sys.stderr.write("Error: %s\n" % e.msg)
usage()
sys.exit(1)
args = {}
for opt, arg in opts:
if opt in ('-a', '--alternate'):
args['a'] = True
if opt in ('-d', '--description'):
args['d'] = True
if opt in ('-p', '--page'):
args['p'] = arg
if opt in ('-h', '--help'):
usage()
sys.exit(0)
Kstat = get_Kstat()
alternate_tunable_layout = 'a' in args
show_tunable_descriptions = 'd' in args
pages = []
if 'p' in args:
try:
pages.append(unSub[int(args['p']) - 1])
except IndexError:
sys.stderr.write('the argument to -p must be between 1 and ' +
str(len(unSub)) + '\n')
opts, args = getopt.getopt(
sys.argv[1:],
"adp:h", ["alternate", "description", "page=", "help"]
)
except getopt.error as e:
sys.stderr.write("Error: %s\n" % e.msg)
usage()
sys.exit(1)
else:
pages = unSub
zfs_header()
for page in pages:
page(Kstat)
sys.stdout.write("\n")
args = {}
for opt, arg in opts:
if opt in ('-a', '--alternate'):
args['a'] = True
if opt in ('-d', '--description'):
args['d'] = True
if opt in ('-p', '--page'):
args['p'] = arg
if opt in ('-h', '--help'):
usage()
sys.exit(0)
Kstat = get_Kstat()
alternate_tunable_layout = 'a' in args
show_tunable_descriptions = 'd' in args
pages = []
if 'p' in args:
try:
pages.append(unSub[int(args['p']) - 1])
except IndexError:
sys.stderr.write('the argument to -p must be between 1 and ' +
str(len(unSub)) + '\n')
sys.exit(1)
else:
pages = unSub
zfs_header()
for page in pages:
page(Kstat)
sys.stdout.write("\n")
except IOError as ex:
if (ex.errno == errno.EPIPE):
sys.exit(0)
raise
except KeyboardInterrupt:
sys.exit(0)
if __name__ == '__main__':

View File

@ -42,6 +42,13 @@ import os
import subprocess
import sys
import time
import errno
# We can't use env -S portably, and we need python3 -u to handle pipes in
# the shell abruptly closing the way we want to, so...
import io
if isinstance(sys.__stderr__.buffer, io.BufferedWriter):
os.execv(sys.executable, [sys.executable, "-u"] + sys.argv)
DESCRIPTION = 'Print ARC and other statistics for OpenZFS'
INDENT = ' '*8
@ -161,21 +168,11 @@ elif sys.platform.startswith('linux'):
# The original arc_summary called /sbin/modinfo/{spl,zfs} to get
# the version information. We switch to /sys/module/{spl,zfs}/version
# to make sure we get what is really loaded in the kernel
command = ["cat", "/sys/module/{0}/version".format(request)]
req = request.upper()
# The recommended way to do this is with subprocess.run(). However,
# some installed versions of Python are < 3.5, so we offer them
# the option of doing it the old way (for now)
if 'run' in dir(subprocess):
info = subprocess.run(command, stdout=subprocess.PIPE,
universal_newlines=True)
version = info.stdout.strip()
else:
info = subprocess.check_output(command, universal_newlines=True)
version = info.strip()
return version
try:
with open("/sys/module/{}/version".format(request)) as f:
return f.read().strip()
except:
return "(unknown)"
def get_descriptions(request):
"""Get the descriptions of the Solaris Porting Layer (SPL) or the
@ -231,6 +228,29 @@ elif sys.platform.startswith('linux'):
return descs
def handle_unraisableException(exc_type, exc_value=None, exc_traceback=None,
err_msg=None, object=None):
handle_Exception(exc_type, object, exc_traceback)
def handle_Exception(ex_cls, ex, tb):
if ex_cls is KeyboardInterrupt:
sys.exit()
if ex_cls is BrokenPipeError:
# It turns out that while sys.exit() triggers an exception
# not handled message on Python 3.8+, os._exit() does not.
os._exit(0)
if ex_cls is OSError:
if ex.errno == errno.ENOTCONN:
sys.exit()
raise ex
if hasattr(sys,'unraisablehook'): # Python 3.8+
sys.unraisablehook = handle_unraisableException
sys.excepthook = handle_Exception
def cleanup_line(single_line):
"""Format a raw line of data from /proc and isolate the name value

1
cmd/fsck_zfs/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
/fsck.zfs

View File

@ -1 +1,6 @@
include $(top_srcdir)/config/Substfiles.am
include $(top_srcdir)/config/Shellcheck.am
dist_sbin_SCRIPTS = fsck.zfs
SUBSTFILES += $(dist_sbin_SCRIPTS)

View File

@ -1,9 +0,0 @@
#!/bin/sh
#
# fsck.zfs: A fsck helper to accommodate distributions that expect
# to be able to execute a fsck on all filesystem types. Currently
# this script does nothing but it could be extended to act as a
# compatibility wrapper for 'zpool scrub'.
#
exit 0

44
cmd/fsck_zfs/fsck.zfs.in Executable file
View File

@ -0,0 +1,44 @@
#!/bin/sh
#
# fsck.zfs: A fsck helper to accommodate distributions that expect
# to be able to execute a fsck on all filesystem types.
#
# This script simply bubbles up some already-known-about errors,
# see fsck.zfs(8)
#
if [ "$#" = "0" ]; then
echo "Usage: $0 [options] dataset…" >&2
exit 16
fi
ret=0
for dataset in "$@"; do
case "$dataset" in
-*)
continue
;;
*)
;;
esac
pool="${dataset%%/*}"
case "$(@sbindir@/zpool list -Ho health "$pool")" in
DEGRADED)
ret=$(( ret | 4 ))
;;
FAULTED)
awk '!/^([[:space:]]*#.*)?$/ && $1 == "'"$dataset"'" && $3 == "zfs" {exit 1}' /etc/fstab || \
ret=$(( ret | 8 ))
;;
"")
# Pool not found, error printed by zpool(8)
ret=$(( ret | 8 ))
;;
*)
;;
esac
done
exit "$ret"

View File

@ -185,10 +185,11 @@ main(int argc, char **argv)
break;
case 'h':
case '?':
(void) fprintf(stderr, gettext("Invalid option '%c'\n"),
optopt);
if (optopt)
(void) fprintf(stderr,
gettext("Invalid option '%c'\n"), optopt);
(void) fprintf(stderr, gettext("Usage: mount.zfs "
"[-sfnv] [-o options] <dataset> <mountpoint>\n"));
"[-sfnvh] [-o options] <dataset> <mountpoint>\n"));
return (MOUNT_USAGE);
}
}

View File

@ -1 +1,3 @@
include $(top_srcdir)/config/Shellcheck.am
dist_udev_SCRIPTS = vdev_id

View File

@ -140,15 +140,17 @@ Usage: vdev_id [-h]
-p number of phy's per switch port [default=$PHYS_PER_PORT]
-h show this summary
EOF
exit 0
exit 1
# exit with error to avoid processing usage message by a udev rule
}
map_slot() {
LINUX_SLOT=$1
CHANNEL=$2
MAPPED_SLOT=$(awk '$1 == "slot" && $2 == "${LINUX_SLOT}" && \
$4 ~ /^${CHANNEL}$|^$/ { print $3; exit}' $CONFIG)
MAPPED_SLOT=$(awk -v linux_slot="$LINUX_SLOT" -v channel="$CHANNEL" \
'$1 == "slot" && $2 == linux_slot && \
($4 ~ "^"channel"$" || $4 ~ /^$/) { print $3; exit}' $CONFIG)
if [ -z "$MAPPED_SLOT" ] ; then
MAPPED_SLOT=$LINUX_SLOT
fi
@ -163,7 +165,7 @@ map_channel() {
case $TOPOLOGY in
"sas_switch")
MAPPED_CHAN=$(awk -v port="$PORT" \
'$1 == "channel" && $2 == ${PORT} \
'$1 == "channel" && $2 == port \
{ print $3; exit }' $CONFIG)
;;
"sas_direct"|"scsi")
@ -727,7 +729,7 @@ done
if [ ! -r "$CONFIG" ] ; then
echo "Error: Config file \"$CONFIG\" not found"
exit 0
exit 1
fi
if [ -z "$DEV" ] && [ -z "$ENCLOSURE_MODE" ] ; then

View File

@ -162,12 +162,6 @@ static int dump_bpobj_cb(void *arg, const blkptr_t *bp, boolean_t free,
dmu_tx_t *tx);
typedef struct sublivelist_verify {
/* all ALLOC'd blkptr_t in one sub-livelist */
zfs_btree_t sv_all_allocs;
/* all FREE'd blkptr_t in one sub-livelist */
zfs_btree_t sv_all_frees;
/* FREE's that haven't yet matched to an ALLOC, in one sub-livelist */
zfs_btree_t sv_pair;
@ -226,29 +220,68 @@ typedef struct sublivelist_verify_block {
static void zdb_print_blkptr(const blkptr_t *bp, int flags);
typedef struct sublivelist_verify_block_refcnt {
/* block pointer entry in livelist being verified */
blkptr_t svbr_blk;
/*
* Refcount gets incremented to 1 when we encounter the first
* FREE entry for the svfbr block pointer and a node for it
* is created in our ZDB verification/tracking metadata.
*
* As we encounter more FREE entries we increment this counter
* and similarly decrement it whenever we find the respective
* ALLOC entries for this block.
*
* When the refcount gets to 0 it means that all the FREE and
* ALLOC entries of this block have paired up and we no longer
* need to track it in our verification logic (e.g. the node
* containing this struct in our verification data structure
* should be freed).
*
* [refer to sublivelist_verify_blkptr() for the actual code]
*/
uint32_t svbr_refcnt;
} sublivelist_verify_block_refcnt_t;
static int
sublivelist_block_refcnt_compare(const void *larg, const void *rarg)
{
const sublivelist_verify_block_refcnt_t *l = larg;
const sublivelist_verify_block_refcnt_t *r = rarg;
return (livelist_compare(&l->svbr_blk, &r->svbr_blk));
}
static int
sublivelist_verify_blkptr(void *arg, const blkptr_t *bp, boolean_t free,
dmu_tx_t *tx)
{
ASSERT3P(tx, ==, NULL);
struct sublivelist_verify *sv = arg;
char blkbuf[BP_SPRINTF_LEN];
sublivelist_verify_block_refcnt_t current = {
.svbr_blk = *bp,
/*
* Start with 1 in case this is the first free entry.
* This field is not used for our B-Tree comparisons
* anyway.
*/
.svbr_refcnt = 1,
};
zfs_btree_index_t where;
sublivelist_verify_block_refcnt_t *pair =
zfs_btree_find(&sv->sv_pair, &current, &where);
if (free) {
zfs_btree_add(&sv->sv_pair, bp);
/* Check if the FREE is a duplicate */
if (zfs_btree_find(&sv->sv_all_frees, bp, &where) != NULL) {
snprintf_blkptr_compact(blkbuf, sizeof (blkbuf), bp,
free);
(void) printf("\tERROR: Duplicate FREE: %s\n", blkbuf);
if (pair == NULL) {
/* first free entry for this block pointer */
zfs_btree_add(&sv->sv_pair, &current);
} else {
zfs_btree_add_idx(&sv->sv_all_frees, bp, &where);
pair->svbr_refcnt++;
}
} else {
/* Check if the ALLOC has been freed */
if (zfs_btree_find(&sv->sv_pair, bp, &where) != NULL) {
zfs_btree_remove_idx(&sv->sv_pair, &where);
} else {
if (pair == NULL) {
/* block that is currently marked as allocated */
for (int i = 0; i < SPA_DVAS_PER_BP; i++) {
if (DVA_IS_EMPTY(&bp->blk_dva[i]))
break;
@ -263,16 +296,16 @@ sublivelist_verify_blkptr(void *arg, const blkptr_t *bp, boolean_t free,
&svb, &where);
}
}
}
/* Check if the ALLOC is a duplicate */
if (zfs_btree_find(&sv->sv_all_allocs, bp, &where) != NULL) {
snprintf_blkptr_compact(blkbuf, sizeof (blkbuf), bp,
free);
(void) printf("\tERROR: Duplicate ALLOC: %s\n", blkbuf);
} else {
zfs_btree_add_idx(&sv->sv_all_allocs, bp, &where);
/* alloc matches a free entry */
pair->svbr_refcnt--;
if (pair->svbr_refcnt == 0) {
/* all allocs and frees have been matched */
zfs_btree_remove_idx(&sv->sv_pair, &where);
}
}
}
return (0);
}
@ -280,32 +313,22 @@ static int
sublivelist_verify_func(void *args, dsl_deadlist_entry_t *dle)
{
int err;
char blkbuf[BP_SPRINTF_LEN];
struct sublivelist_verify *sv = args;
zfs_btree_create(&sv->sv_all_allocs, livelist_compare,
sizeof (blkptr_t));
zfs_btree_create(&sv->sv_all_frees, livelist_compare,
sizeof (blkptr_t));
zfs_btree_create(&sv->sv_pair, livelist_compare,
sizeof (blkptr_t));
zfs_btree_create(&sv->sv_pair, sublivelist_block_refcnt_compare,
sizeof (sublivelist_verify_block_refcnt_t));
err = bpobj_iterate_nofree(&dle->dle_bpobj, sublivelist_verify_blkptr,
sv, NULL);
zfs_btree_clear(&sv->sv_all_allocs);
zfs_btree_destroy(&sv->sv_all_allocs);
zfs_btree_clear(&sv->sv_all_frees);
zfs_btree_destroy(&sv->sv_all_frees);
blkptr_t *e;
sublivelist_verify_block_refcnt_t *e;
zfs_btree_index_t *cookie = NULL;
while ((e = zfs_btree_destroy_nodes(&sv->sv_pair, &cookie)) != NULL) {
snprintf_blkptr_compact(blkbuf, sizeof (blkbuf), e, B_TRUE);
(void) printf("\tERROR: Unmatched FREE: %s\n", blkbuf);
char blkbuf[BP_SPRINTF_LEN];
snprintf_blkptr_compact(blkbuf, sizeof (blkbuf),
&e->svbr_blk, B_TRUE);
(void) printf("\tERROR: %d unmatched FREE(s): %s\n",
e->svbr_refcnt, blkbuf);
}
zfs_btree_destroy(&sv->sv_pair);
@ -614,10 +637,14 @@ mv_populate_livelist_allocs(metaslab_verify_t *mv, sublivelist_verify_t *sv)
/*
* [Livelist Check]
* Iterate through all the sublivelists and:
* - report leftover frees
* - report double ALLOCs/FREEs
* - report leftover frees (**)
* - record leftover ALLOCs together with their TXG [see Cross Check]
*
* (**) Note: Double ALLOCs are valid in datasets that have dedup
* enabled. Similarly double FREEs are allowed as well but
* only if they pair up with a corresponding ALLOC entry once
* we our done with our sublivelist iteration.
*
* [Spacemap Check]
* for each metaslab:
* - iterate over spacemap and then the metaslab's entries in the
@ -5932,7 +5959,8 @@ zdb_leak_init_prepare_indirect_vdevs(spa_t *spa, zdb_cb_t *zcb)
vdev_metaslab_group_create(vd);
VERIFY0(vdev_metaslab_init(vd, 0));
vdev_indirect_mapping_t *vim = vd->vdev_indirect_mapping;
vdev_indirect_mapping_t *vim __maybe_unused =
vd->vdev_indirect_mapping;
uint64_t vim_idx = 0;
for (uint64_t m = 0; m < vd->vdev_ms_count; m++) {
@ -7041,7 +7069,7 @@ verify_checkpoint_vdev_spacemaps(spa_t *checkpoint, spa_t *current)
for (uint64_t c = ckpoint_rvd->vdev_children;
c < current_rvd->vdev_children; c++) {
vdev_t *current_vd = current_rvd->vdev_child[c];
ASSERT3P(current_vd->vdev_checkpoint_sm, ==, NULL);
VERIFY3P(current_vd->vdev_checkpoint_sm, ==, NULL);
}
}

View File

@ -1,8 +1,10 @@
include $(top_srcdir)/config/Rules.am
include $(top_srcdir)/config/Shellcheck.am
AM_CFLAGS += $(LIBUDEV_CFLAGS) $(LIBUUID_CFLAGS)
SUBDIRS = zed.d
SHELLCHECKDIRS = $(SUBDIRS)
sbin_PROGRAMS = zed
@ -43,7 +45,7 @@ zed_LDADD = \
$(abs_top_builddir)/lib/libnvpair/libnvpair.la \
$(abs_top_builddir)/lib/libuutil/libuutil.la
zed_LDADD += -lrt $(LIBUDEV_LIBS) $(LIBUUID_LIBS)
zed_LDADD += -lrt $(LIBATOMIC_LIBS) $(LIBUDEV_LIBS) $(LIBUUID_LIBS)
zed_LDFLAGS = -pthread
EXTRA_DIST = agents/README.md

View File

@ -392,6 +392,7 @@ zfs_agent_init(libzfs_handle_t *zfs_hdl)
list_destroy(&agent_events);
zed_log_die("Failed to initialize agents");
}
pthread_setname_np(g_agents_tid, "agents");
}
void

View File

@ -640,6 +640,27 @@ devid_iter(const char *devid, zfs_process_func_t func, boolean_t is_slice)
return (data.dd_found);
}
/*
* Given a device guid, find any vdevs with a matching guid.
*/
static boolean_t
guid_iter(uint64_t pool_guid, uint64_t vdev_guid, const char *devid,
zfs_process_func_t func, boolean_t is_slice)
{
dev_data_t data = { 0 };
data.dd_func = func;
data.dd_found = B_FALSE;
data.dd_pool_guid = pool_guid;
data.dd_vdev_guid = vdev_guid;
data.dd_islabeled = is_slice;
data.dd_new_devid = devid;
(void) zpool_iter(g_zfshdl, zfs_iter_pool, &data);
return (data.dd_found);
}
/*
* Handle a EC_DEV_ADD.ESC_DISK event.
*
@ -663,15 +684,18 @@ static int
zfs_deliver_add(nvlist_t *nvl, boolean_t is_lofi)
{
char *devpath = NULL, *devid;
uint64_t pool_guid = 0, vdev_guid = 0;
boolean_t is_slice;
/*
* Expecting a devid string and an optional physical location
* Expecting a devid string and an optional physical location and guid
*/
if (nvlist_lookup_string(nvl, DEV_IDENTIFIER, &devid) != 0)
return (-1);
(void) nvlist_lookup_string(nvl, DEV_PHYS_PATH, &devpath);
(void) nvlist_lookup_uint64(nvl, ZFS_EV_POOL_GUID, &pool_guid);
(void) nvlist_lookup_uint64(nvl, ZFS_EV_VDEV_GUID, &vdev_guid);
is_slice = (nvlist_lookup_boolean(nvl, DEV_IS_PART) == 0);
@ -682,12 +706,16 @@ zfs_deliver_add(nvlist_t *nvl, boolean_t is_lofi)
* Iterate over all vdevs looking for a match in the following order:
* 1. ZPOOL_CONFIG_DEVID (identifies the unique disk)
* 2. ZPOOL_CONFIG_PHYS_PATH (identifies disk physical location).
*
* For disks, we only want to pay attention to vdevs marked as whole
* disks or are a multipath device.
* 3. ZPOOL_CONFIG_GUID (identifies unique vdev).
*/
if (!devid_iter(devid, zfs_process_add, is_slice) && devpath != NULL)
(void) devphys_iter(devpath, devid, zfs_process_add, is_slice);
if (devid_iter(devid, zfs_process_add, is_slice))
return (0);
if (devpath != NULL && devphys_iter(devpath, devid, zfs_process_add,
is_slice))
return (0);
if (vdev_guid != 0)
(void) guid_iter(pool_guid, vdev_guid, devid, zfs_process_add,
is_slice);
return (0);
}
@ -918,6 +946,7 @@ zfs_slm_init()
return (-1);
}
pthread_setname_np(g_zfs_tid, "enum-pools");
list_create(&g_device_list, sizeof (struct pendingdev),
offsetof(struct pendingdev, pd_node));

View File

@ -3,7 +3,7 @@
*
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
* Refer to the OpenZFS git commit log for authoritative copyright attribution.
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License Version 1.0 (CDDL-1.0).
@ -60,8 +60,8 @@ _setup_sig_handlers(void)
zed_log_die("Failed to initialize sigset");
sa.sa_flags = SA_RESTART;
sa.sa_handler = SIG_IGN;
sa.sa_handler = SIG_IGN;
if (sigaction(SIGPIPE, &sa, NULL) < 0)
zed_log_die("Failed to ignore SIGPIPE");
@ -75,6 +75,10 @@ _setup_sig_handlers(void)
sa.sa_handler = _hup_handler;
if (sigaction(SIGHUP, &sa, NULL) < 0)
zed_log_die("Failed to register SIGHUP handler");
(void) sigaddset(&sa.sa_mask, SIGCHLD);
if (pthread_sigmask(SIG_BLOCK, &sa.sa_mask, NULL) < 0)
zed_log_die("Failed to block SIGCHLD");
}
/*
@ -212,22 +216,20 @@ _finish_daemonize(void)
int
main(int argc, char *argv[])
{
struct zed_conf *zcp;
struct zed_conf zcp;
uint64_t saved_eid;
int64_t saved_etime[2];
zed_log_init(argv[0]);
zed_log_stderr_open(LOG_NOTICE);
zcp = zed_conf_create();
zed_conf_parse_opts(zcp, argc, argv);
if (zcp->do_verbose)
zed_conf_init(&zcp);
zed_conf_parse_opts(&zcp, argc, argv);
if (zcp.do_verbose)
zed_log_stderr_open(LOG_INFO);
if (geteuid() != 0)
zed_log_die("Must be run as root");
zed_conf_parse_file(zcp);
zed_file_close_from(STDERR_FILENO + 1);
(void) umask(0);
@ -235,32 +237,32 @@ main(int argc, char *argv[])
if (chdir("/") < 0)
zed_log_die("Failed to change to root directory");
if (zed_conf_scan_dir(zcp) < 0)
if (zed_conf_scan_dir(&zcp) < 0)
exit(EXIT_FAILURE);
if (!zcp->do_foreground) {
if (!zcp.do_foreground) {
_start_daemonize();
zed_log_syslog_open(LOG_DAEMON);
}
_setup_sig_handlers();
if (zcp->do_memlock)
if (zcp.do_memlock)
_lock_memory();
if ((zed_conf_write_pid(zcp) < 0) && (!zcp->do_force))
if ((zed_conf_write_pid(&zcp) < 0) && (!zcp.do_force))
exit(EXIT_FAILURE);
if (!zcp->do_foreground)
if (!zcp.do_foreground)
_finish_daemonize();
zed_log_msg(LOG_NOTICE,
"ZFS Event Daemon %s-%s (PID %d)",
ZFS_META_VERSION, ZFS_META_RELEASE, (int)getpid());
if (zed_conf_open_state(zcp) < 0)
if (zed_conf_open_state(&zcp) < 0)
exit(EXIT_FAILURE);
if (zed_conf_read_state(zcp, &saved_eid, saved_etime) < 0)
if (zed_conf_read_state(&zcp, &saved_eid, saved_etime) < 0)
exit(EXIT_FAILURE);
idle:
@ -269,24 +271,24 @@ idle:
* successful.
*/
do {
if (!zed_event_init(zcp))
if (!zed_event_init(&zcp))
break;
/* Wait for some time and try again. tunable? */
sleep(30);
} while (!_got_exit && zcp->do_idle);
} while (!_got_exit && zcp.do_idle);
if (_got_exit)
goto out;
zed_event_seek(zcp, saved_eid, saved_etime);
zed_event_seek(&zcp, saved_eid, saved_etime);
while (!_got_exit) {
int rv;
if (_got_hup) {
_got_hup = 0;
(void) zed_conf_scan_dir(zcp);
(void) zed_conf_scan_dir(&zcp);
}
rv = zed_event_service(zcp);
rv = zed_event_service(&zcp);
/* ENODEV: When kernel module is unloaded (osx) */
if (rv == ENODEV)
@ -294,13 +296,13 @@ idle:
}
zed_log_msg(LOG_NOTICE, "Exiting");
zed_event_fini(zcp);
zed_event_fini(&zcp);
if (zcp->do_idle && !_got_exit)
if (zcp.do_idle && !_got_exit)
goto idle;
out:
zed_conf_destroy(zcp);
zed_conf_destroy(&zcp);
zed_log_fini();
exit(EXIT_SUCCESS);
}

View File

@ -1,5 +1,6 @@
include $(top_srcdir)/config/Rules.am
include $(top_srcdir)/config/Substfiles.am
include $(top_srcdir)/config/Shellcheck.am
EXTRA_DIST += README
@ -51,3 +52,6 @@ install-data-hook:
ln -s "$(zedexecdir)/$${f}" "$(DESTDIR)$(zedconfdir)"; \
done
chmod 0600 "$(DESTDIR)$(zedconfdir)/zed.rc"
# False positive: 1>&"${ZED_FLOCK_FD}" looks suspiciously similar to a >&filename bash extension
CHECKBASHISMS_IGNORE = -e 'should be >word 2>&1' -e '&"$${ZED_FLOCK_FD}"'

View File

@ -12,15 +12,11 @@
zed_exit_if_ignoring_this_event
lockfile="$(basename -- "${ZED_DEBUG_LOG}").lock"
zed_lock "${ZED_DEBUG_LOG}"
{
printenv | sort
echo
} 1>&"${ZED_FLOCK_FD}"
zed_unlock "${ZED_DEBUG_LOG}"
umask 077
zed_lock "${lockfile}"
exec >> "${ZED_DEBUG_LOG}"
printenv | sort
echo
exec >&-
zed_unlock "${lockfile}"
exit 0

View File

@ -42,6 +42,7 @@ fi
msg="${msg} delay=$((ZEVENT_ZIO_DELAY / 1000000))ms"
# list the bookmark data together
# shellcheck disable=SC2153
[ -n "${ZEVENT_ZIO_OBJSET}" ] && \
msg="${msg} bookmark=${ZEVENT_ZIO_OBJSET}:${ZEVENT_ZIO_OBJECT}:${ZEVENT_ZIO_LEVEL}:${ZEVENT_ZIO_BLKID}"

View File

@ -25,7 +25,7 @@ zed_rate_limit "${rate_limit_tag}" || exit 3
umask 077
note_subject="ZFS ${ZEVENT_SUBCLASS} error for ${ZEVENT_POOL} on $(hostname)"
note_pathname="${TMPDIR:="/tmp"}/$(basename -- "$0").${ZEVENT_EID}.$$"
note_pathname="$(mktemp)"
{
echo "ZFS has detected a data error:"
echo

View File

@ -31,7 +31,7 @@ umask 077
pool_str="${ZEVENT_POOL:+" for ${ZEVENT_POOL}"}"
host_str=" on $(hostname)"
note_subject="ZFS ${ZEVENT_SUBCLASS} event${pool_str}${host_str}"
note_pathname="${TMPDIR:="/tmp"}/$(basename -- "$0").${ZEVENT_EID}.$$"
note_pathname="$(mktemp)"
{
echo "ZFS has posted the following event:"
echo

View File

@ -3,9 +3,8 @@
# Track changes to enumerated pools for use in early-boot
set -ef
FSLIST_DIR="@sysconfdir@/zfs/zfs-list.cache"
FSLIST_TMP="@runstatedir@/zfs-list.cache.new"
FSLIST="${FSLIST_DIR}/${ZEVENT_POOL}"
FSLIST="@sysconfdir@/zfs/zfs-list.cache/${ZEVENT_POOL}"
FSLIST_TMP="@runstatedir@/zfs-list.cache@${ZEVENT_POOL}"
# If the pool specific cache file is not writeable, abort
[ -w "${FSLIST}" ] || exit 0
@ -14,20 +13,20 @@ FSLIST="${FSLIST_DIR}/${ZEVENT_POOL}"
. "${ZED_ZEDLET_DIR}/zed-functions.sh"
[ "$ZEVENT_SUBCLASS" != "history_event" ] && exit 0
zed_check_cmd "${ZFS}" sort diff grep
zed_check_cmd "${ZFS}" sort diff
# If we are acting on a snapshot, we have nothing to do
printf '%s' "${ZEVENT_HISTORY_DSNAME}" | grep '@' && exit 0
[ "${ZEVENT_HISTORY_DSNAME%@*}" = "${ZEVENT_HISTORY_DSNAME}" ] || exit 0
# We obtain a lock on zfs-list to avoid any simultaneous writes.
# We lock the output file to avoid simultaneous writes.
# If we run into trouble, log and drop the lock
abort_alter() {
zed_log_msg "Error updating zfs-list.cache!"
zed_unlock zfs-list
zed_log_msg "Error updating zfs-list.cache for ${ZEVENT_POOL}!"
zed_unlock "${FSLIST}"
}
finished() {
zed_unlock zfs-list
zed_unlock "${FSLIST}"
trap - EXIT
exit 0
}
@ -37,7 +36,7 @@ case "${ZEVENT_HISTORY_INTERNAL_NAME}" in
;;
export)
zed_lock zfs-list
zed_lock "${FSLIST}"
trap abort_alter EXIT
echo > "${FSLIST}"
finished
@ -63,7 +62,7 @@ case "${ZEVENT_HISTORY_INTERNAL_NAME}" in
;;
esac
zed_lock zfs-list
zed_lock "${FSLIST}"
trap abort_alter EXIT
PROPS="name,mountpoint,canmount,atime,relatime,devices,exec\
@ -79,7 +78,7 @@ PROPS="name,mountpoint,canmount,atime,relatime,devices,exec\
sort "${FSLIST_TMP}" -o "${FSLIST_TMP}"
# Don't modify the file if it hasn't changed
diff -q "${FSLIST_TMP}" "${FSLIST}" || mv "${FSLIST_TMP}" "${FSLIST}"
diff -q "${FSLIST_TMP}" "${FSLIST}" || cat "${FSLIST_TMP}" > "${FSLIST}"
rm -f "${FSLIST_TMP}"
finished

View File

@ -41,7 +41,7 @@ fi
umask 077
note_subject="ZFS ${ZEVENT_SUBCLASS} event for ${ZEVENT_POOL} on $(hostname)"
note_pathname="${TMPDIR:="/tmp"}/$(basename -- "$0").${ZEVENT_EID}.$$"
note_pathname="$(mktemp)"
{
echo "ZFS has finished a ${action}:"
echo

View File

@ -1,21 +1,21 @@
#!/bin/sh
#
# Turn off/on the VDEV's enclosure fault LEDs when the pool's state changes.
# Turn off/on vdevs' enclosure fault LEDs when their pool's state changes.
#
# Turn the VDEV's fault LED on if it becomes FAULTED, DEGRADED or UNAVAIL.
# Turn the LED off when it's back ONLINE again.
# Turn a vdev's fault LED on if it becomes FAULTED, DEGRADED or UNAVAIL.
# Turn its LED off when it's back ONLINE again.
#
# This script run in two basic modes:
#
# 1. If $ZEVENT_VDEV_ENC_SYSFS_PATH and $ZEVENT_VDEV_STATE_STR are set, then
# only set the LED for that particular VDEV. This is the case for statechange
# only set the LED for that particular vdev. This is the case for statechange
# events and some vdev_* events.
#
# 2. If those vars are not set, then check the state of all VDEVs in the pool
# 2. If those vars are not set, then check the state of all vdevs in the pool
# and set the LEDs accordingly. This is the case for pool_import events.
#
# Note that this script requires that your enclosure be supported by the
# Linux SCSI enclosure services (ses) driver. The script will do nothing
# Linux SCSI Enclosure services (SES) driver. The script will do nothing
# if you have no enclosure, or if your enclosure isn't supported.
#
# Exit codes:
@ -59,6 +59,10 @@ check_and_set_led()
file="$1"
val="$2"
if [ -z "$val" ]; then
return 0
fi
if [ ! -e "$file" ] ; then
return 3
fi
@ -66,11 +70,11 @@ check_and_set_led()
# If another process is accessing the LED when we attempt to update it,
# the update will be lost so retry until the LED actually changes or we
# timeout.
for _ in $(seq 1 5); do
for _ in 1 2 3 4 5; do
# We want to check the current state first, since writing to the
# 'fault' entry always causes a SES command, even if the
# current state is already what you want.
current=$(cat "${file}")
read -r current < "${file}"
# On some enclosures if you write 1 to fault, and read it back,
# it will return 2. Treat all non-zero values as 1 for
@ -85,27 +89,29 @@ check_and_set_led()
else
break
fi
done
done
}
state_to_val()
{
state="$1"
if [ "$state" = "FAULTED" ] || [ "$state" = "DEGRADED" ] || \
[ "$state" = "UNAVAIL" ] ; then
echo 1
elif [ "$state" = "ONLINE" ] ; then
echo 0
fi
case "$state" in
FAULTED|DEGRADED|UNAVAIL)
echo 1
;;
ONLINE)
echo 0
;;
esac
}
# process_pool ([pool])
# process_pool (pool)
#
# Iterate through a pool (or pools) and set the VDEV's enclosure slot LEDs to
# the VDEV's state.
# Iterate through a pool and set the vdevs' enclosure slot LEDs to
# those vdevs' state.
#
# Arguments
# pool: Optional pool name. If not specified, iterate though all pools.
# pool: Pool name.
#
# Return
# 0 on success, 3 on missing sysfs path
@ -113,19 +119,22 @@ state_to_val()
process_pool()
{
pool="$1"
# The output will be the vdevs only (from "grep '/dev/'"):
#
# U45 ONLINE 0 0 0 /dev/sdk 0
# U46 ONLINE 0 0 0 /dev/sdm 0
# U47 ONLINE 0 0 0 /dev/sdn 0
# U50 ONLINE 0 0 0 /dev/sdbn 0
#
ZPOOL_SCRIPTS_AS_ROOT=1 $ZPOOL status -c upath,fault_led "$pool" | grep '/dev/' | (
rc=0
# Lookup all the current LED values and paths in parallel
#shellcheck disable=SC2016
cmd='echo led_token=$(cat "$VDEV_ENC_SYSFS_PATH/fault"),"$VDEV_ENC_SYSFS_PATH",'
out=$($ZPOOL status -vc "$cmd" "$pool" | grep 'led_token=')
#shellcheck disable=SC2034
echo "$out" | while read -r vdev state read write chksum therest; do
while read -r vdev state _ _ _ therest; do
# Read out current LED value and path
tmp=$(echo "$therest" | sed 's/^.*led_token=//g')
vdev_enc_sysfs_path=$(echo "$tmp" | awk -F ',' '{print $2}')
current_val=$(echo "$tmp" | awk -F ',' '{print $1}')
# Get dev name (like 'sda')
dev=$(basename "$(echo "$therest" | awk '{print $(NF-1)}')")
vdev_enc_sysfs_path=$(realpath "/sys/class/block/$dev/device/enclosure_device"*)
current_val=$(echo "$therest" | awk '{print $NF}')
if [ "$current_val" != "0" ] ; then
current_val=1
@ -137,36 +146,27 @@ process_pool()
fi
if [ ! -e "$vdev_enc_sysfs_path/fault" ] ; then
#shellcheck disable=SC2030
rc=1
rc=3
zed_log_msg "vdev $vdev '$file/fault' doesn't exist"
continue;
continue
fi
val=$(state_to_val "$state")
if [ "$current_val" = "$val" ] ; then
# LED is already set correctly
continue;
continue
fi
if ! check_and_set_led "$vdev_enc_sysfs_path/fault" "$val"; then
rc=1
rc=3
fi
done
#shellcheck disable=SC2031
if [ "$rc" = "0" ] ; then
return 0
else
# We didn't see a sysfs entry that we wanted to set
return 3
fi
exit "$rc"; )
}
if [ -n "$ZEVENT_VDEV_ENC_SYSFS_PATH" ] && [ -n "$ZEVENT_VDEV_STATE_STR" ] ; then
# Got a statechange for an individual VDEV
# Got a statechange for an individual vdev
val=$(state_to_val "$ZEVENT_VDEV_STATE_STR")
vdev=$(basename "$ZEVENT_VDEV_PATH")
check_and_set_led "$ZEVENT_VDEV_ENC_SYSFS_PATH/fault" "$val"

View File

@ -37,7 +37,7 @@ fi
umask 077
note_subject="ZFS device fault for pool ${ZEVENT_POOL_GUID} on $(hostname)"
note_pathname="${TMPDIR:="/tmp"}/$(basename -- "$0").${ZEVENT_EID}.$$"
note_pathname="$(mktemp)"
{
if [ "${ZEVENT_VDEV_STATE_STR}" = "FAULTED" ] ; then
echo "The number of I/O errors associated with a ZFS device exceeded"

View File

@ -19,7 +19,7 @@ zed_check_cmd "${ZPOOL}" || exit 9
umask 077
note_subject="ZFS ${ZEVENT_SUBCLASS} event for ${ZEVENT_POOL} on $(hostname)"
note_pathname="${TMPDIR:="/tmp"}/$(basename -- "$0").${ZEVENT_EID}.$$"
note_pathname="$(mktemp)"
{
echo "ZFS has finished a trim:"
echo

View File

@ -126,10 +126,8 @@ zed_lock()
# Obtain a lock on the file bound to the given file descriptor.
#
eval "exec ${fd}> '${lockfile}'"
err="$(flock --exclusive "${fd}" 2>&1)"
# shellcheck disable=SC2181
if [ $? -ne 0 ]; then
eval "exec ${fd}>> '${lockfile}'"
if ! err="$(flock --exclusive "${fd}" 2>&1)"; then
zed_log_err "failed to lock \"${lockfile}\": ${err}"
fi
@ -165,9 +163,7 @@ zed_unlock()
fi
# Release the lock and close the file descriptor.
err="$(flock --unlock "${fd}" 2>&1)"
# shellcheck disable=SC2181
if [ $? -ne 0 ]; then
if ! err="$(flock --unlock "${fd}" 2>&1)"; then
zed_log_err "failed to unlock \"${lockfile}\": ${err}"
fi
eval "exec ${fd}>&-"
@ -267,7 +263,7 @@ zed_notify_email()
-e "s/@SUBJECT@/${subject}/g")"
# shellcheck disable=SC2086
eval "${ZED_EMAIL_PROG}" ${ZED_EMAIL_OPTS} < "${pathname}" >/dev/null 2>&1
eval ${ZED_EMAIL_PROG} ${ZED_EMAIL_OPTS} < "${pathname}" >/dev/null 2>&1
rv=$?
if [ "${rv}" -ne 0 ]; then
zed_log_err "$(basename "${ZED_EMAIL_PROG}") exit=${rv}"
@ -367,7 +363,7 @@ zed_notify_pushbullet()
#
# Notification via Slack Webhook <https://api.slack.com/incoming-webhooks>.
# The Webhook URL (ZED_SLACK_WEBHOOK_URL) identifies this client to the
# Slack channel.
# Slack channel.
#
# Requires awk, curl, and sed executables to be installed in the standard PATH.
#
@ -511,10 +507,8 @@ zed_guid_to_pool()
return
fi
guid=$(printf "%llu" "$1")
if [ -n "$guid" ] ; then
$ZPOOL get -H -ovalue,name guid | awk '$1=='"$guid"' {print $2}'
fi
guid="$(printf "%u" "$1")"
$ZPOOL get -H -ovalue,name guid | awk '$1 == '"$guid"' {print $2; exit}'
}
# zed_exit_if_ignoring_this_event

View File

@ -3,7 +3,7 @@
*
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
* Refer to the OpenZFS git commit log for authoritative copyright attribution.
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License Version 1.0 (CDDL-1.0).
@ -15,11 +15,6 @@
#ifndef ZED_H
#define ZED_H
/*
* Absolute path for the default zed configuration file.
*/
#define ZED_CONF_FILE SYSCONFDIR "/zfs/zed.conf"
/*
* Absolute path for the default zed pid file.
*/
@ -35,16 +30,6 @@
*/
#define ZED_ZEDLET_DIR SYSCONFDIR "/zfs/zed.d"
/*
* Reserved for future use.
*/
#define ZED_MAX_EVENTS 0
/*
* Reserved for future use.
*/
#define ZED_MIN_EVENTS 0
/*
* String prefix for ZED variables passed via environment variables.
*/

View File

@ -3,7 +3,7 @@
*
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
* Refer to the OpenZFS git commit log for authoritative copyright attribution.
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License Version 1.0 (CDDL-1.0).
@ -32,43 +32,26 @@
#include "zed_strings.h"
/*
* Return a new configuration with default values.
* Initialise the configuration with default values.
*/
struct zed_conf *
zed_conf_create(void)
void
zed_conf_init(struct zed_conf *zcp)
{
struct zed_conf *zcp;
memset(zcp, 0, sizeof (*zcp));
zcp = calloc(1, sizeof (*zcp));
if (!zcp)
goto nomem;
/* zcp->zfs_hdl opened in zed_event_init() */
/* zcp->zedlets created in zed_conf_scan_dir() */
zcp->syslog_facility = LOG_DAEMON;
zcp->min_events = ZED_MIN_EVENTS;
zcp->max_events = ZED_MAX_EVENTS;
zcp->pid_fd = -1;
zcp->zedlets = NULL; /* created via zed_conf_scan_dir() */
zcp->state_fd = -1; /* opened via zed_conf_open_state() */
zcp->zfs_hdl = NULL; /* opened via zed_event_init() */
zcp->zevent_fd = -1; /* opened via zed_event_init() */
zcp->pid_fd = -1; /* opened in zed_conf_write_pid() */
zcp->state_fd = -1; /* opened in zed_conf_open_state() */
zcp->zevent_fd = -1; /* opened in zed_event_init() */
if (!(zcp->conf_file = strdup(ZED_CONF_FILE)))
goto nomem;
zcp->max_jobs = 16;
if (!(zcp->pid_file = strdup(ZED_PID_FILE)))
goto nomem;
if (!(zcp->zedlet_dir = strdup(ZED_ZEDLET_DIR)))
goto nomem;
if (!(zcp->state_file = strdup(ZED_STATE_FILE)))
goto nomem;
return (zcp);
nomem:
zed_log_die("Failed to create conf: %s", strerror(errno));
return (NULL);
if (!(zcp->pid_file = strdup(ZED_PID_FILE)) ||
!(zcp->zedlet_dir = strdup(ZED_ZEDLET_DIR)) ||
!(zcp->state_file = strdup(ZED_STATE_FILE)))
zed_log_die("Failed to create conf: %s", strerror(errno));
}
/*
@ -79,9 +62,6 @@ nomem:
void
zed_conf_destroy(struct zed_conf *zcp)
{
if (!zcp)
return;
if (zcp->state_fd >= 0) {
if (close(zcp->state_fd) < 0)
zed_log_msg(LOG_WARNING,
@ -102,10 +82,6 @@ zed_conf_destroy(struct zed_conf *zcp)
zcp->pid_file, strerror(errno));
zcp->pid_fd = -1;
}
if (zcp->conf_file) {
free(zcp->conf_file);
zcp->conf_file = NULL;
}
if (zcp->pid_file) {
free(zcp->pid_file);
zcp->pid_file = NULL;
@ -122,7 +98,6 @@ zed_conf_destroy(struct zed_conf *zcp)
zed_strings_destroy(zcp->zedlets);
zcp->zedlets = NULL;
}
free(zcp);
}
/*
@ -132,46 +107,52 @@ zed_conf_destroy(struct zed_conf *zcp)
* otherwise, output to stderr and exit with a failure status.
*/
static void
_zed_conf_display_help(const char *prog, int got_err)
_zed_conf_display_help(const char *prog, boolean_t got_err)
{
struct opt { const char *o, *d, *v; };
FILE *fp = got_err ? stderr : stdout;
int w1 = 4; /* width of leading whitespace */
int w2 = 8; /* width of L-justified option field */
struct opt *oo;
struct opt iopts[] = {
{ .o = "-h", .d = "Display help" },
{ .o = "-L", .d = "Display license information" },
{ .o = "-V", .d = "Display version information" },
{},
};
struct opt nopts[] = {
{ .o = "-v", .d = "Be verbose" },
{ .o = "-f", .d = "Force daemon to run" },
{ .o = "-F", .d = "Run daemon in the foreground" },
{ .o = "-I",
.d = "Idle daemon until kernel module is (re)loaded" },
{ .o = "-M", .d = "Lock all pages in memory" },
{ .o = "-P", .d = "$PATH for ZED to use (only used by ZTS)" },
{ .o = "-Z", .d = "Zero state file" },
{},
};
struct opt vopts[] = {
{ .o = "-d DIR", .d = "Read enabled ZEDLETs from DIR.",
.v = ZED_ZEDLET_DIR },
{ .o = "-p FILE", .d = "Write daemon's PID to FILE.",
.v = ZED_PID_FILE },
{ .o = "-s FILE", .d = "Write daemon's state to FILE.",
.v = ZED_STATE_FILE },
{ .o = "-j JOBS", .d = "Start at most JOBS at once.",
.v = "16" },
{},
};
fprintf(fp, "Usage: %s [OPTION]...\n", (prog ? prog : "zed"));
fprintf(fp, "\n");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-h",
"Display help.");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-L",
"Display license information.");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-V",
"Display version information.");
for (oo = iopts; oo->o; ++oo)
fprintf(fp, " %*s %s\n", -8, oo->o, oo->d);
fprintf(fp, "\n");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-v",
"Be verbose.");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-f",
"Force daemon to run.");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-F",
"Run daemon in the foreground.");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-I",
"Idle daemon until kernel module is (re)loaded.");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-M",
"Lock all pages in memory.");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-P",
"$PATH for ZED to use (only used by ZTS).");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-Z",
"Zero state file.");
for (oo = nopts; oo->o; ++oo)
fprintf(fp, " %*s %s\n", -8, oo->o, oo->d);
fprintf(fp, "\n");
#if 0
fprintf(fp, "%*c%*s %s [%s]\n", w1, 0x20, -w2, "-c FILE",
"Read configuration from FILE.", ZED_CONF_FILE);
#endif
fprintf(fp, "%*c%*s %s [%s]\n", w1, 0x20, -w2, "-d DIR",
"Read enabled ZEDLETs from DIR.", ZED_ZEDLET_DIR);
fprintf(fp, "%*c%*s %s [%s]\n", w1, 0x20, -w2, "-p FILE",
"Write daemon's PID to FILE.", ZED_PID_FILE);
fprintf(fp, "%*c%*s %s [%s]\n", w1, 0x20, -w2, "-s FILE",
"Write daemon's state to FILE.", ZED_STATE_FILE);
for (oo = vopts; oo->o; ++oo)
fprintf(fp, " %*s %s [%s]\n", -8, oo->o, oo->d, oo->v);
fprintf(fp, "\n");
exit(got_err ? EXIT_FAILURE : EXIT_SUCCESS);
@ -183,20 +164,14 @@ _zed_conf_display_help(const char *prog, int got_err)
static void
_zed_conf_display_license(void)
{
const char **pp;
const char *text[] = {
"The ZFS Event Daemon (ZED) is distributed under the terms of the",
" Common Development and Distribution License (CDDL-1.0)",
" <http://opensource.org/licenses/CDDL-1.0>.",
"",
printf(
"The ZFS Event Daemon (ZED) is distributed under the terms of the\n"
" Common Development and Distribution License (CDDL-1.0)\n"
" <http://opensource.org/licenses/CDDL-1.0>.\n"
"\n"
"Developed at Lawrence Livermore National Laboratory"
" (LLNL-CODE-403049).",
"",
NULL
};
for (pp = text; *pp; pp++)
printf("%s\n", *pp);
" (LLNL-CODE-403049).\n"
"\n");
exit(EXIT_SUCCESS);
}
@ -231,16 +206,19 @@ _zed_conf_parse_path(char **resultp, const char *path)
if (path[0] == '/') {
*resultp = strdup(path);
} else if (!getcwd(buf, sizeof (buf))) {
zed_log_die("Failed to get current working dir: %s",
strerror(errno));
} else if (strlcat(buf, "/", sizeof (buf)) >= sizeof (buf)) {
zed_log_die("Failed to copy path: %s", strerror(ENAMETOOLONG));
} else if (strlcat(buf, path, sizeof (buf)) >= sizeof (buf)) {
zed_log_die("Failed to copy path: %s", strerror(ENAMETOOLONG));
} else {
if (!getcwd(buf, sizeof (buf)))
zed_log_die("Failed to get current working dir: %s",
strerror(errno));
if (strlcat(buf, "/", sizeof (buf)) >= sizeof (buf) ||
strlcat(buf, path, sizeof (buf)) >= sizeof (buf))
zed_log_die("Failed to copy path: %s",
strerror(ENAMETOOLONG));
*resultp = strdup(buf);
}
if (!*resultp)
zed_log_die("Failed to copy path: %s", strerror(ENOMEM));
}
@ -251,8 +229,9 @@ _zed_conf_parse_path(char **resultp, const char *path)
void
zed_conf_parse_opts(struct zed_conf *zcp, int argc, char **argv)
{
const char * const opts = ":hLVc:d:p:P:s:vfFMZI";
const char * const opts = ":hLVd:p:P:s:vfFMZIj:";
int opt;
unsigned long raw;
if (!zcp || !argv || !argv[0])
zed_log_die("Failed to parse options: Internal error");
@ -262,7 +241,7 @@ zed_conf_parse_opts(struct zed_conf *zcp, int argc, char **argv)
while ((opt = getopt(argc, argv, opts)) != -1) {
switch (opt) {
case 'h':
_zed_conf_display_help(argv[0], EXIT_SUCCESS);
_zed_conf_display_help(argv[0], B_FALSE);
break;
case 'L':
_zed_conf_display_license();
@ -270,9 +249,6 @@ zed_conf_parse_opts(struct zed_conf *zcp, int argc, char **argv)
case 'V':
_zed_conf_display_version();
break;
case 'c':
_zed_conf_parse_path(&zcp->conf_file, optarg);
break;
case 'd':
_zed_conf_parse_path(&zcp->zedlet_dir, optarg);
break;
@ -303,31 +279,30 @@ zed_conf_parse_opts(struct zed_conf *zcp, int argc, char **argv)
case 'Z':
zcp->do_zero = 1;
break;
case 'j':
errno = 0;
raw = strtoul(optarg, NULL, 0);
if (errno == ERANGE || raw > INT16_MAX) {
zed_log_die("%lu is too many jobs", raw);
} if (raw == 0) {
zed_log_die("0 jobs makes no sense");
} else {
zcp->max_jobs = raw;
}
break;
case '?':
default:
if (optopt == '?')
_zed_conf_display_help(argv[0], EXIT_SUCCESS);
_zed_conf_display_help(argv[0], B_FALSE);
fprintf(stderr, "%s: %s '-%c'\n\n", argv[0],
"Invalid option", optopt);
_zed_conf_display_help(argv[0], EXIT_FAILURE);
fprintf(stderr, "%s: Invalid option '-%c'\n\n",
argv[0], optopt);
_zed_conf_display_help(argv[0], B_TRUE);
break;
}
}
}
/*
* Parse the configuration file into the configuration [zcp].
*
* FIXME: Not yet implemented.
*/
void
zed_conf_parse_file(struct zed_conf *zcp)
{
if (!zcp)
zed_log_die("Failed to parse config: %s", strerror(EINVAL));
}
/*
* Scan the [zcp] zedlet_dir for files to exec based on the event class.
* Files must be executable by user, but not writable by group or other.
@ -335,8 +310,6 @@ zed_conf_parse_file(struct zed_conf *zcp)
*
* Return 0 on success with an updated set of zedlets,
* or -1 on error with errno set.
*
* FIXME: Check if zedlet_dir and all parent dirs are secure.
*/
int
zed_conf_scan_dir(struct zed_conf *zcp)
@ -452,8 +425,6 @@ zed_conf_scan_dir(struct zed_conf *zcp)
int
zed_conf_write_pid(struct zed_conf *zcp)
{
const mode_t dirmode = S_IRWXU | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH;
const mode_t filemode = S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH;
char buf[PATH_MAX];
int n;
char *p;
@ -481,7 +452,7 @@ zed_conf_write_pid(struct zed_conf *zcp)
if (p)
*p = '\0';
if ((mkdirp(buf, dirmode) < 0) && (errno != EEXIST)) {
if ((mkdirp(buf, 0755) < 0) && (errno != EEXIST)) {
zed_log_msg(LOG_ERR, "Failed to create directory \"%s\": %s",
buf, strerror(errno));
goto err;
@ -491,7 +462,7 @@ zed_conf_write_pid(struct zed_conf *zcp)
*/
mask = umask(0);
umask(mask | 022);
zcp->pid_fd = open(zcp->pid_file, (O_RDWR | O_CREAT), filemode);
zcp->pid_fd = open(zcp->pid_file, O_RDWR | O_CREAT | O_CLOEXEC, 0644);
umask(mask);
if (zcp->pid_fd < 0) {
zed_log_msg(LOG_ERR, "Failed to open PID file \"%s\": %s",
@ -528,7 +499,7 @@ zed_conf_write_pid(struct zed_conf *zcp)
errno = ERANGE;
zed_log_msg(LOG_ERR, "Failed to write PID file \"%s\": %s",
zcp->pid_file, strerror(errno));
} else if (zed_file_write_n(zcp->pid_fd, buf, n) != n) {
} else if (write(zcp->pid_fd, buf, n) != n) {
zed_log_msg(LOG_ERR, "Failed to write PID file \"%s\": %s",
zcp->pid_file, strerror(errno));
} else if (fdatasync(zcp->pid_fd) < 0) {
@ -556,7 +527,6 @@ int
zed_conf_open_state(struct zed_conf *zcp)
{
char dirbuf[PATH_MAX];
mode_t dirmode = S_IRWXU | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH;
int n;
char *p;
int rv;
@ -578,7 +548,7 @@ zed_conf_open_state(struct zed_conf *zcp)
if (p)
*p = '\0';
if ((mkdirp(dirbuf, dirmode) < 0) && (errno != EEXIST)) {
if ((mkdirp(dirbuf, 0755) < 0) && (errno != EEXIST)) {
zed_log_msg(LOG_WARNING,
"Failed to create directory \"%s\": %s",
dirbuf, strerror(errno));
@ -596,7 +566,7 @@ zed_conf_open_state(struct zed_conf *zcp)
(void) unlink(zcp->state_file);
zcp->state_fd = open(zcp->state_file,
(O_RDWR | O_CREAT), (S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH));
O_RDWR | O_CREAT | O_CLOEXEC, 0644);
if (zcp->state_fd < 0) {
zed_log_msg(LOG_WARNING, "Failed to open state file \"%s\": %s",
zcp->state_file, strerror(errno));

View File

@ -3,7 +3,7 @@
*
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
* Refer to the OpenZFS git commit log for authoritative copyright attribution.
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License Version 1.0 (CDDL-1.0).
@ -20,43 +20,39 @@
#include "zed_strings.h"
struct zed_conf {
unsigned do_force:1; /* true if force enabled */
unsigned do_foreground:1; /* true if run in foreground */
unsigned do_memlock:1; /* true if locking memory */
unsigned do_verbose:1; /* true if verbosity enabled */
unsigned do_zero:1; /* true if zeroing state */
unsigned do_idle:1; /* true if idle enabled */
int syslog_facility; /* syslog facility value */
int min_events; /* RESERVED FOR FUTURE USE */
int max_events; /* RESERVED FOR FUTURE USE */
char *conf_file; /* abs path to config file */
char *pid_file; /* abs path to pid file */
int pid_fd; /* fd to pid file for lock */
char *zedlet_dir; /* abs path to zedlet dir */
zed_strings_t *zedlets; /* names of enabled zedlets */
char *state_file; /* abs path to state file */
int state_fd; /* fd to state file */
libzfs_handle_t *zfs_hdl; /* handle to libzfs */
int zevent_fd; /* fd for access to zevents */
zed_strings_t *zedlets; /* names of enabled zedlets */
char *path; /* custom $PATH for zedlets to use */
int pid_fd; /* fd to pid file for lock */
int state_fd; /* fd to state file */
int zevent_fd; /* fd for access to zevents */
int16_t max_jobs; /* max zedlets to run at one time */
boolean_t do_force:1; /* true if force enabled */
boolean_t do_foreground:1; /* true if run in foreground */
boolean_t do_memlock:1; /* true if locking memory */
boolean_t do_verbose:1; /* true if verbosity enabled */
boolean_t do_zero:1; /* true if zeroing state */
boolean_t do_idle:1; /* true if idle enabled */
};
struct zed_conf *zed_conf_create(void);
void zed_conf_init(struct zed_conf *zcp);
void zed_conf_destroy(struct zed_conf *zcp);
void zed_conf_parse_opts(struct zed_conf *zcp, int argc, char **argv);
void zed_conf_parse_file(struct zed_conf *zcp);
int zed_conf_scan_dir(struct zed_conf *zcp);
int zed_conf_write_pid(struct zed_conf *zcp);
int zed_conf_open_state(struct zed_conf *zcp);
int zed_conf_read_state(struct zed_conf *zcp, uint64_t *eidp, int64_t etime[]);
int zed_conf_write_state(struct zed_conf *zcp, uint64_t eid, int64_t etime[]);
#endif /* !ZED_CONF_H */

View File

@ -72,6 +72,8 @@ zed_udev_event(const char *class, const char *subclass, nvlist_t *nvl)
zed_log_msg(LOG_INFO, "\t%s: %s", DEV_PATH, strval);
if (nvlist_lookup_string(nvl, DEV_IDENTIFIER, &strval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %s", DEV_IDENTIFIER, strval);
if (nvlist_lookup_boolean(nvl, DEV_IS_PART) == B_TRUE)
zed_log_msg(LOG_INFO, "\t%s: B_TRUE", DEV_IS_PART);
if (nvlist_lookup_string(nvl, DEV_PHYS_PATH, &strval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %s", DEV_PHYS_PATH, strval);
if (nvlist_lookup_uint64(nvl, DEV_SIZE, &numval) == 0)
@ -379,6 +381,7 @@ zed_disk_event_init()
return (-1);
}
pthread_setname_np(g_mon_tid, "udev monitor");
zed_log_msg(LOG_INFO, "zed_disk_event_init");
return (0);

View File

@ -3,7 +3,7 @@
*
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
* Refer to the OpenZFS git commit log for authoritative copyright attribution.
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License Version 1.0 (CDDL-1.0).
@ -15,7 +15,7 @@
#include <ctype.h>
#include <errno.h>
#include <fcntl.h>
#include <libzfs.h> /* FIXME: Replace with libzfs_core. */
#include <libzfs_core.h>
#include <paths.h>
#include <stdarg.h>
#include <stdio.h>
@ -54,7 +54,7 @@ zed_event_init(struct zed_conf *zcp)
zed_log_die("Failed to initialize libzfs");
}
zcp->zevent_fd = open(ZFS_DEV, O_RDWR);
zcp->zevent_fd = open(ZFS_DEV, O_RDWR | O_CLOEXEC);
if (zcp->zevent_fd < 0) {
if (zcp->do_idle)
return (-1);
@ -96,6 +96,47 @@ zed_event_fini(struct zed_conf *zcp)
libzfs_fini(zcp->zfs_hdl);
zcp->zfs_hdl = NULL;
}
zed_exec_fini();
}
static void
_bump_event_queue_length(void)
{
int zzlm = -1, wr;
char qlen_buf[12] = {0}; /* parameter is int => max "-2147483647\n" */
long int qlen;
zzlm = open("/sys/module/zfs/parameters/zfs_zevent_len_max", O_RDWR);
if (zzlm < 0)
goto done;
if (read(zzlm, qlen_buf, sizeof (qlen_buf)) < 0)
goto done;
qlen_buf[sizeof (qlen_buf) - 1] = '\0';
errno = 0;
qlen = strtol(qlen_buf, NULL, 10);
if (errno == ERANGE)
goto done;
if (qlen <= 0)
qlen = 512; /* default zfs_zevent_len_max value */
else
qlen *= 2;
if (qlen > INT_MAX)
qlen = INT_MAX;
wr = snprintf(qlen_buf, sizeof (qlen_buf), "%ld", qlen);
if (pwrite(zzlm, qlen_buf, wr, 0) < 0)
goto done;
zed_log_msg(LOG_WARNING, "Bumping queue length to %ld", qlen);
done:
if (zzlm > -1)
(void) close(zzlm);
}
/*
@ -136,10 +177,7 @@ zed_event_seek(struct zed_conf *zcp, uint64_t saved_eid, int64_t saved_etime[])
if (n_dropped > 0) {
zed_log_msg(LOG_WARNING, "Missed %d events", n_dropped);
/*
* FIXME: Increase max size of event nvlist in
* /sys/module/zfs/parameters/zfs_zevent_len_max ?
*/
_bump_event_queue_length();
}
if (nvlist_lookup_uint64(nvl, "eid", &eid) != 0) {
zed_log_msg(LOG_WARNING, "Failed to lookup zevent eid");
@ -211,7 +249,7 @@ _zed_event_value_is_hex(const char *name)
*
* All environment variables in [zsp] should be added through this function.
*/
static int
static __attribute__((format(printf, 5, 6))) int
_zed_event_add_var(uint64_t eid, zed_strings_t *zsp,
const char *prefix, const char *name, const char *fmt, ...)
{
@ -586,8 +624,6 @@ _zed_event_add_string_array(uint64_t eid, zed_strings_t *zsp,
* Convert the nvpair [nvp] to a string which is added to the environment
* of the child process.
* Return 0 on success, -1 on error.
*
* FIXME: Refactor with cmd/zpool/zpool_main.c:zpool_do_events_nvprint()?
*/
static void
_zed_event_add_nvpair(uint64_t eid, zed_strings_t *zsp, nvpair_t *nvp)
@ -686,23 +722,11 @@ _zed_event_add_nvpair(uint64_t eid, zed_strings_t *zsp, nvpair_t *nvp)
_zed_event_add_var(eid, zsp, prefix, name,
"%llu", (u_longlong_t)i64);
break;
case DATA_TYPE_NVLIST:
_zed_event_add_var(eid, zsp, prefix, name,
"%s", "_NOT_IMPLEMENTED_"); /* FIXME */
break;
case DATA_TYPE_STRING:
(void) nvpair_value_string(nvp, &str);
_zed_event_add_var(eid, zsp, prefix, name,
"%s", (str ? str : "<NULL>"));
break;
case DATA_TYPE_BOOLEAN_ARRAY:
_zed_event_add_var(eid, zsp, prefix, name,
"%s", "_NOT_IMPLEMENTED_"); /* FIXME */
break;
case DATA_TYPE_BYTE_ARRAY:
_zed_event_add_var(eid, zsp, prefix, name,
"%s", "_NOT_IMPLEMENTED_"); /* FIXME */
break;
case DATA_TYPE_INT8_ARRAY:
_zed_event_add_int8_array(eid, zsp, prefix, nvp);
break;
@ -730,9 +754,11 @@ _zed_event_add_nvpair(uint64_t eid, zed_strings_t *zsp, nvpair_t *nvp)
case DATA_TYPE_STRING_ARRAY:
_zed_event_add_string_array(eid, zsp, prefix, nvp);
break;
case DATA_TYPE_NVLIST:
case DATA_TYPE_BOOLEAN_ARRAY:
case DATA_TYPE_BYTE_ARRAY:
case DATA_TYPE_NVLIST_ARRAY:
_zed_event_add_var(eid, zsp, prefix, name,
"%s", "_NOT_IMPLEMENTED_"); /* FIXME */
_zed_event_add_var(eid, zsp, prefix, name, "_NOT_IMPLEMENTED_");
break;
default:
errno = EINVAL;
@ -912,10 +938,7 @@ zed_event_service(struct zed_conf *zcp)
if (n_dropped > 0) {
zed_log_msg(LOG_WARNING, "Missed %d events", n_dropped);
/*
* FIXME: Increase max size of event nvlist in
* /sys/module/zfs/parameters/zfs_zevent_len_max ?
*/
_bump_event_queue_length();
}
if (nvlist_lookup_uint64(nvl, "eid", &eid) != 0) {
zed_log_msg(LOG_WARNING, "Failed to lookup zevent eid");
@ -953,8 +976,7 @@ zed_event_service(struct zed_conf *zcp)
_zed_event_add_time_strings(eid, zsp, etime);
zed_exec_process(eid, class, subclass,
zcp->zedlet_dir, zcp->zedlets, zsp, zcp->zevent_fd);
zed_exec_process(eid, class, subclass, zcp, zsp);
zed_conf_write_state(zcp, eid, etime);

View File

@ -3,7 +3,7 @@
*
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
* Refer to the OpenZFS git commit log for authoritative copyright attribution.
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License Version 1.0 (CDDL-1.0).

View File

@ -3,7 +3,7 @@
*
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
* Refer to the OpenZFS git commit log for authoritative copyright attribution.
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License Version 1.0 (CDDL-1.0).
@ -18,17 +18,53 @@
#include <fcntl.h>
#include <stdlib.h>
#include <string.h>
#include <stddef.h>
#include <sys/avl.h>
#include <sys/resource.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <time.h>
#include <unistd.h>
#include <pthread.h>
#include "zed_exec.h"
#include "zed_file.h"
#include "zed_log.h"
#include "zed_strings.h"
#define ZEVENT_FILENO 3
struct launched_process_node {
avl_node_t node;
pid_t pid;
uint64_t eid;
char *name;
};
static int
_launched_process_node_compare(const void *x1, const void *x2)
{
pid_t p1;
pid_t p2;
assert(x1 != NULL);
assert(x2 != NULL);
p1 = ((const struct launched_process_node *) x1)->pid;
p2 = ((const struct launched_process_node *) x2)->pid;
if (p1 < p2)
return (-1);
else if (p1 == p2)
return (0);
else
return (1);
}
static pthread_t _reap_children_tid = (pthread_t)-1;
static volatile boolean_t _reap_children_stop;
static avl_tree_t _launched_processes;
static pthread_mutex_t _launched_processes_lock = PTHREAD_MUTEX_INITIALIZER;
static int16_t _launched_processes_limit;
/*
* Create an environment string array for passing to execve() using the
* NAME=VALUE strings in container [zsp].
@ -79,20 +115,26 @@ _zed_exec_create_env(zed_strings_t *zsp)
*/
static void
_zed_exec_fork_child(uint64_t eid, const char *dir, const char *prog,
char *env[], int zfd)
char *env[], int zfd, boolean_t in_foreground)
{
char path[PATH_MAX];
int n;
pid_t pid;
int fd;
pid_t wpid;
int status;
struct launched_process_node *node;
sigset_t mask;
struct timespec launch_timeout =
{ .tv_sec = 0, .tv_nsec = 200 * 1000 * 1000, };
assert(dir != NULL);
assert(prog != NULL);
assert(env != NULL);
assert(zfd >= 0);
while (__atomic_load_n(&_launched_processes_limit,
__ATOMIC_SEQ_CST) <= 0)
(void) nanosleep(&launch_timeout, NULL);
n = snprintf(path, sizeof (path), "%s/%s", dir, prog);
if ((n < 0) || (n >= sizeof (path))) {
zed_log_msg(LOG_WARNING,
@ -100,101 +142,179 @@ _zed_exec_fork_child(uint64_t eid, const char *dir, const char *prog,
prog, eid, strerror(ENAMETOOLONG));
return;
}
(void) pthread_mutex_lock(&_launched_processes_lock);
pid = fork();
if (pid < 0) {
(void) pthread_mutex_unlock(&_launched_processes_lock);
zed_log_msg(LOG_WARNING,
"Failed to fork \"%s\" for eid=%llu: %s",
prog, eid, strerror(errno));
return;
} else if (pid == 0) {
(void) sigemptyset(&mask);
(void) sigprocmask(SIG_SETMASK, &mask, NULL);
(void) umask(022);
if ((fd = open("/dev/null", O_RDWR)) != -1) {
if (in_foreground && /* we're already devnulled if daemonised */
(fd = open("/dev/null", O_RDWR | O_CLOEXEC)) != -1) {
(void) dup2(fd, STDIN_FILENO);
(void) dup2(fd, STDOUT_FILENO);
(void) dup2(fd, STDERR_FILENO);
}
(void) dup2(zfd, ZEVENT_FILENO);
zed_file_close_from(ZEVENT_FILENO + 1);
execle(path, prog, NULL, env);
_exit(127);
}
/* parent process */
node = calloc(1, sizeof (*node));
if (node) {
node->pid = pid;
node->eid = eid;
node->name = strdup(prog);
avl_add(&_launched_processes, node);
}
(void) pthread_mutex_unlock(&_launched_processes_lock);
__atomic_sub_fetch(&_launched_processes_limit, 1, __ATOMIC_SEQ_CST);
zed_log_msg(LOG_INFO, "Invoking \"%s\" eid=%llu pid=%d",
prog, eid, pid);
}
/* FIXME: Timeout rogue child processes with sigalarm? */
static void
_nop(int sig)
{}
/*
* Wait for child process using WNOHANG to limit
* the time spent waiting to 10 seconds (10,000ms).
*/
for (n = 0; n < 1000; n++) {
wpid = waitpid(pid, &status, WNOHANG);
if (wpid == (pid_t)-1) {
if (errno == EINTR)
continue;
zed_log_msg(LOG_WARNING,
"Failed to wait for \"%s\" eid=%llu pid=%d",
prog, eid, pid);
break;
} else if (wpid == 0) {
struct timespec t;
static void *
_reap_children(void *arg)
{
struct launched_process_node node, *pnode;
pid_t pid;
int status;
struct rusage usage;
struct sigaction sa = {};
/* child still running */
t.tv_sec = 0;
t.tv_nsec = 10000000; /* 10ms */
(void) nanosleep(&t, NULL);
continue;
}
(void) sigfillset(&sa.sa_mask);
(void) sigdelset(&sa.sa_mask, SIGCHLD);
(void) pthread_sigmask(SIG_SETMASK, &sa.sa_mask, NULL);
if (WIFEXITED(status)) {
zed_log_msg(LOG_INFO,
"Finished \"%s\" eid=%llu pid=%d exit=%d",
prog, eid, pid, WEXITSTATUS(status));
} else if (WIFSIGNALED(status)) {
zed_log_msg(LOG_INFO,
"Finished \"%s\" eid=%llu pid=%d sig=%d/%s",
prog, eid, pid, WTERMSIG(status),
strsignal(WTERMSIG(status)));
(void) sigemptyset(&sa.sa_mask);
sa.sa_handler = _nop;
sa.sa_flags = SA_NOCLDSTOP;
(void) sigaction(SIGCHLD, &sa, NULL);
for (_reap_children_stop = B_FALSE; !_reap_children_stop; ) {
(void) pthread_mutex_lock(&_launched_processes_lock);
pid = wait4(0, &status, WNOHANG, &usage);
if (pid == 0 || pid == (pid_t)-1) {
(void) pthread_mutex_unlock(&_launched_processes_lock);
if (pid == 0 || errno == ECHILD)
pause();
else if (errno != EINTR)
zed_log_msg(LOG_WARNING,
"Failed to wait for children: %s",
strerror(errno));
} else {
zed_log_msg(LOG_INFO,
"Finished \"%s\" eid=%llu pid=%d status=0x%X",
prog, eid, (unsigned int) status);
memset(&node, 0, sizeof (node));
node.pid = pid;
pnode = avl_find(&_launched_processes, &node, NULL);
if (pnode) {
memcpy(&node, pnode, sizeof (node));
avl_remove(&_launched_processes, pnode);
free(pnode);
}
(void) pthread_mutex_unlock(&_launched_processes_lock);
__atomic_add_fetch(&_launched_processes_limit, 1,
__ATOMIC_SEQ_CST);
usage.ru_utime.tv_sec += usage.ru_stime.tv_sec;
usage.ru_utime.tv_usec += usage.ru_stime.tv_usec;
usage.ru_utime.tv_sec +=
usage.ru_utime.tv_usec / (1000 * 1000);
usage.ru_utime.tv_usec %= 1000 * 1000;
if (WIFEXITED(status)) {
zed_log_msg(LOG_INFO,
"Finished \"%s\" eid=%llu pid=%d "
"time=%llu.%06us exit=%d",
node.name, node.eid, pid,
(unsigned long long) usage.ru_utime.tv_sec,
(unsigned int) usage.ru_utime.tv_usec,
WEXITSTATUS(status));
} else if (WIFSIGNALED(status)) {
zed_log_msg(LOG_INFO,
"Finished \"%s\" eid=%llu pid=%d "
"time=%llu.%06us sig=%d/%s",
node.name, node.eid, pid,
(unsigned long long) usage.ru_utime.tv_sec,
(unsigned int) usage.ru_utime.tv_usec,
WTERMSIG(status),
strsignal(WTERMSIG(status)));
} else {
zed_log_msg(LOG_INFO,
"Finished \"%s\" eid=%llu pid=%d "
"time=%llu.%06us status=0x%X",
node.name, node.eid,
(unsigned long long) usage.ru_utime.tv_sec,
(unsigned int) usage.ru_utime.tv_usec,
(unsigned int) status);
}
free(node.name);
}
break;
}
/*
* kill child process after 10 seconds
*/
if (wpid == 0) {
zed_log_msg(LOG_WARNING, "Killing hung \"%s\" pid=%d",
prog, pid);
(void) kill(pid, SIGKILL);
(void) waitpid(pid, &status, 0);
return (NULL);
}
void
zed_exec_fini(void)
{
struct launched_process_node *node;
void *ck = NULL;
if (_reap_children_tid == (pthread_t)-1)
return;
_reap_children_stop = B_TRUE;
(void) pthread_kill(_reap_children_tid, SIGCHLD);
(void) pthread_join(_reap_children_tid, NULL);
while ((node = avl_destroy_nodes(&_launched_processes, &ck)) != NULL) {
free(node->name);
free(node);
}
avl_destroy(&_launched_processes);
(void) pthread_mutex_destroy(&_launched_processes_lock);
(void) pthread_mutex_init(&_launched_processes_lock, NULL);
_reap_children_tid = (pthread_t)-1;
}
/*
* Process the event [eid] by synchronously invoking all zedlets with a
* matching class prefix.
*
* Each executable in [zedlets] from the directory [dir] is matched against
* the event's [class], [subclass], and the "all" class (which matches
* all events). Every zedlet with a matching class prefix is invoked.
* Each executable in [zcp->zedlets] from the directory [zcp->zedlet_dir]
* is matched against the event's [class], [subclass], and the "all" class
* (which matches all events).
* Every zedlet with a matching class prefix is invoked.
* The NAME=VALUE strings in [envs] will be passed to the zedlet as
* environment variables.
*
* The file descriptor [zfd] is the zevent_fd used to track the
* The file descriptor [zcp->zevent_fd] is the zevent_fd used to track the
* current cursor location within the zevent nvlist.
*
* Return 0 on success, -1 on error.
*/
int
zed_exec_process(uint64_t eid, const char *class, const char *subclass,
const char *dir, zed_strings_t *zedlets, zed_strings_t *envs, int zfd)
struct zed_conf *zcp, zed_strings_t *envs)
{
const char *class_strings[4];
const char *allclass = "all";
@ -203,9 +323,22 @@ zed_exec_process(uint64_t eid, const char *class, const char *subclass,
char **e;
int n;
if (!dir || !zedlets || !envs || zfd < 0)
if (!zcp->zedlet_dir || !zcp->zedlets || !envs || zcp->zevent_fd < 0)
return (-1);
if (_reap_children_tid == (pthread_t)-1) {
_launched_processes_limit = zcp->max_jobs;
if (pthread_create(&_reap_children_tid, NULL,
_reap_children, NULL) != 0)
return (-1);
pthread_setname_np(_reap_children_tid, "reap ZEDLETs");
avl_create(&_launched_processes, _launched_process_node_compare,
sizeof (struct launched_process_node),
offsetof(struct launched_process_node, node));
}
csp = class_strings;
if (class)
@ -221,11 +354,13 @@ zed_exec_process(uint64_t eid, const char *class, const char *subclass,
e = _zed_exec_create_env(envs);
for (z = zed_strings_first(zedlets); z; z = zed_strings_next(zedlets)) {
for (z = zed_strings_first(zcp->zedlets); z;
z = zed_strings_next(zcp->zedlets)) {
for (csp = class_strings; *csp; csp++) {
n = strlen(*csp);
if ((strncmp(z, *csp, n) == 0) && !isalpha(z[n]))
_zed_exec_fork_child(eid, dir, z, e, zfd);
_zed_exec_fork_child(eid, zcp->zedlet_dir,
z, e, zcp->zevent_fd, zcp->do_foreground);
}
}
free(e);

View File

@ -3,7 +3,7 @@
*
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
* Refer to the OpenZFS git commit log for authoritative copyright attribution.
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License Version 1.0 (CDDL-1.0).
@ -17,9 +17,11 @@
#include <stdint.h>
#include "zed_strings.h"
#include "zed_conf.h"
void zed_exec_fini(void);
int zed_exec_process(uint64_t eid, const char *class, const char *subclass,
const char *dir, zed_strings_t *zedlets, zed_strings_t *envs,
int zevent_fd);
struct zed_conf *zcp, zed_strings_t *envs);
#endif /* !ZED_EXEC_H */

View File

@ -3,7 +3,7 @@
*
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
* Refer to the OpenZFS git commit log for authoritative copyright attribution.
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License Version 1.0 (CDDL-1.0).
@ -12,73 +12,17 @@
* You may not use this file except in compliance with the license.
*/
#include <dirent.h>
#include <errno.h>
#include <fcntl.h>
#include <limits.h>
#include <string.h>
#include <sys/resource.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include "zed_file.h"
#include "zed_log.h"
/*
* Read up to [n] bytes from [fd] into [buf].
* Return the number of bytes read, 0 on EOF, or -1 on error.
*/
ssize_t
zed_file_read_n(int fd, void *buf, size_t n)
{
unsigned char *p;
size_t n_left;
ssize_t n_read;
p = buf;
n_left = n;
while (n_left > 0) {
if ((n_read = read(fd, p, n_left)) < 0) {
if (errno == EINTR)
continue;
else
return (-1);
} else if (n_read == 0) {
break;
}
n_left -= n_read;
p += n_read;
}
return (n - n_left);
}
/*
* Write [n] bytes from [buf] out to [fd].
* Return the number of bytes written, or -1 on error.
*/
ssize_t
zed_file_write_n(int fd, void *buf, size_t n)
{
const unsigned char *p;
size_t n_left;
ssize_t n_written;
p = buf;
n_left = n;
while (n_left > 0) {
if ((n_written = write(fd, p, n_left)) < 0) {
if (errno == EINTR)
continue;
else
return (-1);
}
n_left -= n_written;
p += n_written;
}
return (n);
}
/*
* Set an exclusive advisory lock on the open file descriptor [fd].
* Return 0 on success, 1 if a conflicting lock is held by another process,
@ -160,6 +104,13 @@ zed_file_is_locked(int fd)
return (lock.l_pid);
}
#if __APPLE__
#define PROC_SELF_FD "/dev/fd"
#else /* Linux-compatible layout */
#define PROC_SELF_FD "/proc/self/fd"
#endif
/*
* Close all open file descriptors greater than or equal to [lowfd].
* Any errors encountered while closing file descriptors are ignored.
@ -167,51 +118,24 @@ zed_file_is_locked(int fd)
void
zed_file_close_from(int lowfd)
{
const int maxfd_def = 256;
int errno_bak;
struct rlimit rl;
int maxfd;
int errno_bak = errno;
int maxfd = 0;
int fd;
DIR *fddir;
struct dirent *fdent;
errno_bak = errno;
if (getrlimit(RLIMIT_NOFILE, &rl) < 0) {
maxfd = maxfd_def;
} else if (rl.rlim_max == RLIM_INFINITY) {
maxfd = maxfd_def;
if ((fddir = opendir(PROC_SELF_FD)) != NULL) {
while ((fdent = readdir(fddir)) != NULL) {
fd = atoi(fdent->d_name);
if (fd > maxfd && fd != dirfd(fddir))
maxfd = fd;
}
(void) closedir(fddir);
} else {
maxfd = rl.rlim_max;
maxfd = sysconf(_SC_OPEN_MAX);
}
for (fd = lowfd; fd < maxfd; fd++)
(void) close(fd);
errno = errno_bak;
}
/*
* Set the CLOEXEC flag on file descriptor [fd] so it will be automatically
* closed upon successful execution of one of the exec functions.
* Return 0 on success, or -1 on error.
*
* FIXME: No longer needed?
*/
int
zed_file_close_on_exec(int fd)
{
int flags;
if (fd < 0) {
errno = EBADF;
return (-1);
}
flags = fcntl(fd, F_GETFD);
if (flags == -1)
return (-1);
flags |= FD_CLOEXEC;
if (fcntl(fd, F_SETFD, flags) == -1)
return (-1);
return (0);
}

View File

@ -3,7 +3,7 @@
*
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
* Refer to the OpenZFS git commit log for authoritative copyright attribution.
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License Version 1.0 (CDDL-1.0).
@ -18,10 +18,6 @@
#include <sys/types.h>
#include <unistd.h>
ssize_t zed_file_read_n(int fd, void *buf, size_t n);
ssize_t zed_file_write_n(int fd, void *buf, size_t n);
int zed_file_lock(int fd);
int zed_file_unlock(int fd);
@ -30,6 +26,4 @@ pid_t zed_file_is_locked(int fd);
void zed_file_close_from(int fd);
int zed_file_close_on_exec(int fd);
#endif /* !ZED_FILE_H */

View File

@ -3,7 +3,7 @@
*
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
* Refer to the OpenZFS git commit log for authoritative copyright attribution.
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License Version 1.0 (CDDL-1.0).

View File

@ -3,7 +3,7 @@
*
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
* Refer to the OpenZFS git commit log for authoritative copyright attribution.
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License Version 1.0 (CDDL-1.0).

View File

@ -3,7 +3,7 @@
*
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
* Refer to the OpenZFS git commit log for authoritative copyright attribution.
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License Version 1.0 (CDDL-1.0).

View File

@ -3,7 +3,7 @@
*
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
* Refer to the OpenZFS git commit log for authoritative copyright attribution.
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License Version 1.0 (CDDL-1.0).

View File

@ -317,7 +317,7 @@ get_usage(zfs_help_t idx)
case HELP_SEND:
return (gettext("\tsend [-DnPpRvLecwhb] [-[i|I] snapshot] "
"<snapshot>\n"
"\tsend [-nvPLecw] [-i snapshot|bookmark] "
"\tsend [-DnvPLecw] [-i snapshot|bookmark] "
"<filesystem|volume|snapshot>\n"
"\tsend [-DnPpvLec] [-i bookmark|snapshot] "
"--redact <bookmark> <snapshot>\n"
@ -728,6 +728,32 @@ finish_progress(char *done)
pt_header = NULL;
}
/* This function checks if the passed fd refers to /dev/null or /dev/zero */
#ifdef __linux__
static boolean_t
is_dev_nullzero(int fd)
{
struct stat st;
fstat(fd, &st);
return (major(st.st_rdev) == 1 && (minor(st.st_rdev) == 3 /* null */ ||
minor(st.st_rdev) == 5 /* zero */));
}
#endif
static void
note_dev_error(int err, int fd)
{
#ifdef __linux__
if (err == EINVAL && is_dev_nullzero(fd)) {
(void) fprintf(stderr,
gettext("Error: Writing directly to /dev/{null,zero} files"
" on certain kernels is not currently implemented.\n"
"(As a workaround, "
"try \"zfs send [...] | cat > /dev/null\")\n"));
}
#endif
}
static int
zfs_mount_and_share(libzfs_handle_t *hdl, const char *dataset, zfs_type_t type)
{
@ -4376,6 +4402,7 @@ zfs_do_send(int argc, char **argv)
struct option long_options[] = {
{"replicate", no_argument, NULL, 'R'},
{"skip-missing", no_argument, NULL, 's'},
{"redact", required_argument, NULL, 'd'},
{"props", no_argument, NULL, 'p'},
{"parsable", no_argument, NULL, 'P'},
@ -4394,7 +4421,7 @@ zfs_do_send(int argc, char **argv)
};
/* check options */
while ((c = getopt_long(argc, argv, ":i:I:RDpvnPLeht:cwbd:S",
while ((c = getopt_long(argc, argv, ":i:I:RsDpvnPLeht:cwbd:S",
long_options, NULL)) != -1) {
switch (c) {
case 'i':
@ -4411,6 +4438,9 @@ zfs_do_send(int argc, char **argv)
case 'R':
flags.replicate = B_TRUE;
break;
case 's':
flags.skipmissing = B_TRUE;
break;
case 'd':
redactbook = optarg;
break;
@ -4568,11 +4598,23 @@ zfs_do_send(int argc, char **argv)
err = zfs_send_saved(zhp, &flags, STDOUT_FILENO,
resume_token);
if (err != 0)
note_dev_error(errno, STDOUT_FILENO);
zfs_close(zhp);
return (err != 0);
} else if (resume_token != NULL) {
return (zfs_send_resume(g_zfs, &flags, STDOUT_FILENO,
resume_token));
err = zfs_send_resume(g_zfs, &flags, STDOUT_FILENO,
resume_token);
if (err != 0)
note_dev_error(errno, STDOUT_FILENO);
return (err);
}
if (flags.skipmissing && !flags.replicate) {
(void) fprintf(stderr,
gettext("skip-missing flag can only be used in "
"conjunction with replicate\n"));
usage(B_FALSE);
}
/*
@ -4616,6 +4658,8 @@ zfs_do_send(int argc, char **argv)
err = zfs_send_one(zhp, fromname, STDOUT_FILENO, &flags,
redactbook);
zfs_close(zhp);
if (err != 0)
note_dev_error(errno, STDOUT_FILENO);
return (err != 0);
}
@ -4692,6 +4736,7 @@ zfs_do_send(int argc, char **argv)
nvlist_free(dbgnv);
}
zfs_close(zhp);
note_dev_error(errno, STDOUT_FILENO);
return (err != 0);
}

View File

@ -35,7 +35,7 @@ libzfs_handle_t *g_zfs;
static void
usage(int err)
{
fprintf(stderr, "Usage: [-v] zfs_ids_to_path <pool> <objset id> "
fprintf(stderr, "Usage: zfs_ids_to_path [-v] <pool> <objset id> "
"<object id>\n");
exit(err);
}
@ -63,11 +63,11 @@ main(int argc, char **argv)
uint64_t objset, object;
if (sscanf(argv[1], "%llu", (u_longlong_t *)&objset) != 1) {
(void) fprintf(stderr, "Invalid objset id: %s\n", argv[2]);
(void) fprintf(stderr, "Invalid objset id: %s\n", argv[1]);
usage(2);
}
if (sscanf(argv[2], "%llu", (u_longlong_t *)&object) != 1) {
(void) fprintf(stderr, "Invalid object id: %s\n", argv[3]);
(void) fprintf(stderr, "Invalid object id: %s\n", argv[2]);
usage(3);
}
if ((g_zfs = libzfs_init()) == NULL) {
@ -76,7 +76,7 @@ main(int argc, char **argv)
}
zpool_handle_t *pool = zpool_open(g_zfs, argv[0]);
if (pool == NULL) {
fprintf(stderr, "Could not open pool %s\n", argv[1]);
fprintf(stderr, "Could not open pool %s\n", argv[0]);
libzfs_fini(g_zfs);
return (5);
}

View File

@ -36,8 +36,6 @@
#include <time.h>
#include <unistd.h>
static void usage(void);
static void
usage(void)
{
@ -60,12 +58,11 @@ int
main(int argc, char **argv)
{
/* default file path, can be optionally set by user */
char path[PATH_MAX] = "/etc/hostid";
const char *path = "/etc/hostid";
/* holds converted user input or lrand48() generated value */
unsigned long input_i = 0;
int opt;
int pathlen;
int force_fwrite = 0;
while ((opt = getopt_long(argc, argv, "fo:h?", 0, 0)) != -1) {
switch (opt) {
@ -73,14 +70,7 @@ main(int argc, char **argv)
force_fwrite = 1;
break;
case 'o':
pathlen = snprintf(path, sizeof (path), "%s", optarg);
if (pathlen >= sizeof (path)) {
fprintf(stderr, "%s\n", strerror(EOVERFLOW));
exit(EXIT_FAILURE);
} else if (pathlen < 1) {
fprintf(stderr, "%s\n", strerror(EINVAL));
exit(EXIT_FAILURE);
}
path = optarg;
break;
case 'h':
case '?':
@ -118,7 +108,7 @@ main(int argc, char **argv)
if (force_fwrite == 0 && stat(path, &fstat) == 0 &&
S_ISREG(fstat.st_mode)) {
fprintf(stderr, "%s: %s\n", path, strerror(EEXIST));
exit(EXIT_FAILURE);
exit(EXIT_FAILURE);
}
/*
@ -137,7 +127,7 @@ main(int argc, char **argv)
}
/*
* we need just 4 bytes in native endianess
* we need just 4 bytes in native endianness
* not using sethostid() because it may be missing or just a stub
*/
uint32_t hostid = input_i;

View File

@ -1,4 +1,5 @@
include $(top_srcdir)/config/Rules.am
include $(top_srcdir)/config/Shellcheck.am
AM_CFLAGS += $(LIBBLKID_CFLAGS) $(LIBUUID_CFLAGS)
@ -146,6 +147,10 @@ dist_zpoolcompat_DATA = \
compatibility.d/openzfsonosx-1.9.3 \
compatibility.d/openzfs-2.0-freebsd \
compatibility.d/openzfs-2.0-linux \
compatibility.d/openzfs-2.1-freebsd \
compatibility.d/openzfs-2.1-linux \
compatibility.d/zol-0.6.1 \
compatibility.d/zol-0.6.4 \
compatibility.d/zol-0.6.5 \
compatibility.d/zol-0.7 \
compatibility.d/zol-0.8

View File

@ -0,0 +1,34 @@
# Features supported by OpenZFS 2.1 on FreeBSD
allocation_classes
async_destroy
bookmark_v2
bookmark_written
bookmarks
device_rebuild
device_removal
draid
embedded_data
empty_bpobj
enabled_txg
encryption
extensible_dataset
filesystem_limits
hole_birth
large_blocks
large_dnode
livelist
log_spacemap
lz4_compress
multi_vdev_crash_dump
obsolete_counts
project_quota
redacted_datasets
redaction_bookmarks
resilver_defer
sha512
skein
spacemap_histogram
spacemap_v2
userobj_accounting
zpool_checkpoint
zstd_compress

View File

@ -0,0 +1,35 @@
# Features supported by OpenZFS 2.1 on Linux
allocation_classes
async_destroy
bookmark_v2
bookmark_written
bookmarks
device_rebuild
device_removal
draid
edonr
embedded_data
empty_bpobj
enabled_txg
encryption
extensible_dataset
filesystem_limits
hole_birth
large_blocks
large_dnode
livelist
log_spacemap
lz4_compress
multi_vdev_crash_dump
obsolete_counts
project_quota
redacted_datasets
redaction_bookmarks
resilver_defer
sha512
skein
spacemap_histogram
spacemap_v2
userobj_accounting
zpool_checkpoint
zstd_compress

View File

@ -0,0 +1,4 @@
# Features supported by ZFSonLinux v0.6.1
async_destroy
empty_bpobj
lz4_compress

View File

@ -0,0 +1,10 @@
# Features supported by ZFSonLinux v0.6.4
async_destroy
bookmarks
embedded_data
empty_bpobj
enabled_txg
extensible_dataset
hole_birth
lz4_compress
spacemap_histogram

View File

@ -101,3 +101,18 @@ check_sector_size_database(char *path, int *sector_size)
{
return (0);
}
void
after_zpool_upgrade(zpool_handle_t *zhp)
{
char bootfs[ZPOOL_MAXPROPLEN];
if (zpool_get_prop(zhp, ZPOOL_PROP_BOOTFS, bootfs,
sizeof (bootfs), NULL, B_FALSE) == 0 &&
strcmp(bootfs, "-") != 0) {
(void) printf(gettext("Pool '%s' has the bootfs "
"property set, you might need to update\nthe boot "
"code. See gptzfsboot(8) and loader.efi(8) for "
"details.\n"), zpool_get_name(zhp));
}
}

View File

@ -405,3 +405,8 @@ check_device(const char *path, boolean_t force,
return (error);
}
void
after_zpool_upgrade(zpool_handle_t *zhp)
{
}

View File

@ -53,7 +53,7 @@ get_filename_from_dir()
num_files=$(find "$dir" -maxdepth 1 -type f | wc -l)
mod=$((pid % num_files))
i=0
find "$dir" -type f -printf "%f\n" | while read -r file ; do
find "$dir" -type f -printf '%f\n' | while read -r file ; do
if [ "$mod" = "$i" ] ; then
echo "$file"
break
@ -62,17 +62,14 @@ get_filename_from_dir()
done
}
script=$(basename "$0")
script="${0##*/}"
if [ "$1" = "-h" ] ; then
echo "$helpstr" | grep "$script:" | tr -s '\t' | cut -f 2-
exit
fi
smartctl_path=$(command -v smartctl)
# shellcheck disable=SC2015
if [ -b "$VDEV_UPATH" ] && [ -x "$smartctl_path" ] || [ -n "$samples" ] ; then
if [ -b "$VDEV_UPATH" ] && PATH="/usr/sbin:$PATH" command -v smartctl > /dev/null || [ -n "$samples" ] ; then
if [ -n "$samples" ] ; then
# cat a smartctl output text file instead of running smartctl
# on a vdev (only used for developer testing).
@ -80,7 +77,7 @@ if [ -b "$VDEV_UPATH" ] && [ -x "$smartctl_path" ] || [ -n "$samples" ] ; then
echo "file=$file"
raw_out=$(cat "$samples/$file")
else
raw_out=$(eval "sudo $smartctl_path -a $VDEV_UPATH")
raw_out=$(sudo smartctl -a "$VDEV_UPATH")
fi
# What kind of drive are we? Look for the right line in smartctl:
@ -231,11 +228,11 @@ esac
with_vals=$(echo "$out" | grep -E "$scripts")
if [ -n "$with_vals" ]; then
echo "$with_vals"
without_vals=$(echo "$scripts" | tr "|" "\n" |
without_vals=$(echo "$scripts" | tr '|' '\n' |
grep -v -E "$(echo "$with_vals" |
awk -F "=" '{print $1}')" | awk '{print $0"="}')
else
without_vals=$(echo "$scripts" | tr "|" "\n" | awk '{print $0"="}')
without_vals=$(echo "$scripts" | tr '|' '\n' | awk '{print $0"="}')
fi
if [ -n "$without_vals" ]; then

View File

@ -494,19 +494,25 @@ vdev_run_cmd(vdev_cmd_data_t *data, char *cmd)
/* Setup our custom environment variables */
rc = asprintf(&env[1], "VDEV_PATH=%s",
data->path ? data->path : "");
if (rc == -1)
if (rc == -1) {
env[1] = NULL;
goto out;
}
rc = asprintf(&env[2], "VDEV_UPATH=%s",
data->upath ? data->upath : "");
if (rc == -1)
if (rc == -1) {
env[2] = NULL;
goto out;
}
rc = asprintf(&env[3], "VDEV_ENC_SYSFS_PATH=%s",
data->vdev_enc_sysfs_path ?
data->vdev_enc_sysfs_path : "");
if (rc == -1)
if (rc == -1) {
env[3] = NULL;
goto out;
}
/* Run the command */
rc = libzfs_run_process_get_stdout_nopath(cmd, argv, env, &lines,
@ -525,8 +531,7 @@ out:
/* Start with i = 1 since env[0] was statically allocated */
for (i = 1; i < ARRAY_SIZE(env); i++)
if (env[i] != NULL)
free(env[i]);
free(env[i]);
}
/*

View File

@ -32,6 +32,7 @@
* Copyright (c) 2017, Intel Corporation.
* Copyright (c) 2019, loli10K <ezomori.nozomu@gmail.com>
* Copyright (c) 2021, Colm Buckley <colm@tuatha.org>
* Copyright [2021] Hewlett Packard Enterprise Development LP
*/
#include <assert.h>
@ -532,7 +533,7 @@ usage(boolean_t requested)
(void) fprintf(fp, "YES disabled | enabled | active\n");
(void) fprintf(fp, gettext("\nThe feature@ properties must be "
"appended with a feature name.\nSee zpool-features(5).\n"));
"appended with a feature name.\nSee zpool-features(7).\n"));
}
/*
@ -786,7 +787,7 @@ add_prop_list(const char *propname, char *propval, nvlist_t **props,
if (poolprop) {
const char *vname = zpool_prop_to_name(ZPOOL_PROP_VERSION);
const char *fname =
const char *cname =
zpool_prop_to_name(ZPOOL_PROP_COMPATIBILITY);
if ((prop = zpool_name_to_prop(propname)) == ZPOOL_PROP_INVAL &&
@ -811,16 +812,19 @@ add_prop_list(const char *propname, char *propval, nvlist_t **props,
}
/*
* compatibility property and version should not be specified
* at the same time.
* if version is specified, only "legacy" compatibility
* may be requested
*/
if ((prop == ZPOOL_PROP_COMPATIBILITY &&
strcmp(propval, ZPOOL_COMPAT_LEGACY) != 0 &&
nvlist_exists(proplist, vname)) ||
(prop == ZPOOL_PROP_VERSION &&
nvlist_exists(proplist, fname))) {
(void) fprintf(stderr, gettext("'compatibility' and "
"'version' properties cannot be specified "
"together\n"));
nvlist_exists(proplist, cname) &&
strcmp(fnvlist_lookup_string(proplist, cname),
ZPOOL_COMPAT_LEGACY) != 0)) {
(void) fprintf(stderr, gettext("when 'version' is "
"specified, the 'compatibility' feature may only "
"be set to '" ZPOOL_COMPAT_LEGACY "'\n"));
return (2);
}
@ -1065,7 +1069,7 @@ zpool_do_add(int argc, char **argv)
free(vname);
}
}
/* And finaly the spares */
/* And finally the spares */
if (nvlist_lookup_nvlist_array(poolnvroot, ZPOOL_CONFIG_SPARES,
&sparechild, &sparechildren) == 0 && sparechildren > 0) {
hadspare = B_TRUE;
@ -1211,6 +1215,26 @@ zpool_do_remove(int argc, char **argv)
return (ret);
}
/*
* Return 1 if a vdev is active (being used in a pool)
* Return 0 if a vdev is inactive (offlined or faulted, or not in active pool)
*
* This is useful for checking if a disk in an active pool is offlined or
* faulted.
*/
static int
vdev_is_active(char *vdev_path)
{
int fd;
fd = open(vdev_path, O_EXCL);
if (fd < 0) {
return (1); /* cant open O_EXCL - disk is active */
}
close(fd);
return (0); /* disk is inactive in the pool */
}
/*
* zpool labelclear [-f] <vdev>
*
@ -1320,9 +1344,23 @@ zpool_do_labelclear(int argc, char **argv)
case POOL_STATE_ACTIVE:
case POOL_STATE_SPARE:
case POOL_STATE_L2CACHE:
/*
* We allow the user to call 'zpool offline -f'
* on an offlined disk in an active pool. We can check if
* the disk is online by calling vdev_is_active().
*/
if (force && !vdev_is_active(vdev))
break;
(void) fprintf(stderr, gettext(
"%s is a member (%s) of pool \"%s\"\n"),
"%s is a member (%s) of pool \"%s\""),
vdev, zpool_pool_state_to_name(state), name);
if (force) {
(void) fprintf(stderr, gettext(
". Offline the disk first to clear its label."));
}
printf("\n");
ret = 1;
goto errout;
@ -1674,6 +1712,7 @@ zpool_do_create(int argc, char **argv)
* - enable_pool_features (ie: no '-d' or '-o version')
* - it's supported by the kernel module
* - it's in the requested feature set
* - warn if it's enabled but not in compat
*/
for (spa_feature_t i = 0; i < SPA_FEATURES; i++) {
char propname[MAXPATHLEN];
@ -1687,6 +1726,14 @@ zpool_do_create(int argc, char **argv)
if (strcmp(propval, ZFS_FEATURE_DISABLED) == 0)
(void) nvlist_remove_all(props,
propname);
if (strcmp(propval,
ZFS_FEATURE_ENABLED) == 0 &&
!requested_features[i])
(void) fprintf(stderr, gettext(
"Warning: feature \"%s\" enabled "
"but is not in specified "
"'compatibility' feature set.\n"),
feat->fi_uname);
} else if (
enable_pool_features &&
feat->fi_zfs_mod_supported &&
@ -2367,6 +2414,10 @@ print_status_config(zpool_handle_t *zhp, status_cbdata_t *cb, const char *name,
(void) printf(gettext("all children offline"));
break;
case VDEV_AUX_BAD_LABEL:
(void) printf(gettext("invalid label"));
break;
default:
(void) printf(gettext("corrupted data"));
break;
@ -2509,6 +2560,10 @@ print_import_config(status_cbdata_t *cb, const char *name, nvlist_t *nv,
(void) printf(gettext("all children offline"));
break;
case VDEV_AUX_BAD_LABEL:
(void) printf(gettext("invalid label"));
break;
default:
(void) printf(gettext("corrupted data"));
break;
@ -2717,8 +2772,10 @@ show_import(nvlist_t *config, boolean_t report_error)
case ZPOOL_STATUS_FEAT_DISABLED:
printf_color(ANSI_BOLD, gettext("status: "));
printf_color(ANSI_YELLOW, gettext("Some supported and "
"requested features are not enabled on the pool.\n"));
printf_color(ANSI_YELLOW, gettext("Some supported "
"features are not enabled on the pool.\n\t"
"(Note that they may be intentionally disabled "
"if the\n\t'compatibility' property is set.)\n"));
break;
case ZPOOL_STATUS_COMPATIBILITY_ERR:
@ -2728,6 +2785,13 @@ show_import(nvlist_t *config, boolean_t report_error)
"property.\n"));
break;
case ZPOOL_STATUS_INCOMPATIBLE_FEAT:
printf_color(ANSI_BOLD, gettext("status: "));
printf_color(ANSI_YELLOW, gettext("One or more features "
"are enabled on the pool despite not being\n"
"requested by the 'compatibility' property.\n"));
break;
case ZPOOL_STATUS_UNSUP_FEAT_READ:
printf_color(ANSI_BOLD, gettext("status: "));
printf_color(ANSI_YELLOW, gettext("The pool uses the following "
@ -3734,9 +3798,10 @@ zpool_do_import(int argc, char **argv)
return (1);
}
err = import_pools(pools, props, mntopts, flags, argv[0],
argc == 1 ? NULL : argv[1], do_destroyed, pool_specified,
do_all, &idata);
err = import_pools(pools, props, mntopts, flags,
argc >= 1 ? argv[0] : NULL,
argc >= 2 ? argv[1] : NULL,
do_destroyed, pool_specified, do_all, &idata);
/*
* If we're using the cachefile and we failed to import, then
@ -3756,9 +3821,10 @@ zpool_do_import(int argc, char **argv)
nvlist_free(pools);
pools = zpool_search_import(g_zfs, &idata, &libzfs_config_ops);
err = import_pools(pools, props, mntopts, flags, argv[0],
argc == 1 ? NULL : argv[1], do_destroyed, pool_specified,
do_all, &idata);
err = import_pools(pools, props, mntopts, flags,
argc >= 1 ? argv[0] : NULL,
argc >= 2 ? argv[1] : NULL,
do_destroyed, pool_specified, do_all, &idata);
}
error:
@ -4965,8 +5031,8 @@ get_interval_count(int *argcp, char **argv, float *iv,
if (*end == '\0' && errno == 0) {
if (interval == 0) {
(void) fprintf(stderr, gettext("interval "
"cannot be zero\n"));
(void) fprintf(stderr, gettext(
"interval cannot be zero\n"));
usage(B_FALSE);
}
/*
@ -4996,8 +5062,8 @@ get_interval_count(int *argcp, char **argv, float *iv,
if (*end == '\0' && errno == 0) {
if (interval == 0) {
(void) fprintf(stderr, gettext("interval "
"cannot be zero\n"));
(void) fprintf(stderr, gettext(
"interval cannot be zero\n"));
usage(B_FALSE);
}
@ -6000,7 +6066,7 @@ print_one_column(zpool_prop_t prop, uint64_t value, const char *str,
break;
case ZPOOL_PROP_HEALTH:
width = 8;
snprintf(propval, sizeof (propval), "%-*s", (int)width, str);
(void) strlcpy(propval, str, sizeof (propval));
break;
default:
zfs_nicenum_format(value, propval, sizeof (propval), format);
@ -8055,7 +8121,8 @@ status_callback(zpool_handle_t *zhp, void *data)
(reason == ZPOOL_STATUS_OK ||
reason == ZPOOL_STATUS_VERSION_OLDER ||
reason == ZPOOL_STATUS_FEAT_DISABLED ||
reason == ZPOOL_STATUS_COMPATIBILITY_ERR)) {
reason == ZPOOL_STATUS_COMPATIBILITY_ERR ||
reason == ZPOOL_STATUS_INCOMPATIBLE_FEAT)) {
if (!cbp->cb_allpools) {
(void) printf(gettext("pool '%s' is healthy\n"),
zpool_get_name(zhp));
@ -8238,7 +8305,7 @@ status_callback(zpool_handle_t *zhp, void *data)
printf_color(ANSI_YELLOW, gettext("Enable all features using "
"'zpool upgrade'. Once this is done,\n\tthe pool may no "
"longer be accessible by software that does not support\n\t"
"the features. See zpool-features(5) for details.\n"));
"the features. See zpool-features(7) for details.\n"));
break;
case ZPOOL_STATUS_COMPATIBILITY_ERR:
@ -8254,6 +8321,18 @@ status_callback(zpool_handle_t *zhp, void *data)
ZPOOL_DATA_COMPAT_D ".\n"));
break;
case ZPOOL_STATUS_INCOMPATIBLE_FEAT:
printf_color(ANSI_BOLD, gettext("status: "));
printf_color(ANSI_YELLOW, gettext("One or more features "
"are enabled on the pool despite not being\n\t"
"requested by the 'compatibility' property.\n"));
printf_color(ANSI_BOLD, gettext("action: "));
printf_color(ANSI_YELLOW, gettext("Consider setting "
"'compatibility' to an appropriate value, or\n\t"
"adding needed features to the relevant file in\n\t"
ZPOOL_SYSCONF_COMPAT_D " or " ZPOOL_DATA_COMPAT_D ".\n"));
break;
case ZPOOL_STATUS_UNSUP_FEAT_READ:
printf_color(ANSI_BOLD, gettext("status: "));
printf_color(ANSI_YELLOW, gettext("The pool cannot be accessed "
@ -8713,6 +8792,11 @@ upgrade_version(zpool_handle_t *zhp, uint64_t version)
verify(nvlist_lookup_uint64(config, ZPOOL_CONFIG_VERSION,
&oldversion) == 0);
char compat[ZFS_MAXPROPLEN];
if (zpool_get_prop(zhp, ZPOOL_PROP_COMPATIBILITY, compat,
ZFS_MAXPROPLEN, NULL, B_FALSE) != 0)
compat[0] = '\0';
assert(SPA_VERSION_IS_SUPPORTED(oldversion));
assert(oldversion < version);
@ -8727,6 +8811,13 @@ upgrade_version(zpool_handle_t *zhp, uint64_t version)
return (1);
}
if (strcmp(compat, ZPOOL_COMPAT_LEGACY) == 0) {
(void) fprintf(stderr, gettext("Upgrade not performed because "
"'compatibility' property set to '"
ZPOOL_COMPAT_LEGACY "'.\n"));
return (1);
}
ret = zpool_upgrade(zhp, version);
if (ret != 0)
return (ret);
@ -8803,7 +8894,7 @@ upgrade_cb(zpool_handle_t *zhp, void *arg)
upgrade_cbdata_t *cbp = arg;
nvlist_t *config;
uint64_t version;
boolean_t printnl = B_FALSE;
boolean_t modified_pool = B_FALSE;
int ret;
config = zpool_get_config(zhp, NULL);
@ -8817,7 +8908,7 @@ upgrade_cb(zpool_handle_t *zhp, void *arg)
ret = upgrade_version(zhp, cbp->cb_version);
if (ret != 0)
return (ret);
printnl = B_TRUE;
modified_pool = B_TRUE;
/*
* If they did "zpool upgrade -a", then we could
@ -8837,12 +8928,13 @@ upgrade_cb(zpool_handle_t *zhp, void *arg)
if (count > 0) {
cbp->cb_first = B_FALSE;
printnl = B_TRUE;
modified_pool = B_TRUE;
}
}
if (printnl) {
(void) printf(gettext("\n"));
if (modified_pool) {
(void) printf("\n");
(void) after_zpool_upgrade(zhp);
}
return (0);
@ -8868,7 +8960,10 @@ upgrade_list_older_cb(zpool_handle_t *zhp, void *arg)
"be upgraded to use feature flags. After "
"being upgraded, these pools\nwill no "
"longer be accessible by software that does not "
"support feature\nflags.\n\n"));
"support feature\nflags.\n\n"
"Note that setting a pool's 'compatibility' "
"feature to '" ZPOOL_COMPAT_LEGACY "' will\n"
"inhibit upgrades.\n\n"));
(void) printf(gettext("VER POOL\n"));
(void) printf(gettext("--- ------------\n"));
cbp->cb_first = B_FALSE;
@ -8913,8 +9008,12 @@ upgrade_list_disabled_cb(zpool_handle_t *zhp, void *arg)
"pool may become incompatible with "
"software\nthat does not support "
"the feature. See "
"zpool-features(5) for "
"details.\n\n"));
"zpool-features(7) for "
"details.\n\n"
"Note that the pool "
"'compatibility' feature can be "
"used to inhibit\nfeature "
"upgrades.\n\n"));
(void) printf(gettext("POOL "
"FEATURE\n"));
(void) printf(gettext("------"
@ -8948,7 +9047,7 @@ upgrade_list_disabled_cb(zpool_handle_t *zhp, void *arg)
static int
upgrade_one(zpool_handle_t *zhp, void *data)
{
boolean_t printnl = B_FALSE;
boolean_t modified_pool = B_FALSE;
upgrade_cbdata_t *cbp = data;
uint64_t cur_version;
int ret;
@ -8976,7 +9075,7 @@ upgrade_one(zpool_handle_t *zhp, void *data)
}
if (cur_version != cbp->cb_version) {
printnl = B_TRUE;
modified_pool = B_TRUE;
ret = upgrade_version(zhp, cbp->cb_version);
if (ret != 0)
return (ret);
@ -8989,7 +9088,7 @@ upgrade_one(zpool_handle_t *zhp, void *data)
return (ret);
if (count != 0) {
printnl = B_TRUE;
modified_pool = B_TRUE;
} else if (cur_version == SPA_VERSION) {
(void) printf(gettext("Pool '%s' already has all "
"supported and requested features enabled.\n"),
@ -8997,8 +9096,9 @@ upgrade_one(zpool_handle_t *zhp, void *data)
}
}
if (printnl) {
(void) printf(gettext("\n"));
if (modified_pool) {
(void) printf("\n");
(void) after_zpool_upgrade(zhp);
}
return (0);
@ -9970,6 +10070,63 @@ set_callback(zpool_handle_t *zhp, void *data)
int error;
set_cbdata_t *cb = (set_cbdata_t *)data;
/* Check if we have out-of-bounds features */
if (strcmp(cb->cb_propname, ZPOOL_CONFIG_COMPATIBILITY) == 0) {
boolean_t features[SPA_FEATURES];
if (zpool_do_load_compat(cb->cb_value, features) !=
ZPOOL_COMPATIBILITY_OK)
return (-1);
nvlist_t *enabled = zpool_get_features(zhp);
spa_feature_t i;
for (i = 0; i < SPA_FEATURES; i++) {
const char *fguid = spa_feature_table[i].fi_guid;
if (nvlist_exists(enabled, fguid) && !features[i])
break;
}
if (i < SPA_FEATURES)
(void) fprintf(stderr, gettext("Warning: one or "
"more features already enabled on pool '%s'\n"
"are not present in this compatibility set.\n"),
zpool_get_name(zhp));
}
/* if we're setting a feature, check it's in compatibility set */
if (zpool_prop_feature(cb->cb_propname) &&
strcmp(cb->cb_value, ZFS_FEATURE_ENABLED) == 0) {
char *fname = strchr(cb->cb_propname, '@') + 1;
spa_feature_t f;
if (zfeature_lookup_name(fname, &f) == 0) {
char compat[ZFS_MAXPROPLEN];
if (zpool_get_prop(zhp, ZPOOL_PROP_COMPATIBILITY,
compat, ZFS_MAXPROPLEN, NULL, B_FALSE) != 0)
compat[0] = '\0';
boolean_t features[SPA_FEATURES];
if (zpool_do_load_compat(compat, features) !=
ZPOOL_COMPATIBILITY_OK) {
(void) fprintf(stderr, gettext("Error: "
"cannot enable feature '%s' on pool '%s'\n"
"because the pool's 'compatibility' "
"property cannot be parsed.\n"),
fname, zpool_get_name(zhp));
return (-1);
}
if (!features[f]) {
(void) fprintf(stderr, gettext("Error: "
"cannot enable feature '%s' on pool '%s'\n"
"as it is not specified in this pool's "
"current compatibility set.\n"
"Consider setting 'compatibility' to a "
"less restrictive set, or to 'off'.\n"),
fname, zpool_get_name(zhp));
return (-1);
}
}
}
error = zpool_set_prop(zhp, cb->cb_propname, cb->cb_value);
if (!error)
@ -10492,28 +10649,25 @@ zpool_do_version(int argc, char **argv)
static zpool_compat_status_t
zpool_do_load_compat(const char *compat, boolean_t *list)
{
char badword[ZFS_MAXPROPLEN];
char badfile[MAXPATHLEN];
char report[1024];
zpool_compat_status_t ret;
switch (ret = zpool_load_compat(compat, list, badword, badfile)) {
ret = zpool_load_compat(compat, list, report, 1024);
switch (ret) {
case ZPOOL_COMPATIBILITY_OK:
break;
case ZPOOL_COMPATIBILITY_READERR:
(void) fprintf(stderr, gettext("error reading compatibility "
"file '%s'\n"), badfile);
break;
case ZPOOL_COMPATIBILITY_BADFILE:
(void) fprintf(stderr, gettext("compatibility file '%s' "
"too large or not newline-terminated\n"), badfile);
break;
case ZPOOL_COMPATIBILITY_BADWORD:
(void) fprintf(stderr, gettext("unknown feature '%s' in "
"compatibility file '%s'\n"), badword, badfile);
break;
case ZPOOL_COMPATIBILITY_NOFILES:
(void) fprintf(stderr, gettext("no compatibility files "
"specified\n"));
case ZPOOL_COMPATIBILITY_BADFILE:
case ZPOOL_COMPATIBILITY_BADTOKEN:
(void) fprintf(stderr, "Error: %s\n", report);
break;
case ZPOOL_COMPATIBILITY_WARNTOKEN:
(void) fprintf(stderr, "Warning: %s\n", report);
ret = ZPOOL_COMPATIBILITY_OK;
break;
}
return (ret);

View File

@ -129,6 +129,7 @@ int check_device(const char *path, boolean_t force,
boolean_t check_sector_size_database(char *path, int *sector_size);
void vdev_error(const char *fmt, ...);
int check_file(const char *file, boolean_t force, boolean_t isspare);
void after_zpool_upgrade(zpool_handle_t *zhp);
#ifdef __cplusplus
}

View File

@ -445,7 +445,7 @@ typedef struct replication_level {
/*
* N.B. For the purposes of comparing replication levels dRAID can be
* considered functionally equivilant to raidz.
* considered functionally equivalent to raidz.
*/
static boolean_t
is_raidz_mirror(replication_level_t *a, replication_level_t *b,

View File

@ -1360,7 +1360,7 @@
"type": "row"
},
{
"content": "I/O requests that are satisfied by accessing pool devices are managed by the ZIO scheduler.\nThe total latency is measured from the start of the I/O to completion by the disk.\nLatency through each queue is shown prior to its submission to the disk queue.\n\nThis view is useful for observing the effects of tuning the ZIO scheduler min and max values\n(see zfs-module-parameters(5) and [ZFS on Linux Module Parameters](https://openzfs.github.io/openzfs-docs/Performance%20and%20tuning/ZFS%20on%20Linux%20Module%20Parameters.html)):\n+ *zfs_vdev_max_active* controls the ZIO scheduler's disk queue depth (do not confuse with the block device's nr_requests)\n+ *zfs_vdev_sync_read_min_active* and *zfs_vdev_sync_read_max_active* control the synchronous queue for reads: most reads are sync\n+ *zfs_vdev_sync_write_min_active* and *zfs_vdev_sync_write_max_active* control the synchronous queue for writes: \nusually metadata or user data depending on the \"sync\" property setting or I/Os that are requested to be flushed\n+ *zfs_vdev_async_read_min_active* and *zfs_vdev_async_read_max_active* control the asynchronous queue for reads: usually prefetches\n+ *zfs_vdev_async_write_min_active* and *zfs_vdev_async_write_max_active* control the asynchronous queue for writes: \nusually the bulk of all writes at transaction group (txg) commit\n+ *zfs_vdev_scrub_min_active* and *zfs_vdev_scrub_max_active* controls the scan reads: usually scrub or resilver\n\n",
"content": "I/O requests that are satisfied by accessing pool devices are managed by the ZIO scheduler.\nThe total latency is measured from the start of the I/O to completion by the disk.\nLatency through each queue is shown prior to its submission to the disk queue.\n\nThis view is useful for observing the effects of tuning the ZIO scheduler min and max values\n(see zfs(4) and [ZFS on Linux Module Parameters](https://openzfs.github.io/openzfs-docs/Performance%20and%20tuning/ZFS%20on%20Linux%20Module%20Parameters.html)):\n+ *zfs_vdev_max_active* controls the ZIO scheduler's disk queue depth (do not confuse with the block device's nr_requests)\n+ *zfs_vdev_sync_read_min_active* and *zfs_vdev_sync_read_max_active* control the synchronous queue for reads: most reads are sync\n+ *zfs_vdev_sync_write_min_active* and *zfs_vdev_sync_write_max_active* control the synchronous queue for writes: \nusually metadata or user data depending on the \"sync\" property setting or I/Os that are requested to be flushed\n+ *zfs_vdev_async_read_min_active* and *zfs_vdev_async_read_max_active* control the asynchronous queue for reads: usually prefetches\n+ *zfs_vdev_async_write_min_active* and *zfs_vdev_async_write_max_active* control the asynchronous queue for writes: \nusually the bulk of all writes at transaction group (txg) commit\n+ *zfs_vdev_scrub_min_active* and *zfs_vdev_scrub_max_active* controls the scan reads: usually scrub or resilver\n\n",
"datasource": "${DS_MACBOOK-INFLUX}",
"fieldConfig": {
"defaults": {
@ -1664,4 +1664,4 @@
"list": []
},
"version": 2
}
}

View File

@ -683,9 +683,8 @@ print_recursive_stats(stat_printer_f func, nvlist_t *nvroot,
if (descend && nvlist_lookup_nvlist_array(nvroot, ZPOOL_CONFIG_CHILDREN,
&child, &children) == 0) {
(void) strncpy(vdev_name, get_vdev_name(nvroot, parent_name),
(void) strlcpy(vdev_name, get_vdev_name(nvroot, parent_name),
sizeof (vdev_name));
vdev_name[sizeof (vdev_name) - 1] = '\0';
for (c = 0; c < children; c++) {
print_recursive_stats(func, child[c], pool_name,

View File

@ -15,3 +15,6 @@ zstream_LDADD = \
$(abs_top_builddir)/lib/libnvpair/libnvpair.la
include $(top_srcdir)/config/CppCheck.am
install-exec-hook:
cd $(DESTDIR)$(sbindir) && $(LN_S) -f zstream zstreamdump

View File

@ -49,6 +49,11 @@ zstream_usage(void)
int
main(int argc, char *argv[])
{
char *basename = strrchr(argv[0], '/');
basename = basename ? (basename + 1) : argv[0];
if (argc >= 1 && strcmp(basename, "zstreamdump") == 0)
return (zstream_do_dump(argc, argv));
if (argc < 2)
zstream_usage();

View File

@ -1 +0,0 @@
dist_sbin_SCRIPTS = zstreamdump

View File

@ -1,3 +0,0 @@
#!/bin/sh
zstream dump "$@"

View File

@ -124,6 +124,7 @@
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <getopt.h>
#include <signal.h>
#include <umem.h>
#include <ctype.h>
@ -192,30 +193,58 @@ typedef struct ztest_shared_opts {
char zo_gvars[ZO_GVARS_MAX_COUNT][ZO_GVARS_MAX_ARGLEN];
} ztest_shared_opts_t;
/* Default values for command line options. */
#define DEFAULT_POOL "ztest"
#define DEFAULT_VDEV_DIR "/tmp"
#define DEFAULT_VDEV_COUNT 5
#define DEFAULT_VDEV_SIZE (SPA_MINDEVSIZE * 4) /* 256m default size */
#define DEFAULT_VDEV_SIZE_STR "256M"
#define DEFAULT_ASHIFT SPA_MINBLOCKSHIFT
#define DEFAULT_MIRRORS 2
#define DEFAULT_RAID_CHILDREN 4
#define DEFAULT_RAID_PARITY 1
#define DEFAULT_DRAID_DATA 4
#define DEFAULT_DRAID_SPARES 1
#define DEFAULT_DATASETS_COUNT 7
#define DEFAULT_THREADS 23
#define DEFAULT_RUN_TIME 300 /* 300 seconds */
#define DEFAULT_RUN_TIME_STR "300 sec"
#define DEFAULT_PASS_TIME 60 /* 60 seconds */
#define DEFAULT_PASS_TIME_STR "60 sec"
#define DEFAULT_KILL_RATE 70 /* 70% kill rate */
#define DEFAULT_KILLRATE_STR "70%"
#define DEFAULT_INITS 1
#define DEFAULT_MAX_LOOPS 50 /* 5 minutes */
#define DEFAULT_FORCE_GANGING (64 << 10)
#define DEFAULT_FORCE_GANGING_STR "64K"
/* Simplifying assumption: -1 is not a valid default. */
#define NO_DEFAULT -1
static const ztest_shared_opts_t ztest_opts_defaults = {
.zo_pool = "ztest",
.zo_dir = "/tmp",
.zo_pool = DEFAULT_POOL,
.zo_dir = DEFAULT_VDEV_DIR,
.zo_alt_ztest = { '\0' },
.zo_alt_libpath = { '\0' },
.zo_vdevs = 5,
.zo_ashift = SPA_MINBLOCKSHIFT,
.zo_mirrors = 2,
.zo_raid_children = 4,
.zo_raid_parity = 1,
.zo_vdevs = DEFAULT_VDEV_COUNT,
.zo_ashift = DEFAULT_ASHIFT,
.zo_mirrors = DEFAULT_MIRRORS,
.zo_raid_children = DEFAULT_RAID_CHILDREN,
.zo_raid_parity = DEFAULT_RAID_PARITY,
.zo_raid_type = VDEV_TYPE_RAIDZ,
.zo_vdev_size = SPA_MINDEVSIZE * 4, /* 256m default size */
.zo_draid_data = 4, /* data drives */
.zo_draid_spares = 1, /* distributed spares */
.zo_datasets = 7,
.zo_threads = 23,
.zo_passtime = 60, /* 60 seconds */
.zo_killrate = 70, /* 70% kill rate */
.zo_vdev_size = DEFAULT_VDEV_SIZE,
.zo_draid_data = DEFAULT_DRAID_DATA, /* data drives */
.zo_draid_spares = DEFAULT_DRAID_SPARES, /* distributed spares */
.zo_datasets = DEFAULT_DATASETS_COUNT,
.zo_threads = DEFAULT_THREADS,
.zo_passtime = DEFAULT_PASS_TIME,
.zo_killrate = DEFAULT_KILL_RATE,
.zo_verbose = 0,
.zo_mmp_test = 0,
.zo_init = 1,
.zo_time = 300, /* 5 minutes */
.zo_maxloops = 50, /* max loops during spa_freeze() */
.zo_metaslab_force_ganging = 64 << 10,
.zo_init = DEFAULT_INITS,
.zo_time = DEFAULT_RUN_TIME,
.zo_maxloops = DEFAULT_MAX_LOOPS, /* max loops during spa_freeze() */
.zo_metaslab_force_ganging = DEFAULT_FORCE_GANGING,
.zo_special_vdevs = ZTEST_VDEV_CLASS_RND,
.zo_gvars_count = 0,
};
@ -684,68 +713,154 @@ nicenumtoull(const char *buf)
return (val);
}
typedef struct ztest_option {
const char short_opt;
const char *long_opt;
const char *long_opt_param;
const char *comment;
unsigned int default_int;
char *default_str;
} ztest_option_t;
/*
* The following option_table is used for generating the usage info as well as
* the long and short option information for calling getopt_long().
*/
static ztest_option_t option_table[] = {
{ 'v', "vdevs", "INTEGER", "Number of vdevs", DEFAULT_VDEV_COUNT,
NULL},
{ 's', "vdev-size", "INTEGER", "Size of each vdev",
NO_DEFAULT, DEFAULT_VDEV_SIZE_STR},
{ 'a', "alignment-shift", "INTEGER",
"Alignment shift; use 0 for random", DEFAULT_ASHIFT, NULL},
{ 'm', "mirror-copies", "INTEGER", "Number of mirror copies",
DEFAULT_MIRRORS, NULL},
{ 'r', "raid-disks", "INTEGER", "Number of raidz/draid disks",
DEFAULT_RAID_CHILDREN, NULL},
{ 'R', "raid-parity", "INTEGER", "Raid parity",
DEFAULT_RAID_PARITY, NULL},
{ 'K', "raid-kind", "raidz|draid|random", "Raid kind",
NO_DEFAULT, "random"},
{ 'D', "draid-data", "INTEGER", "Number of draid data drives",
DEFAULT_DRAID_DATA, NULL},
{ 'S', "draid-spares", "INTEGER", "Number of draid spares",
DEFAULT_DRAID_SPARES, NULL},
{ 'd', "datasets", "INTEGER", "Number of datasets",
DEFAULT_DATASETS_COUNT, NULL},
{ 't', "threads", "INTEGER", "Number of ztest threads",
DEFAULT_THREADS, NULL},
{ 'g', "gang-block-threshold", "INTEGER",
"Metaslab gang block threshold",
NO_DEFAULT, DEFAULT_FORCE_GANGING_STR},
{ 'i', "init-count", "INTEGER", "Number of times to initialize pool",
DEFAULT_INITS, NULL},
{ 'k', "kill-percentage", "INTEGER", "Kill percentage",
NO_DEFAULT, DEFAULT_KILLRATE_STR},
{ 'p', "pool-name", "STRING", "Pool name",
NO_DEFAULT, DEFAULT_POOL},
{ 'f', "vdev-file-directory", "PATH", "File directory for vdev files",
NO_DEFAULT, DEFAULT_VDEV_DIR},
{ 'M', "multi-host", NULL,
"Multi-host; simulate pool imported on remote host",
NO_DEFAULT, NULL},
{ 'E', "use-existing-pool", NULL,
"Use existing pool instead of creating new one", NO_DEFAULT, NULL},
{ 'T', "run-time", "INTEGER", "Total run time",
NO_DEFAULT, DEFAULT_RUN_TIME_STR},
{ 'P', "pass-time", "INTEGER", "Time per pass",
NO_DEFAULT, DEFAULT_PASS_TIME_STR},
{ 'F', "freeze-loops", "INTEGER", "Max loops in spa_freeze()",
DEFAULT_MAX_LOOPS, NULL},
{ 'B', "alt-ztest", "PATH", "Alternate ztest path",
NO_DEFAULT, NULL},
{ 'C', "vdev-class-state", "on|off|random", "vdev class state",
NO_DEFAULT, "random"},
{ 'o', "option", "\"OPTION=INTEGER\"",
"Set global variable to an unsigned 32-bit integer value",
NO_DEFAULT, NULL},
{ 'G', "dump-debug-msg", NULL,
"Dump zfs_dbgmsg buffer before exiting due to an error",
NO_DEFAULT, NULL},
{ 'V', "verbose", NULL,
"Verbose (use multiple times for ever more verbosity)",
NO_DEFAULT, NULL},
{ 'h', "help", NULL, "Show this help",
NO_DEFAULT, NULL},
{0, 0, 0, 0, 0, 0}
};
static struct option *long_opts = NULL;
static char *short_opts = NULL;
static void
init_options(void)
{
ASSERT3P(long_opts, ==, NULL);
ASSERT3P(short_opts, ==, NULL);
int count = sizeof (option_table) / sizeof (option_table[0]);
long_opts = umem_alloc(sizeof (struct option) * count, UMEM_NOFAIL);
short_opts = umem_alloc(sizeof (char) * 2 * count, UMEM_NOFAIL);
int short_opt_index = 0;
for (int i = 0; i < count; i++) {
long_opts[i].val = option_table[i].short_opt;
long_opts[i].name = option_table[i].long_opt;
long_opts[i].has_arg = option_table[i].long_opt_param != NULL
? required_argument : no_argument;
long_opts[i].flag = NULL;
short_opts[short_opt_index++] = option_table[i].short_opt;
if (option_table[i].long_opt_param != NULL) {
short_opts[short_opt_index++] = ':';
}
}
}
static void
fini_options(void)
{
int count = sizeof (option_table) / sizeof (option_table[0]);
umem_free(long_opts, sizeof (struct option) * count);
umem_free(short_opts, sizeof (char) * 2 * count);
long_opts = NULL;
short_opts = NULL;
}
static void
usage(boolean_t requested)
{
const ztest_shared_opts_t *zo = &ztest_opts_defaults;
char nice_vdev_size[NN_NUMBUF_SZ];
char nice_force_ganging[NN_NUMBUF_SZ];
char option[80];
FILE *fp = requested ? stdout : stderr;
nicenum(zo->zo_vdev_size, nice_vdev_size, sizeof (nice_vdev_size));
nicenum(zo->zo_metaslab_force_ganging, nice_force_ganging,
sizeof (nice_force_ganging));
(void) fprintf(fp, "Usage: %s [OPTIONS...]\n", DEFAULT_POOL);
for (int i = 0; option_table[i].short_opt != 0; i++) {
if (option_table[i].long_opt_param != NULL) {
(void) sprintf(option, " -%c --%s=%s",
option_table[i].short_opt,
option_table[i].long_opt,
option_table[i].long_opt_param);
} else {
(void) sprintf(option, " -%c --%s",
option_table[i].short_opt,
option_table[i].long_opt);
}
(void) fprintf(fp, " %-40s%s", option,
option_table[i].comment);
(void) fprintf(fp, "Usage: %s\n"
"\t[-v vdevs (default: %llu)]\n"
"\t[-s size_of_each_vdev (default: %s)]\n"
"\t[-a alignment_shift (default: %d)] use 0 for random\n"
"\t[-m mirror_copies (default: %d)]\n"
"\t[-r raidz_disks / draid_disks (default: %d)]\n"
"\t[-R raid_parity (default: %d)]\n"
"\t[-K raid_kind (default: random)] raidz|draid|random\n"
"\t[-D draid_data (default: %d)] in config\n"
"\t[-S draid_spares (default: %d)]\n"
"\t[-d datasets (default: %d)]\n"
"\t[-t threads (default: %d)]\n"
"\t[-g gang_block_threshold (default: %s)]\n"
"\t[-i init_count (default: %d)] initialize pool i times\n"
"\t[-k kill_percentage (default: %llu%%)]\n"
"\t[-p pool_name (default: %s)]\n"
"\t[-f dir (default: %s)] file directory for vdev files\n"
"\t[-M] Multi-host simulate pool imported on remote host\n"
"\t[-V] verbose (use multiple times for ever more blather)\n"
"\t[-E] use existing pool instead of creating new one\n"
"\t[-T time (default: %llu sec)] total run time\n"
"\t[-F freezeloops (default: %llu)] max loops in spa_freeze()\n"
"\t[-P passtime (default: %llu sec)] time per pass\n"
"\t[-B alt_ztest (default: <none>)] alternate ztest path\n"
"\t[-C vdev class state (default: random)] special=on|off|random\n"
"\t[-o variable=value] ... set global variable to an unsigned\n"
"\t 32-bit integer value\n"
"\t[-G dump zfs_dbgmsg buffer before exiting due to an error\n"
"\t[-h] (print help)\n"
"",
zo->zo_pool,
(u_longlong_t)zo->zo_vdevs, /* -v */
nice_vdev_size, /* -s */
zo->zo_ashift, /* -a */
zo->zo_mirrors, /* -m */
zo->zo_raid_children, /* -r */
zo->zo_raid_parity, /* -R */
zo->zo_draid_data, /* -D */
zo->zo_draid_spares, /* -S */
zo->zo_datasets, /* -d */
zo->zo_threads, /* -t */
nice_force_ganging, /* -g */
zo->zo_init, /* -i */
(u_longlong_t)zo->zo_killrate, /* -k */
zo->zo_pool, /* -p */
zo->zo_dir, /* -f */
(u_longlong_t)zo->zo_time, /* -T */
(u_longlong_t)zo->zo_maxloops, /* -F */
(u_longlong_t)zo->zo_passtime);
if (option_table[i].long_opt_param != NULL) {
if (option_table[i].default_str != NULL) {
(void) fprintf(fp, " (default: %s)",
option_table[i].default_str);
} else if (option_table[i].default_int != NO_DEFAULT) {
(void) fprintf(fp, " (default: %u)",
option_table[i].default_int);
}
}
(void) fprintf(fp, "\n");
}
exit(requested ? 0 : 1);
}
@ -817,8 +932,10 @@ process_options(int argc, char **argv)
bcopy(&ztest_opts_defaults, zo, sizeof (*zo));
while ((opt = getopt(argc, argv,
"v:s:a:m:r:R:K:D:S:d:t:g:i:k:p:f:MVET:P:hF:B:C:o:G")) != EOF) {
init_options();
while ((opt = getopt_long(argc, argv, short_opts, long_opts,
NULL)) != EOF) {
value = 0;
switch (opt) {
case 'v':
@ -953,6 +1070,8 @@ process_options(int argc, char **argv)
}
}
fini_options();
/* When raid choice is 'random' add a draid pool 50% of the time */
if (strcmp(raid_kind, "random") == 0) {
(void) strlcpy(raid_kind, (ztest_random(2) == 0) ?
@ -5979,7 +6098,7 @@ ztest_fault_inject(ztest_ds_t *zd, uint64_t id)
vd0->vdev_resilver_txg != 0)) {
/*
* Make vd0 explicitly claim to be unreadable,
* or unwriteable, or reach behind its back
* or unwritable, or reach behind its back
* and close the underlying fd. We can do this if
* maxfaults == 0 because we'll fail and reexecute,
* and we can do it if maxfaults >= 2 because we'll
@ -6550,7 +6669,7 @@ ztest_initialize(ztest_ds_t *zd, uint64_t id)
char *path = strdup(rand_vd->vdev_path);
boolean_t active = rand_vd->vdev_initialize_thread != NULL;
zfs_dbgmsg("vd %px, guid %llu", rand_vd, guid);
zfs_dbgmsg("vd %px, guid %llu", rand_vd, (u_longlong_t)guid);
spa_config_exit(spa, SCL_VDEV, FTAG);
uint64_t cmd = ztest_random(POOL_INITIALIZE_FUNCS);
@ -6622,7 +6741,7 @@ ztest_trim(ztest_ds_t *zd, uint64_t id)
char *path = strdup(rand_vd->vdev_path);
boolean_t active = rand_vd->vdev_trim_thread != NULL;
zfs_dbgmsg("vd %p, guid %llu", rand_vd, guid);
zfs_dbgmsg("vd %p, guid %llu", rand_vd, (u_longlong_t)guid);
spa_config_exit(spa, SCL_VDEV, FTAG);
uint64_t cmd = ztest_random(POOL_TRIM_FUNCS);

View File

@ -38,40 +38,39 @@
static int
ioctl_get_msg(char *var, int fd)
{
int error = 0;
int ret;
char msg[ZFS_MAX_DATASET_NAME_LEN];
error = ioctl(fd, BLKZNAME, msg);
if (error < 0) {
return (error);
ret = ioctl(fd, BLKZNAME, msg);
if (ret < 0) {
return (ret);
}
snprintf(var, ZFS_MAX_DATASET_NAME_LEN, "%s", msg);
return (error);
return (ret);
}
int
main(int argc, char **argv)
{
int fd, error = 0;
int fd = -1, ret = 0, status = EXIT_FAILURE;
char zvol_name[ZFS_MAX_DATASET_NAME_LEN];
char *zvol_name_part = NULL;
char *dev_name;
struct stat64 statbuf;
int dev_minor, dev_part;
int i;
int rc;
if (argc < 2) {
printf("Usage: %s /dev/zvol_device_node\n", argv[0]);
return (EINVAL);
fprintf(stderr, "Usage: %s /dev/zvol_device_node\n", argv[0]);
goto fail;
}
dev_name = argv[1];
error = stat64(dev_name, &statbuf);
if (error != 0) {
printf("Unable to access device file: %s\n", dev_name);
return (errno);
ret = stat64(dev_name, &statbuf);
if (ret != 0) {
fprintf(stderr, "Unable to access device file: %s\n", dev_name);
goto fail;
}
dev_minor = minor(statbuf.st_rdev);
@ -79,23 +78,23 @@ main(int argc, char **argv)
fd = open(dev_name, O_RDONLY);
if (fd < 0) {
printf("Unable to open device file: %s\n", dev_name);
return (errno);
fprintf(stderr, "Unable to open device file: %s\n", dev_name);
goto fail;
}
error = ioctl_get_msg(zvol_name, fd);
if (error < 0) {
printf("ioctl_get_msg failed:%s\n", strerror(errno));
return (errno);
ret = ioctl_get_msg(zvol_name, fd);
if (ret < 0) {
fprintf(stderr, "ioctl_get_msg failed: %s\n", strerror(errno));
goto fail;
}
if (dev_part > 0)
rc = asprintf(&zvol_name_part, "%s-part%d", zvol_name,
ret = asprintf(&zvol_name_part, "%s-part%d", zvol_name,
dev_part);
else
rc = asprintf(&zvol_name_part, "%s", zvol_name);
ret = asprintf(&zvol_name_part, "%s", zvol_name);
if (rc == -1 || zvol_name_part == NULL)
goto error;
if (ret == -1 || zvol_name_part == NULL)
goto fail;
for (i = 0; i < strlen(zvol_name_part); i++) {
if (isblank(zvol_name_part[i]))
@ -103,8 +102,13 @@ main(int argc, char **argv)
}
printf("%s\n", zvol_name_part);
free(zvol_name_part);
error:
close(fd);
return (error);
status = EXIT_SUCCESS;
fail:
if (zvol_name_part)
free(zvol_name_part);
if (fd >= 0)
close(fd);
return (status);
}

View File

@ -1 +1,3 @@
include $(top_srcdir)/config/Shellcheck.am
dist_bin_SCRIPTS = zvol_wait

View File

@ -9,45 +9,42 @@ count_zvols() {
}
filter_out_zvols_with_links() {
while read -r zvol; do
if [ ! -L "/dev/zvol/$zvol" ]; then
echo "$zvols" | tr ' ' '+' | while read -r zvol; do
if ! [ -L "/dev/zvol/$zvol" ]; then
echo "$zvol"
fi
done
done | tr '+' ' '
}
filter_out_deleted_zvols() {
while read -r zvol; do
if zfs list "$zvol" >/dev/null 2>&1; then
echo "$zvol"
fi
done
OIFS="$IFS"
IFS="
"
# shellcheck disable=SC2086
zfs list -H -o name $zvols 2>/dev/null
IFS="$OIFS"
}
list_zvols() {
read -r default_volmode < /sys/module/zfs/parameters/zvol_volmode
zfs list -t volume -H -o \
name,volmode,receive_resume_token,redact_snaps |
while read -r zvol_line; do
name=$(echo "$zvol_line" | awk '{print $1}')
volmode=$(echo "$zvol_line" | awk '{print $2}')
token=$(echo "$zvol_line" | awk '{print $3}')
redacted=$(echo "$zvol_line" | awk '{print $4}')
#
name,volmode,receive_resume_token,redact_snaps |
while IFS=" " read -r name volmode token redacted; do # IFS=\t here!
# /dev links are not created for zvols with volmode = "none"
# or for redacted zvols.
#
[ "$volmode" = "none" ] && continue
[ "$volmode" = "default" ] && [ "$default_volmode" = "3" ] &&
continue
[ "$redacted" = "-" ] || continue
#
# We also also ignore partially received zvols if it is
# We also ignore partially received zvols if it is
# not an incremental receive, as those won't even have a block
# device minor node created yet.
#
if [ "$token" != "-" ]; then
#
# Incremental receives create an invisible clone that
# is not automatically displayed by zfs list.
#
if ! zfs list "$name/%recv" >/dev/null 2>&1; then
continue
fi
@ -75,7 +72,7 @@ while [ "$outer_loop" -lt 20 ]; do
while [ "$inner_loop" -lt 30 ]; do
inner_loop=$((inner_loop + 1))
zvols="$(echo "$zvols" | filter_out_zvols_with_links)"
zvols="$(filter_out_zvols_with_links)"
zvols_count=$(count_zvols)
if [ "$zvols_count" -eq 0 ]; then
@ -95,7 +92,7 @@ while [ "$outer_loop" -lt 20 ]; do
echo "No progress since last loop."
echo "Checking if any zvols were deleted."
zvols=$(echo "$zvols" | filter_out_deleted_zvols)
zvols=$(filter_out_deleted_zvols)
zvols_count=$(count_zvols)
if [ "$old_zvols_count" -ne "$zvols_count" ]; then

View File

@ -1,5 +1,5 @@
#
# Default rules for running cppcheck against the the user space components.
# Default rules for running cppcheck against the user space components.
#
PHONY += cppcheck

View File

@ -39,7 +39,6 @@ AM_CPPFLAGS = -D_GNU_SOURCE
AM_CPPFLAGS += -D_REENTRANT
AM_CPPFLAGS += -D_FILE_OFFSET_BITS=64
AM_CPPFLAGS += -D_LARGEFILE64_SOURCE
AM_CPPFLAGS += -DHAVE_LARGE_STACKS=1
AM_CPPFLAGS += -DLIBEXECDIR=\"$(libexecdir)\"
AM_CPPFLAGS += -DRUNSTATEDIR=\"$(runstatedir)\"
AM_CPPFLAGS += -DSBINDIR=\"$(sbindir)\"

22
config/Shellcheck.am Normal file
View File

@ -0,0 +1,22 @@
.PHONY: shellcheck
shellcheck: $(SCRIPTS) $(SHELLCHECKSCRIPTS)
if HAVE_SHELLCHECK
[ -z "$(SCRIPTS)$(SHELLCHECKSCRIPTS)" ] && exit; shellcheck $$([ -n "$(SHELLCHECK_SHELL)" ] && echo "--shell=$(SHELLCHECK_SHELL)") --exclude=SC1090,SC1091$(SHELLCHECK_IGNORE) --format=gcc $(SCRIPTS) $(SHELLCHECKSCRIPTS)
else
@[ -z "$(SCRIPTS)$(SHELLCHECKSCRIPTS)" ] && exit; echo "skipping shellcheck of" $(SCRIPTS) $(SHELLCHECKSCRIPTS) "because shellcheck is not installed"
endif
@set -e; for dir in $(SHELLCHECKDIRS); do $(MAKE) -C $$dir shellcheck; done
# command -v *is* specified by POSIX and every shell in existence supports it
.PHONY: checkbashisms
checkbashisms: $(SCRIPTS) $(SHELLCHECKSCRIPTS)
if HAVE_CHECKBASHISMS
[ -z "$(SCRIPTS)$(SHELLCHECKSCRIPTS)" ] && exit; ! if [ -z "$(SHELLCHECK_SHELL)" ]; then \
checkbashisms -npx $(SCRIPTS) $(SHELLCHECKSCRIPTS); else \
for f in $(SCRIPTS) $(SHELLCHECKSCRIPTS); do echo $$f >&3; { echo '#!/bin/$(SHELLCHECK_SHELL)'; cat $$f; } | checkbashisms -npx; done; \
fi 3>&2 2>&1 | grep -vFe "'command' with option other than -p" -e 'command -v' $(CHECKBASHISMS_IGNORE) >&2
else
@[ -z "$(SCRIPTS)$(SHELLCHECKSCRIPTS)" ] && exit; echo "skipping checkbashisms of" $(SCRIPTS) $(SHELLCHECKSCRIPTS) "because checkbashisms is not installed"
endif
@set -e; for dir in $(SHELLCHECKDIRS); do $(MAKE) -C $$dir checkbashisms; done

View File

@ -0,0 +1,10 @@
dnl #
dnl # Check if shellcheck and/or checkbashisms are available.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_SHELLCHECK], [
AC_CHECK_PROG([SHELLCHECK], [shellcheck], [yes])
AC_CHECK_PROG([CHECKBASHISMS], [checkbashisms], [yes])
AM_CONDITIONAL([HAVE_SHELLCHECK], [test "x$SHELLCHECK" = "xyes"])
AM_CONDITIONAL([HAVE_CHECKBASHISMS], [test "x$CHECKBASHISMS" = "xyes"])
])

View File

@ -148,8 +148,7 @@ variable to configure. See ``configure --help'' for reference.
# Check if you have distutils, else fail
#
AC_MSG_CHECKING([for the distutils Python package])
ac_distutils_result=`$PYTHON -c "import distutils" 2>&1`
if test $? -eq 0; then
if ac_distutils_result=`$PYTHON -c "import distutils" 2>&1`; then
AC_MSG_RESULT([yes])
else
AC_MSG_RESULT([no])

View File

@ -14,7 +14,19 @@ deb-local:
"*** package for your distribution which provides ${ALIEN},\n" \
"*** re-run configure, and try again.\n"; \
exit 1; \
fi)
fi; \
if test "${ALIEN_MAJOR}" = "8" && \
test "${ALIEN_MINOR}" = "95"; then \
if test "${ALIEN_POINT}" = "1" || \
test "${ALIEN_POINT}" = "2" || \
test "${ALIEN_POINT}" = "3"; then \
/bin/echo -e "\n" \
"*** Installed version of ${ALIEN} is known to be broken;\n" \
"*** attempting to generate debs will fail! See\n" \
"*** https://github.com/openzfs/zfs/issues/11650 for details.\n"; \
exit 1; \
fi; \
fi)
deb-kmod: deb-local rpm-kmod
name=${PACKAGE}; \
@ -43,9 +55,9 @@ deb-utils: deb-local rpm-utils-initramfs
pkg1=$${name}-$${version}.$${arch}.rpm; \
pkg2=libnvpair3-$${version}.$${arch}.rpm; \
pkg3=libuutil3-$${version}.$${arch}.rpm; \
pkg4=libzfs4-$${version}.$${arch}.rpm; \
pkg5=libzpool4-$${version}.$${arch}.rpm; \
pkg6=libzfs4-devel-$${version}.$${arch}.rpm; \
pkg4=libzfs5-$${version}.$${arch}.rpm; \
pkg5=libzpool5-$${version}.$${arch}.rpm; \
pkg6=libzfs5-devel-$${version}.$${arch}.rpm; \
pkg7=$${name}-test-$${version}.$${arch}.rpm; \
pkg8=$${name}-dracut-$${version}.noarch.rpm; \
pkg9=$${name}-initramfs-$${version}.$${arch}.rpm; \
@ -56,7 +68,7 @@ deb-utils: deb-local rpm-utils-initramfs
path_prepend=`mktemp -d /tmp/intercept.XXXXXX`; \
echo "#$(SHELL)" > $${path_prepend}/dh_shlibdeps; \
echo "`which dh_shlibdeps` -- \
-xlibuutil3linux -xlibnvpair3linux -xlibzfs4linux -xlibzpool4linux" \
-xlibuutil3linux -xlibnvpair3linux -xlibzfs5linux -xlibzpool5linux" \
>> $${path_prepend}/dh_shlibdeps; \
## These -x arguments are passed to dpkg-shlibdeps, which exclude the
## Debianized packages from the auto-generated dependencies of the new debs,

View File

@ -189,7 +189,22 @@ dnl #
dnl # 3.14 API change,
dnl # Check if inode_operations contains the function set_acl
dnl #
dnl # 5.12 API change,
dnl # set_acl() added a user_namespace* parameter first
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_SET_ACL], [
ZFS_LINUX_TEST_SRC([inode_operations_set_acl_userns], [
#include <linux/fs.h>
int set_acl_fn(struct user_namespace *userns,
struct inode *inode, struct posix_acl *acl,
int type) { return 0; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.set_acl = set_acl_fn,
};
],[])
ZFS_LINUX_TEST_SRC([inode_operations_set_acl], [
#include <linux/fs.h>
@ -205,11 +220,17 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_SET_ACL], [
AC_DEFUN([ZFS_AC_KERNEL_INODE_OPERATIONS_SET_ACL], [
AC_MSG_CHECKING([whether iops->set_acl() exists])
ZFS_LINUX_TEST_RESULT([inode_operations_set_acl], [
ZFS_LINUX_TEST_RESULT([inode_operations_set_acl_userns], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SET_ACL, 1, [iops->set_acl() exists])
AC_DEFINE(HAVE_SET_ACL_USERNS, 1, [iops->set_acl() takes 4 args])
],[
AC_MSG_RESULT(no)
ZFS_LINUX_TEST_RESULT([inode_operations_set_acl], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SET_ACL, 1, [iops->set_acl() exists, takes 3 args])
],[
AC_MSG_RESULT(no)
])
])
])

View File

@ -8,7 +8,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BDI], [
], [
char *name = "bdi";
atomic_long_t zfs_bdi_seq;
int error __attribute__((unused)) =
int error __attribute__((unused));
atomic_long_set(&zfs_bdi_seq, 0);
error =
super_setup_bdi_name(&sb, "%.28s-%ld", name,
atomic_long_inc_return(&zfs_bdi_seq));
])

View File

@ -23,8 +23,8 @@ AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_PLUG], [
])
dnl #
dnl # 2.6.32 - 4.11, statically allocated bdi in request_queue
dnl # 4.12 - x.y, dynamically allocated bdi in request_queue
dnl # 2.6.32 - 4.11: statically allocated bdi in request_queue
dnl # 4.12: dynamically allocated bdi in request_queue
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_BDI], [
ZFS_LINUX_TEST_SRC([blk_queue_bdi], [
@ -48,7 +48,7 @@ AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_BDI], [
])
dnl #
dnl # 2.6.32 - 4.x API,
dnl # 2.6.32 API,
dnl # blk_queue_discard()
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_DISCARD], [
@ -71,7 +71,7 @@ AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_DISCARD], [
])
dnl #
dnl # 4.8 - 4.x API,
dnl # 4.8 API,
dnl # blk_queue_secure_erase()
dnl #
dnl # 2.6.36 - 4.7 API,

View File

@ -120,7 +120,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_CHECK_MEDIA_CHANGE], [
])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_BDEV_CHECK_MEDIA_CHANGE], [
AC_MSG_CHECKING([whether bdev_disk_changed() exists])
AC_MSG_CHECKING([whether bdev_check_media_change() exists])
ZFS_LINUX_TEST_RESULT([bdev_check_media_change], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BDEV_CHECK_MEDIA_CHANGE, 1,

View File

@ -52,12 +52,44 @@ AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
])
])
dnl #
dnl # 5.13 API change
dnl # block_device_operations->revalidate_disk() was removed
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
ZFS_LINUX_TEST_SRC([block_device_operations_revalidate_disk], [
#include <linux/blkdev.h>
int blk_revalidate_disk(struct gendisk *disk) {
return(0);
}
static const struct block_device_operations
bops __attribute__ ((unused)) = {
.revalidate_disk = blk_revalidate_disk,
};
], [], [$NO_UNUSED_BUT_SET_VARIABLE])
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
AC_MSG_CHECKING([whether bops->revalidate_disk() exists])
ZFS_LINUX_TEST_RESULT([block_device_operations_revalidate_disk], [
AC_DEFINE([HAVE_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [1],
[Define if revalidate_disk() in block_device_operations])
AC_MSG_RESULT(yes)
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS], [
ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS
ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID
ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS], [
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK
])

View File

@ -19,7 +19,6 @@ AC_DEFUN([ZFS_AC_KERNEL_CONFIG_DEFINED], [
])
])
ZFS_AC_KERNEL_SRC_CONFIG_THREAD_SIZE
ZFS_AC_KERNEL_SRC_CONFIG_DEBUG_LOCK_ALLOC
ZFS_AC_KERNEL_SRC_CONFIG_TRIM_UNUSED_KSYMS
ZFS_AC_KERNEL_SRC_CONFIG_ZLIB_INFLATE
@ -29,42 +28,12 @@ AC_DEFUN([ZFS_AC_KERNEL_CONFIG_DEFINED], [
ZFS_LINUX_TEST_COMPILE_ALL([config])
AC_MSG_RESULT([done])
ZFS_AC_KERNEL_CONFIG_THREAD_SIZE
ZFS_AC_KERNEL_CONFIG_DEBUG_LOCK_ALLOC
ZFS_AC_KERNEL_CONFIG_TRIM_UNUSED_KSYMS
ZFS_AC_KERNEL_CONFIG_ZLIB_INFLATE
ZFS_AC_KERNEL_CONFIG_ZLIB_DEFLATE
])
dnl #
dnl # Check configured THREAD_SIZE
dnl #
dnl # The stack size will vary by architecture, but as of Linux 3.15 on x86_64
dnl # the default thread stack size was increased to 16K from 8K. Therefore,
dnl # on newer kernels and some architectures stack usage optimizations can be
dnl # conditionally applied to improve performance without negatively impacting
dnl # stability.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_CONFIG_THREAD_SIZE], [
ZFS_LINUX_TEST_SRC([config_thread_size], [
#include <linux/module.h>
],[
#if (THREAD_SIZE < 16384)
#error "THREAD_SIZE is less than 16K"
#endif
])
])
AC_DEFUN([ZFS_AC_KERNEL_CONFIG_THREAD_SIZE], [
AC_MSG_CHECKING([whether kernel was built with 16K or larger stacks])
ZFS_LINUX_TEST_RESULT([config_thread_size], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_LARGE_STACKS, 1, [kernel has large stacks])
],[
AC_MSG_RESULT([no])
])
])
dnl #
dnl # Check CONFIG_DEBUG_LOCK_ALLOC
dnl #

View File

@ -16,7 +16,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_GENERIC_FILLATTR_USERNS], [
])
AC_DEFUN([ZFS_AC_KERNEL_GENERIC_FILLATTR_USERNS], [
AC_MSG_CHECKING([whether generic_fillattr requres struct user_namespace*])
AC_MSG_CHECKING([whether generic_fillattr requires struct user_namespace*])
ZFS_LINUX_TEST_RESULT([generic_fillattr_userns], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_GENERIC_FILLATTR_USERNS, 1,

View File

@ -4,6 +4,10 @@ dnl # The is_owner_or_cap() macro was renamed to inode_owner_or_capable(),
dnl # This is used for permission checks in the xattr and file attribute call
dnl # paths.
dnl #
dnl # 5.12 API change,
dnl # inode_owner_or_capable() now takes struct user_namespace *
dnl # to support idmapped mounts
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_OWNER_OR_CAPABLE], [
ZFS_LINUX_TEST_SRC([inode_owner_or_capable], [
#include <linux/fs.h>

View File

@ -42,6 +42,13 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_MAKE_REQUEST_FN], [
struct block_device_operations o;
o.submit_bio = NULL;
])
ZFS_LINUX_TEST_SRC([blk_alloc_disk], [
#include <linux/blkdev.h>
],[
struct gendisk *disk __attribute__ ((unused));
disk = blk_alloc_disk(NUMA_NO_NODE);
])
])
AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
@ -56,6 +63,19 @@ AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
AC_DEFINE(HAVE_SUBMIT_BIO_IN_BLOCK_DEVICE_OPERATIONS, 1,
[submit_bio is member of struct block_device_operations])
dnl #
dnl # Linux 5.14 API Change:
dnl # blk_alloc_queue() + alloc_disk() combo replaced by
dnl # a single call to blk_alloc_disk().
dnl #
AC_MSG_CHECKING([whether blk_alloc_disk() exists])
ZFS_LINUX_TEST_RESULT([blk_alloc_disk], [
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_BLK_ALLOC_DISK], 1, [blk_alloc_disk() exists])
], [
AC_MSG_RESULT(no)
])
],[
AC_MSG_RESULT(no)

View File

@ -25,6 +25,31 @@ AC_DEFUN([ZFS_AC_KERNEL_PERCPU_COUNTER_INIT], [
])
])
dnl #
dnl # 4.13 API change,
dnl # __percpu_counter_add() was renamed to percpu_counter_add_batch().
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_PERCPU_COUNTER_ADD_BATCH], [
ZFS_LINUX_TEST_SRC([percpu_counter_add_batch], [
#include <linux/percpu_counter.h>
],[
struct percpu_counter counter;
percpu_counter_add_batch(&counter, 1, 1);
])
])
AC_DEFUN([ZFS_AC_KERNEL_PERCPU_COUNTER_ADD_BATCH], [
AC_MSG_CHECKING([whether percpu_counter_add_batch() is defined])
ZFS_LINUX_TEST_RESULT([percpu_counter_add_batch], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_PERCPU_COUNTER_ADD_BATCH, 1,
[percpu_counter_add_batch() is defined])
],[
AC_MSG_RESULT(no)
])
])
dnl #
dnl # 5.10 API change,
dnl # The "count" was moved into ref->data, from ref
@ -51,10 +76,12 @@ AC_DEFUN([ZFS_AC_KERNEL_PERCPU_REF_COUNT_IN_DATA], [
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_PERCPU], [
ZFS_AC_KERNEL_SRC_PERCPU_COUNTER_INIT
ZFS_AC_KERNEL_SRC_PERCPU_COUNTER_ADD_BATCH
ZFS_AC_KERNEL_SRC_PERCPU_REF_COUNT_IN_DATA
])
AC_DEFUN([ZFS_AC_KERNEL_PERCPU], [
ZFS_AC_KERNEL_PERCPU_COUNTER_INIT
ZFS_AC_KERNEL_PERCPU_COUNTER_ADD_BATCH
ZFS_AC_KERNEL_PERCPU_REF_COUNT_IN_DATA
])

View File

@ -44,6 +44,7 @@ AC_DEFUN([ZFS_AC_KERNEL_RENAME], [
],[
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether iop->rename() wants flags])
ZFS_LINUX_TEST_RESULT([inode_operations_rename_flags], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_RENAME_WANTS_FLAGS, 1,

21
config/kernel-siginfo.m4 Normal file
View File

@ -0,0 +1,21 @@
dnl #
dnl # 4.20 API change
dnl # Added kernel_siginfo_t
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_SIGINFO], [
ZFS_LINUX_TEST_SRC([siginfo], [
#include <linux/signal_types.h>
],[
kernel_siginfo_t info __attribute__ ((unused));
])
])
AC_DEFUN([ZFS_AC_KERNEL_SIGINFO], [
AC_MSG_CHECKING([whether kernel_siginfo_t tyepedef exists])
ZFS_LINUX_TEST_RESULT([siginfo], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SIGINFO, 1, [kernel_siginfo_t exists])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,21 @@
dnl #
dnl # 4.4 API change
dnl # Added kernel_signal_stop
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_SIGNAL_STOP], [
ZFS_LINUX_TEST_SRC([signal_stop], [
#include <linux/sched/signal.h>
],[
kernel_signal_stop();
])
])
AC_DEFUN([ZFS_AC_KERNEL_SIGNAL_STOP], [
AC_MSG_CHECKING([whether signal_stop() exists])
ZFS_LINUX_TEST_RESULT([signal_stop], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SIGNAL_STOP, 1, [signal_stop() exists])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,21 @@
dnl #
dnl # 4.17 API change
dnl # Added set_special_state() function
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_SET_SPECIAL_STATE], [
ZFS_LINUX_TEST_SRC([set_special_state], [
#include <linux/sched.h>
],[
set_special_state(TASK_STOPPED);
])
])
AC_DEFUN([ZFS_AC_KERNEL_SET_SPECIAL_STATE], [
AC_MSG_CHECKING([whether set_special_state() exists])
ZFS_LINUX_TEST_RESULT([set_special_state], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SET_SPECIAL_STATE, 1, [set_special_state() exists])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -3,23 +3,43 @@ dnl # 3.11 API change
dnl # Add support for i_op->tmpfile
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_TMPFILE], [
ZFS_LINUX_TEST_SRC([inode_operations_tmpfile], [
dnl #
dnl # 5.11 API change
dnl # add support for userns parameter to tmpfile
dnl #
ZFS_LINUX_TEST_SRC([inode_operations_tmpfile_userns], [
#include <linux/fs.h>
int tmpfile(struct inode *inode, struct dentry *dentry,
int tmpfile(struct user_namespace *userns,
struct inode *inode, struct dentry *dentry,
umode_t mode) { return 0; }
static struct inode_operations
iops __attribute__ ((unused)) = {
.tmpfile = tmpfile,
};
],[])
ZFS_LINUX_TEST_SRC([inode_operations_tmpfile], [
#include <linux/fs.h>
int tmpfile(struct inode *inode, struct dentry *dentry,
umode_t mode) { return 0; }
static struct inode_operations
iops __attribute__ ((unused)) = {
.tmpfile = tmpfile,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_TMPFILE], [
AC_MSG_CHECKING([whether i_op->tmpfile() exists])
ZFS_LINUX_TEST_RESULT([inode_operations_tmpfile], [
ZFS_LINUX_TEST_RESULT([inode_operations_tmpfile_userns], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_TMPFILE, 1, [i_op->tmpfile() exists])
AC_DEFINE(HAVE_TMPFILE_USERNS, 1, [i_op->tmpfile() has userns])
],[
AC_MSG_RESULT(no)
ZFS_LINUX_TEST_RESULT([inode_operations_tmpfile], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_TMPFILE, 1, [i_op->tmpfile() exists])
],[
AC_MSG_RESULT(no)
])
])
])

View File

@ -0,0 +1,34 @@
dnl #
dnl # Linux 5.14 adds a change to require set_page_dirty to be manually
dnl # wired up in struct address_space_operations. Determine if this needs
dnl # to be done. This patch set also introduced __set_page_dirty_nobuffers
dnl # declaration in linux/pagemap.h, so these tests look for the presence
dnl # of that function to tell the compiler to assign set_page_dirty in
dnl # module/os/linux/zfs/zpl_file.c
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_VFS_SET_PAGE_DIRTY_NOBUFFERS], [
ZFS_LINUX_TEST_SRC([vfs_has_set_page_dirty_nobuffers], [
#include <linux/pagemap.h>
#include <linux/fs.h>
static const struct address_space_operations
aops __attribute__ ((unused)) = {
.set_page_dirty = __set_page_dirty_nobuffers,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_VFS_SET_PAGE_DIRTY_NOBUFFERS], [
dnl #
dnl # Linux 5.14 change requires set_page_dirty() to be assigned
dnl # in address_space_operations()
dnl #
AC_MSG_CHECKING([__set_page_dirty_nobuffers exists])
ZFS_LINUX_TEST_RESULT([vfs_has_set_page_dirty_nobuffers], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_VFS_SET_PAGE_DIRTY_NOBUFFERS, 1,
[__set_page_dirty_nobuffers exists])
],[
AC_MSG_RESULT([no])
])
])

View File

@ -129,6 +129,10 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_MKNOD
ZFS_AC_KERNEL_SRC_SYMLINK
ZFS_AC_KERNEL_SRC_BIO_MAX_SEGS
ZFS_AC_KERNEL_SRC_SIGNAL_STOP
ZFS_AC_KERNEL_SRC_SIGINFO
ZFS_AC_KERNEL_SRC_SET_SPECIAL_STATE
ZFS_AC_KERNEL_SRC_VFS_SET_PAGE_DIRTY_NOBUFFERS
AC_MSG_CHECKING([for available kernel interfaces])
ZFS_LINUX_TEST_COMPILE_ALL([kabi])
@ -231,6 +235,10 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_MKNOD
ZFS_AC_KERNEL_SYMLINK
ZFS_AC_KERNEL_BIO_MAX_SEGS
ZFS_AC_KERNEL_SIGNAL_STOP
ZFS_AC_KERNEL_SIGINFO
ZFS_AC_KERNEL_SET_SPECIAL_STATE
ZFS_AC_KERNEL_VFS_SET_PAGE_DIRTY_NOBUFFERS
])
dnl #

28
config/user-libatomic.m4 Normal file
View File

@ -0,0 +1,28 @@
dnl #
dnl # If -latomic exists and atomic.c doesn't link without it,
dnl # it's needed for __atomic intrinsics.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_USER_LIBATOMIC], [
AC_MSG_CHECKING([whether -latomic is required])
saved_libs="$LIBS"
LIBS="$LIBS -latomic"
LIBATOMIC_LIBS=""
AC_LINK_IFELSE([AC_LANG_PROGRAM([], [])], [
LIBS="$saved_libs"
saved_cflags="$CFLAGS"
CFLAGS="$CFLAGS -isystem lib/libspl/include"
AC_LINK_IFELSE([AC_LANG_PROGRAM([#include "lib/libspl/atomic.c"], [])], [], [LIBATOMIC_LIBS="-latomic"])
CFLAGS="$saved_cflags"
])
if test -n "$LIBATOMIC_LIBS"; then
AC_MSG_RESULT([yes])
else
AC_MSG_RESULT([no])
fi
LIBS="$saved_libs"
AC_SUBST([LIBATOMIC_LIBS])
])

View File

@ -21,6 +21,7 @@ AC_DEFUN([ZFS_AC_CONFIG_USER], [
ZFS_AC_CONFIG_USER_LIBUDEV
ZFS_AC_CONFIG_USER_LIBCRYPTO
ZFS_AC_CONFIG_USER_LIBAIO
ZFS_AC_CONFIG_USER_LIBATOMIC
ZFS_AC_CONFIG_USER_CLOCK_GETTIME
ZFS_AC_CONFIG_USER_PAM
ZFS_AC_CONFIG_USER_RUNSTATEDIR

Some files were not shown because too many files have changed in this diff Show More