Compare commits

...

212 Commits

Author SHA1 Message Date
Tony Hutter ad81baab77 Tag zfs-2.0.7
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2021-12-22 09:47:58 -08:00
Brian Behlendorf a4831fa023 Fix zvol_open() lock inversion
When restructuring the zvol_open() logic for the Linux 5.13 kernel
a lock inversion was accidentally introduced.  In the updated code
the spa_namespace_lock is now taken before the zv_suspend_lock
allowing the following scenario to occur:

    down_read <=== waiting for zv_suspend_lock
    zvol_open <=== holds spa_namespace_lock
    __blkdev_get
    blkdev_get_by_dev
    blkdev_open
    ...

     mutex_lock <== waiting for spa_namespace_lock
     spa_open_common
     spa_open
     dsl_pool_hold
     dmu_objset_hold_flags
     dmu_objset_hold
     dsl_prop_get
     dsl_prop_get_integer
     zvol_create_minor
     dmu_recv_end
     zfs_ioc_recv_impl <=== holds zv_suspend_lock via zvol_suspend()
     zfs_ioc_recv
     ...

This commit resolves the issue by moving the acquisition of the
spa_namespace_lock back to after the zv_suspend_lock which restores
the original ordering.

Additionally, as part of this change the error exit paths were
simplified where possible.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12863
2021-12-22 09:31:58 -08:00
Ryan Moeller b327131a1f FreeBSD: Add vop_standard_writecount_nomsync
https://cgit.freebsd.org/src/commit?id=3ffcfa599e29686cf2b3c1a6087408c37acaed78

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #12828
2021-12-13 12:26:20 -08:00
Ryan Moeller 2ed7c54654 FreeBSD: Catch up with more VFS changes
Unused thread argument was removed from NDINIT*

https://cgit.freebsd.org/src/commit?id=7e1d3eefd410ca0fbae5a217422821244c3eeee4

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #12828
2021-12-13 12:25:53 -08:00
Pawel Jakub Dawidek 70fdb198ad Remove (now unused) td argument from zfs_lookup()
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes #12748
2021-12-13 12:25:36 -08:00
Ryan Moeller 7100db9793 FreeBSD: Implement xattr=sa
FreeBSD historically has not cared about the xattr property; it was
always treated as xattr=on.  With xattr=on, xattrs are stored as files
in a hidden xattr directory.  With xattr=sa, xattrs are stored as
system attributes and get cached in nvlists during xattr operations.
This makes SA xattrs simpler and more efficient to manipulate.  FreeBSD
needs to implement the SA xattr operations for feature parity with
Linux and to ensure that SA xattrs are accessible when migrated or
replicated from Linux.

Following the example set by Linux, refactor our existing extattr vnops
to split off the parts handling dir style xattrs, and add the
corresponding SA handling parts.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11997
2021-12-13 12:22:11 -08:00
Mark Johnston 57f43735ca Fix several bugs in the FreeBSD rename VOP implementation
- To avoid a use-after-free, zfsvfs->z_log needs to be loaded after the
  teardown lock is acquired with ZFS_ENTER().
- Avoid leaking vnode locks in zfs_rename_relock() and zfs_rename_()
  when the ZFS_ENTER() macros forces an early return.

Refactor the rename implementation so that ZFS_ENTER() can be used
safely.  As a bonus, this lets us use the ZFS_VERIFY_ZP() macro instead
of open-coding its implementation.

Reported-by: Peter Holm <pho@FreeBSD.org>
Tested-by: Peter Holm <pho@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Sponsored-by: The FreeBSD Foundation
Closes #12717
2021-12-13 11:56:03 -08:00
Mark Johnston 47f098d2db Exit the teardown section later in rename on FreeBSD
We have to hold the teardown lock while dereferencing zfsvfs->z_os and,
I believe, when committing to the ZIL.

Note that jumping to the "out" label, "error" is always non-zero.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12704
2021-12-13 10:43:22 -08:00
Mark Johnston d94d1a589c Fix potential use-after-frees in FreeBSD getpages and setattr VOPs
The objset object is reallocated during certain dataset operations, such
as rollbacks, so the objset pointer must be loaded after acquiring the
teardown lock.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12704
2021-12-13 10:43:15 -08:00
Brian Behlendorf 4bbffa2489 ZTS: import_rewind_device_replaced reliably fails
The import_rewind_device_replaced.ksh test was never entirely reliable
because it depends on MOS data not being overwritten.  The MOS data is
not protected by the snapshot so occasional failures were always
expected.  However, this test is now failing reliably on all platforms
indicating something has changed in the code since the test was marked
"maybe".  Convert the test to a "known" failure until the root cause
is identified and resolved.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12821
2021-12-08 15:01:44 -08:00
Brian Behlendorf d1bf6c7251 Linux 5.15 compat: META (#12824)
The final 5.15 kernel is available and has been tested.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2021-12-07 17:03:37 -08:00
Paul Dagnelie 1fbfc022c9 ZFS send/recv with ashift 9->12 leads to data corruption
Improve the ability of zfs send to determine if a block is compressed
or not by using information contained in the blkptr.

Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: Matthew Ahrens <matthew.ahrens@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #12770
2021-12-07 17:03:17 -08:00
Coleman Kane 2a21853b75 Linux 5.16: Resolve ZSTD_isError symbol collision in Linux kernel
Newer zstd code introduced in the main kernel tree now creates a symbol
collision with ZSTD_isError in our ZSTD code. This change relabels our
implementation with a ZFS-specific symbol name, and undoes some
macro-based micro-optimizations that conflict with the attempt to rename
our internal-use version.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12819
2021-12-07 13:15:12 -08:00
Coleman Kane 6a3ba7538a Linux 5.16: The blk-cgroup.h header is where struct blkcg_gq is defined
The definition of struct blkcg_gq was moved into blk-cgroup.h, which is
a header that's been in Linux since 2015. This is used by
vdev_blkg_tryget() in module/os/linux/zfs/vdev_disk.c. Since the kernel
for CentOS 7 and similar-generation releases doesn't have this header,
its inclusion is guarded by a configure test.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12819
2021-12-07 13:15:12 -08:00
Coleman Kane 94e49866e9 Linux 5.16: bio_set_dev is no longer a helper macro
This change adds a confiugre check to determine if bio_set_dev is a
helper macro or not. If not, then the attempt to override its internal
call to bio_associate_blkg(), with a macro definition to our own
version, is no longer possible, as the compiler won't use it when
compiling the new inline function replacement implemented in the header.
This change also creates a new vdev_bio_set_dev() function that performs
the same work, and also performs the work implemented in
vdev_bio_associate_blkg(), as it is the only thing calling that function
in our code. Our custom vdev_bio_associate_blkg() is now only compiled
if the bio_set_dev() is a macro in the Linux headers.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12819
2021-12-07 13:15:12 -08:00
Coleman Kane eb9d35e641 Linux 5.16: type member of iov_iter renamed iter_type
The iov_iter->type member was renamed iov_iter->iter_type. However,
while looking into this, realized that in 2018 a iov_iter_type(*iov)
accessor function was introduced. So if that is present, use it,
otherwise fall back to trying the existing behavior of directly
accessing type from iov_iter.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12819
2021-12-07 13:15:11 -08:00
Coleman Kane 649b3ce987 Linux 5.16: block_device_operations->submit_bio now returns void
The return type for the submit_bio member of struct
block_device_operations was changed to no longer return a value.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12819
2021-12-07 13:15:11 -08:00
Coleman Kane 541fe3d598 Linux 5.16 compat: asm/fpu/xcr.h is new location for xgetbv/xsetbv
Linux 5.16 moved these functions into this new header in commit
1b4fb8545f2b00f2844c4b7619d64d98440a477c. This change adds code to look
for the presence of this header, and include it so that the code using
xgetbv & xsetbv will compile again.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12800
2021-12-07 13:15:11 -08:00
Alexander Motin c84f9be5c7 FreeBSD: avoid memory allocation in arc_prune_async
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Closes #12049
2021-12-07 11:18:28 -08:00
наб 5ea770b0a2 tests/file_check: remove unused variable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12187
2021-12-06 13:52:41 -08:00
John Wren Kennedy 3404e887b1 Strip colons from all test result filenames
The upload artifact functionality in github can't handle colons in
filenames. The current code handles this for files under the most
recent set of results. With the ability to rerun failed tests, now
there can be multiple sets of results, and they all need to be
processed in the same way.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: John Kennedy <john.kennedy@delphix.com>
Closes #12815
2021-12-06 13:41:04 -08:00
Brian Behlendorf 2915930f89 Linux 5.13 compat: retry zvol_open() when contended
Due to a possible lock inversion the zvol open call path on Linux
needs to be able to retry in the case where the spa_namespace_lock
cannot be acquired.

For Linux 5.12 an older kernel this was accomplished by returning
-ERESTARTSYS from zvol_open() to request that blkdev_get() drop
the bdev->bd_mutex lock, reaquire it, then call the open callback
again.  However, as of the 5.13 kernel this behavior was removed.

Therefore, for 5.12 and older kernels we preserved the existing
retry logic, but for 5.13 and newer kernels we retry internally in
zvol_open().  This should always succeed except in the case where
a pool's vdev are layed on zvols, in which case it may fail.  To
handle this case vdev_disk_open() has been updated to retry when
opening a device when -ERESTARTSYS is returned.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #12301
Closes #12759
2021-12-06 13:40:59 -08:00
John Wren Kennedy 3c97d7a84b Temporarily remove tests from sanity runfile
With the addition of functionality to rerun failing tests, some
tests that fail only sometimes still fail often enough to degrade
the reliability of the sanity runs. Remove them from the runfile
until they reliably pass.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: John Kennedy <john.kennedy@delphix.com>
Closes #12814
2021-12-06 13:40:54 -08:00
Paul Dagnelie 5a80c82609 Add zfs-test facility to automatically rerun failing tests
This was a project proposed as part of the Quality theme for the
hackthon for the 2021 OpenZFS Developer Summit. The idea is to improve
the usability of the automated tests that get run when a PR is created
by having failing tests automatically rerun in order to make flaky
tests less impactful.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #12740
2021-12-06 13:40:48 -08:00
Coleman Kane 679be593dd Linux 5.16: wait_on_page_bit() no longer available to modules
Instead, linux/pagemap.h offers a number of folio-specific functions to
be called instead. In this case, module/os/linux/zfs/zfs_vnops_os.c
wants to call wait_on_page_bit(pp, PG_writeback). This gets replaced
with folio_wait_bit(folio_page(pp), PG_writeback). This change modifies
the code to conditionally compile that if configure identifies th
presence of the folio_wait_bit() function.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12800
2021-12-06 13:40:43 -08:00
Jorgen Lundman f43b315e17 Iterate encrypted clones at zvol_create_minor
Userland figures out which encryption-root keys are required to load,
and issues ZFS_IOC_LOAD_KEY.
The tail section of spa_keystore_load_wkey() will call
zvol_create_minors() on the encryption-root object.

Any clones of the encrypted zvol will not be plumbed. This commits
adds additional logic to detect if zvol has clones, and is encrypted,
then adds these to the list of zvols to call zvol_create_minors() on.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #12471
2021-12-06 13:40:37 -08:00
Tony Hutter 922fe416b2 Update ABIs for zfs-2.0.7
In the course of cherry-picking commits, the ABI files from master
and the code in zfs-2.0.7 can get out of sync, and the ABIs need
to be regenerated.  This patch just runs 'make storeabi' on the
previous commit and checks in the new ABI files.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2021-11-12 16:38:16 -08:00
наб e2dcc523a4 libefi: remove efi_auto_sense()
It's present (but undocumented) in the illumos gate and used exclusively
by rmformat(1) (which I recommend as a nice blast from the past),
and also the math assumes 512B sectors and is therefore wrong

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12191
2021-11-12 16:31:55 -08:00
наб dec1ea4dbc libefi: efi_get_devname: don't allocate procfs path
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12048
2021-11-12 16:31:55 -08:00
Brian Behlendorf 1a4030b1b5 cppcheck: resolve double free
The double free reported for the realloc() failure branch is a
false positive.  It should be resolved in cppcheck 2.4 but for
the benefit of older versions we supress the warning.

    https://trac.cppcheck.net/ticket/9292

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11508
2021-11-12 16:31:55 -08:00
Brian Behlendorf 33fa3985a9 Restore dirty dnode detection logic
In addition to flushing memory mapped regions when checking holes,
commit de198f2d95 modified the dirty dnode detection logic to check
the dn->dn_dirty_records instead of the dn->dn_dirty_link.  Relying
on the dirty record has not be reliable, switch back to the previous
method.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #11900 
Closes #12745
2021-11-12 16:31:55 -08:00
Brian Behlendorf a524f8d6af Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency
When using lseek(2) to report data/holes memory mapped regions of
the file were ignored.  This could result in incorrect results.
To handle this zfs_holey_common() was updated to asynchronously
writeback any dirty mmap(2) regions prior to reporting holes.

Additionally, while not strictly required, the dn_struct_rwlock is
now held over the dirty check to prevent the dnode structure from
changing.  This ensures that a clean dnode can't be dirtied before
the data/hole is located.  The range lock is now also taken to
ensure the call cannot race with zfs_write().

Furthermore, the code was refactored to provide a dnode_is_dirty()
helper function which checks the dnode for any dirty records to
determine its dirtiness.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #11900
Closes #12724
2021-11-12 16:31:55 -08:00
Tony Hutter 4ba1a6227a zed: Control NVMe fault LEDs
The ZED code currently can only turn on the fault LED for
a faulted disk in a JBOD enclosure.  This extends support
for faulted NVMe disks as well.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #12648
Closes #12695
2021-11-12 16:31:55 -08:00
Brian Behlendorf 6de5c440fa Linux 5.16 compat: submit_bio()
The submit_bio() prototype has changed again.  The version is 5.16
still only expects a single argument but the return type has changed
to void.  Since we never used the returned value before update the
configure check to detect both single arg versions.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12725
2021-11-12 16:31:55 -08:00
Brian Behlendorf b8b3b93ebb Linux 5.16 compat: linux/elevator.h
Commit https://github.com/torvalds/linux/commit/2e9bc346 moved
the elevator.h header under the block/ directory as part of some
refactoring.  This turns out not to be a problem since there's
no longer anything we need from the header.  This has been the
case for some time, this change removes the elevator.h include
and replaces it with a major.h include.

Reviewed-by: Alexander Lobakin <alobakin@pm.me>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12725
2021-11-12 16:31:55 -08:00
наб 586358102e zed.d/pool_import-led.sh: fix for current zpool scripts
Also minor clean-up with folding state_to_val() into a case,
unrolling the lesser-available seq into numbers,
ignoring vdev states we don't care about,
and documentation comments

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11934
Closes #11935
2021-11-12 16:31:55 -08:00
Rich Ercolani 110d0ba5ca Revert behavior of 59eab109 on not-Linux
It turns out that short-circuiting the EFAULT behavior on a short read
breaks things on FreeBSD. So until there's a nicer solution, let's
just revert the behavior for not-Linux.

Reference:
https://reviews.freebsd.org/R10:70f51f0e474ffe1fb74cb427423a2fba3637544d

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12698
2021-11-12 16:31:55 -08:00
Rich Ercolani 2ef1ce66f5 Handle partial reads in zfs_read
Currently, dmu_read_uio_dnode can read 64K of a requested 1M in one
loop, get EFAULT back from zfs_uiomove() (because the iovec only holds
64k), and return EFAULT, which turns into EAGAIN on the way out. EAGAIN
gets interpreted as "I didn't read anything", the caller tries again
without consuming the 64k we already read, and we're stuck.

This apparently works on newer kernels because the caller which breaks
on older Linux kernels by happily passing along a 1M read request and a
64k iovec just requests 64k at a time.

With this, we now won't return EFAULT if we got a partial read.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12370 
Closes #12509
Closes #12516
2021-11-12 16:31:55 -08:00
Brian Atkinson d10c35b640 Cleaning up uio headers
Making uio_impl.h the common header interface between Linux and FreeBSD
so both OS's can share a common header file. This also helps reduce code
duplication for zfs_uio_t for each OS.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #11622
2021-11-12 16:31:55 -08:00
Brian Atkinson 0b1e6fcc3e Extending FreeBSD UIO Struct
In FreeBSD the struct uio was just a typedef to uio_t. In order to
extend this struct, outside of the definition for the struct uio, the
struct uio has been embedded inside of a uio_t struct.

Also renamed all the uio_* interfaces to be zfs_uio_* to make it clear
this is a ZFS interface.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #11438
2021-11-12 16:31:55 -08:00
Ryan Moeller cd4f9572d0 FreeBSD: Move uio_prefaultpages def to uio.h
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11176
2021-11-12 16:31:55 -08:00
Matthew Macy 6f59f6402d Remove UIO_ZEROCOPY functions structures
The original xuio zero copy functionality has always been unused 
on Linux and FreeBSD.  Remove this disabled code to avoid any
confusion and improve readability.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #11124
2021-11-12 16:31:55 -08:00
Ryan Moeller 3fcf17e69d FreeBSD: Catch up with recent VFS changes
cn_thread is always curthread.

https://cgit.freebsd.org/src/commit?id=b4a58fbf640409a1e507d9f7b411c83a3f83a2f3
https://cgit.freebsd.org/src/commit?id=2b68eb8e1dbbdaf6a0df1c83b26f5403ca52d4c3

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #12668
2021-11-12 16:31:55 -08:00
Tony Hutter 92fcbe04ba vdev_id: Fix PHY sorting
One of our developers noticed a bug in vdev_id where we were incorrectly
sorting PHYs using alphabetical sorting (which usually works) instead
of natural sorting (-v).  For example:

	[port-0:0]# ls -d phy*
	phy-0:10  phy-0:11  phy-0:8  phy-0:9

	[port-0:0]# ls -vd phy*
	phy-0:8  phy-0:9  phy-0:10  phy-0:11

This fixes the issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #12699
2021-11-12 16:31:55 -08:00
Tony Hutter dfbc33a0e5 vdev_id: Fix enclosure_symlinks feature
The vdev_id.conf "enclosure_symlinks" option persistently creates
and maps /dev/by-enclosure symlinks to dynamic /dev/sg* devices.

This patch fixes two issues:

1. The enclosure_symlinks feature was accidentally broken in:

   vdev_id: Support daisy-chained JBODs in multipath mode

2. Even when working, the feature numbered the enclosure
   sequentially rather than by HBA port number.  That meant that
   if a port was down or didn't appear in sysfs, then the
   enclosure_sumlinks numbers would be numbered wrong.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #12660
2021-11-12 16:31:55 -08:00
Tony Hutter 91fe213823 Rescan enclosure sysfs path on import
When you create a pool, zfs writes vd->vdev_enc_sysfs_path with the
enclosure sysfs path to the fault LEDs, like:

    vdev_enc_sysfs_path = /sys/class/enclosure/0:0:1:0/SLOT8

However, this enclosure path doesn't get updated on successive imports
even if enclosure path to the disk changes.  This patch fixes the issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #11950
Closes #12095
2021-11-12 16:31:55 -08:00
Anton Gubarkov 728cf93b0c vdev_id: Return an error if config file is not found (#12508)
Signed-off-by: Anton Gubarkov <anton.gubarkov@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
2021-11-12 16:31:55 -08:00
наб 43706c7dee vdev_id.conf.5: modernise
Also yeet pci_slot since it doesn't seem to exist?

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-11-12 16:31:55 -08:00
наб 5cc5996f4e vdev_id.8: modernise, note scsi topology
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12125
2021-11-12 16:31:55 -08:00
наб cc6ea631ce zfs_get_enclosure_sysfs_path(): don't free undefined pointer
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11993
2021-11-12 16:31:55 -08:00
наб b847e538ef zfs_get_enclosure_sysfs_path(): don't leak dev path
Also always free tmp2 at the end

Before:
nabijaczleweli@tarta:~/uwu$ valgrind --leak-check=full ./blergh
==8947== Memcheck, a memory error detector
==8947== Using Valgrind-3.14.0 and LibVEX
==8947== Command: ./blergh
==8947==
(null)
==8947==
==8947== HEAP SUMMARY:
==8947==     in use at exit: 23 bytes in 1 blocks
==8947==   total heap usage: 3 allocs, 2 frees, 1,147 bytes allocated
==8947==
==8947== 23 bytes in 1 blocks are definitely lost in loss record 1 of 1
==8947==    at 0x483577F: malloc (vg_replace_malloc.c:299)
==8947==    by 0x48D74B7: vasprintf (vasprintf.c:73)
==8947==    by 0x48B7833: asprintf (asprintf.c:35)
==8947==    by 0x401258: zfs_get_enclosure_sysfs_path
                         (zutil_device_path_os.c:191)
==8947==    by 0x401482: main (blergh.c:107)
==8947==
==8947== LEAK SUMMARY:
==8947==    definitely lost: 23 bytes in 1 blocks
==8947==    indirectly lost: 0 bytes in 0 blocks
==8947==      possibly lost: 0 bytes in 0 blocks
==8947==    still reachable: 0 bytes in 0 blocks
==8947==         suppressed: 0 bytes in 0 blocks
==8947==
==8947== For counts of detected and suppressed errors, rerun with: -v
==8947== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

nabijaczleweli@tarta:~/uwu$ sed -n 191p zutil_device_path_os.c
        tmpsize = asprintf(&tmp1, "/sys/block/%s/device", dev_name);

After:
nabijaczleweli@tarta:~/uwu$ valgrind --leak-check=full ./blergh
==9512== Memcheck, a memory error detector
==9512== Using Valgrind-3.14.0 and LibVEX
==9512== Command: ./blergh
==9512==
(null)
==9512==
==9512== HEAP SUMMARY:
==9512==     in use at exit: 0 bytes in 0 blocks
==9512==   total heap usage: 3 allocs, 3 frees, 1,147 bytes allocated
==9512==
==9512== All heap blocks were freed -- no leaks are possible
==9512==
==9512== For counts of detected and suppressed errors, rerun with: -v
==9512== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11993
2021-11-12 16:31:55 -08:00
Arshad Hussain 238504fece vdev_id: variable not getting expanded under map_slot()
Under function map_slot() variable passed as args
were not getting properly substituted or expanded.
This patch fixes the substitution issue.

Reviewed-by: Niklas Edmundsson <nikke@acc.umu.se>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Closes #11951
Closes #11959
2021-11-12 16:31:55 -08:00
Tony Hutter b0b796a94d vdev_id: Create symlinks even if no /dev/mapper/
vdev_id uses the /dev/mapper/ symlinks to resolve a UUID to a dm name
(like dm-1).  However on some multipath setups, there is no /dev/mapper/
entry for the UUID at the time vdev_id is called by udev.  However,
this isn't necessarily needed, as we may be able to resolve the dm
name from the $DEVNAME that udev passes us (like DEVNAME="/dev/dm-1").

This patch tries to resolve the dm name from $DEVNAME first, before
falling back to looking in /dev/mapper/.  This fixed an issue where the
by-vdev names weren't reliably showing up on one of our nodes.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #11698
2021-11-12 16:31:55 -08:00
Tony Hutter 9d14963de6 vdev_id: Fix partition regular expression
Given a DM device name, the old vdev_id script would extract any text
after a 'p' as the partition number.  It then appends "-part" + the
partition number to the name, giving a by-vdev name like "L0-part5".

This works fine if the DM name is like 'dm-2p5', but doesn't work if
the DM name is a multipath name like "mpatha".  In those cases it
incorrectly matches the 'p' in "mpatha", giving by-vdev names like
"L0-partatha".

This patch fixes the issue by making the partition regex match stricter.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #11637
2021-11-12 16:31:55 -08:00
Tony Hutter 88a3751e67 Better zfs_get_enclosure_sysfs_path() enclosure support
A multpathed disk will have several 'underlying' paths to the disk.  For
example, multipath disk 'dm-0' may be made up of paths:
/dev/{sda,sdb,sdc,sdd}.  On many enclosures those underlying sysfs
paths will have a symlink back to their enclosure device entry
(like 'enclosure_device0/slot1').  This is used by the
statechange-led.sh script to set/clear the fault LED for a disk, and
by 'zpool status -c'.

However, on some enclosures, those underlying paths may not all have
symlinks back to the enclosure device.  Maybe only two out of four
of them might.

This patch updates zfs_get_enclosure_sysfs_path() to favor returning
paths that have symlinks back to their enclosure devices, rather
than just returning the first path.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #11617
2021-11-12 16:31:55 -08:00
Arshad Hussain bdd43a2396 vdev_id: Support daisy-chained JBODs in multipath mode
Within function sas_handler() userspace commands like
'/usr/sbin/multipath' have been replaced with sourcing
device details from within sysfs which reduced a
significant amount of overhead and processing time.
Multiple JBOD enclosures and their order are sourced
from the bsg driver (/sys/class/enclosure) to isolate
chassis top-level expanders, which are then dynamically
indexed based on host channel of the multipath subordinate
disk member device being processed. Additionally added a
"mixed" mode for slot identification for environments where
a ZFS server system may contain SAS disk slots where there
is no expander (direct connect to HBA) while an attached
external JBOD with an expander have different slot identifier
methods.

How Has This Been Tested?
~~~~~~~~~~~~~~~~~~~~~~~~~

Testing was performed on a AMD EPYC based dual-server
high-availability multipath environment with multiple
HBAs per ZFS server and four SAS JBODs. The two primary
JBODs were multipath/cross-connected between the two
ZFS-HA servers. The secondary JBODs were daisy-chained
off of the primary JBODs using aligned SAS expander
channels (JBOD-0 expanderA--->JBOD-1 expanderA,
          JBOD-0 expanderB--->JBOD-1 expanderB, etc).
Pools were created, exported and re-imported, imported
globally with 'zpool import -a -d /dev/disk/by-vdev'.
Low level udev debug outputs were traced to isolate
and resolve errors.

Result:
~~~~~~~

Initial testing of a previous version of this change
showed how reliance on userspace utilities like
'/usr/sbin/multipath' and '/usr/bin/lsscsi' were
exacerbated by increasing numbers of disks and JBODs.
With four 60-disk SAS JBODs and 240 disks the time to
process a udevadm trigger was 3 minutes 30 seconds
during which nearly all CPU cores were above 80%
utilization. By switching reliance on userspace
utilities to sysfs in this version, the udevadm
trigger processing time was reduced to 12.2 seconds
and negligible CPU load.

This patch also fixes few shellcheck complains.

Reviewed-by: Gabriel A. Devenyi <gdevenyi@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Jeff Johnson <jeff.johnson@aeoncomputing.com>
Signed-off-by: Jeff Johnson <jeff.johnson@aeoncomputing.com>
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Closes #11526
2021-11-12 16:31:55 -08:00
Rich Ercolani 47ada9e5ed Added error for writing to /dev/ on Linux
Starting in Linux 5.10, trying to write to /dev/{null,zero} errors out.
Prefer to inform people when this happens rather than hoping they guess
what's wrong.

Reviewed-by: Antonio Russo <aerusso@aerusso.net>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes:  #11991
2021-11-12 16:31:55 -08:00
Brian Behlendorf c4a5e56abc ZTS: Add known exceptions
The receive-o-x_props_override test case reliably fails on the
FreeBSD main builders (but not on Linux), until the root cause is
understood add this test to the FreeBSD exception list.

On Linux the alloc_class_012_pos test case may occasionally fail.
This is a known false positive which has also been added to the
Linux exception list until the test can be made entirely reliable.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12272
2021-11-12 16:31:55 -08:00
Brian Behlendorf ec1b033413 ZTS: Standardize use of destroy_dataset in cleanup
When cleaning up a test case standardize on using the convention:

    datasetexists $ds && destroy_dataset $ds <flags>

By using 'destroy_dataset' instead of 'log_must zfs destroy' we ensure
that the destroy is retried in the event that a ZFS volume is busy.
This helps ensures ensure tests are fully cleaned up and prevents false
positive test failures on Linux.

Note that all of the tests which used 'zfs destroy' in cleanup have
been updated even if they don't use volumes.  This was done to
clearly establish the expected convention.

Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12663
2021-11-12 16:31:55 -08:00
Damian Szuberski 8464de1315 Update `checkstyle` workflow env to ubuntu-20.04
- `checkstyle` workflow uses ubuntu-20.04 environment
- improved `mancheck.sh` readability

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #12713
2021-11-12 16:31:55 -08:00
Rich Ercolani cc40a67cf8 Workaround cloud-init hotplug issue
cloud-init added a hook which triggers on every device add/rm
event, which results in holding open devices for a while after
they're created/destroyed.

So let's shove an exclusion rule for that into the GH workflows
until it gets fixed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12644
Closes #12669
2021-11-12 16:31:54 -08:00
George Melikov c2a69a21ef CI: don't install abigail-tools
We use docker image instead.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #12529
2021-11-12 16:31:54 -08:00
George Melikov 866ac70904 CI: use fresh libabigail via docker image
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #12529
2021-11-12 16:31:48 -08:00
Jonathon f3c85e3ebd Update libera webchat client URL
Libera have made a webchat client available. This change builds on #12127.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jonathon Fernyhough <jonathon@m2x.dev>
Closes #12251
2021-11-12 15:40:55 -08:00
Paul Dagnelie 14770e1030 Don't direct to freenode in issue template
While Libera doesn't yet have a webchat client, we should at least
direct them to the right network. Once a webchat client is available,
we can direct them to it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #12127
2021-11-12 15:40:35 -08:00
Attila Fülöp fb9eee4cc2 gcc 11 cleanup
Compiling with gcc 11.1.0 produces three new warnings.
Change the code slightly to avoid them.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #12130
Closes #12188
Closes #12237
2021-11-12 15:24:36 -08:00
Brian Behlendorf 820c95750b Use fallthrough macro
As of the Linux 5.9 kernel a fallthrough macro has been added which
should be used to anotate all intentional fallthrough paths.  Once
all of the kernel code paths have been updated to use fallthrough
the -Wimplicit-fallthrough option will because the default.  To
avoid warnings in the OpenZFS code base when this happens apply
the fallthrough macro.

Additional reading: https://lwn.net/Articles/794944/

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12441
2021-11-12 15:24:36 -08:00
Rich Ercolani 66c9e15686 Correct a flaw in the Python 3 version checking
It turns out the ax_python_devel.m4 version check assumes that
("3.X+1.0" >= "3.X.0") is True in Python, which is not when X+1
is 10 or above and X is not. (Also presumably X+1=100 and ...)

So let's remake the check to behave consistently, using the
"packaging" or (if absent) the "distlib" modules.

(Also, update the Github workflows to use the new packages.)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes: #12073
2021-11-12 15:24:36 -08:00
Rich Ercolani c64c17328b Let zfs diff be more permissive
In the current world, `zfs diff` will die on certain kinds of errors
that come up on ordinary, not-mangled filesystems - like EINVAL,
which can come from a file with multiple hardlinks having the one
whose name is referenced deleted.

Since it should always be safe to continue, let's relax about all
error codes - still print something for most, but don't immediately
abort when we encounter them.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12072
2021-11-12 15:24:36 -08:00
Rich Ercolani 439b4b134d Added test for being able to read various variants of zstd
As detailed in #12022 and #12008, it turns out the current zstd
implementation is quite nonportable, and results in various
configurations of ondisk header that only each platform can read.

So I've added a test which contains a dataset with a file written by
Linux/x86_64 and one written by FBSD/ppc64.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12030
2021-11-12 15:24:36 -08:00
наб 5957574694 zed: only go up to current limit in close_from() fallback
Consider the following strace log:
  prlimit64(0, RLIMIT_NOFILE,
            NULL, {rlim_cur=1024, rlim_max=1024*1024}) = 0
  dup2(0, 30)                         = 30
  dup2(0, 300)                        = 300
  dup2(0, 3000)                       = -1 EBADF (Bad file descriptor)
  dup2(0, 30000)                      = -1 EBADF (Bad file descriptor)
  dup2(0, 300000)                     = -1 EBADF (Bad file descriptor)
  prlimit64(0, RLIMIT_NOFILE,
            {rlim_cur=1024*1024, rlim_max=1024*1024}, NULL) = 0
  dup2(0, 30)                         = 30
  dup2(0, 300)                        = 300
  dup2(0, 3000)                       = 3000
  dup2(0, 30000)                      = 30000
  dup2(0, 300000)                     = 300000

Even a privileged process needs to bump its rlimit before being able
to use fds higher than rlim_cur.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11834
2021-11-12 15:24:36 -08:00
наб 3e04897edc zed: implement close_from() in terms of /proc/self/fd, if available
/dev/fd on Darwin

Consider the following strace output:
  prlimit64(0, RLIMIT_NOFILE, NULL, {rlim_cur=1024, rlim_max=1024*1024}) = 0

Yes, that is well over a million file descriptors!

This reduces the ZED start-up time from "at least a second" to
"instantaneous", and, under strace, from "don't even try" to "usable"
by simple virtue of doing five syscalls instead of over a million;
in most cases the main loop does nothing

Recent Linuxes (5.8+) have close_range(2) for this, but that's an
overoptimisation (and libcs don't have wrappers for it yet)

This is also run by the ZEDLET pre-exec. Compare:
  Finished "all-syslog.sh" eid=13 pid=6717 time=1.027100s exit=0
  Finished "history_event-zfs-list-cacher.sh" eid=13 pid=6718 time=1.046923s exit=0
to
  Finished "all-syslog.sh" eid=12 pid=4834 time=0.001836s exit=0
  Finished "history_event-zfs-list-cacher.sh" eid=12 pid=4835 time=0.001346s exit=0
lol

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11834
2021-11-12 15:24:36 -08:00
Rich Ercolani fb823061b0 Fix cross-endian interoperability of zstd
It turns out that layouts of union bitfields are a pain, and the
current code results in an inconsistent layout between BE and LE
systems, leading to zstd-active datasets on one erroring out on
the other.

Switch everyone over to the LE layout, and add compatibility code
to read both.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12008
Closes #12022
2021-11-12 15:24:36 -08:00
George Melikov 4e8a639d5f CI: generate ABI files if changed
So commit author can just download them as
artifacts and commit.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #12379
2021-11-12 15:22:08 -08:00
Brian Behlendorf 8009f02748 Update bug report template
- Remove the "SPL Version" line, the repositories have been merged
  since the 0.8 release and we no longer need to ask about this.

- Simply ask for the kernel version / patch level and add a hint
  about how to get this information on Linux and FreeBSD.

- Remove "Status: Triage Needed" from the template, in practice
  we really haven't been using this label so let's step setting it.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes: #12340
2021-11-12 15:22:00 -08:00
Jonathon 4a5316eef4 Update libera webchat client URL
Libera have made a webchat client available. This change builds on #12127.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jonathon Fernyhough <jonathon@m2x.dev>
Closes #12251
2021-11-12 15:21:48 -08:00
Paul Dagnelie e00f3da136 Don't direct to freenode in issue template
While Libera doesn't yet have a webchat client, we should at least
direct them to the right network. Once a webchat client is available,
we can direct them to it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #12127
2021-11-12 15:21:31 -08:00
Tony Hutter ef686e96ec Tag zfs-2.0.6
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2021-09-22 15:19:08 -07:00
Brian Behlendorf 7a41ef240a Linux 5.15 compat: get_acl()
Kernel commits

332f606b32b6 ovl: enable RCU'd ->get_acl()
0cad6246621b vfs: add rcu argument to ->get_acl() callback

Added compatibility code to detect the new ->get_acl() interface
and correctly handle the case where the new rcu argument is set.

Reviewed-by: Coleman Kane <ckane@colemankane.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12548
2021-09-22 15:19:08 -07:00
Alexander 72d16a9b49 Linux 5.15 compat: standalone <linux/stdarg.h>
Kernel commits

39f75da7bcc8 ("isystem: trim/fixup stdarg.h and other headers")
c0891ac15f04 ("isystem: ship and use stdarg.h")
564f963eabd1 ("isystem: delete global -isystem compile option")

(for now can be found in linux-next.git tree, will land into the
 Linus' tree during the ongoing 5.15 cycle with one of akpm merges)

removed the -isystem flag and disallowed the inclusion of any
compiler header files. They also introduced a minimal
<linux/stdarg.h> as a replacement for <stdarg.h>.
include/os/linux/spl/sys/cmn_err.h in the ZFS source tree includes
<stdarg.h> unconditionally. Introduce a test for <linux/stdarg.h>
and include it instead of the compiler's one to prevent module
build breakage.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #12531
2021-09-22 15:19:08 -07:00
Brian Behlendorf 54c358c3f2 Linux 5.15 compat: block device readahead
The 5.15 kernel moved the backing_dev_info structure out of
the request queue structure which causes a build failure.

Rather than look in the new location for the BDI we instead
detect this upstream refactoring by the existance of either
the blk_queue_update_readahead() or disk_update_readahead()
functions.  In either case, there's no longer any reason to
manually set the ra_pages value since it will be overridden
with a reasonable default (2x the block size) when
blk_queue_io_opt() is called.

Therefore, we update the compatibility wrapper to do nothing
for 5.9 and newer kernels.  While it's tempting to do the
same for older kernels we want to keep the compatibility
code to preserve the existing behavior.  Removing it would
effectively increase the default readahead to 128k.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12532
2021-09-22 15:19:08 -07:00
Brian Behlendorf 560e9fc817 Linux 5.14 compat: META
Increase the Linux-Maximum version in the META file to 5.14.
All of the required compatibility patches have been merged
and the 5.14 kernel has been officially released.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12565
2021-09-22 15:19:08 -07:00
Brian Behlendorf d642efe83c Linux 5.13 compat: META
Increase the Linux-Maximum version in the META file to 5.13.
All of the required compatibility patches have been merged
and the 5.13 kernel has been officially released.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2021-09-22 15:19:08 -07:00
Alexander Motin 51e313f610 FreeBSD: Ignore make_dev_s() errors
Since errors returned by zvol_create_minor_impl() are ignored by the
common code, it is more convenient to ignore make_dev_s() errors there.
It allows, for example, to get device created for the zvol after later
rename instead of having it further stuck in half-created state.
zvol_rename_minor() already ignores those errors.

While there, switch from MAXPHYS to maxphys in FreeBSD 13+.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12375
2021-09-22 15:19:08 -07:00
Alexander Motin 327f12c291 FreeBSD: Switch from MAXPHYS to maxphys on FreeBSD 13+
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12378
2021-09-22 15:19:08 -07:00
Alexander Motin 5d4f7e7566 FreeBSD: Retry OCF ENOMEM errors.
ZFS does not expect transient errors from crypto.  For read they are
counted as checksum errors, while for write end up in panic.  To not
panic on random low memory conditions retry ENOMEM errors in the OCF
wrapper function.

While there remove unneeded timeout and priority from msleep().

External-issue: https://reviews.freebsd.org/D30339
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12077
2021-09-22 15:19:08 -07:00
Serapheim Dimitropoulos abd0b59e48 Livelist logic should handle dedup blkptrs
Update the logic to handle the dedup-case of consecutive
FREEs in the livelist code. The logic still ensures that
all the FREE entries are matched up with a respective
ALLOC by keeping a refcount for each FREE blkptr that we
encounter and ensuring that this refcount gets to zero
by the time we are done processing the livelist.

zdb -y no longer panics when encountering double frees

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #11480
Closes #12177
2021-09-22 15:19:08 -07:00
Coleman Kane 6b9d0eda75 Linux 5.14 compat: explicity assign set_page_dirty
Kernel 5.14 introduced a change where set_page_dirty of
struct address_space_operations is no longer implicitly set to
__set_page_dirty_buffers(), which ended up resulting in a NULL
pointer deref in the kernel when it is attempted to be called.
This change sets .set_page_dirty in the structure to
__set_page_dirty_nobuffers(), which was introduced with the
related patch set. The breaking change was introduce in commit
0af573780b0b13fceb7fabd49dc1b073cee9a507 to torvalds/linux.git.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12427
2021-09-22 15:19:08 -07:00
Paul Dagnelie 744cdfd93b Add SIGSTOP and SIGTSTP handling to issig
This change adds SIGSTOP and SIGTSTP handling to the issig function;
this mirrors its behavior on Solaris. This way, long running kernel
tasks can be stopped with the appropriate signals. Note that doing
so with ctrl-z on the command line doesn't return control of the tty
to the shell, because tty handling is done separately from stopping
the process. That can be future work, if people feel that it is a
necessary addition.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Issue #810
Issue #10843
Closes #11801
2021-09-22 15:19:08 -07:00
Brian Behlendorf 36d50b60d8 Linux 5.14 compat: blk_alloc_disk()
In Linux 5.14, blk_alloc_queue is no longer exported, and its usage
has been superseded by blk_alloc_disk, which returns a gendisk struct
from which we can still retrieve the struct request_queue* that is
needed in the one place where it is used. This also replaces the call
to alloc_disk(minors), and minors is now set via struct member
assignment.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12362
Closes #12409
2021-09-22 15:19:08 -07:00
Mark Johnston 0c4f86be74 Initialize dn_next_type[] in the dnode constructor
It seems nothing ensures that this array is zeroed when a dnode is
freshly allocated, so in principle it retains the values from the
previous allocation.  In practice it seems to be the case that the
fields should end up zeroed, but we can zero the field anyway for
consistency.

This was found using KMSAN.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12383
2021-09-22 15:19:08 -07:00
Mark Johnston 88be308b2f Zero pad bytes following TX_WRITE log data
When logging a TX_WRITE record in the case where file data has to be
copied from the DMU, we pad the log record size to a multiple of 8
bytes.  In this case, any padding bytes should be zeroed, otherwise the
contents of uninitialized memory are written to the ZIL.

This was found using KMSAN.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12383
2021-09-22 15:19:08 -07:00
Mark Johnston d6dc79eabc Zero pad bytes when allocating a ZIL record
When allocating a record, we round up the allocation size to a multiple
of 8.  In this case, any padding bytes should be zeroed, otherwise the
contents of uninitialized memory are written to the ZIL.

This was found using KMSAN.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12383
2021-09-22 15:19:08 -07:00
Mark Johnston 900a444107 Initialize all fields in zfs_log_xvattr()
When logging TX_SETATTR, we could otherwise fail to initialize part of
the corresponding ZIL record depending on which fields are present in
the xvattr.  Initialize the creation time and the AV scan timestamp to
zero so that uninitialized bytes are not written to the ZIL.

This was found using KMSAN.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12383
2021-09-22 15:19:08 -07:00
George Wilson 1885e5ebab file reference counts can get corrupted
Callers of zfs_file_get and zfs_file_put can corrupt the reference
counts for the file structure resulting in a panic or a soft lockup.
When zfs send/recv runs, it will add a reference count to the
open file, and begin to send or recv the stream. If the file descriptor
is closed, then when dmu_recv_stream() or dmu_send() return we will
call zfs_file_put to remove the reference we placed on the file
structure. Unfortunately, because zfs_file_put() uses the file
descriptor to lookup the file structure, it may end up finding that
the file descriptor table no longer contains the file struct, thus
leaking the file structure. Or it might end up finding a file
descriptor for a different file and blindly updating its reference
counts. Other failure modes probably exists.

This change reworks the zfs_file_[get|put] interface to not rely
on the file descriptor but instead pass the zfs_file_t pointer around.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Co-authored-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: George Wilson <gwilson@delphix.com>
External-issue: DLPX-76119
Closes #12299
2021-09-22 15:19:08 -07:00
Antonio Russo 1ad8fcc054 Revert Consolidate arc_buf allocation checks
This reverts commit 13fac09868.

Per the discussion in #11531, the reverted commit---which intended only
to be a cleanup commit---introduced a subtle, unintended change in
behavior.

Care was taken to partially revert and then reapply 10b3c7f5e4
which would otherwise have caused a conflict.  These changes were
squashed in to this commit.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Suggested-by: @chrisrd
Suggested-by: robn@despairlabs.com
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #11531
Closes #12227
2021-09-22 15:19:08 -07:00
Rich Ercolani 05c96b438a Fix unfortunate NULL in spa_update_dspace
After 1325434b, we can in certain circumstances end up calling
spa_update_dspace with vd->vdev_mg NULL, which ends poorly during
vdev removal.

So let's not do that further space adjustment when we can't.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12380 
Closes #12428
2021-09-22 15:19:08 -07:00
Rich Ercolani ccf6d0a59b Tinker with slop space accounting with dedup
* Tinker with slop space accounting with dedup

Do not include the deduplicated space usage in the slop space
reservation, it leads to surprising outcomes.

* Update spa_dedup_dspace sometimes

Sometimes, we get into spa_get_slop_space() with
spa_dedup_dspace=~0ULL, AKA "unset", while spa_dspace is correctly set.

So call the code to update it before we use it if we hit that case.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12271
2021-09-22 15:19:08 -07:00
Prakash Surya 61ae6c99f7 Add upper bound for slop space calculation
This change modifies the behavior of how we determine how much slop
space to use in the pool, such that now it has an upper limit. The
default upper limit is 128G, but is configurable via a tunable.

(Backporting note: Snipped out the embedded_log portion of the changes.)

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Prakash Surya <prakash.surya@delphix.com>
Closes #11023
2021-09-22 15:19:08 -07:00
Tony Hutter e9353bc2ef Tag zfs-2.0.5
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2021-06-23 13:28:36 -07:00
George Amanakis 87d93731e7 Avoid deadlock when removing L2ARC devices under I/O
In case we have I/O and try to remove an L2ARC device a deadlock might
occur. arc_read()->zio_read()->zfs_blkptr_verify() waits for SCL_VDEV
to be dropped while holding the hash_lock. However, spa_l2cache_load()
holds SCL_ALL and waits for the hash_lock in l2arc_evict().

Fix this by moving zfs_blkptr_verify() to the top top arc_read() before
the hash_lock is taken. Verify the block pointer and return a checksum
error if damaged rather than halting the system, by using
BLK_VERIFY_LOG instead of BLK_VERIFY_HALT.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #12054
2021-06-23 13:22:15 -07:00
Paul Zuchowski bd197378e7 Do not hash unlinked inodes
In zfs_znode_alloc we always hash inodes.  If the
znode is unlinked, we do not need to hash it.  This
fixes the problem where zfs_suspend_fs is doing zrele
(iput) in an async fashion, and zfs_resume_fs unlinked
drain processing will try to hash an inode that could
still be hashed, resulting in a panic.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #9741
Closes #11223
Closes #11648
Closes #12210
2021-06-23 13:22:15 -07:00
jharmening cd2bb9ca44 FreeBSD: incorporate changes to the VFS_QUOTACTL(9) KPI
VFS_QUOTACTL(9) has been updated to allow each filesystem to indicate
whether it has changed the busy state of the mount.  The filesystem
may still assume that its .vfs_quotactl entrypoint is always called
with the mount busied, but only needs to unbusy the mount (and clear
*mp_busy) if it does something that actually requires the mount to be
unbusied.  It no longer needs to blindly copy-paste the UFS protocol
for calling vfs_unbusy(9) for the Q_QUOTAOFF and Q_QUOTAON commands.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Jason Harmening <jason.harmening@gmail.com>
Closes #12052
2021-06-23 13:22:15 -07:00
Mateusz Guzik 52d9bc7174 FreeBSD: use vnlru_free_vfsops if available
Fixes issues when zfs is used along with other filesystems.

External-issue: https://cgit.freebsd.org/src/commit/?id=e9272225e6bed840b00eef1c817b188c172338ee
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11881
2021-06-23 13:22:15 -07:00
Brian Behlendorf 6905f6b2c1 cppcheck: integrete cppcheck
In order for cppcheck to perform a proper analysis it needs to be
aware of how the sources are compiled (source files, include
paths/files, extra defines, etc).  All the needed information is
available from the Makefiles and can be leveraged with a generic
cppcheck Makefile target.  So let's add one.

Additional minor changes:

* Removing the cppcheck-suppressions.txt file.  With cppcheck 2.3
  and these changes it appears to no longer be needed.  Some inline
  suppressions were also removed since they appear not to be
  needed.  We can add them back if it turns out they're needed
  for older versions of cppcheck.

* Added the ax_count_cpus m4 macro to detect at configure time how
  many processors are available in order to run multiple cppcheck
  jobs.  This value is also now used as a replacement for nproc
  when executing the kernel interface checks.

* "PHONY =" line moved in to the Rules.am file which is included
  at the top of all Makefile.am's.  This is just convenient becase
  it allows us to use the += syntax to add phony targets.

* One upside of this integration worth mentioning is it now allows
  `make cppcheck` to be run in any directory to check that subtree.

* For the moment, cppcheck is not run against the FreeBSD specific
  kernel sources.  The cppcheck-FreeBSD target will need to be
  implemented and testing on FreeBSD to support this.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11508
2021-06-23 13:22:15 -07:00
Rich Ercolani cfc82609a2 Simple change to fix building in recent environments
Renamed _fini too for symmetry.

Suggested-by: @ensch
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12059
Closes: #11987
Closes: #12056
2021-06-23 13:22:15 -07:00
George Melikov e26776f14c ZTS: pool_state test check for pool existence in cleanup
If there is no scsi_debug module, then this test
must be skipped, in this case cleanup routine should
be prepared for absent pool.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #11534
2021-06-23 13:22:15 -07:00
Chunwei Chen e68e938f54 Fix zfs_get_data access to files with wrong generation
If TX_WRITE is create on a file, and the file is later deleted and a new
directory is created on the same object id, it is possible that when
zil_commit happens, zfs_get_data will be called on the new directory.
This may result in panic as it tries to do range lock.

This patch fixes this issue by record the generation number during
zfs_log_write, so zfs_get_data can check if the object is valid.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #10593
Closes #11682
2021-06-23 13:22:15 -07:00
Christian Schwarz ab2110e3e2 zfs_vnops: make zfs_get_data OS-independent
Move zfs_get_data() in to platform-independent code. The only
platform-specific aspect of it is the way we release an inode
(Linux) / vnode_t (FreeBSD). I am not aware of a platform that
could be supported by ZFS that couldn't implement zfs_rele_async
itself. It's sibling zvol_get_data already is platform-independent.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Christian Schwarz <me@cschwarz.com>
Closes #10979
2021-06-23 13:22:15 -07:00
Matthew Macy a909190683 Consolidate zfs_holey and zfs_access
The zfs_holey() and zfs_access() functions can be made common
to both FreeBSD and Linux.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #11125
2021-06-23 13:22:15 -07:00
наб 073f3c7826 zed: reap child after killing on time-out
When a child process is killed waitpid() must be called on the
pid the reap the zombie process.

Update BUGS section to reflect reality by replacing "zedlets
aren't time limited with "zedlets can be interrupted".

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11769 
Closes #11798
2021-06-23 13:22:15 -07:00
Luis Henriques 5072f37419 Fix error code on __zpl_ioctl_setflags()
Other (all?) Linux filesystems seem to return -EPERM instead of -EACCESS
when trying to set FS_APPEND_FL or FS_IMMUTABLE_FL without the
CAP_LINUX_IMMUTABLE capability.  This was detected by generic/545 test
in the fstest suite.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Luis Henriques <henrix@camandro.org>
Closes #11791
2021-06-23 13:22:15 -07:00
Ryan Moeller 332cd2e313 Fix typo in zgenhostid.8
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11770
2021-06-23 13:22:15 -07:00
Adam D. Moss ac6f332154 Linux: always check or verify return of igrab()
zhold() wraps igrab() on Linux, and igrab() may fail when the inode 
is in the process of being deleted.  This means zhold() must only be
called when a reference exists and therefore it cannot be deleted. 
This is the case for all existing consumers so add a VERIFY and a
comment explaining this requirement.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Adam Moss <c@yotes.com>
Closes #11704
2021-06-23 13:22:15 -07:00
Brian Behlendorf f14b6f2302 Linux: Set spl_kmem_cache_slab_limit when page size !4K
For small objects the kernel's slab implementation is very fast and
space efficient. However, as the allocation size increases to
require multiple pages performance suffers. The SPL kmem cache
allocator was designed to better handle these large allocation
sizes. Therefore, on Linux the kmem_cache_* compatibility wrappers
prefer to use the kernel's slab allocator for small objects and
the custom SPL kmem cache allocator for larger objects.

This logic was effectively disabled for all architectures using
a non-4K page size which caused all kmem caches to only use the
SPL implementation. Functionally this is fine, but the SPL code
which calculates the target number of objects per-slab does not
take in to account that __vmalloc() always returns page-aligned
memory. This can result in a massive amount of wasted space when
allocating tiny objects on a platform using large pages (64k).

To resolve this issue we set the spl_kmem_cache_slab_limit cutoff
to 16K for all architectures. 

This particular change does not attempt to update the logic used
to calculate the optimal number of pages per slab. This remains
an issue which should be addressed in a future change.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12152
Closes #11429
Closes #11574
Closes #12150
2021-06-23 13:22:15 -07:00
Chunwei Chen bfb2928490 Fix zfs_get_data access to files with wrong generation
If TX_WRITE is create on a file, and the file is later deleted and a new
directory is created on the same object id, it is possible that when
zil_commit happens, zfs_get_data will be called on the new directory.
This may result in panic as it tries to do range lock.

This patch fixes this issue by record the generation number during
zfs_log_write, so zfs_get_data can check if the object is valid.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #10593
Closes #11682
2021-06-23 13:22:15 -07:00
Paul Zuchowski ecb1b1a31d Fix dmu_recv_stream test for resumable
Use dsl_dataset_has_resume_receive_state()
not dsl_dataset_is_zapified() to check if
stream is resumable.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Alek Pinchuk <apinchuk@axcient.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #12034
2021-06-23 13:22:15 -07:00
Rich Ercolani 412b69dfab Remove iov_iter_advance() for iter_write
The additional iter advance is incorrect, as copy_from_iter() has
already done the right thing.  This will result in the following
warning being printed to the console as of the 5.12 kernel.

    Attempted to advance past end of bvec iter

This change should have been included with #11378 when a
similar change was made on the read side.

Suggested-by: @siebenmann
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Issue #11378
Closes #12041
Closes #12155
(cherry picked from commit 3f81aba766)
Signed-off-by: Jonathon Fernyhough <jonathon@m2x.dev>
2021-06-23 13:22:15 -07:00
Jonathon Fernyhough d014da0032 linux 5.13 compat: bdevops->revalidate_disk() removed (#12122)
Linux kernel commit 0f00b82e5413571ed225ddbccad6882d7ea60bc7 removes the
revalidate_disk() handler from struct block_device_operations. This
caused a regression, and this commit eliminates the call to it and the
assignment in the block_device_operations static handler assignment
code, when configure identifies that the kernel doesn't support that
API handler.

Reviewed-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #11967
Closes #11977
(cherry picked from commit 48c7b0e444)
Signed-off-by: Jonathon Fernyhough <jonathon@m2x.dev>

Co-authored-by: Coleman Kane <ckane@colemankane.org>
2021-06-23 13:22:15 -07:00
Rich Ercolani 8f93ab53d4 Bend zpl_set_acl to permit the new userns* parameter
Just like #12087, the set_acl signature changed with all the bolted-on
*userns parameters, which disabled set_acl usage, and caused #12076.

Turn zpl_set_acl into zpl_set_acl and zpl_set_acl_impl, and add a
new configure test for the new version.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12076
Closes #12093
2021-06-23 13:22:15 -07:00
Rich Ercolani 88891b804b Update tmpfile() existence detection
Linux changed the tmpfile() signature again in torvalds/linux@6521f89,
which in turn broke our HAVE_TMPFILE detection in configure.

Update that macro to include the new case, and change the signature of
zpl_tmpfile as appropriate.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes: #12060
Closes: #12087
2021-06-23 13:22:15 -07:00
Armin Wehrfritz 87b5e7fb1d RPM: Explicitly set the required min/max kernel version for the DKMS package
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Armin Wehrfritz <dkxls23@gmail.com>
Closes openzfs#12124
2021-06-23 13:22:15 -07:00
Coleman Kane 5722dce473 Linux 5.12 update: bio_max_segs() replaces BIO_MAX_PAGES
The BIO_MAX_PAGES macro is being retired in favor of a bio_max_segs()
function that implements the typical MIN(x,y) logic used throughout the
kernel for bounding the allocation, and also the new implementation is
intended to be signed-safe (which the former was not).

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #11765
(cherry picked from commit ffd6978ef5)
Signed-off-by: Jonathon Fernyhough <jonathon@m2x.dev>
2021-06-23 13:22:15 -07:00
Coleman Kane 91d5ac85c0 Linux 5.12 compat: idmapped mounts
In Linux 5.12, the filesystem API was modified to support ipmapped
mounts by adding a "struct user_namespace *" parameter to a number
functions and VFS handlers. This change adds the needed autoconf
macros to detect the new interfaces and updates the code appropriately.
This change does not add support for idmapped mounts, instead it
preserves the existing behavior by passing the initial user namespace
where needed.  A subsequent commit will be required to add support
for idmapped mounted.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #11712
(cherry picked from commit e2a8296131)
Signed-off-by: Jonathon Fernyhough <jonathon@m2x.dev>
2021-06-23 13:22:15 -07:00
Ryan Moeller fd8ccce07b FreeBSD: Initialize/destroy zp->z_lock
zp->z_lock is used in shared code for protecting projid and scantime.
We don't exercise these paths much if at all on FreeBSD, so have been
lucky enough not to have issues with the uninitialized locks so far.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #12003
2021-06-23 13:22:15 -07:00
Ryan Moeller c2a3bcc91d ZTS: Fix xattr_002_neg passing too soon
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11970
2021-06-23 13:22:15 -07:00
Toomas Soome d1277e2a13 zdb: ASSERT issues when DEBUG is not defined
If zdb is not built with DEBUG mode, the ASSERT macros will be
eliminated.

This will leave vim defined, but not used (gcc warning) and
checkpoint spacemap validation loop will do nothing.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #11932
2021-06-23 13:22:14 -07:00
Brian Behlendorf a59bf29606 ZTS: Add known exceptions
Both the zpool_initialize_import_export and checkpoint_discard_busy
test cases a known to occasionally fail.  Add them to the list of
known possible failures and reference the appropriate issue on the
tracker.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11949
2021-06-23 13:22:14 -07:00
Prawn 28fdc5f811 receive: don't fail inheriting (-x) properties on wrong dataset type
Receiving datasets while blanket inheriting properties like zfs 
receive -x mountpoint can generally be desirable, e.g. to avoid 
unexpected mounts on backup hosts.

Currently this will fail to receive zvols due to the mountpoint 
property being applicable to filesystems only.  This limitation 
currently requires operators to special-case their minds and tools 
for zvols.

This change gets rid of this limitation for inherit (-x) by
Spiting up the dataset type handling: Warnings for inheriting (-x), 
errors for overriding (-o).

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: InsanePrawn <insane.prawny@gmail.com>
Closes #11416
Closes #11840
Closes #11864
2021-06-23 13:22:14 -07:00
Mateusz Guzik a78a93e67e FreeBSD: damage control racing .. lookups in face of mkdir/rmdir
External-issue: https://reviews.freebsd.org/D29769
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11926
2021-06-23 13:22:14 -07:00
Romain Dolbeau 9a68ceba10 Fix AVX512BW Fletcher code on AVX512-but-not-BW machines
Introduce a specific valid function for avx512f+avx512bw (instead 
of checking only for avx512f).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Adam Moss <c@yotes.com>
Signed-off-by: Romain Dolbeau <romain@dolbeau.org>
Closes #11937
Closes #11938
2021-06-23 13:22:14 -07:00
Daniel Stevenson 3c7ad493a6 Fixed incorrect man page reference in zfsprops(8)
The special_small_blocks section directed readers to zpool(8) for
documentation on special allocation classes, while they are actually
documented in zpoolconcepts(8).

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Daniel Stevenson <daniel@dstev.net>
Closes #11918
2021-06-23 13:22:14 -07:00
наб faad85637b freebsd/libshare: nfs: make nfs_is_shared() thread-safe
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11886
2021-06-23 13:22:14 -07:00
наб 501da8d433 libshare: nfs: don't leak nfs_lock_fd when lock fails
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11886
2021-06-23 13:22:14 -07:00
наб 30620ad8e1 libzfs: refresh property cache after inheriting userprop
This matches what happens when inheriting a system property

Consider the following program:
	int main() {
		void *zhp = libzfs_init();
		void *dataset = zfs_open(zhp, "zest/__test", 1);

		printf("before:");
		dump_nvlist(zfs_get_user_props(dataset), 2);
		printf("\n");

		zfs_prop_inherit(dataset, "xyz.nabijaczleweli:test", 0);
		printf("after:");
		dump_nvlist(zfs_get_user_props(dataset), 2);
		printf("\n");

		zfs_refresh_properties(dataset);
		printf("refreshed:");
		dump_nvlist(zfs_get_user_props(dataset), 2);
		printf("\n");
	}

And the output before:
	# zfs set xyz.nabijaczleweli:test=hehe zest/__test
	# ./a.out
	before:  xyz.nabijaczleweli:test:
	      value: 'hehe'
	      source: 'zest/__test'

	after:  xyz.nabijaczleweli:test:
	      value: 'hehe'
	      source: 'zest/__test'

	refreshed:

As compared to the output after:
	# zfs set xyz.nabijaczleweli:test=hehe zest/__test
	# ./a.out
	before:  xyz.nabijaczleweli:test:
	      value: 'hehe'
	      source: 'zest/__test'

	after:
	refreshed:

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11064
Closes #11911
2021-06-23 13:22:14 -07:00
наб 1d65d1a597 libzfs: don't mark prompt+raw as retriable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11911
Closes #11031
2021-06-23 13:22:14 -07:00
Mateusz Guzik edf3d7aa43 Combine zio caches if possible
This deduplicates 2 sets of caches which use the same allocation size.

Memory savings fluctuate a lot, one sample result is FreeBSD running
"make buildworld" saving ~180MB RAM in reduced page count associated
with zio caches.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11877
2021-06-23 13:22:14 -07:00
Paul Zuchowski 9755cdfd89 Fix crash in zio_done error reporting
Fix NULL pointer dereference when reporting
checksum error for gang block in zio_done.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #11872
Closes #11896
2021-06-23 13:22:14 -07:00
Brian Behlendorf f36e8118fd Fix 'make checkbashisms` warnings
The awk command used by the checkbashisms target incorrectly
adds the escape character before the ! and # characters.  This
results in the following warnings because these characters do not
need to be escaped.

    awk: cmd. line:1: warning: regexp escape sequence
        `\!' is not a known regexp operator
    awk: cmd. line:1: warning: regexp escape sequence
        `\#' is not a known regexp operator

Remove the unneeded escape character before ! and #.

Valid escape sequences are:

    https://www.gnu.org/software/gawk/manual/html_node/Escape-Sequences.html

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11902
2021-06-23 13:22:14 -07:00
Yuri Pankov e69f73c5cf Fix vdev health padding in zpool list -v
Do not (incorrectly, right instead left) pad health string itself,
it will be taken care of when printing property value below.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Yuri Pankov <yuripv@FreeBSD.org>
Closes #11899
2021-06-23 13:22:14 -07:00
наб 3430eced80 libzfs: zfs_mount_at(): load key for encryption root if MS_CRYPT
zfs_crypto_load_key() only works on encryption roots,
and zfs mount -la would fail if it encounters a datasets that
is sorted before their encroots.

To trigger:
  truncate -s 40G /tmp/test
  dd if=/dev/urandom of=/tmp/k bs=128 count=1 status=none
  zpool create -O encryption=on -O keylocation=file:///tmp/k \
               -O keyformat=passphrase test /tmp/test
  zfs create -o mountpoint=/a test/a
  zfs create -o mountpoint=/b test/b
  zfs umount test
  zfs unload-key test
  zfs mount -la

The final mount errored out with:
  Key load error: Keys must be loaded for
    encryption root of 'test/a' (test).
  Key load error: Keys must be loaded for
    encryption root of 'test/b' (test).

And only /test was mounted

This technically breaks the libzfs API, but the previous behavior was
decidedly a bug.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11870 
Closes #11875
2021-06-23 13:22:14 -07:00
Brian Behlendorf 0839934d84 ZTS: fix removal_condense_export test case
It's been observed in the CI that the required 25% of obsolete bytes
in the mapping can be to high a threshold for this test resulting in
condensing never being triggered and a test failure.  To prevent these
failures make the existing zfs_condense_indirect_obsolete_pct tuning
available so the obsolete percentage can be reduced from 25% to 5%
during this test.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11869
2021-06-23 13:22:14 -07:00
наб f831d1377f libzfs{,_core}: set O_CLOEXEC on persistent (ZFS_DEV and MNTTAB) fds
These were fd 3, 4, and 5 by the time zfs change-key hit
execute_key_fob()

glibc appends "e" to setmntent() mode, but musl's just returns fopen()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11866
2021-06-23 13:22:14 -07:00
наб a559c3d0f4 libzfs: zfs_crypto_create() requires a new key by definition: set newkey
This changes the password prompt for new encryption roots from
  Enter passphrase:
  Re-enter passphrase:
to
  Enter new passphrase:
  Re-enter new passphrase:
which makes more sense and is more consistent with "new passphrase"
now always meaning "come up with something" and plain "passphrase"
"remember that thing"

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11866
2021-06-23 13:22:14 -07:00
наб 95e29a7275 zfprops(8): fix spacing in jailed= arguments
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11866
2021-06-23 13:22:14 -07:00
наб 85890d54d9 zfs-[un]jail(8): fix "zfs-jail [un]jail" leftovers
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11866
2021-06-23 13:22:14 -07:00
Ryan Moeller 12e25a6ec3 ZTS: Improve cleanup in removal_with_export
Kill the removal operation on every platform, not just Linux.
The test has been fixed and is now stable on FreeBSD.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11856
2021-06-23 13:22:14 -07:00
Ryan Moeller 57490d5e13 ZTS: Tests using zhack may fail on FreeBSD
As described in #11854, zhack is occasionally segfaulting on FreeBSD. 
Debugging this is proving to be tricky. To avoid false positives in
the CI add entries for the tests that use zhack in zts-report to 
accept that they may occasionally fail on FreeBSD.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Issue #11854
Closes #11855
2021-06-23 13:22:14 -07:00
Ryan Moeller 214196e9f5 Ratelimit deadman zevents as with delay zevents
Just as delay zevents can flood the zevent pipe when a vdev becomes
unresponsive, so do the deadman zevents.

Ratelimit deadman zevents according to the same tunable as for delay
zevents.

Enable deadman tests on FreeBSD and add a test for deadman event
ratelimiting. 

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11786
2021-06-23 13:22:14 -07:00
matt-fidd 932d471bec zfs get -p only outputs 3 columns if "clones" property is empty
get_clones_string currently returns an empty string for filesystem
snapshots which have no clones. This breaks parsable `zfs get` output as
only three columns are output, instead of 4.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Fiddaman <github@m.fiddaman.uk>
Co-authored-by: matt <matt@fiddaman.net>
Closes #11837
2021-06-23 13:22:14 -07:00
наб 77cba40b85 zpool-features.5: remove "booting not possible with this feature"s
The exact limitations on what features are supported when booting
vary considerably depending on the environment.  In order to minimize
confusion avoid categorical statements which assume GRUB2 is being 
used.  The supported GRUB2 features are covered earlier in this man 
page for easy reference.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11842
2021-06-23 13:22:14 -07:00
George Melikov 18d069a20f man: fix wrong .Xr macros usages
In addition, html doc will have working hyperlinks.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #11845
2021-06-23 13:22:14 -07:00
наб dfc25ea91c libzutil: zfs_isnumber(): return false if input empty
zpool list, which is the only user, would mistakenly try to parse the
empty string as the interval in this case:

  $ zpool list "a"
  cannot open 'a': no such pool
  $ zpool list ""
  interval cannot be zero
  usage: <usage string follows>
which is now symmetric with zpool get:
  $ zpool list ""
  cannot open '': name must begin with a letter

Avoid breaking the  "interval cannot be zero" string.
There simply isn't a need for this, and it's user-facing.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11841 
Closes #11843
2021-06-23 13:22:14 -07:00
Brian Behlendorf 7428adc6b7 ZTS: pool_checkpoint improvements
The pool_checkpoint tests may incorrectly fail because several of
them invoke zdb for an imported pool.  In this scenario it's not
unexpected for zdb to fail if the pool is modified.  To resolve
this these zdb checks are now done after the pool has been exported.

Additionally, the default cleanup functions assumed the pool would
be imported when they were run.  If this was not the case they're
exit early and fail to cleanup all of the test state causing
subsequent tests to fail.  Add a check to only destroy the pool
when it is imported.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11832
2021-06-23 13:22:14 -07:00
Ryan Moeller 30782f2ff5 ZTS: inheritance/inherit_001_pos is flaky
Add inheritance/inherit_001_pos to the maybe fails on FreeBSD list.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11830
2021-06-23 13:22:14 -07:00
Ryan Moeller 43efd5625b Avoid taking global lock to destroy zfsdev state
We have exclusive access to our zfsdev state object in this section
until it is invalidated by setting zs_minor to -1, so we can destroy
the state without taking a lock if we do the invalidation last, after
a member to ensure correct ordering.

While here, strengthen the assertions that zs_minor is valid when we
enter.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #11751
2021-06-23 13:22:14 -07:00
Ryan Moeller 87a520c4e7 FreeBSD: Fix stable/12 after AT_BENEATH removal
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11827
2021-06-23 13:22:14 -07:00
Ryan Moeller d02bcae0c4 Allow pool names that look like Solaris disk names
Nothing bad happens if a prefix of your pool name matches a disk name.
This is a bit of a silly restriction at this point.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #11781 
Closes #11813
2021-06-23 13:22:14 -07:00
Ryan Moeller d9bcadc6ba Don't scale zfs_zevent_len_max by CPU count
The lower bound for this scaling to too low and the upper bound is too
high.  Use a fixed default length of 512 instead, which is a reasonable
value on any system.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11822
2021-06-23 13:22:14 -07:00
Ryan Moeller 67819ea28d Atomically check and set dropped zevent count
ratelimit_dropped isn't protected by a lock and is expected to
be updated atomically.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11822
2021-06-23 13:22:14 -07:00
Brian Behlendorf 79430bf34b CI: Increase free space in workflow
Recently we've been running out of free space in the ubuntu 20.04
environment resulting in test failures.  This appears to be caused
by a change in the default available free space and not because of
any change in OpenZFS. Try and avoid this failure by applying a
suggested workaround which removes some unnecessary files.

https://github.com/actions/virtual-environments/issues/2840

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11826
2021-06-23 13:22:14 -07:00
Andrew cabc605aa7 Fix regression in POSIX mode behavior
Commit 235a85657 introduced a regression in evaluation of POSIX modes
that require group DENY entries in the internal ZFS ACL. An example
of such a POSX mode is 007. When write_implies_delete_child is set,
then ACE_WRITE_DATA is added to `wanted_dirperms` in prior to calling
zfs_zaccess_common(). This occurs is zfs_zaccess_delete().

Unfortunately, when zfs_zaccess_aces_check hits this particular DENY
ACE, zfs_groupmember() is checked to determine whether access should be
denied, and since zfs_groupmember() always returns B_TRUE on Linux and
so this check is failed, resulting ultimately in EPERM being returned.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Andrew Walker <awalker@ixsystems.com>
Closes #11760
2021-06-23 13:22:14 -07:00
Palash Gandhi 0740d41e9b ZTS: New test for kernel panic induced by redacted send
This change adds a new test that covers a bug fix in the binary search
in the redacted send resume logic that causes a kernel panic.
The bug was fixed in https://github.com/openzfs/zfs/pull/11297.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Palash Gandhi <palash.gandhi@delphix.com>
Closes #11764
2021-06-23 13:22:14 -07:00
Martin Matuška 16e8e24f9c Allow setting bootfs property on pools with indirect vdevs
The FreeBSD boot loader relies on the bootfs property and is capable
of booting from removed (indirect) vdevs.

Reviewed-by Eric van Gyzen
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Martin Matuska <mm@FreeBSD.org>
Closes #11763
2021-06-23 13:22:14 -07:00
Mateusz Guzik bc5afff351 FreeBSD: make seqc asserts conditional on replay
Avoids tripping on asserts when doing pool recovery.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11739
2021-06-23 13:22:14 -07:00
Ryan Moeller 825d8e1b0f FreeBSD: Fix memory leaks in kstats
Don't handle (incorrectly) kmem_zalloc() failure.  With KM_SLEEP,
will never return NULL.

Free the data allocated for non-virtual kstats when deleting the object.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11767
2021-06-23 13:22:14 -07:00
gldisater f7797f3f5e Hold and release permissions exist
The man page was missing these two permissions.
Add the missing permissions to the man page. 

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jeremy Faulkner <gldisater@gldis.ca>
Closes #11727
2021-06-23 13:22:14 -07:00
Ryan Moeller 2439e95f36 ZTS: Add tests for DOS mode attributes
Create a new section of tests to run with acltype=off.

For now the only test we have is for the DOS mode READONLY attribute on
FreeBSD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11734
2021-06-23 13:22:14 -07:00
Ryan Moeller de722e55ca ZTS: Fix incorrect use of libtest in user_run by xattr_003_neg
You can't use user_run to eval ksh functions defined in libtest unless
you include libtest in the user shell.

Fix xattr_003_neg by:
* include libtest in the user shell
* *then* run get_xattr
* assert this fails
* use variables for filenames so they don't change in the user's shell
* don't log the contents of /etc/passwd
* cleanup all byproducts

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11185
2021-06-23 13:22:14 -07:00
Ryan Moeller acb4e7b0e1 ZTS: Use ksh and current environment for user_run
The current user_run often does not work as expected.  Commands are run
in a different shell, with a different environment, and all output is
discarded.

Simplify user_run to retain the current environment, eliminate eval,
and feed the command string into ksh.  Enhance the logging for
user_run so we can see out and err.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11185
2021-06-23 13:22:14 -07:00
Mariusz Zaborski 5a397fda09 FreeBSD: bring back possibility to rewind the checkpoint from bootloader
Add parsing of the rewind options.

When I was upstreaming the change [1], I omitted the part where we
detect that the pool should be rewind. When the FreeBSD repo has
synced with the OpenZFS, this part of the code was removed.

[1] FreeBSD repo: 277f38abffc6a8160b5044128b5b2c620fbb970c
[2] OpenZFS repo: f2c027bd6a

External-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254152
Originally reviewed by: tsoome, allanjude
Originally reviewed by: kevans (ok from high-level overview)
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mariusz Zaborski <oshogbo@vexillium.org>
Closes #11730
2021-06-23 13:22:14 -07:00
Ryan Moeller 525a7037c7 FreeBSD: Clean up zfsdev_close to match Linux
Resolve some oddities in zfsdev_close() which could result in a
panic and were not present in the equivalent function for Linux.

- Remove unused definition ZFS_MIN_MINOR
- FreeBSD: Simplify zfsdev state destruction
- Assert zs_minor is valid in zfsdev_close
- Make locking around zfsdev state match Linux

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11720
2021-06-23 13:22:14 -07:00
Mateusz Guzik 789f3d0e56 FreeBSD: switch teardown lock to rms
This deserializes otherwise non-contending operations.

The previous scheme of using 17 locks hashed by curthread runs into
conflicts very quickly. Check the pull request for sample results.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11153
2021-06-23 13:22:14 -07:00
Mateusz Guzik 5d946dfa1c Macroify teardown lock handling
This will allow platforms to implement it as they see fit, in particular
in a different manner than rrm locks.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11153
2021-06-23 13:22:14 -07:00
Mateusz Guzik cd80d6d355 FreeBSD: rename teardown inactive macros to mimick rrm convention
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11153
2021-06-23 13:22:14 -07:00
Mateusz Guzik 140584bfff FreeBSD: remove 2 assertions that teardown lock is not held
They are not very useful and hard to implement in the rms routine
the code is about to start using.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11153
2021-06-23 13:22:14 -07:00
Mateusz Guzik 8773e29c23 FreeBSD: rework asserts in zfs_dd_lookup
1. even up ifdefs
2. drop the arguably useless teardown lock asserts -- nothing else
   checks for it

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11153
2021-06-23 13:22:14 -07:00
Mateusz Guzik bb21c0aa3b Add branch prediction to ZFS_ENTER and ZFS_VERIFY_ZP macros
They are expected to fail only in corner cases.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11153
2021-06-23 13:22:14 -07:00
George Wilson 455ea5bb1f zpool import cachefile improvements
Importing a pool using the cachefile is ideal to reduce the time
required to import a pool. However, if the devices associated with
a pool in the cachefile have changed, then the import would fail.
This can easily be corrected by doing a normal import which would
then read the pool configuration from the labels.

The goal of this change is make importing using a cachefile more
resilient and auto-correcting. This is accomplished by having
the cachefile import logic automatically fallback to reading the
labels of the devices similar to a normal import. The main difference
between the fallback logic and a normal import is that the cachefile
import logic will only look at the device directories that were
originally used when the cachefile was populated. Additionally,
the fallback logic will always import by guid to ensure that only
the pools in the cachefile would be imported.

External-issue: DLPX-71980
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Wilson <gwilson@delphix.com>
Closes #11716
2021-06-23 13:22:14 -07:00
Martin Matuška 7126b0e5ed Fix whitespace introduced in ecc277cff
The manual page change in ecc277c has introduced whitespace on
line ends.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Martin Matuska <mm@FreeBSD.org>
Closes #11722
2021-06-23 13:22:14 -07:00
Ryan Moeller 7ca0ef36e1 FreeBSD: Fix scope of deadman tunables
A few deadman tunables ended up in the wrong sysctl node.

Move them to vfs.zfs.deadman.*

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11715
2021-06-23 13:22:14 -07:00
Adam D. Moss 0d6186cff3 Microoptimizations for VERIFY() and friends
Add branch hints and constify the intermediate evaluations of 
left/right params in VERIFY3*().

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Adam Moss <c@yotes.com>
Closes #11708
2021-06-23 13:22:14 -07:00
Allan Jude 6f349d33a5 Add missing files to Makefile
Some .h files that were added were missed in this Makefile. Since 
they are .h files, their being missing only resulted in them 
disappeared from the dist archive.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #11705
2021-06-23 13:22:14 -07:00
George Melikov 226d362b12 CI checkstyle: pin ubuntu version
Our checkstyle doesn't work well on Ubuntu 20.04,
temporary pin it to 18.04.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #11713
2021-06-23 13:22:14 -07:00
Antonio Russo 497fc5fb06 ZTS events_002: Improve speed and reliability
events_002 exercises the ZED, ensuring that it neither misses events,
nor reporting events twice.

On slow test hardware, some of the timeouts are insufficient to allow
the ZED to properly settle.  Conversely, on fast hardware these same
timeouts are too long, unnecessarily slowing the test run.

Instead of using a fixed timeout, wait for the expected final event
before returning.  Additionally, wait with a timeout for unexpected
events to avoid missing them if they show up late.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #11703
2021-06-23 13:22:14 -07:00
Christian Schwarz abb485a34a zvol: call zil_replaying() during replay
zil_replaying(zil, tx) has the side-effect of informing the ZIL that an
entry has been replayed in the (still open) tx.  The ZIL uses that
information to record the replay progress in the ZIL header when that
tx's txg syncs.

ZPL log entries are not idempotent and logically dependent and thus
calling zil_replaying() is necessary for correctness.

For ZVOLs the question of correctness is more nuanced: ZVOL logs only
TX_WRITE and TX_TRUNCATE, both of which are idempotent. Logical
dependencies between two records exist only if the write or discard
request had sync semantics or if the ranges affected by the records
overlap.

Thus, at a first glance, it would be correct to restart replay from
the beginning if we crash before replay completes. But this does not
address the following scenario:
Assume one log record per LWB.
The chain on disk is

    HDR -> 1:W(1, "A") -> 2:W(1, "B") -> 3:W(2, "X") -> 4:W(3, "Z")

where N:W(O, C) represents log entry number N which is a TX_WRITE of C
to offset A.
We replay 1, 2 and 3 in one txg, sync that txg, then crash.
Bit flips corrupt 2, 3, and 4.
We come up again and restart replay from the beginning because
we did not call zil_replaying() during replay.
We replay 1 again, then interpret 2's invalid checksum as the end
of the ZIL chain and call replay done.
The replayed zvol content is "AX".

If we had called zil_replaying() the HDR would have pointed to 3
and our resumed replay would not have replayed anything because
3 was corrupted, resulting in zvol content "BX".

If 3 logically depends on 2 then the replay corrupted the ZVOL_OBJ's
contents.

This patch adds the zil_replaying() calls to the replay functions.
Since the callbacks in the replay function need the zilog_t* pointer
so that they can call zil_replaying() we open the ZIL while
replaying in zvol_create_minor(). We also verify that replay has
been done when on-demand-opening the ZIL on the first modifying
bio.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Christian Schwarz <me@cschwarz.com>
Closes #11667
2021-06-23 13:22:14 -07:00
Ryan Moeller 0ccffb2634 ZTS: Improve cleanup in zpool tests
* Restore original kern.corefile value after the test.
* Don't leave behind a frozen pool.
* Clean up leftover vdev files.
* Make zpool_002_pos and zpool_003_pos consistent in their handling of
core files while here.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11694
2021-06-23 13:22:14 -07:00
nssrikanth 26cb87d22d Cancel TRIM / initialize on FAULTED non-writeable vdevs
When a device which is actively trimming or initializing becomes
FAULTED, and therefore no longer writable, cancel the active
TRIM or initialization.  When the device is merely taken offline
with `zpool offline` then stop the operation but do not cancel it.
When the device is brought back online the operation will be
resumed if possible.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Vipin Kumar Verma <vipin.verma@hpe.com>
Signed-off-by: Srikanth N S <srikanth.nagasubbaraoseetharaman@hpe.com>
Closes #11588
2021-06-23 13:22:14 -07:00
Brian Behlendorf 43dbfa3921 ZTS: zpool_trim_start_and_cancel_pos.ksh
Several of the TRIM tests were based of the initialize tests and
then adapted for TRIM.  The zpool_trim_start_and_cancel_pos.ksh
test was intended to be one such test but it was overlooked and
actually never adapted.  Update it accordingly.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11649
2021-06-23 13:22:14 -07:00
Brian Behlendorf 395583e38e Fix overly broad locking in spa_vdev_config_exit()
Calling vdev_free() only requires the we acquire the spa config
SCL_STATE_ALL locks, not the SCL_ALL locks.  In particular, we need
need to avoid taking the SCL_CONFIG lock (included in SCL_ALL) as a
writer since this can lead to a deadlock.  The txg_sync_thread() may
block in spa_txg_history_init_io() when taking the SCL_CONFIG lock
as a reading when it detects there's a pending writer.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11585
2021-06-23 13:22:14 -07:00
Ryan Moeller 818bc70c32 Wrap bare EINVAL returns with SET_ERROR
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11636
2021-06-23 13:22:14 -07:00
Tony Hutter 6150fbe67f Tag zfs-2.0.4
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2021-03-08 13:04:37 -08:00
manfromafar 458bb8c8a1 Clarify compressed zfs send/recv behavior
Docs for send and receive do not explain behavior when sending a 
compressed stream then receiving on a host that overrides compression 
with -o compress=value.

The data from the send stream is written as it was from the send is 
the compressed form but the compression algorithm set on the receiver 
is the overridden version which causes some confusion as to what 
algorithm was actually used.

Updated man docs to clarify behavior

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed By: Allan Jude <allanjude@freebsd.org>
Signed-off-by: manfromafar <manfromafar@outlook.com>
Closes #11690
2021-03-08 09:07:32 -08:00
Ryan Moeller cda6fdd500 Intentionally allow ZFS_READONLY in zfs_write
ZFS_READONLY represents the "DOS R/O" attribute.
When that flag is set, we should behave as if write access
were not granted by anything in the ACL.  In particular:
We _must_ allow writes after opening the file r/w, then
setting the DOS R/O attribute, and writing some more.
(Similar to how you can write after fchmod(fd, 0444).)

Restore these semantics which were lost on FreeBSD when refactoring
zfs_write.  To my knowledge Linux does not actually expose this flag,
but we'll need it to eventually so I've added the supporting checks.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11693
2021-03-08 09:07:29 -08:00
Brian Behlendorf df8271301e Suppress cppcheck invalidSyntax warninigs
For some reason cppcheck 1.90 is generating an invalidSyntax warning
when the BF64_SET macro is used in the zstream source.  The same
warning is not reported by cppcheck 2.3, nor is their any evident
problem with the expanded macro.  This appears to be an issue with
this version of cppcheck.  This commit annotates the source to suppress
the warning.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11700
2021-03-08 09:07:25 -08:00
Brian Behlendorf e219935f10 Initialize ZIL buffers
When populating a ZIL destination buffer ensure it is always
zeroed before its contents are constructed.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tom Caputi <caputit1@tcnj.edu>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11687
2021-03-08 09:07:21 -08:00
Thomas Lamprecht bb9104f1e8 zpool: use tab to intend continuation from removal status
Bring the output of the removal status in line with the other
"fields" that zpool status outputs, and thus allows an parser to
easier detect this as continuation of the 'remove:' output.

Before:
remove: Removal of vdev 0 copied 282G in 0h9m, completed on [...]
    776K memory used for removed device mappings

Now:
remove: Removal of vdev 0 copied 282G in 0h9m, completed on [...]
	776K memory used for removed device mappings

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Closes #11674
2021-03-08 09:07:18 -08:00
James Wah 94b240bae0 Don't bomb out when using keylocation=file://
Avoid following the error path when the operation in fact succeeded.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: James Wah <james@laird-wah.net>
Closes #11651
2021-03-05 12:59:07 -08:00
Jake Howard e93203e004 Add "zstd-fast" to help options for "compression" property
This value does work as expected, and is documented in the manpage.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jake Howard <git@theorangeone.net>
Closes #11670
2021-03-05 12:58:47 -08:00
Andriy Gapon ccb453acd0 Fix assert in FreeBSD-specific dmu_read_pages
The function has three similar pieces of code: for read-behind pages,
requested pages and read-ahead pages.  All three pieces had an
assert to ensure that the page is not mapped.  Later the assert was
relaxed to require that the page is not mapped for writing.  But that
was done in two places out of three.  This change fixes the third piece,
read-ahead.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes #11654
2021-03-05 12:58:08 -08:00
Coleman Kane 6c1989923e Linux 5.12 compat: replace bio_*_io_acct with disk_*_io_acct
The bio_*_acct functions became GPL exports, which causes the
kernel modules to refuse to compile. This replaces code with
alternate function calls to the disk_*_io_acct interfaces, which
are not GPL exports. This change was added in kernel commit
99dfc43ecbf67f12a06512918aaba61d55863efc.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #11639
2021-03-05 12:57:54 -08:00
Coleman Kane a0eb5a77a0 Linux 5.12 compat: bio->bi_disk member moved
The struct bio member bi_disk was moved underneath a new member named
bi_bdev. So all attempts to reference bio->bi_disk need to now become
bio->bi_bdev->bd_disk.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #11639
2021-03-05 12:57:46 -08:00
Brian Behlendorf fe77c48320 Linux: increase max nvlist_src size
On Linux increase the maximum allowed size of the src nvlist which
can be passed to the /dev/zfs ioctl.  Originally, this was set
to a maximum of KMALLOC_MAX_SIZE (4M) because it was kmalloc'd.
Since that time it's been converted to a vmalloc so that's no
longer a hard limit, and it's desirable for `zfs send/recv` to
allow larger nvlists so more snapshots can be sent at once.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6572
Closes #11638
2021-03-05 12:54:57 -08:00
Cedric Maunoury db9b29895e send_iterate_snap : doall send without fromsnap
The behavior of a NULL fromsnap was inadvertently changed for a doall
send when the send/recv logic in libzfs was updated.  Restore the
previous behavior by correcting send_iterate_snap() to include all
the snapshots in the nvlist for this case. 

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Cedric Maunoury <cedric.maunoury@gmail.com>
Closes #11608
2021-03-05 12:54:39 -08:00
fbynite 4d4dd76f0f vdev_ops: don't try to call vdev_op_hold or vdev_op_rele when NULL
This prevents a panic after a SLOG add/removal on the root pool followed
by a zpool scrub.

When a SLOG is removed, a hole takes its place - the vdev_ops for a hole
is vdev_hole_ops, which defines the handler functions of vdev_op_hold
and vdev_op_rele as NULL.

This bug has been reported in illumos and FreeBSD, a different trigger
in the FreeBSD report though.

Credit for this patch goes to Patrick Mooney <pmooney@pfmooney.com>

Obtained from: illumos-gate commit: c65bd18728f34725
External-issue: https://www.illumos.org/issues/12981
External-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252396
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Wing <rob.fx907@gmail.com>
Closes #11623
2021-03-05 12:53:50 -08:00
Christian Schwarz 286c7f75bb libzpool: set_global_var: fix endianness handling (fixes zdb -o )
Without this patch I get the error

  Setting global variables is only supported on little-endian systems

when using `zdb -o` on my amd64 machine.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Christian Schwarz <me@cschwarz.com>
Closes #11602
2021-03-05 12:51:48 -08:00
Ryan Moeller 403703d57a Restore FreeBSD resource usage accounting
Add zfs_racct_* interfaces for platform-dependent read/write accounting.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11613
2021-03-05 12:50:32 -08:00
Mark Johnston f17c843eff FreeBSD: disable the use of hardware crypto offload drivers for now
First, the crypto request completion handler contains a bug in that it
fails to reset fs_done correctly after the request is completed.  This
is only a problem for asynchronous drivers.  Second, some hardware
drivers have input constraints which ZFS does not satisfy.  For
instance, ccp(4) apparently requires the AAD length for AES-GCM to be a
multiple of the cipher block size, and with qat(4) the AES-GCM AAD
length may not be longer than 240 bytes.  FreeBSD's generic crypto
framework doesn't have a mechanism to automatically fall back to a
software implementation if a hardware driver cannot process a request,
and ZFS does not tolerate such errors.

The plan is to implement such a fallback mechanism, but with FreeBSD
13.0 approaching we should simply disable the use hardware drivers for
now.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #11612
2021-03-05 12:49:41 -08:00
Andriy Gapon f6440fa094 Fix report_mount_progress never calling set_progress_header
That happens because of an off-by-one mistake.
share_mount_one_cb() calls report_mount_progress(current=sm_done) after
having incremented sm_done by one.  Then report_mount_progress()
increments the parameter again.  It appears that that logic became
obsolete after commit a10d50f999, parallel zfs mount.

On FreeBSD I observe that zfs mount -a -v prints, for example,
    (null): (201/248)
That happens because set_progress_header() is never called.

With this change the output becomes correct:
    Mounting ZFS filesystems: (209/248)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes #11607
2021-03-05 12:49:22 -08:00
José Luis Salvador Rufo 62f9691e10 Support uClibc for the tests compilations
There are two issues that don't allow ZFS to be compiled using uClibc.
`backtrace()`, and `program_invocation_short_name` as a `const`.
This patch adds uClibc to the conditionals in the same way there are
already for Glibc for `backtrace()`; and removes the external param
`program_invocation_short_name` because its only used here for the
whole project.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: José Luis Salvador Rufo <salvador.joseluis@gmail.com>
Closes #11600
2021-03-05 12:48:50 -08:00
Brian Behlendorf 73e26fdc09 Linux 5.11 compat: META
Increase the Linux-Maximum version in the META file to 5.11.
All of the required compatibility patches have been merged.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11586
2021-03-05 12:48:08 -08:00
Tony Hutter 65a89d9f49 Tag zfs-2.0.3
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2021-02-10 13:03:21 -08:00
592 changed files with 23920 additions and 14785 deletions

View File

@ -25,14 +25,16 @@ Type | Version/Name
--- | ---
Distribution Name |
Distribution Version |
Linux Kernel |
Kernel Version |
Architecture |
ZFS Version |
SPL Version |
OpenZFS Version |
<!--
Commands to find ZFS/SPL versions:
modinfo zfs | grep -iw version
modinfo spl | grep -iw version
Command to find OpenZFS version:
zfs version
Commands to find kernel version:
uname -r # Linux
freebsd-version -r # FreeBSD
-->
### Describe the problem you're observing

View File

@ -7,5 +7,5 @@ contact_links:
url: https://lists.freebsd.org/mailman/listinfo/freebsd-fs
about: Get community support for OpenZFS on FreeBSD
- name: OpenZFS on IRC
url: https://webchat.freenode.net/#openzfs
url: https://web.libera.chat/#openzfs
about: Use IRC to get community support for OpenZFS

View File

@ -6,7 +6,7 @@ on:
jobs:
checkstyle:
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v2
with:
@ -18,12 +18,13 @@ jobs:
sudo apt-get install --yes -qq zlib1g-dev uuid-dev libattr1-dev libblkid-dev libselinux-dev libudev-dev libssl-dev python-dev python-setuptools python-cffi python3 python3-dev python3-setuptools python3-cffi
# packages for tests
sudo apt-get install --yes -qq parted lsscsi ksh attr acl nfs-kernel-server fio
sudo apt-get install --yes -qq mandoc cppcheck pax-utils abigail-tools # devscripts - enable then bashisms fixed
sudo apt-get install --yes -qq mandoc cppcheck pax-utils devscripts
sudo -E pip --quiet install flake8
- name: Prepare
run: |
sh ./autogen.sh
./configure
make -j$(nproc)
- name: Checkstyle
run: |
make checkstyle
@ -31,6 +32,19 @@ jobs:
run: |
make lint
- name: CheckABI
id: CheckABI
run: |
make -j$(nproc)
make checkabi
sudo docker run -v $(pwd):/source ghcr.io/openzfs/libabigail make checkabi
- name: StoreABI
if: failure() && steps.CheckABI.outcome == 'failure'
run: |
sudo docker run -v $(pwd):/source ghcr.io/openzfs/libabigail make storeabi
- name: Prepare artifacts
if: failure() && steps.CheckABI.outcome == 'failure'
run: |
find -name *.abi | tar -cf abi_files.tar -T -
- uses: actions/upload-artifact@v2
if: failure() && steps.CheckABI.outcome == 'failure'
with:
name: New ABI files (use only if you're sure about interface changes)
path: abi_files.tar

View File

@ -26,7 +26,7 @@ jobs:
xfslibs-dev libattr1-dev libacl1-dev libudev-dev libdevmapper-dev \
libssl-dev libffi-dev libaio-dev libelf-dev libmount-dev \
libpam0g-dev pamtester python-dev python-setuptools python-cffi \
python3 python3-dev python3-setuptools python3-cffi
python3 python3-dev python3-setuptools python3-cffi python3-packaging
- name: Autogen.sh
run: |
sh autogen.sh
@ -44,9 +44,26 @@ jobs:
sudo sed -i.bak 's/updates/extra updates/' /etc/depmod.d/ubuntu.conf
sudo depmod
sudo modprobe zfs
# Workaround for cloud-init bug
# see https://github.com/openzfs/zfs/issues/12644
FILE=/lib/udev/rules.d/10-cloud-init-hook-hotplug.rules
if [ -r "${FILE}" ]; then
HASH=$(md5sum "${FILE}" | awk '{ print $1 }')
if [ "${HASH}" = "121ff0ef1936cd2ef65aec0458a35772" ]; then
# Just shove a zd* exclusion right above the hotplug hook...
sudo sed -i -e s/'LABEL="cloudinit_hook"'/'KERNEL=="zd*", GOTO="cloudinit_end"\n&'/ "${FILE}"
sudo udevadm control --reload-rules
fi
fi
# Workaround to provide additional free space for testing.
# https://github.com/actions/virtual-environments/issues/2840
sudo rm -rf /usr/share/dotnet
sudo rm -rf /opt/ghc
sudo rm -rf "/usr/local/share/boost"
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
- name: Tests
run: |
/usr/share/zfs/zfs-tests.sh -v -s 3G
/usr/share/zfs/zfs-tests.sh -vR -s 3G
- name: Prepare artifacts
if: failure()
run: |
@ -55,7 +72,7 @@ jobs:
sudo cp /var/log/syslog $RESULTS_PATH/
sudo chmod +r $RESULTS_PATH/*
# Replace ':' in dir names, actions/upload-artifact doesn't support it
for f in $(find $RESULTS_PATH -name '*:*'); do mv "$f" "${f//:/__}"; done
for f in $(find /var/tmp/test_results -name '*:*'); do mv "$f" "${f//:/__}"; done
- uses: actions/upload-artifact@v2
if: failure()
with:

View File

@ -22,7 +22,7 @@ jobs:
xfslibs-dev libattr1-dev libacl1-dev libudev-dev libdevmapper-dev \
libssl-dev libffi-dev libaio-dev libelf-dev libmount-dev \
libpam0g-dev pamtester python-dev python-setuptools python-cffi \
python3 python3-dev python3-setuptools python3-cffi
python3 python3-dev python3-setuptools python3-cffi python3-packaging
- name: Autogen.sh
run: |
sh autogen.sh
@ -40,9 +40,26 @@ jobs:
sudo sed -i.bak 's/updates/extra updates/' /etc/depmod.d/ubuntu.conf
sudo depmod
sudo modprobe zfs
# Workaround for cloud-init bug
# see https://github.com/openzfs/zfs/issues/12644
FILE=/lib/udev/rules.d/10-cloud-init-hook-hotplug.rules
if [ -r "${FILE}" ]; then
HASH=$(md5sum "${FILE}" | awk '{ print $1 }')
if [ "${HASH}" = "121ff0ef1936cd2ef65aec0458a35772" ]; then
# Just shove a zd* exclusion right above the hotplug hook...
sudo sed -i -e s/'LABEL="cloudinit_hook"'/'KERNEL=="zd*", GOTO="cloudinit_end"\n&'/ "${FILE}"
sudo udevadm control --reload-rules
fi
fi
# Workaround to provide additional free space for testing.
# https://github.com/actions/virtual-environments/issues/2840
sudo rm -rf /usr/share/dotnet
sudo rm -rf /opt/ghc
sudo rm -rf "/usr/local/share/boost"
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
- name: Tests
run: |
/usr/share/zfs/zfs-tests.sh -v -s 3G -r sanity
/usr/share/zfs/zfs-tests.sh -vR -s 3G -r sanity
- name: Prepare artifacts
if: failure()
run: |
@ -51,7 +68,7 @@ jobs:
sudo cp /var/log/syslog $RESULTS_PATH/
sudo chmod +r $RESULTS_PATH/*
# Replace ':' in dir names, actions/upload-artifact doesn't support it
for f in $(find $RESULTS_PATH -name '*:*'); do mv "$f" "${f//:/__}"; done
for f in $(find /var/tmp/test_results -name '*:*'); do mv "$f" "${f//:/__}"; done
- uses: actions/upload-artifact@v2
if: failure()
with:

View File

@ -22,8 +22,8 @@ jobs:
xfslibs-dev libattr1-dev libacl1-dev libudev-dev libdevmapper-dev \
libssl-dev libffi-dev libaio-dev libelf-dev libmount-dev \
libpam0g-dev \
python-dev python-setuptools python-cffi \
python3 python3-dev python3-setuptools python3-cffi
python-dev python-setuptools python-cffi python-packaging \
python3 python3-dev python3-setuptools python3-cffi python3-packaging
- name: Autogen.sh
run: |
sh autogen.sh

4
META
View File

@ -1,10 +1,10 @@
Meta: 1
Name: zfs
Branch: 1.0
Version: 2.0.2
Version: 2.0.7
Release: 1
Release-Tags: relext
License: CDDL
Author: OpenZFS
Linux-Maximum: 5.10
Linux-Maximum: 5.15
Linux-Minimum: 3.10

View File

@ -25,7 +25,6 @@ endif
AUTOMAKE_OPTIONS = foreign
EXTRA_DIST = autogen.sh copy-builtin
EXTRA_DIST += cppcheck-suppressions.txt
EXTRA_DIST += config/config.awk config/rpm.am config/deb.am config/tgz.am
EXTRA_DIST += META AUTHORS COPYRIGHT LICENSE NEWS NOTICE README.md
EXTRA_DIST += CODE_OF_CONDUCT.md
@ -162,7 +161,7 @@ checkbashisms:
-o -name '90zfs' -prune \
-o -type f ! -name 'config*' \
! -name 'libtool' \
-exec sh -c 'awk "NR==1 && /\#\!.*bin\/sh.*/ {print FILENAME;}" "{}"' \;); \
-exec sh -c 'awk "NR==1 && /#!.*bin\/sh.*/ {print FILENAME;}" "{}"' \;); \
else \
echo "skipping checkbashisms because checkbashisms is not installed"; \
fi
@ -204,13 +203,13 @@ vcscheck:
PHONY += lint
lint: cppcheck paxcheck
CPPCHECKDIRS = cmd lib module
PHONY += cppcheck
cppcheck:
@if type cppcheck > /dev/null 2>&1; then \
cppcheck --quiet --force --error-exitcode=2 --inline-suppr \
--suppressions-list=${top_srcdir}/cppcheck-suppressions.txt \
-UHAVE_SSE2 -UHAVE_AVX512F -UHAVE_UIO_ZEROCOPY \
${top_srcdir}; \
cppcheck: $(CPPCHECKDIRS)
@if test -n "$(CPPCHECK)"; then \
set -e ; for dir in $(CPPCHECKDIRS) ; do \
$(MAKE) -C $$dir cppcheck ; \
done \
else \
echo "skipping cppcheck because cppcheck is not installed"; \
fi

View File

@ -1,10 +1,20 @@
SUBDIRS = zfs zpool zdb zhack zinject zstream zstreamdump ztest
SUBDIRS += fsck_zfs vdev_id raidz_test zfs_ids_to_path
CPPCHECKDIRS = zfs zpool zdb zhack zinject zstream ztest
CPPCHECKDIRS += raidz_test zfs_ids_to_path
if USING_PYTHON
SUBDIRS += arcstat arc_summary dbufstat
endif
if BUILD_LINUX
SUBDIRS += mount_zfs zed zgenhostid zvol_id zvol_wait
CPPCHECKDIRS += mount_zfs zed zgenhostid zvol_id
endif
PHONY = cppcheck
cppcheck: $(CPPCHECKDIRS)
set -e ; for dir in $(CPPCHECKDIRS) ; do \
$(MAKE) -C $$dir cppcheck ; \
done

View File

@ -58,7 +58,6 @@ SECTION_PATHS = {'arc': 'arcstats',
'dmu': 'dmu_tx',
'l2arc': 'arcstats', # L2ARC stuff lives in arcstats
'vdev': 'vdev_cache_stats',
'xuio': 'xuio_stats',
'zfetch': 'zfetchstats',
'zil': 'zil'}

View File

@ -18,3 +18,5 @@ mount_zfs_LDADD = \
$(abs_top_builddir)/lib/libnvpair/libnvpair.la
mount_zfs_LDADD += $(LTLIBINTL)
include $(top_srcdir)/config/CppCheck.am

View File

@ -367,7 +367,7 @@ main(int argc, char **argv)
"mount the filesystem again.\n"), dataset);
return (MOUNT_SYSERR);
}
/* fallthru */
fallthrough;
#endif
default:
(void) fprintf(stderr, gettext("filesystem "

View File

@ -18,3 +18,5 @@ raidz_test_LDADD = \
$(abs_top_builddir)/lib/libzfs_core/libzfs_core.la
raidz_test_LDADD += -lm
include $(top_srcdir)/config/CppCheck.am

View File

@ -79,6 +79,34 @@
# channel 86:00.0 1 A
# channel 86:00.0 0 B
# #
# # Example vdev_id.conf - multipath / multijbod-daisychaining
# #
#
# multipath yes
# multijbod yes
#
# # PCI_ID HBA PORT CHANNEL NAME
# channel 85:00.0 1 A
# channel 85:00.0 0 B
# channel 86:00.0 1 A
# channel 86:00.0 0 B
# #
# # Example vdev_id.conf - multipath / mixed
# #
#
# multipath yes
# slot mix
#
# # PCI_ID HBA PORT CHANNEL NAME
# channel 85:00.0 3 A
# channel 85:00.0 2 B
# channel 86:00.0 3 A
# channel 86:00.0 2 B
# channel af:00.0 0 C
# channel af:00.0 1 C
# #
# # Example vdev_id.conf - alias
# #
@ -92,9 +120,10 @@ PATH=/bin:/sbin:/usr/bin:/usr/sbin
CONFIG=/etc/zfs/vdev_id.conf
PHYS_PER_PORT=
DEV=
MULTIPATH=
TOPOLOGY=
BAY=
ENCL_ID=""
UNIQ_ENCL_ID=""
usage() {
cat << EOF
@ -107,22 +136,25 @@ Usage: vdev_id [-h]
-e Create enclose device symlinks only (/dev/by-enclosure)
-g Storage network topology [default="$TOPOLOGY"]
-m Run in multipath mode
-j Run in multijbod mode
-p number of phy's per switch port [default=$PHYS_PER_PORT]
-h show this summary
EOF
exit 0
exit 1
# exit with error to avoid processing usage message by a udev rule
}
map_slot() {
LINUX_SLOT=$1
CHANNEL=$2
MAPPED_SLOT=`awk "\\$1 == \"slot\" && \\$2 == ${LINUX_SLOT} && \
\\$4 ~ /^${CHANNEL}$|^$/ { print \\$3; exit }" $CONFIG`
MAPPED_SLOT=$(awk -v linux_slot="$LINUX_SLOT" -v channel="$CHANNEL" \
'$1 == "slot" && $2 == linux_slot && \
($4 ~ "^"channel"$" || $4 ~ /^$/) { print $3; exit}' $CONFIG)
if [ -z "$MAPPED_SLOT" ] ; then
MAPPED_SLOT=$LINUX_SLOT
fi
printf "%d" ${MAPPED_SLOT}
printf "%d" "${MAPPED_SLOT}"
}
map_channel() {
@ -132,40 +164,120 @@ map_channel() {
case $TOPOLOGY in
"sas_switch")
MAPPED_CHAN=`awk "\\$1 == \"channel\" && \\$2 == ${PORT} \
{ print \\$3; exit }" $CONFIG`
MAPPED_CHAN=$(awk -v port="$PORT" \
'$1 == "channel" && $2 == port \
{ print $3; exit }' $CONFIG)
;;
"sas_direct"|"scsi")
MAPPED_CHAN=`awk "\\$1 == \"channel\" && \
\\$2 == \"${PCI_ID}\" && \\$3 == ${PORT} \
{ print \\$4; exit }" $CONFIG`
MAPPED_CHAN=$(awk -v pciID="$PCI_ID" -v port="$PORT" \
'$1 == "channel" && $2 == pciID && $3 == port \
{print $4}' $CONFIG)
;;
esac
printf "%s" ${MAPPED_CHAN}
printf "%s" "${MAPPED_CHAN}"
}
get_encl_id() {
set -- $(echo $1)
count=$#
i=1
while [ $i -le $count ] ; do
d=$(eval echo '$'{$i})
id=$(cat "/sys/class/enclosure/${d}/id")
ENCL_ID="${ENCL_ID} $id"
i=$((i + 1))
done
}
get_uniq_encl_id() {
for uuid in ${ENCL_ID}; do
found=0
for count in ${UNIQ_ENCL_ID}; do
if [ $count = $uuid ]; then
found=1
break
fi
done
if [ $found -eq 0 ]; then
UNIQ_ENCL_ID="${UNIQ_ENCL_ID} $uuid"
fi
done
}
# map_jbod explainer: The bsg driver knows the difference between a SAS
# expander and fanout expander. Use hostX instance along with top-level
# (whole enclosure) expander instances in /sys/class/enclosure and
# matching a field in an array of expanders, using the index of the
# matched array field as the enclosure instance, thereby making jbod IDs
# dynamic. Avoids reliance on high overhead userspace commands like
# multipath and lsscsi and instead uses existing sysfs data. $HOSTCHAN
# variable derived from devpath gymnastics in sas_handler() function.
map_jbod() {
DEVEXP=$(ls -l "/sys/block/$DEV/device/" | grep enclos | awk -F/ '{print $(NF-1) }')
DEV=$1
# Use "set --" to create index values (Arrays)
set -- $(ls -l /sys/class/enclosure | grep -v "^total" | awk '{print $9}')
# Get count of total elements
JBOD_COUNT=$#
JBOD_ITEM=$*
# Build JBODs (enclosure) id from sys/class/enclosure/<dev>/id
get_encl_id "$JBOD_ITEM"
# Different expander instances for each paths.
# Filter out and keep only unique id.
get_uniq_encl_id
# Identify final 'mapped jbod'
j=0
for count in ${UNIQ_ENCL_ID}; do
i=1
j=$((j + 1))
while [ $i -le $JBOD_COUNT ] ; do
d=$(eval echo '$'{$i})
id=$(cat "/sys/class/enclosure/${d}/id")
if [ "$d" = "$DEVEXP" ] && [ $id = $count ] ; then
MAPPED_JBOD=$j
break
fi
i=$((i + 1))
done
done
printf "%d" "${MAPPED_JBOD}"
}
sas_handler() {
if [ -z "$PHYS_PER_PORT" ] ; then
PHYS_PER_PORT=`awk "\\$1 == \"phys_per_port\" \
{print \\$2; exit}" $CONFIG`
PHYS_PER_PORT=$(awk '$1 == "phys_per_port" \
{print $2; exit}' $CONFIG)
fi
PHYS_PER_PORT=${PHYS_PER_PORT:-4}
if ! echo $PHYS_PER_PORT | grep -q -E '^[0-9]+$' ; then
if ! echo "$PHYS_PER_PORT" | grep -q -E '^[0-9]+$' ; then
echo "Error: phys_per_port value $PHYS_PER_PORT is non-numeric"
exit 1
fi
if [ -z "$MULTIPATH_MODE" ] ; then
MULTIPATH_MODE=`awk "\\$1 == \"multipath\" \
{print \\$2; exit}" $CONFIG`
MULTIPATH_MODE=$(awk '$1 == "multipath" \
{print $2; exit}' $CONFIG)
fi
if [ -z "$MULTIJBOD_MODE" ] ; then
MULTIJBOD_MODE=$(awk '$1 == "multijbod" \
{print $2; exit}' $CONFIG)
fi
# Use first running component device if we're handling a dm-mpath device
if [ "$MULTIPATH_MODE" = "yes" ] ; then
# If udev didn't tell us the UUID via DM_NAME, check /dev/mapper
if [ -z "$DM_NAME" ] ; then
DM_NAME=`ls -l --full-time /dev/mapper |
awk "/\/$DEV$/{print \\$9}"`
DM_NAME=$(ls -l --full-time /dev/mapper |
grep "$DEV"$ | awk '{print $9}')
fi
# For raw disks udev exports DEVTYPE=partition when
@ -175,28 +287,50 @@ sas_handler() {
# we have to append the -part suffix directly in the
# helper.
if [ "$DEVTYPE" != "partition" ] ; then
PART=`echo $DM_NAME | awk -Fp '/p/{print "-part"$2}'`
# Match p[number], remove the 'p' and prepend "-part"
PART=$(echo "$DM_NAME" |
awk 'match($0,/p[0-9]+$/) {print "-part"substr($0,RSTART+1,RLENGTH-1)}')
fi
# Strip off partition information.
DM_NAME=`echo $DM_NAME | sed 's/p[0-9][0-9]*$//'`
DM_NAME=$(echo "$DM_NAME" | sed 's/p[0-9][0-9]*$//')
if [ -z "$DM_NAME" ] ; then
return
fi
# Get the raw scsi device name from multipath -ll. Strip off
# leading pipe symbols to make field numbering consistent.
DEV=`multipath -ll $DM_NAME |
awk '/running/{gsub("^[|]"," "); print $3 ; exit}'`
# Utilize DM device name to gather subordinate block devices
# using sysfs to avoid userspace utilities
# If our DEVNAME is something like /dev/dm-177, then we may be
# able to get our DMDEV from it.
DMDEV=$(echo $DEVNAME | sed 's;/dev/;;g')
if [ ! -e /sys/block/$DMDEV/slaves/* ] ; then
# It's not there, try looking in /dev/mapper
DMDEV=$(ls -l --full-time /dev/mapper | grep $DM_NAME |
awk '{gsub("../", " "); print $NF}')
fi
# Use sysfs pointers in /sys/block/dm-X/slaves because using
# userspace tools creates lots of overhead and should be avoided
# whenever possible. Use awk to isolate lowest instance of
# sd device member in dm device group regardless of string
# length.
DEV=$(ls "/sys/block/$DMDEV/slaves" | awk '
{ len=sprintf ("%20s",length($0)); gsub(/ /,0,str); a[NR]=len "_" $0; }
END {
asort(a)
print substr(a[1],22)
}')
if [ -z "$DEV" ] ; then
return
fi
fi
if echo $DEV | grep -q ^/devices/ ; then
if echo "$DEV" | grep -q ^/devices/ ; then
sys_path=$DEV
else
sys_path=`udevadm info -q path -p /sys/block/$DEV 2>/dev/null`
sys_path=$(udevadm info -q path -p "/sys/block/$DEV" 2>/dev/null)
fi
# Use positional parameters as an ad-hoc array
@ -206,84 +340,104 @@ sas_handler() {
# Get path up to /sys/.../hostX
i=1
while [ $i -le $num_dirs ] ; do
d=$(eval echo \${$i})
while [ $i -le "$num_dirs" ] ; do
d=$(eval echo '$'{$i})
scsi_host_dir="$scsi_host_dir/$d"
echo $d | grep -q -E '^host[0-9]+$' && break
i=$(($i + 1))
echo "$d" | grep -q -E '^host[0-9]+$' && break
i=$((i + 1))
done
if [ $i = $num_dirs ] ; then
# Lets grab the SAS host channel number and save it for JBOD sorting later
HOSTCHAN=$(echo "$d" | awk -F/ '{ gsub("host","",$NF); print $NF}')
if [ $i = "$num_dirs" ] ; then
return
fi
PCI_ID=$(eval echo \${$(($i -1))} | awk -F: '{print $2":"$3}')
PCI_ID=$(eval echo '$'{$((i -1))} | awk -F: '{print $2":"$3}')
# In sas_switch mode, the directory four levels beneath
# /sys/.../hostX contains symlinks to phy devices that reveal
# the switch port number. In sas_direct mode, the phy links one
# directory down reveal the HBA port.
port_dir=$scsi_host_dir
case $TOPOLOGY in
"sas_switch") j=$(($i + 4)) ;;
"sas_direct") j=$(($i + 1)) ;;
"sas_switch") j=$((i + 4)) ;;
"sas_direct") j=$((i + 1)) ;;
esac
i=$(($i + 1))
i=$((i + 1))
while [ $i -le $j ] ; do
port_dir="$port_dir/$(eval echo \${$i})"
i=$(($i + 1))
port_dir="$port_dir/$(eval echo '$'{$i})"
i=$((i + 1))
done
PHY=`ls -d $port_dir/phy* 2>/dev/null | head -1 | awk -F: '{print $NF}'`
PHY=$(ls -vd "$port_dir"/phy* 2>/dev/null | head -1 | awk -F: '{print $NF}')
if [ -z "$PHY" ] ; then
PHY=0
fi
PORT=$(( $PHY / $PHYS_PER_PORT ))
PORT=$((PHY / PHYS_PER_PORT))
# Look in /sys/.../sas_device/end_device-X for the bay_identifier
# attribute.
end_device_dir=$port_dir
while [ $i -lt $num_dirs ] ; do
d=$(eval echo \${$i})
while [ $i -lt "$num_dirs" ] ; do
d=$(eval echo '$'{$i})
end_device_dir="$end_device_dir/$d"
if echo $d | grep -q '^end_device' ; then
if echo "$d" | grep -q '^end_device' ; then
end_device_dir="$end_device_dir/sas_device/$d"
break
fi
i=$(($i + 1))
i=$((i + 1))
done
# Add 'mix' slot type for environments where dm-multipath devices
# include end-devices connected via SAS expanders or direct connection
# to SAS HBA. A mixed connectivity environment such as pool devices
# contained in a SAS JBOD and spare drives or log devices directly
# connected in a server backplane without expanders in the I/O path.
SLOT=
case $BAY in
"bay")
SLOT=`cat $end_device_dir/bay_identifier 2>/dev/null`
SLOT=$(cat "$end_device_dir/bay_identifier" 2>/dev/null)
;;
"mix")
if [ $(cat "$end_device_dir/bay_identifier" 2>/dev/null) ] ; then
SLOT=$(cat "$end_device_dir/bay_identifier" 2>/dev/null)
else
SLOT=$(cat "$end_device_dir/phy_identifier" 2>/dev/null)
fi
;;
"phy")
SLOT=`cat $end_device_dir/phy_identifier 2>/dev/null`
SLOT=$(cat "$end_device_dir/phy_identifier" 2>/dev/null)
;;
"port")
d=$(eval echo \${$i})
SLOT=`echo $d | sed -e 's/^.*://'`
d=$(eval echo '$'{$i})
SLOT=$(echo "$d" | sed -e 's/^.*://')
;;
"id")
i=$(($i + 1))
d=$(eval echo \${$i})
SLOT=`echo $d | sed -e 's/^.*://'`
i=$((i + 1))
d=$(eval echo '$'{$i})
SLOT=$(echo "$d" | sed -e 's/^.*://')
;;
"lun")
i=$(($i + 2))
d=$(eval echo \${$i})
SLOT=`echo $d | sed -e 's/^.*://'`
i=$((i + 2))
d=$(eval echo '$'{$i})
SLOT=$(echo "$d" | sed -e 's/^.*://')
;;
"ses")
# look for this SAS path in all SCSI Enclosure Services
# (SES) enclosures
sas_address=`cat $end_device_dir/sas_address 2>/dev/null`
enclosures=`lsscsi -g | \
sed -n -e '/enclosu/s/^.* \([^ ][^ ]*\) *$/\1/p'`
sas_address=$(cat "$end_device_dir/sas_address" 2>/dev/null)
enclosures=$(lsscsi -g | \
sed -n -e '/enclosu/s/^.* \([^ ][^ ]*\) *$/\1/p')
for enclosure in $enclosures; do
set -- $(sg_ses -p aes $enclosure | \
set -- $(sg_ses -p aes "$enclosure" | \
awk "/device slot number:/{slot=\$12} \
/SAS address: $sas_address/\
{print slot}")
@ -298,42 +452,55 @@ sas_handler() {
return
fi
CHAN=`map_channel $PCI_ID $PORT`
SLOT=`map_slot $SLOT $CHAN`
if [ "$MULTIJBOD_MODE" = "yes" ] ; then
CHAN=$(map_channel "$PCI_ID" "$PORT")
SLOT=$(map_slot "$SLOT" "$CHAN")
JBOD=$(map_jbod "$DEV")
if [ -z "$CHAN" ] ; then
return
fi
echo ${CHAN}${SLOT}${PART}
echo "${CHAN}"-"${JBOD}"-"${SLOT}${PART}"
else
CHAN=$(map_channel "$PCI_ID" "$PORT")
SLOT=$(map_slot "$SLOT" "$CHAN")
if [ -z "$CHAN" ] ; then
return
fi
echo "${CHAN}${SLOT}${PART}"
fi
}
scsi_handler() {
if [ -z "$FIRST_BAY_NUMBER" ] ; then
FIRST_BAY_NUMBER=`awk "\\$1 == \"first_bay_number\" \
{print \\$2; exit}" $CONFIG`
FIRST_BAY_NUMBER=$(awk '$1 == "first_bay_number" \
{print $2; exit}' $CONFIG)
fi
FIRST_BAY_NUMBER=${FIRST_BAY_NUMBER:-0}
if [ -z "$PHYS_PER_PORT" ] ; then
PHYS_PER_PORT=`awk "\\$1 == \"phys_per_port\" \
{print \\$2; exit}" $CONFIG`
PHYS_PER_PORT=$(awk '$1 == "phys_per_port" \
{print $2; exit}' $CONFIG)
fi
PHYS_PER_PORT=${PHYS_PER_PORT:-4}
if ! echo $PHYS_PER_PORT | grep -q -E '^[0-9]+$' ; then
if ! echo "$PHYS_PER_PORT" | grep -q -E '^[0-9]+$' ; then
echo "Error: phys_per_port value $PHYS_PER_PORT is non-numeric"
exit 1
fi
if [ -z "$MULTIPATH_MODE" ] ; then
MULTIPATH_MODE=`awk "\\$1 == \"multipath\" \
{print \\$2; exit}" $CONFIG`
MULTIPATH_MODE=$(awk '$1 == "multipath" \
{print $2; exit}' $CONFIG)
fi
# Use first running component device if we're handling a dm-mpath device
if [ "$MULTIPATH_MODE" = "yes" ] ; then
# If udev didn't tell us the UUID via DM_NAME, check /dev/mapper
if [ -z "$DM_NAME" ] ; then
DM_NAME=`ls -l --full-time /dev/mapper |
awk "/\/$DEV$/{print \\$9}"`
DM_NAME=$(ls -l --full-time /dev/mapper |
grep "$DEV"$ | awk '{print $9}')
fi
# For raw disks udev exports DEVTYPE=partition when
@ -343,28 +510,30 @@ scsi_handler() {
# we have to append the -part suffix directly in the
# helper.
if [ "$DEVTYPE" != "partition" ] ; then
PART=`echo $DM_NAME | awk -Fp '/p/{print "-part"$2}'`
# Match p[number], remove the 'p' and prepend "-part"
PART=$(echo "$DM_NAME" |
awk 'match($0,/p[0-9]+$/) {print "-part"substr($0,RSTART+1,RLENGTH-1)}')
fi
# Strip off partition information.
DM_NAME=`echo $DM_NAME | sed 's/p[0-9][0-9]*$//'`
DM_NAME=$(echo "$DM_NAME" | sed 's/p[0-9][0-9]*$//')
if [ -z "$DM_NAME" ] ; then
return
fi
# Get the raw scsi device name from multipath -ll. Strip off
# leading pipe symbols to make field numbering consistent.
DEV=`multipath -ll $DM_NAME |
awk '/running/{gsub("^[|]"," "); print $3 ; exit}'`
DEV=$(multipath -ll "$DM_NAME" |
awk '/running/{gsub("^[|]"," "); print $3 ; exit}')
if [ -z "$DEV" ] ; then
return
fi
fi
if echo $DEV | grep -q ^/devices/ ; then
if echo "$DEV" | grep -q ^/devices/ ; then
sys_path=$DEV
else
sys_path=`udevadm info -q path -p /sys/block/$DEV 2>/dev/null`
sys_path=$(udevadm info -q path -p "/sys/block/$DEV" 2>/dev/null)
fi
# expect sys_path like this, for example:
@ -377,44 +546,47 @@ scsi_handler() {
# Get path up to /sys/.../hostX
i=1
while [ $i -le $num_dirs ] ; do
d=$(eval echo \${$i})
while [ $i -le "$num_dirs" ] ; do
d=$(eval echo '$'{$i})
scsi_host_dir="$scsi_host_dir/$d"
echo $d | grep -q -E '^host[0-9]+$' && break
i=$(($i + 1))
echo "$d" | grep -q -E '^host[0-9]+$' && break
i=$((i + 1))
done
if [ $i = $num_dirs ] ; then
if [ $i = "$num_dirs" ] ; then
return
fi
PCI_ID=$(eval echo \${$(($i -1))} | awk -F: '{print $2":"$3}')
PCI_ID=$(eval echo '$'{$((i -1))} | awk -F: '{print $2":"$3}')
# In scsi mode, the directory two levels beneath
# /sys/.../hostX reveals the port and slot.
port_dir=$scsi_host_dir
j=$(($i + 2))
j=$((i + 2))
i=$(($i + 1))
i=$((i + 1))
while [ $i -le $j ] ; do
port_dir="$port_dir/$(eval echo \${$i})"
i=$(($i + 1))
port_dir="$port_dir/$(eval echo '$'{$i})"
i=$((i + 1))
done
set -- $(echo $port_dir | sed -e 's/^.*:\([^:]*\):\([^:]*\)$/\1 \2/')
set -- $(echo "$port_dir" | sed -e 's/^.*:\([^:]*\):\([^:]*\)$/\1 \2/')
PORT=$1
SLOT=$(($2 + $FIRST_BAY_NUMBER))
SLOT=$(($2 + FIRST_BAY_NUMBER))
if [ -z "$SLOT" ] ; then
return
fi
CHAN=`map_channel $PCI_ID $PORT`
SLOT=`map_slot $SLOT $CHAN`
CHAN=$(map_channel "$PCI_ID" "$PORT")
SLOT=$(map_slot "$SLOT" "$CHAN")
if [ -z "$CHAN" ] ; then
return
fi
echo ${CHAN}${SLOT}${PART}
echo "${CHAN}${SLOT}${PART}"
}
# Figure out the name for the enclosure symlink
@ -425,7 +597,7 @@ enclosure_handler () {
# Get the enclosure ID ("0:0:0:0")
ENC=$(basename $(readlink -m "/sys/$DEVPATH/../.."))
if [ ! -d /sys/class/enclosure/$ENC ] ; then
if [ ! -d "/sys/class/enclosure/$ENC" ] ; then
# Not an enclosure, bail out
return
fi
@ -433,14 +605,14 @@ enclosure_handler () {
# Get the long sysfs device path to our enclosure. Looks like:
# /devices/pci0000:00/0000:00:03.0/0000:05:00.0/host0/port-0:0/ ... /enclosure/0:0:0:0
ENC_DEVICE=$(readlink /sys/class/enclosure/$ENC)
ENC_DEVICE=$(readlink "/sys/class/enclosure/$ENC")
# Grab the full path to the hosts port dir:
# /devices/pci0000:00/0000:00:03.0/0000:05:00.0/host0/port-0:0
PORT_DIR=$(echo $ENC_DEVICE | grep -Eo '.+host[0-9]+/port-[0-9]+:[0-9]+')
PORT_DIR=$(echo "$ENC_DEVICE" | grep -Eo '.+host[0-9]+/port-[0-9]+:[0-9]+')
# Get the port number
PORT_ID=$(echo $PORT_DIR | grep -Eo "[0-9]+$")
PORT_ID=$(echo "$PORT_DIR" | grep -Eo "[0-9]+$")
# The PCI directory is two directories up from the port directory
# /sys/devices/pci0000:00/0000:00:03.0/0000:05:00.0
@ -451,7 +623,7 @@ enclosure_handler () {
# Name our device according to vdev_id.conf (like "L0" or "U1").
NAME=$(awk "/channel/{if (\$1 == \"channel\" && \$2 == \"$PCI_ID\" && \
\$3 == \"$PORT_ID\") {print \$4int(count[\$4])}; count[\$4]++}" $CONFIG)
\$3 == \"$PORT_ID\") {print \$4\$3}}" $CONFIG)
echo "${NAME}"
}
@ -487,9 +659,11 @@ alias_handler () {
# ambiguity seems unavoidable, so devices using this facility
# must not use such names.
DM_PART=
if echo $DM_NAME | grep -q -E 'p[0-9][0-9]*$' ; then
if echo "$DM_NAME" | grep -q -E 'p[0-9][0-9]*$' ; then
if [ "$DEVTYPE" != "partition" ] ; then
DM_PART=`echo $DM_NAME | awk -Fp '/p/{print "-part"$2}'`
# Match p[number], remove the 'p' and prepend "-part"
DM_PART=$(echo "$DM_NAME" |
awk 'match($0,/p[0-9]+$/) {print "-part"substr($0,RSTART+1,RLENGTH-1)}')
fi
fi
@ -497,21 +671,25 @@ alias_handler () {
for link in $DEVLINKS ; do
# Remove partition information to match key of top-level device.
if [ -n "$DM_PART" ] ; then
link=`echo $link | sed 's/p[0-9][0-9]*$//'`
link=$(echo "$link" | sed 's/p[0-9][0-9]*$//')
fi
# Check both the fully qualified and the base name of link.
for l in $link `basename $link` ; do
alias=`awk "\\$1 == \"alias\" && \\$3 == \"${l}\" \
{ print \\$2; exit }" $CONFIG`
for l in $link $(basename "$link") ; do
if [ ! -z "$l" ]; then
alias=$(awk -v var="$l" '($1 == "alias") && \
($3 == var) \
{ print $2; exit }' $CONFIG)
if [ -n "$alias" ] ; then
echo ${alias}${DM_PART}
echo "${alias}${DM_PART}"
return
fi
fi
done
done
}
while getopts 'c:d:eg:mp:h' OPTION; do
# main
while getopts 'c:d:eg:jmp:h' OPTION; do
case ${OPTION} in
c)
CONFIG=${OPTARG}
@ -524,7 +702,9 @@ while getopts 'c:d:eg:mp:h' OPTION; do
# create the enclosure device symlinks only. We also need
# "enclosure_symlinks yes" set in vdev_id.config to actually create the
# symlink.
ENCLOSURE_MODE=$(awk '{if ($1 == "enclosure_symlinks") print $2}' $CONFIG)
ENCLOSURE_MODE=$(awk '{if ($1 == "enclosure_symlinks") \
print $2}' "$CONFIG")
if [ "$ENCLOSURE_MODE" != "yes" ] ; then
exit 0
fi
@ -535,6 +715,9 @@ while getopts 'c:d:eg:mp:h' OPTION; do
p)
PHYS_PER_PORT=${OPTARG}
;;
j)
MULTIJBOD_MODE=yes
;;
m)
MULTIPATH_MODE=yes
;;
@ -544,9 +727,9 @@ while getopts 'c:d:eg:mp:h' OPTION; do
esac
done
if [ ! -r $CONFIG ] ; then
if [ ! -r "$CONFIG" ] ; then
echo "Error: Config file \"$CONFIG\" not found"
exit 0
exit 1
fi
if [ -z "$DEV" ] && [ -z "$ENCLOSURE_MODE" ] ; then
@ -555,11 +738,11 @@ if [ -z "$DEV" ] && [ -z "$ENCLOSURE_MODE" ] ; then
fi
if [ -z "$TOPOLOGY" ] ; then
TOPOLOGY=`awk "\\$1 == \"topology\" {print \\$2; exit}" $CONFIG`
TOPOLOGY=$(awk '($1 == "topology") {print $2; exit}' "$CONFIG")
fi
if [ -z "$BAY" ] ; then
BAY=`awk "\\$1 == \"slot\" {print \\$2; exit}" $CONFIG`
BAY=$(awk '($1 == "slot") {print $2; exit}' "$CONFIG")
fi
TOPOLOGY=${TOPOLOGY:-sas_direct}
@ -572,7 +755,7 @@ if [ "$ENCLOSURE_MODE" = "yes" ] && [ "$TOPOLOGY" = "sas_direct" ] ; then
fi
# Just create the symlinks to the enclosure devices and then exit.
ENCLOSURE_PREFIX=$(awk '/enclosure_symlinks_prefix/{print $2}' $CONFIG)
ENCLOSURE_PREFIX=$(awk '/enclosure_symlinks_prefix/{print $2}' "$CONFIG")
if [ -z "$ENCLOSURE_PREFIX" ] ; then
ENCLOSURE_PREFIX="enc"
fi
@ -582,16 +765,16 @@ if [ "$ENCLOSURE_MODE" = "yes" ] && [ "$TOPOLOGY" = "sas_direct" ] ; then
fi
# First check if an alias was defined for this device.
ID_VDEV=`alias_handler`
ID_VDEV=$(alias_handler)
if [ -z "$ID_VDEV" ] ; then
BAY=${BAY:-bay}
case $TOPOLOGY in
sas_direct|sas_switch)
ID_VDEV=`sas_handler`
ID_VDEV=$(sas_handler)
;;
scsi)
ID_VDEV=`scsi_handler`
ID_VDEV=$(scsi_handler)
;;
*)
echo "Error: unknown topology $TOPOLOGY"

View File

@ -14,3 +14,5 @@ zdb_LDADD = \
$(abs_top_builddir)/lib/libzpool/libzpool.la \
$(abs_top_builddir)/lib/libzfs_core/libzfs_core.la \
$(abs_top_builddir)/lib/libnvpair/libnvpair.la
include $(top_srcdir)/config/CppCheck.am

View File

@ -161,12 +161,6 @@ static int dump_bpobj_cb(void *arg, const blkptr_t *bp, boolean_t free,
dmu_tx_t *tx);
typedef struct sublivelist_verify {
/* all ALLOC'd blkptr_t in one sub-livelist */
zfs_btree_t sv_all_allocs;
/* all FREE'd blkptr_t in one sub-livelist */
zfs_btree_t sv_all_frees;
/* FREE's that haven't yet matched to an ALLOC, in one sub-livelist */
zfs_btree_t sv_pair;
@ -225,29 +219,68 @@ typedef struct sublivelist_verify_block {
static void zdb_print_blkptr(const blkptr_t *bp, int flags);
typedef struct sublivelist_verify_block_refcnt {
/* block pointer entry in livelist being verified */
blkptr_t svbr_blk;
/*
* Refcount gets incremented to 1 when we encounter the first
* FREE entry for the svfbr block pointer and a node for it
* is created in our ZDB verification/tracking metadata.
*
* As we encounter more FREE entries we increment this counter
* and similarly decrement it whenever we find the respective
* ALLOC entries for this block.
*
* When the refcount gets to 0 it means that all the FREE and
* ALLOC entries of this block have paired up and we no longer
* need to track it in our verification logic (e.g. the node
* containing this struct in our verification data structure
* should be freed).
*
* [refer to sublivelist_verify_blkptr() for the actual code]
*/
uint32_t svbr_refcnt;
} sublivelist_verify_block_refcnt_t;
static int
sublivelist_block_refcnt_compare(const void *larg, const void *rarg)
{
const sublivelist_verify_block_refcnt_t *l = larg;
const sublivelist_verify_block_refcnt_t *r = rarg;
return (livelist_compare(&l->svbr_blk, &r->svbr_blk));
}
static int
sublivelist_verify_blkptr(void *arg, const blkptr_t *bp, boolean_t free,
dmu_tx_t *tx)
{
ASSERT3P(tx, ==, NULL);
struct sublivelist_verify *sv = arg;
char blkbuf[BP_SPRINTF_LEN];
sublivelist_verify_block_refcnt_t current = {
.svbr_blk = *bp,
/*
* Start with 1 in case this is the first free entry.
* This field is not used for our B-Tree comparisons
* anyway.
*/
.svbr_refcnt = 1,
};
zfs_btree_index_t where;
sublivelist_verify_block_refcnt_t *pair =
zfs_btree_find(&sv->sv_pair, &current, &where);
if (free) {
zfs_btree_add(&sv->sv_pair, bp);
/* Check if the FREE is a duplicate */
if (zfs_btree_find(&sv->sv_all_frees, bp, &where) != NULL) {
snprintf_blkptr_compact(blkbuf, sizeof (blkbuf), bp,
free);
(void) printf("\tERROR: Duplicate FREE: %s\n", blkbuf);
if (pair == NULL) {
/* first free entry for this block pointer */
zfs_btree_add(&sv->sv_pair, &current);
} else {
zfs_btree_add_idx(&sv->sv_all_frees, bp, &where);
pair->svbr_refcnt++;
}
} else {
/* Check if the ALLOC has been freed */
if (zfs_btree_find(&sv->sv_pair, bp, &where) != NULL) {
zfs_btree_remove_idx(&sv->sv_pair, &where);
} else {
if (pair == NULL) {
/* block that is currently marked as allocated */
for (int i = 0; i < SPA_DVAS_PER_BP; i++) {
if (DVA_IS_EMPTY(&bp->blk_dva[i]))
break;
@ -262,16 +295,16 @@ sublivelist_verify_blkptr(void *arg, const blkptr_t *bp, boolean_t free,
&svb, &where);
}
}
}
/* Check if the ALLOC is a duplicate */
if (zfs_btree_find(&sv->sv_all_allocs, bp, &where) != NULL) {
snprintf_blkptr_compact(blkbuf, sizeof (blkbuf), bp,
free);
(void) printf("\tERROR: Duplicate ALLOC: %s\n", blkbuf);
} else {
zfs_btree_add_idx(&sv->sv_all_allocs, bp, &where);
/* alloc matches a free entry */
pair->svbr_refcnt--;
if (pair->svbr_refcnt == 0) {
/* all allocs and frees have been matched */
zfs_btree_remove_idx(&sv->sv_pair, &where);
}
}
}
return (0);
}
@ -279,32 +312,22 @@ static int
sublivelist_verify_func(void *args, dsl_deadlist_entry_t *dle)
{
int err;
char blkbuf[BP_SPRINTF_LEN];
struct sublivelist_verify *sv = args;
zfs_btree_create(&sv->sv_all_allocs, livelist_compare,
sizeof (blkptr_t));
zfs_btree_create(&sv->sv_all_frees, livelist_compare,
sizeof (blkptr_t));
zfs_btree_create(&sv->sv_pair, livelist_compare,
sizeof (blkptr_t));
zfs_btree_create(&sv->sv_pair, sublivelist_block_refcnt_compare,
sizeof (sublivelist_verify_block_refcnt_t));
err = bpobj_iterate_nofree(&dle->dle_bpobj, sublivelist_verify_blkptr,
sv, NULL);
zfs_btree_clear(&sv->sv_all_allocs);
zfs_btree_destroy(&sv->sv_all_allocs);
zfs_btree_clear(&sv->sv_all_frees);
zfs_btree_destroy(&sv->sv_all_frees);
blkptr_t *e;
sublivelist_verify_block_refcnt_t *e;
zfs_btree_index_t *cookie = NULL;
while ((e = zfs_btree_destroy_nodes(&sv->sv_pair, &cookie)) != NULL) {
snprintf_blkptr_compact(blkbuf, sizeof (blkbuf), e, B_TRUE);
(void) printf("\tERROR: Unmatched FREE: %s\n", blkbuf);
char blkbuf[BP_SPRINTF_LEN];
snprintf_blkptr_compact(blkbuf, sizeof (blkbuf),
&e->svbr_blk, B_TRUE);
(void) printf("\tERROR: %d unmatched FREE(s): %s\n",
e->svbr_refcnt, blkbuf);
}
zfs_btree_destroy(&sv->sv_pair);
@ -613,10 +636,14 @@ mv_populate_livelist_allocs(metaslab_verify_t *mv, sublivelist_verify_t *sv)
/*
* [Livelist Check]
* Iterate through all the sublivelists and:
* - report leftover frees
* - report double ALLOCs/FREEs
* - report leftover frees (**)
* - record leftover ALLOCs together with their TXG [see Cross Check]
*
* (**) Note: Double ALLOCs are valid in datasets that have dedup
* enabled. Similarly double FREEs are allowed as well but
* only if they pair up with a corresponding ALLOC entry once
* we our done with our sublivelist iteration.
*
* [Spacemap Check]
* for each metaslab:
* - iterate over spacemap and then the metaslab's entries in the
@ -2184,7 +2211,8 @@ snprintf_zstd_header(spa_t *spa, char *blkbuf, size_t buflen,
(void) snprintf(blkbuf + strlen(blkbuf),
buflen - strlen(blkbuf),
" ZSTD:size=%u:version=%u:level=%u:EMBEDDED",
zstd_hdr.c_len, zstd_hdr.version, zstd_hdr.level);
zstd_hdr.c_len, zfs_get_hdrversion(&zstd_hdr),
zfs_get_hdrlevel(&zstd_hdr));
return;
}
@ -2208,7 +2236,8 @@ snprintf_zstd_header(spa_t *spa, char *blkbuf, size_t buflen,
(void) snprintf(blkbuf + strlen(blkbuf),
buflen - strlen(blkbuf),
" ZSTD:size=%u:version=%u:level=%u:NORMAL",
zstd_hdr.c_len, zstd_hdr.version, zstd_hdr.level);
zstd_hdr.c_len, zfs_get_hdrversion(&zstd_hdr),
zfs_get_hdrlevel(&zstd_hdr));
abd_return_buf_copy(pabd, buf, BP_GET_LSIZE(bp));
}
@ -4060,7 +4089,7 @@ cksum_record_compare(const void *x1, const void *x2)
const cksum_record_t *l = (cksum_record_t *)x1;
const cksum_record_t *r = (cksum_record_t *)x2;
int arraysize = ARRAY_SIZE(l->cksum.zc_word);
int difference;
int difference = 0;
for (int i = 0; i < arraysize; i++) {
difference = TREE_CMP(l->cksum.zc_word[i], r->cksum.zc_word[i]);
@ -4535,7 +4564,7 @@ dump_path_impl(objset_t *os, uint64_t obj, char *name)
case DMU_OT_DIRECTORY_CONTENTS:
if (s != NULL && *(s + 1) != '\0')
return (dump_path_impl(os, child_obj, s + 1));
/*FALLTHROUGH*/
fallthrough;
case DMU_OT_PLAIN_FILE_CONTENTS:
dump_object(os, child_obj, dump_opt['v'], &header, NULL, 0);
return (0);
@ -5840,7 +5869,8 @@ zdb_leak_init_prepare_indirect_vdevs(spa_t *spa, zdb_cb_t *zcb)
*/
VERIFY0(vdev_metaslab_init(vd, 0));
vdev_indirect_mapping_t *vim = vd->vdev_indirect_mapping;
vdev_indirect_mapping_t *vim __maybe_unused =
vd->vdev_indirect_mapping;
uint64_t vim_idx = 0;
for (uint64_t m = 0; m < vd->vdev_ms_count; m++) {
@ -6933,7 +6963,7 @@ verify_checkpoint_vdev_spacemaps(spa_t *checkpoint, spa_t *current)
for (uint64_t c = ckpoint_rvd->vdev_children;
c < current_rvd->vdev_children; c++) {
vdev_t *current_vd = current_rvd->vdev_child[c];
ASSERT3P(current_vd->vdev_checkpoint_sm, ==, NULL);
VERIFY3P(current_vd->vdev_checkpoint_sm, ==, NULL);
}
}

View File

@ -47,3 +47,5 @@ zed_LDADD += -lrt $(LIBUDEV_LIBS) $(LIBUUID_LIBS)
zed_LDFLAGS = -pthread
EXTRA_DIST = agents/README.md
include $(top_srcdir)/config/CppCheck.am

View File

@ -1,21 +1,21 @@
#!/bin/sh
#
# Turn off/on the VDEV's enclosure fault LEDs when the pool's state changes.
# Turn off/on vdevs' enclosure fault LEDs when their pool's state changes.
#
# Turn the VDEV's fault LED on if it becomes FAULTED, DEGRADED or UNAVAIL.
# Turn the LED off when it's back ONLINE again.
# Turn a vdev's fault LED on if it becomes FAULTED, DEGRADED or UNAVAIL.
# Turn its LED off when it's back ONLINE again.
#
# This script run in two basic modes:
#
# 1. If $ZEVENT_VDEV_ENC_SYSFS_PATH and $ZEVENT_VDEV_STATE_STR are set, then
# only set the LED for that particular VDEV. This is the case for statechange
# only set the LED for that particular vdev. This is the case for statechange
# events and some vdev_* events.
#
# 2. If those vars are not set, then check the state of all VDEVs in the pool
# 2. If those vars are not set, then check the state of all vdevs in the pool
# and set the LEDs accordingly. This is the case for pool_import events.
#
# Note that this script requires that your enclosure be supported by the
# Linux SCSI enclosure services (ses) driver. The script will do nothing
# Linux SCSI Enclosure services (SES) driver. The script will do nothing
# if you have no enclosure, or if your enclosure isn't supported.
#
# Exit codes:
@ -29,7 +29,8 @@
[ -f "${ZED_ZEDLET_DIR}/zed.rc" ] && . "${ZED_ZEDLET_DIR}/zed.rc"
. "${ZED_ZEDLET_DIR}/zed-functions.sh"
if [ ! -d /sys/class/enclosure ] ; then
if [ ! -d /sys/class/enclosure ] && [ ! -d /sys/bus/pci/slots ] ; then
# No JBOD enclosure or NVMe slots
exit 1
fi
@ -59,6 +60,10 @@ check_and_set_led()
file="$1"
val="$2"
if [ -z "$val" ]; then
return 0
fi
if [ ! -e "$file" ] ; then
return 3
fi
@ -66,11 +71,11 @@ check_and_set_led()
# If another process is accessing the LED when we attempt to update it,
# the update will be lost so retry until the LED actually changes or we
# timeout.
for _ in $(seq 1 5); do
for _ in 1 2 3 4 5; do
# We want to check the current state first, since writing to the
# 'fault' entry always causes a SES command, even if the
# current state is already what you want.
current=$(cat "${file}")
read -r current < "${file}"
# On some enclosures if you write 1 to fault, and read it back,
# it will return 2. Treat all non-zero values as 1 for
@ -88,24 +93,81 @@ check_and_set_led()
done
}
state_to_val()
# Fault LEDs for JBODs and NVMe drives are handled a little differently.
#
# On JBODs the fault LED is called 'fault' and on a path like this:
#
# /sys/class/enclosure/0:0:1:0/SLOT 10/fault
#
# On NVMe it's called 'attention' and on a path like this:
#
# /sys/bus/pci/slot/0/attention
#
# This function returns the full path to the fault LED file for a given
# enclosure/slot directory.
#
path_to_led()
{
state="$1"
if [ "$state" = "FAULTED" ] || [ "$state" = "DEGRADED" ] || \
[ "$state" = "UNAVAIL" ] ; then
echo 1
elif [ "$state" = "ONLINE" ] ; then
echo 0
dir=$1
if [ -f "$dir/fault" ] ; then
echo "$dir/fault"
elif [ -f "$dir/attention" ] ; then
echo "$dir/attention"
fi
}
# process_pool ([pool])
state_to_val()
{
state="$1"
case "$state" in
FAULTED|DEGRADED|UNAVAIL)
echo 1
;;
ONLINE)
echo 0
;;
esac
}
#
# Iterate through a pool (or pools) and set the VDEV's enclosure slot LEDs to
# the VDEV's state.
# Given a nvme name like 'nvme0n1', pass back its slot directory
# like "/sys/bus/pci/slots/0"
#
nvme_dev_to_slot()
{
dev="$1"
# Get the address "0000:01:00.0"
address=$(cat "/sys/class/block/$dev/device/address")
# For each /sys/bus/pci/slots subdir that is an actual number
# (rather than weird directories like "1-3/").
# shellcheck disable=SC2010
for i in $(ls /sys/bus/pci/slots/ | grep -E "^[0-9]+$") ; do
this_address=$(cat "/sys/bus/pci/slots/$i/address")
# The format of address is a little different between
# /sys/class/block/$dev/device/address and
# /sys/bus/pci/slots/
#
# address= "0000:01:00.0"
# this_address = "0000:01:00"
#
if echo "$address" | grep -Eq ^"$this_address" ; then
echo "/sys/bus/pci/slots/$i"
break
fi
done
}
# process_pool (pool)
#
# Iterate through a pool and set the vdevs' enclosure slot LEDs to
# those vdevs' state.
#
# Arguments
# pool: Optional pool name. If not specified, iterate though all pools.
# pool: Pool name.
#
# Return
# 0 on success, 3 on missing sysfs path
@ -113,19 +175,27 @@ state_to_val()
process_pool()
{
pool="$1"
# The output will be the vdevs only (from "grep '/dev/'"):
#
# U45 ONLINE 0 0 0 /dev/sdk 0
# U46 ONLINE 0 0 0 /dev/sdm 0
# U47 ONLINE 0 0 0 /dev/sdn 0
# U50 ONLINE 0 0 0 /dev/sdbn 0
#
ZPOOL_SCRIPTS_AS_ROOT=1 $ZPOOL status -c upath,fault_led "$pool" | grep '/dev/' | (
rc=0
# Lookup all the current LED values and paths in parallel
#shellcheck disable=SC2016
cmd='echo led_token=$(cat "$VDEV_ENC_SYSFS_PATH/fault"),"$VDEV_ENC_SYSFS_PATH",'
out=$($ZPOOL status -vc "$cmd" "$pool" | grep 'led_token=')
#shellcheck disable=SC2034
echo "$out" | while read -r vdev state read write chksum therest; do
while read -r vdev state _ _ _ therest; do
# Read out current LED value and path
tmp=$(echo "$therest" | sed 's/^.*led_token=//g')
vdev_enc_sysfs_path=$(echo "$tmp" | awk -F ',' '{print $2}')
current_val=$(echo "$tmp" | awk -F ',' '{print $1}')
# Get dev name (like 'sda')
dev=$(basename "$(echo "$therest" | awk '{print $(NF-1)}')")
vdev_enc_sysfs_path=$(realpath "/sys/class/block/$dev/device/enclosure_device"*)
if [ ! -d "$vdev_enc_sysfs_path" ] ; then
# This is not a JBOD disk, but it could be a PCI NVMe drive
vdev_enc_sysfs_path=$(nvme_dev_to_slot "$dev")
fi
current_val=$(echo "$therest" | awk '{print $NF}')
if [ "$current_val" != "0" ] ; then
current_val=1
@ -136,40 +206,33 @@ process_pool()
continue
fi
if [ ! -e "$vdev_enc_sysfs_path/fault" ] ; then
#shellcheck disable=SC2030
rc=1
zed_log_msg "vdev $vdev '$file/fault' doesn't exist"
continue;
led_path=$(path_to_led "$vdev_enc_sysfs_path")
if [ ! -e "$led_path" ] ; then
rc=3
zed_log_msg "vdev $vdev '$led_path' doesn't exist"
continue
fi
val=$(state_to_val "$state")
if [ "$current_val" = "$val" ] ; then
# LED is already set correctly
continue;
continue
fi
if ! check_and_set_led "$vdev_enc_sysfs_path/fault" "$val"; then
rc=1
if ! check_and_set_led "$led_path" "$val"; then
rc=3
fi
done
#shellcheck disable=SC2031
if [ "$rc" = "0" ] ; then
return 0
else
# We didn't see a sysfs entry that we wanted to set
return 3
fi
exit "$rc"; )
}
if [ -n "$ZEVENT_VDEV_ENC_SYSFS_PATH" ] && [ -n "$ZEVENT_VDEV_STATE_STR" ] ; then
# Got a statechange for an individual VDEV
# Got a statechange for an individual vdev
val=$(state_to_val "$ZEVENT_VDEV_STATE_STR")
vdev=$(basename "$ZEVENT_VDEV_PATH")
check_and_set_led "$ZEVENT_VDEV_ENC_SYSFS_PATH/fault" "$val"
ledpath=$(path_to_led "$ZEVENT_VDEV_ENC_SYSFS_PATH")
check_and_set_led "$ledpath" "$val"
else
# Process the entire pool
poolname=$(zed_guid_to_pool "$ZEVENT_POOL_GUID")

View File

@ -89,8 +89,8 @@
##
# Turn on/off enclosure LEDs when drives get DEGRADED/FAULTED. This works for
# device mapper and multipath devices as well. Your enclosure must be
# supported by the Linux SES driver for this to work.
# device mapper and multipath devices as well. This works with JBOD enclosures
# and NVMe PCI drives (assuming they're supported by Linux in sysfs).
#
ZED_USE_ENCLOSURE_LEDS=1

View File

@ -173,6 +173,7 @@ _zed_exec_fork_child(uint64_t eid, const char *dir, const char *prog,
zed_log_msg(LOG_WARNING, "Killing hung \"%s\" pid=%d",
prog, pid);
(void) kill(pid, SIGKILL);
(void) waitpid(pid, &status, 0);
}
}

View File

@ -12,11 +12,11 @@
* You may not use this file except in compliance with the license.
*/
#include <dirent.h>
#include <errno.h>
#include <fcntl.h>
#include <limits.h>
#include <string.h>
#include <sys/resource.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
@ -160,6 +160,13 @@ zed_file_is_locked(int fd)
return (lock.l_pid);
}
#if __APPLE__
#define PROC_SELF_FD "/dev/fd"
#else /* Linux-compatible layout */
#define PROC_SELF_FD "/proc/self/fd"
#endif
/*
* Close all open file descriptors greater than or equal to [lowfd].
* Any errors encountered while closing file descriptors are ignored.
@ -167,20 +174,21 @@ zed_file_is_locked(int fd)
void
zed_file_close_from(int lowfd)
{
const int maxfd_def = 256;
int errno_bak;
struct rlimit rl;
int maxfd;
int errno_bak = errno;
int maxfd = 0;
int fd;
DIR *fddir;
struct dirent *fdent;
errno_bak = errno;
if (getrlimit(RLIMIT_NOFILE, &rl) < 0) {
maxfd = maxfd_def;
} else if (rl.rlim_max == RLIM_INFINITY) {
maxfd = maxfd_def;
if ((fddir = opendir(PROC_SELF_FD)) != NULL) {
while ((fdent = readdir(fddir)) != NULL) {
fd = atoi(fdent->d_name);
if (fd > maxfd && fd != dirfd(fddir))
maxfd = fd;
}
(void) closedir(fddir);
} else {
maxfd = rl.rlim_max;
maxfd = sysconf(_SC_OPEN_MAX);
}
for (fd = lowfd; fd < maxfd; fd++)
(void) close(fd);

View File

@ -21,3 +21,5 @@ zfs_LDADD += $(LTLIBINTL)
if BUILD_FREEBSD
zfs_LDADD += -lgeom -ljail
endif
include $(top_srcdir)/config/CppCheck.am

View File

@ -730,6 +730,32 @@ finish_progress(char *done)
pt_header = NULL;
}
/* This function checks if the passed fd refers to /dev/null or /dev/zero */
#ifdef __linux__
static boolean_t
is_dev_nullzero(int fd)
{
struct stat st;
fstat(fd, &st);
return (major(st.st_rdev) == 1 && (minor(st.st_rdev) == 3 /* null */ ||
minor(st.st_rdev) == 5 /* zero */));
}
#endif
static void
note_dev_error(int err, int fd)
{
#ifdef __linux__
if (err == EINVAL && is_dev_nullzero(fd)) {
(void) fprintf(stderr,
gettext("Error: Writing directly to /dev/{null,zero} files"
" on certain kernels is not currently implemented.\n"
"(As a workaround, "
"try \"zfs send [...] | cat > /dev/null\")\n"));
}
#endif
}
static int
zfs_mount_and_share(libzfs_handle_t *hdl, const char *dataset, zfs_type_t type)
{
@ -4448,11 +4474,16 @@ zfs_do_send(int argc, char **argv)
err = zfs_send_saved(zhp, &flags, STDOUT_FILENO,
resume_token);
if (err != 0)
note_dev_error(errno, STDOUT_FILENO);
zfs_close(zhp);
return (err != 0);
} else if (resume_token != NULL) {
return (zfs_send_resume(g_zfs, &flags, STDOUT_FILENO,
resume_token));
err = zfs_send_resume(g_zfs, &flags, STDOUT_FILENO,
resume_token);
if (err != 0)
note_dev_error(errno, STDOUT_FILENO);
return (err);
}
/*
@ -4496,6 +4527,8 @@ zfs_do_send(int argc, char **argv)
err = zfs_send_one(zhp, fromname, STDOUT_FILENO, &flags,
redactbook);
zfs_close(zhp);
if (err != 0)
note_dev_error(errno, STDOUT_FILENO);
return (err != 0);
}
@ -4572,6 +4605,7 @@ zfs_do_send(int argc, char **argv)
nvlist_free(dbgnv);
}
zfs_close(zhp);
note_dev_error(errno, STDOUT_FILENO);
return (err != 0);
}
@ -6790,9 +6824,6 @@ report_mount_progress(int current, int total)
time_t now = time(NULL);
char info[32];
/* report 1..n instead of 0..n-1 */
++current;
/* display header if we're here for the first time */
if (current == 1) {
set_progress_header(gettext("Mounting ZFS filesystems"));
@ -7313,6 +7344,7 @@ unshare_unmount(int op, int argc, char **argv)
if (zfs_prop_get_int(zhp, ZFS_PROP_CANMOUNT) ==
ZFS_CANMOUNT_NOAUTO)
continue;
break;
default:
break;
}

View File

@ -7,3 +7,5 @@ zfs_ids_to_path_SOURCES = \
zfs_ids_to_path_LDADD = \
$(abs_top_builddir)/lib/libzfs/libzfs.la
include $(top_srcdir)/config/CppCheck.am

View File

@ -3,3 +3,5 @@ include $(top_srcdir)/config/Rules.am
sbin_PROGRAMS = zgenhostid
zgenhostid_SOURCES = zgenhostid.c
include $(top_srcdir)/config/CppCheck.am

View File

@ -12,3 +12,5 @@ zhack_LDADD = \
$(abs_top_builddir)/lib/libzpool/libzpool.la \
$(abs_top_builddir)/lib/libzfs_core/libzfs_core.la \
$(abs_top_builddir)/lib/libnvpair/libnvpair.la
include $(top_srcdir)/config/CppCheck.am

View File

@ -11,3 +11,5 @@ zinject_LDADD = \
$(abs_top_builddir)/lib/libzfs/libzfs.la \
$(abs_top_builddir)/lib/libzfs_core/libzfs_core.la \
$(abs_top_builddir)/lib/libnvpair/libnvpair.la
include $(top_srcdir)/config/CppCheck.am

View File

@ -25,7 +25,8 @@ zpool_LDADD = \
$(abs_top_builddir)/lib/libzfs/libzfs.la \
$(abs_top_builddir)/lib/libzfs_core/libzfs_core.la \
$(abs_top_builddir)/lib/libnvpair/libnvpair.la \
$(abs_top_builddir)/lib/libuutil/libuutil.la
$(abs_top_builddir)/lib/libuutil/libuutil.la \
$(abs_top_builddir)/lib/libzutil/libzutil.la
zpool_LDADD += $(LTLIBINTL)
@ -34,6 +35,8 @@ zpool_LDADD += -lgeom
endif
zpool_LDADD += -lm $(LIBBLKID_LIBS) $(LIBUUID_LIBS)
include $(top_srcdir)/config/CppCheck.am
zpoolconfdir = $(sysconfdir)/zfs/zpool.d
zpoolexecdir = $(zfsexecdir)/zpool.d

View File

@ -41,7 +41,13 @@ for i in $scripts ; do
val=$(ls "$VDEV_ENC_SYSFS_PATH/../device/scsi_generic" 2>/dev/null)
;;
fault_led)
# JBODs fault LED is called 'fault', NVMe fault LED is called
# 'attention'.
if [ -f "$VDEV_ENC_SYSFS_PATH/fault" ] ; then
val=$(cat "$VDEV_ENC_SYSFS_PATH/fault" 2>/dev/null)
elif [ -f "$VDEV_ENC_SYSFS_PATH/attention" ] ; then
val=$(cat "$VDEV_ENC_SYSFS_PATH/attention" 2>/dev/null)
fi
;;
locate_led)
val=$(cat "$VDEV_ENC_SYSFS_PATH/locate" 2>/dev/null)

View File

@ -264,51 +264,6 @@ for_each_pool(int argc, char **argv, boolean_t unavail,
return (ret);
}
static int
for_each_vdev_cb(zpool_handle_t *zhp, nvlist_t *nv, pool_vdev_iter_f func,
void *data)
{
nvlist_t **child;
uint_t c, children;
int ret = 0;
int i;
char *type;
const char *list[] = {
ZPOOL_CONFIG_SPARES,
ZPOOL_CONFIG_L2CACHE,
ZPOOL_CONFIG_CHILDREN
};
for (i = 0; i < ARRAY_SIZE(list); i++) {
if (nvlist_lookup_nvlist_array(nv, list[i], &child,
&children) == 0) {
for (c = 0; c < children; c++) {
uint64_t ishole = 0;
(void) nvlist_lookup_uint64(child[c],
ZPOOL_CONFIG_IS_HOLE, &ishole);
if (ishole)
continue;
ret |= for_each_vdev_cb(zhp, child[c], func,
data);
}
}
}
if (nvlist_lookup_string(nv, ZPOOL_CONFIG_TYPE, &type) != 0)
return (ret);
/* Don't run our function on root vdevs */
if (strcmp(type, VDEV_TYPE_ROOT) != 0) {
ret |= func(zhp, nv, data);
}
return (ret);
}
/*
* This is the equivalent of for_each_pool() for vdevs. It iterates thorough
* all vdevs in the pool, ignoring root vdevs and holes, calling func() on
@ -327,7 +282,7 @@ for_each_vdev(zpool_handle_t *zhp, pool_vdev_iter_f func, void *data)
verify(nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE,
&nvroot) == 0);
}
return (for_each_vdev_cb(zhp, nvroot, func, data));
return (for_each_vdev_cb((void *) zhp, nvroot, func, data));
}
/*
@ -598,7 +553,7 @@ vdev_run_cmd_thread(void *cb_cmd_data)
/* For each vdev in the pool run a command */
static int
for_each_vdev_run_cb(zpool_handle_t *zhp, nvlist_t *nv, void *cb_vcdl)
for_each_vdev_run_cb(void *zhp_data, nvlist_t *nv, void *cb_vcdl)
{
vdev_cmd_data_list_t *vcdl = cb_vcdl;
vdev_cmd_data_t *data;
@ -606,6 +561,7 @@ for_each_vdev_run_cb(zpool_handle_t *zhp, nvlist_t *nv, void *cb_vcdl)
char *vname = NULL;
char *vdev_enc_sysfs_path = NULL;
int i, match = 0;
zpool_handle_t *zhp = zhp_data;
if (nvlist_lookup_string(nv, ZPOOL_CONFIG_PATH, &path) != 0)
return (1);

View File

@ -2586,8 +2586,8 @@ print_class_vdevs(zpool_handle_t *zhp, status_cbdata_t *cb, nvlist_t *nv,
/*
* Display the status for the given pool.
*/
static void
show_import(nvlist_t *config)
static int
show_import(nvlist_t *config, boolean_t report_error)
{
uint64_t pool_state;
vdev_stat_t *vs;
@ -2619,6 +2619,13 @@ show_import(nvlist_t *config)
reason = zpool_import_status(config, &msgid, &errata);
/*
* If we're importing using a cachefile, then we won't report any
* errors unless we are in the scan phase of the import.
*/
if (reason != ZPOOL_STATUS_OK && !report_error)
return (reason);
(void) printf(gettext(" pool: %s\n"), name);
(void) printf(gettext(" id: %llu\n"), (u_longlong_t)guid);
(void) printf(gettext(" state: %s"), health);
@ -2933,6 +2940,7 @@ show_import(nvlist_t *config)
"be part of this pool, though their\n\texact "
"configuration cannot be determined.\n"));
}
return (0);
}
static boolean_t
@ -3071,6 +3079,121 @@ do_import(nvlist_t *config, const char *newname, const char *mntopts,
return (ret);
}
static int
import_pools(nvlist_t *pools, nvlist_t *props, char *mntopts, int flags,
char *orig_name, char *new_name,
boolean_t do_destroyed, boolean_t pool_specified, boolean_t do_all,
importargs_t *import)
{
nvlist_t *config = NULL;
nvlist_t *found_config = NULL;
uint64_t pool_state;
/*
* At this point we have a list of import candidate configs. Even if
* we were searching by pool name or guid, we still need to
* post-process the list to deal with pool state and possible
* duplicate names.
*/
int err = 0;
nvpair_t *elem = NULL;
boolean_t first = B_TRUE;
while ((elem = nvlist_next_nvpair(pools, elem)) != NULL) {
verify(nvpair_value_nvlist(elem, &config) == 0);
verify(nvlist_lookup_uint64(config, ZPOOL_CONFIG_POOL_STATE,
&pool_state) == 0);
if (!do_destroyed && pool_state == POOL_STATE_DESTROYED)
continue;
if (do_destroyed && pool_state != POOL_STATE_DESTROYED)
continue;
verify(nvlist_add_nvlist(config, ZPOOL_LOAD_POLICY,
import->policy) == 0);
if (!pool_specified) {
if (first)
first = B_FALSE;
else if (!do_all)
(void) printf("\n");
if (do_all) {
err |= do_import(config, NULL, mntopts,
props, flags);
} else {
/*
* If we're importing from cachefile, then
* we don't want to report errors until we
* are in the scan phase of the import. If
* we get an error, then we return that error
* to invoke the scan phase.
*/
if (import->cachefile && !import->scan)
err = show_import(config, B_FALSE);
else
(void) show_import(config, B_TRUE);
}
} else if (import->poolname != NULL) {
char *name;
/*
* We are searching for a pool based on name.
*/
verify(nvlist_lookup_string(config,
ZPOOL_CONFIG_POOL_NAME, &name) == 0);
if (strcmp(name, import->poolname) == 0) {
if (found_config != NULL) {
(void) fprintf(stderr, gettext(
"cannot import '%s': more than "
"one matching pool\n"),
import->poolname);
(void) fprintf(stderr, gettext(
"import by numeric ID instead\n"));
err = B_TRUE;
}
found_config = config;
}
} else {
uint64_t guid;
/*
* Search for a pool by guid.
*/
verify(nvlist_lookup_uint64(config,
ZPOOL_CONFIG_POOL_GUID, &guid) == 0);
if (guid == import->guid)
found_config = config;
}
}
/*
* If we were searching for a specific pool, verify that we found a
* pool, and then do the import.
*/
if (pool_specified && err == 0) {
if (found_config == NULL) {
(void) fprintf(stderr, gettext("cannot import '%s': "
"no such pool available\n"), orig_name);
err = B_TRUE;
} else {
err |= do_import(found_config, new_name,
mntopts, props, flags);
}
}
/*
* If we were just looking for pools, report an error if none were
* found.
*/
if (!pool_specified && first)
(void) fprintf(stderr,
gettext("no pools available to import\n"));
return (err);
}
typedef struct target_exists_args {
const char *poolname;
uint64_t poolguid;
@ -3198,12 +3321,15 @@ zpool_do_checkpoint(int argc, char **argv)
/*
* zpool import [-d dir] [-D]
* import [-o mntopts] [-o prop=value] ... [-R root] [-D] [-l]
* [-d dir | -c cachefile] [-f] -a
* [-d dir | -c cachefile | -s] [-f] -a
* import [-o mntopts] [-o prop=value] ... [-R root] [-D] [-l]
* [-d dir | -c cachefile] [-f] [-n] [-F] <pool | id> [newpool]
* [-d dir | -c cachefile | -s] [-f] [-n] [-F] <pool | id>
* [newpool]
*
* -c Read pool information from a cachefile instead of searching
* devices.
* devices. If importing from a cachefile config fails, then
* fallback to searching for devices only in the directories that
* exist in the cachefile.
*
* -d Scan in a specific directory, other than /dev/. More than
* one directory can be specified using multiple '-d' options.
@ -3259,15 +3385,11 @@ zpool_do_import(int argc, char **argv)
boolean_t do_all = B_FALSE;
boolean_t do_destroyed = B_FALSE;
char *mntopts = NULL;
nvpair_t *elem;
nvlist_t *config;
uint64_t searchguid = 0;
char *searchname = NULL;
char *propval;
nvlist_t *found_config;
nvlist_t *policy = NULL;
nvlist_t *props = NULL;
boolean_t first;
int flags = ZFS_IMPORT_NORMAL;
uint32_t rewind_policy = ZPOOL_NO_REWIND;
boolean_t dryrun = B_FALSE;
@ -3275,7 +3397,8 @@ zpool_do_import(int argc, char **argv)
boolean_t xtreme_rewind = B_FALSE;
boolean_t do_scan = B_FALSE;
boolean_t pool_exists = B_FALSE;
uint64_t pool_state, txg = -1ULL;
boolean_t pool_specified = B_FALSE;
uint64_t txg = -1ULL;
char *cachefile = NULL;
importargs_t idata = { 0 };
char *endptr;
@ -3397,6 +3520,11 @@ zpool_do_import(int argc, char **argv)
usage(B_FALSE);
}
if (cachefile && do_scan) {
(void) fprintf(stderr, gettext("-c is incompatible with -s\n"));
usage(B_FALSE);
}
if ((flags & ZFS_IMPORT_LOAD_KEYS) && (flags & ZFS_IMPORT_ONLY)) {
(void) fprintf(stderr, gettext("-l is incompatible with -N\n"));
usage(B_FALSE);
@ -3477,7 +3605,7 @@ zpool_do_import(int argc, char **argv)
searchname = argv[0];
searchguid = 0;
}
found_config = NULL;
pool_specified = B_TRUE;
/*
* User specified a name or guid. Ensure it's unique.
@ -3556,97 +3684,32 @@ zpool_do_import(int argc, char **argv)
return (1);
}
/*
* At this point we have a list of import candidate configs. Even if
* we were searching by pool name or guid, we still need to
* post-process the list to deal with pool state and possible
* duplicate names.
*/
err = 0;
elem = NULL;
first = B_TRUE;
while ((elem = nvlist_next_nvpair(pools, elem)) != NULL) {
verify(nvpair_value_nvlist(elem, &config) == 0);
verify(nvlist_lookup_uint64(config, ZPOOL_CONFIG_POOL_STATE,
&pool_state) == 0);
if (!do_destroyed && pool_state == POOL_STATE_DESTROYED)
continue;
if (do_destroyed && pool_state != POOL_STATE_DESTROYED)
continue;
verify(nvlist_add_nvlist(config, ZPOOL_LOAD_POLICY,
policy) == 0);
if (argc == 0) {
if (first)
first = B_FALSE;
else if (!do_all)
(void) printf("\n");
if (do_all) {
err |= do_import(config, NULL, mntopts,
props, flags);
} else {
show_import(config);
}
} else if (searchname != NULL) {
char *name;
err = import_pools(pools, props, mntopts, flags, argv[0],
argc == 1 ? NULL : argv[1], do_destroyed, pool_specified,
do_all, &idata);
/*
* We are searching for a pool based on name.
* If we're using the cachefile and we failed to import, then
* fallback to scanning the directory for pools that match
* those in the cachefile.
*/
verify(nvlist_lookup_string(config,
ZPOOL_CONFIG_POOL_NAME, &name) == 0);
if (strcmp(name, searchname) == 0) {
if (found_config != NULL) {
(void) fprintf(stderr, gettext(
"cannot import '%s': more than "
"one matching pool\n"), searchname);
(void) fprintf(stderr, gettext(
"import by numeric ID instead\n"));
err = B_TRUE;
}
found_config = config;
}
} else {
uint64_t guid;
if (err != 0 && cachefile != NULL) {
(void) printf(gettext("cachefile import failed, retrying\n"));
/*
* Search for a pool by guid.
* We use the scan flag to gather the directories that exist
* in the cachefile. If we need to fallback to searching for
* the pool config, we will only search devices in these
* directories.
*/
verify(nvlist_lookup_uint64(config,
ZPOOL_CONFIG_POOL_GUID, &guid) == 0);
idata.scan = B_TRUE;
nvlist_free(pools);
pools = zpool_search_import(g_zfs, &idata, &libzfs_config_ops);
if (guid == searchguid)
found_config = config;
err = import_pools(pools, props, mntopts, flags, argv[0],
argc == 1 ? NULL : argv[1], do_destroyed, pool_specified,
do_all, &idata);
}
}
/*
* If we were searching for a specific pool, verify that we found a
* pool, and then do the import.
*/
if (argc != 0 && err == 0) {
if (found_config == NULL) {
(void) fprintf(stderr, gettext("cannot import '%s': "
"no such pool available\n"), argv[0]);
err = B_TRUE;
} else {
err |= do_import(found_config, argc == 1 ? NULL :
argv[1], mntopts, props, flags);
}
}
/*
* If we were just looking for pools, report an error if none were
* found.
*/
if (argc == 0 && first)
(void) fprintf(stderr,
gettext("no pools available to import\n"));
error:
nvlist_free(props);
@ -4852,8 +4915,8 @@ get_interval_count(int *argcp, char **argv, float *iv,
if (*end == '\0' && errno == 0) {
if (interval == 0) {
(void) fprintf(stderr, gettext("interval "
"cannot be zero\n"));
(void) fprintf(stderr, gettext(
"interval cannot be zero\n"));
usage(B_FALSE);
}
/*
@ -4883,8 +4946,8 @@ get_interval_count(int *argcp, char **argv, float *iv,
if (*end == '\0' && errno == 0) {
if (interval == 0) {
(void) fprintf(stderr, gettext("interval "
"cannot be zero\n"));
(void) fprintf(stderr, gettext(
"interval cannot be zero\n"));
usage(B_FALSE);
}
@ -4986,11 +5049,12 @@ get_stat_flags(zpool_list_t *list)
* Return 1 if cb_data->cb_vdev_names[0] is this vdev's name, 0 otherwise.
*/
static int
is_vdev_cb(zpool_handle_t *zhp, nvlist_t *nv, void *cb_data)
is_vdev_cb(void *zhp_data, nvlist_t *nv, void *cb_data)
{
iostat_cbdata_t *cb = cb_data;
char *name = NULL;
int ret = 0;
zpool_handle_t *zhp = zhp_data;
name = zpool_vdev_name(g_zfs, zhp, nv, cb->cb_name_flags);
@ -5887,7 +5951,7 @@ print_one_column(zpool_prop_t prop, uint64_t value, const char *str,
break;
case ZPOOL_PROP_HEALTH:
width = 8;
snprintf(propval, sizeof (propval), "%-*s", (int)width, str);
(void) strlcpy(propval, str, sizeof (propval));
break;
default:
zfs_nicenum_format(value, propval, sizeof (propval), format);
@ -7735,8 +7799,8 @@ print_removal_status(zpool_handle_t *zhp, pool_removal_stat_t *prs)
* do not print estimated time if hours_left is more than
* 30 days
*/
(void) printf(gettext(" %s copied out of %s at %s/s, "
"%.2f%% done"),
(void) printf(gettext(
"\t%s copied out of %s at %s/s, %.2f%% done"),
examined_buf, total_buf, rate_buf, 100 * fraction_done);
if (hours_left < (30 * 24)) {
(void) printf(gettext(", %lluh%um to go\n"),
@ -7751,8 +7815,8 @@ print_removal_status(zpool_handle_t *zhp, pool_removal_stat_t *prs)
if (prs->prs_mapping_memory > 0) {
char mem_buf[7];
zfs_nicenum(prs->prs_mapping_memory, mem_buf, sizeof (mem_buf));
(void) printf(gettext(" %s memory used for "
"removed device mappings\n"),
(void) printf(gettext(
"\t%s memory used for removed device mappings\n"),
mem_buf);
}
}

View File

@ -27,6 +27,7 @@
#include <libnvpair.h>
#include <libzfs.h>
#include <libzutil.h>
#ifdef __cplusplus
extern "C" {
@ -67,7 +68,6 @@ int for_each_pool(int, char **, boolean_t unavail, zprop_list_t **,
boolean_t, zpool_iter_f, void *);
/* Vdev list functions */
typedef int (*pool_vdev_iter_f)(zpool_handle_t *, nvlist_t *, void *);
int for_each_vdev(zpool_handle_t *zhp, pool_vdev_iter_f func, void *data);
typedef struct zpool_list zpool_list_t;

View File

@ -13,3 +13,5 @@ zstream_LDADD = \
$(abs_top_builddir)/lib/libzfs/libzfs.la \
$(abs_top_builddir)/lib/libzfs_core/libzfs_core.la \
$(abs_top_builddir)/lib/libnvpair/libnvpair.la
include $(top_srcdir)/config/CppCheck.am

View File

@ -248,6 +248,7 @@ zfs_redup_stream(int infd, int outfd, boolean_t verbose)
fflags = DMU_GET_FEATUREFLAGS(drrb->drr_versioninfo);
fflags &= ~(DMU_BACKUP_FEATURE_DEDUP |
DMU_BACKUP_FEATURE_DEDUPPROPS);
/* cppcheck-suppress syntaxError */
DMU_SET_FEATUREFLAGS(drrb->drr_versioninfo, fflags);
int sz = drr->drr_payloadlen;

View File

@ -21,3 +21,5 @@ ztest_LDADD = \
ztest_LDADD += -lm
ztest_LDFLAGS = -pthread
include $(top_srcdir)/config/CppCheck.am

View File

@ -132,7 +132,7 @@
#include <libnvpair.h>
#include <libzutil.h>
#include <sys/crypto/icp.h>
#ifdef __GLIBC__
#if (__GLIBC__ && !__UCLIBC__)
#include <execinfo.h> /* for backtrace() */
#endif
@ -556,7 +556,7 @@ dump_debug_buffer(void)
static void sig_handler(int signo)
{
struct sigaction action;
#ifdef __GLIBC__ /* backtrace() is a GNU extension */
#if (__GLIBC__ && !__UCLIBC__) /* backtrace() is a GNU extension */
int nptrs;
void *buffer[BACKTRACE_SZ];
@ -2163,8 +2163,8 @@ ztest_get_done(zgd_t *zgd, int error)
}
static int
ztest_get_data(void *arg, lr_write_t *lr, char *buf, struct lwb *lwb,
zio_t *zio)
ztest_get_data(void *arg, uint64_t arg2, lr_write_t *lr, char *buf,
struct lwb *lwb, zio_t *zio)
{
ztest_ds_t *zd = arg;
objset_t *os = zd->zd_os;

View File

@ -8,3 +8,5 @@ udev_PROGRAMS = zvol_id
zvol_id_SOURCES = \
zvol_id_main.c
include $(top_srcdir)/config/CppCheck.am

11
config/CppCheck.am Normal file
View File

@ -0,0 +1,11 @@
#
# Default rules for running cppcheck against the the user space components.
#
PHONY += cppcheck
CPPCHECKFLAGS = --std=c99 --quiet --max-configs=1 --error-exitcode=2
CPPCHECKFLAGS += --inline-suppr -U_KERNEL
cppcheck:
$(CPPCHECK) -j$(CPU_COUNT) $(CPPCHECKFLAGS) $(DEFAULT_INCLUDES) $(SOURCES)

View File

@ -3,6 +3,7 @@
# should include these rules and override or extend them as needed.
#
PHONY =
DEFAULT_INCLUDES = \
-include $(top_builddir)/zfs_config.h \
-I$(top_builddir)/include \
@ -25,6 +26,7 @@ AM_LIBTOOLFLAGS = --silent
AM_CFLAGS = -std=gnu99 -Wall -Wstrict-prototypes -Wmissing-prototypes
AM_CFLAGS += -fno-strict-aliasing
AM_CFLAGS += $(NO_OMIT_FRAME_POINTER)
AM_CFLAGS += $(IMPLICIT_FALLTHROUGH)
AM_CFLAGS += $(DEBUG_CFLAGS)
AM_CFLAGS += $(ASAN_CFLAGS)
AM_CFLAGS += $(CODE_COVERAGE_CFLAGS) $(NO_FORMAT_ZERO_LENGTH)

View File

@ -161,6 +161,29 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_UNUSED_BUT_SET_VARIABLE], [
AC_SUBST([NO_UNUSED_BUT_SET_VARIABLE])
])
dnl #
dnl # Check if gcc supports -Wimplicit-fallthrough option.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_IMPLICIT_FALLTHROUGH], [
AC_MSG_CHECKING([whether $CC supports -Wimplicit-fallthrough])
saved_flags="$CFLAGS"
CFLAGS="$CFLAGS -Werror -Wimplicit-fallthrough"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [])], [
IMPLICIT_FALLTHROUGH=-Wimplicit-fallthrough
AC_DEFINE([HAVE_IMPLICIT_FALLTHROUGH], 1,
[Define if compiler supports -Wimplicit-fallthrough])
AC_MSG_RESULT([yes])
], [
IMPLICIT_FALLTHROUGH=
AC_MSG_RESULT([no])
])
CFLAGS="$saved_flags"
AC_SUBST([IMPLICIT_FALLTHROUGH])
])
dnl #
dnl # Check if gcc supports -fno-omit-frame-pointer option.
dnl #

View File

@ -0,0 +1,6 @@
dnl #
dnl # Check if cppcheck is available.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CPPCHECK], [
AC_CHECK_PROG([CPPCHECK], [cppcheck], [cppcheck])
])

View File

@ -46,6 +46,21 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYZFS], [
])
AC_SUBST(DEFINE_PYZFS)
dnl #
dnl # Python "packaging" (or, failing that, "distlib") module is required to build and install pyzfs
dnl #
AS_IF([test "x$enable_pyzfs" = xcheck -o "x$enable_pyzfs" = xyes], [
ZFS_AC_PYTHON_MODULE([packaging], [], [
ZFS_AC_PYTHON_MODULE([distlib], [], [
AS_IF([test "x$enable_pyzfs" = xyes], [
AC_MSG_ERROR("Python $PYTHON_VERSION packaging and distlib modules are not installed")
], [test "x$enable_pyzfs" != xno], [
enable_pyzfs=no
])
])
])
])
dnl #
dnl # Require python-devel libraries
dnl #

101
config/ax_count_cpus.m4 Normal file
View File

@ -0,0 +1,101 @@
# ===========================================================================
# https://www.gnu.org/software/autoconf-archive/ax_count_cpus.html
# ===========================================================================
#
# SYNOPSIS
#
# AX_COUNT_CPUS([ACTION-IF-DETECTED],[ACTION-IF-NOT-DETECTED])
#
# DESCRIPTION
#
# Attempt to count the number of logical processor cores (including
# virtual and HT cores) currently available to use on the machine and
# place detected value in CPU_COUNT variable.
#
# On successful detection, ACTION-IF-DETECTED is executed if present. If
# the detection fails, then ACTION-IF-NOT-DETECTED is triggered. The
# default ACTION-IF-NOT-DETECTED is to set CPU_COUNT to 1.
#
# LICENSE
#
# Copyright (c) 2014,2016 Karlson2k (Evgeny Grin) <k2k@narod.ru>
# Copyright (c) 2012 Brian Aker <brian@tangent.org>
# Copyright (c) 2008 Michael Paul Bailey <jinxidoru@byu.net>
# Copyright (c) 2008 Christophe Tournayre <turn3r@users.sourceforge.net>
#
# Copying and distribution of this file, with or without modification, are
# permitted in any medium without royalty provided the copyright notice
# and this notice are preserved. This file is offered as-is, without any
# warranty.
#serial 22
AC_DEFUN([AX_COUNT_CPUS],[dnl
AC_REQUIRE([AC_CANONICAL_HOST])dnl
AC_REQUIRE([AC_PROG_EGREP])dnl
AC_MSG_CHECKING([the number of available CPUs])
CPU_COUNT="0"
# Try generic methods
# 'getconf' is POSIX utility, but '_NPROCESSORS_ONLN' and
# 'NPROCESSORS_ONLN' are platform-specific
command -v getconf >/dev/null 2>&1 && \
CPU_COUNT=`getconf _NPROCESSORS_ONLN 2>/dev/null || getconf NPROCESSORS_ONLN 2>/dev/null` || CPU_COUNT="0"
AS_IF([[test "$CPU_COUNT" -gt "0" 2>/dev/null || ! command -v nproc >/dev/null 2>&1]],[[: # empty]],[dnl
# 'nproc' is part of GNU Coreutils and is widely available
CPU_COUNT=`OMP_NUM_THREADS='' nproc 2>/dev/null` || CPU_COUNT=`nproc 2>/dev/null` || CPU_COUNT="0"
])dnl
AS_IF([[test "$CPU_COUNT" -gt "0" 2>/dev/null]],[[: # empty]],[dnl
# Try platform-specific preferred methods
AS_CASE([[$host_os]],dnl
[[*linux*]],[[CPU_COUNT=`lscpu -p 2>/dev/null | $EGREP -e '^@<:@0-9@:>@+,' -c` || CPU_COUNT="0"]],dnl
[[*darwin*]],[[CPU_COUNT=`sysctl -n hw.logicalcpu 2>/dev/null` || CPU_COUNT="0"]],dnl
[[freebsd*]],[[command -v sysctl >/dev/null 2>&1 && CPU_COUNT=`sysctl -n kern.smp.cpus 2>/dev/null` || CPU_COUNT="0"]],dnl
[[netbsd*]], [[command -v sysctl >/dev/null 2>&1 && CPU_COUNT=`sysctl -n hw.ncpuonline 2>/dev/null` || CPU_COUNT="0"]],dnl
[[solaris*]],[[command -v psrinfo >/dev/null 2>&1 && CPU_COUNT=`psrinfo 2>/dev/null | $EGREP -e '^@<:@0-9@:>@.*on-line' -c 2>/dev/null` || CPU_COUNT="0"]],dnl
[[mingw*]],[[CPU_COUNT=`ls -qpU1 /proc/registry/HKEY_LOCAL_MACHINE/HARDWARE/DESCRIPTION/System/CentralProcessor/ 2>/dev/null | $EGREP -e '^@<:@0-9@:>@+/' -c` || CPU_COUNT="0"]],dnl
[[msys*]],[[CPU_COUNT=`ls -qpU1 /proc/registry/HKEY_LOCAL_MACHINE/HARDWARE/DESCRIPTION/System/CentralProcessor/ 2>/dev/null | $EGREP -e '^@<:@0-9@:>@+/' -c` || CPU_COUNT="0"]],dnl
[[cygwin*]],[[CPU_COUNT=`ls -qpU1 /proc/registry/HKEY_LOCAL_MACHINE/HARDWARE/DESCRIPTION/System/CentralProcessor/ 2>/dev/null | $EGREP -e '^@<:@0-9@:>@+/' -c` || CPU_COUNT="0"]]dnl
)dnl
])dnl
AS_IF([[test "$CPU_COUNT" -gt "0" 2>/dev/null || ! command -v sysctl >/dev/null 2>&1]],[[: # empty]],[dnl
# Try less preferred generic method
# 'hw.ncpu' exist on many platforms, but not on GNU/Linux
CPU_COUNT=`sysctl -n hw.ncpu 2>/dev/null` || CPU_COUNT="0"
])dnl
AS_IF([[test "$CPU_COUNT" -gt "0" 2>/dev/null]],[[: # empty]],[dnl
# Try platform-specific fallback methods
# They can be less accurate and slower then preferred methods
AS_CASE([[$host_os]],dnl
[[*linux*]],[[CPU_COUNT=`$EGREP -e '^processor' -c /proc/cpuinfo 2>/dev/null` || CPU_COUNT="0"]],dnl
[[*darwin*]],[[CPU_COUNT=`system_profiler SPHardwareDataType 2>/dev/null | $EGREP -i -e 'number of cores:'|cut -d : -f 2 -s|tr -d ' '` || CPU_COUNT="0"]],dnl
[[freebsd*]],[[CPU_COUNT=`dmesg 2>/dev/null| $EGREP -e '^cpu@<:@0-9@:>@+: '|sort -u|$EGREP -e '^' -c` || CPU_COUNT="0"]],dnl
[[netbsd*]], [[CPU_COUNT=`command -v cpuctl >/dev/null 2>&1 && cpuctl list 2>/dev/null| $EGREP -e '^@<:@0-9@:>@+ .* online ' -c` || \
CPU_COUNT=`dmesg 2>/dev/null| $EGREP -e '^cpu@<:@0-9@:>@+ at'|sort -u|$EGREP -e '^' -c` || CPU_COUNT="0"]],dnl
[[solaris*]],[[command -v kstat >/dev/null 2>&1 && CPU_COUNT=`kstat -m cpu_info -s state -p 2>/dev/null | $EGREP -c -e 'on-line'` || \
CPU_COUNT=`kstat -m cpu_info 2>/dev/null | $EGREP -c -e 'module: cpu_info'` || CPU_COUNT="0"]],dnl
[[mingw*]],[AS_IF([[CPU_COUNT=`reg query 'HKLM\\Hardware\\Description\\System\\CentralProcessor' 2>/dev/null | $EGREP -e '\\\\@<:@0-9@:>@+$' -c`]],dnl
[[: # empty]],[[test "$NUMBER_OF_PROCESSORS" -gt "0" 2>/dev/null && CPU_COUNT="$NUMBER_OF_PROCESSORS"]])],dnl
[[msys*]],[[test "$NUMBER_OF_PROCESSORS" -gt "0" 2>/dev/null && CPU_COUNT="$NUMBER_OF_PROCESSORS"]],dnl
[[cygwin*]],[[test "$NUMBER_OF_PROCESSORS" -gt "0" 2>/dev/null && CPU_COUNT="$NUMBER_OF_PROCESSORS"]]dnl
)dnl
])dnl
AS_IF([[test "x$CPU_COUNT" != "x0" && test "$CPU_COUNT" -gt 0 2>/dev/null]],[dnl
AC_MSG_RESULT([[$CPU_COUNT]])
m4_ifvaln([$1],[$1],)dnl
],[dnl
m4_ifval([$2],[dnl
AS_UNSET([[CPU_COUNT]])
AC_MSG_RESULT([[unable to detect]])
$2
], [dnl
CPU_COUNT="1"
AC_MSG_RESULT([[unable to detect (assuming 1)]])
])dnl
])dnl
])dnl

View File

@ -97,9 +97,18 @@ AC_DEFUN([AX_PYTHON_DEVEL],[
# Check for a version of Python >= 2.1.0
#
AC_MSG_CHECKING([for a version of Python >= '2.1.0'])
ac_supports_python_ver=`$PYTHON -c "import sys; \
ver = sys.version.split ()[[0]]; \
print (ver >= '2.1.0')"`
ac_supports_python_ver=`cat<<EOD | $PYTHON -
from __future__ import print_function;
import sys;
try:
from packaging import version;
except ImportError:
from distlib import version;
ver = sys.version.split ()[[0]];
(tst_cmp, tst_ver) = ">= '2.1.0'".split ();
tst_ver = tst_ver.strip ("'");
eval ("print (version.LegacyVersion (ver)"+ tst_cmp +"version.LegacyVersion (tst_ver))")
EOD`
if test "$ac_supports_python_ver" != "True"; then
if test -z "$PYTHON_NOVERSIONCHECK"; then
AC_MSG_RESULT([no])
@ -126,9 +135,21 @@ to something else than an empty string.
#
if test -n "$1"; then
AC_MSG_CHECKING([for a version of Python $1])
ac_supports_python_ver=`$PYTHON -c "import sys; \
ver = sys.version.split ()[[0]]; \
print (ver $1)"`
# Why the strip ()? Because if we don't, version.parse
# will, for example, report 3.10.0 >= '3.11.0'
ac_supports_python_ver=`cat<<EOD | $PYTHON -
from __future__ import print_function;
import sys;
try:
from packaging import version;
except ImportError:
from distlib import version;
ver = sys.version.split ()[[0]];
(tst_cmp, tst_ver) = "$1".split ();
tst_ver = tst_ver.strip ("'");
eval ("print (version.LegacyVersion (ver)"+ tst_cmp +"version.LegacyVersion (tst_ver))")
EOD`
if test "$ac_supports_python_ver" = "True"; then
AC_MSG_RESULT([yes])
else

View File

@ -162,6 +162,9 @@ dnl #
dnl # 3.1 API change,
dnl # Check if inode_operations contains the function get_acl
dnl #
dnl # 5.15 API change,
dnl # Added the bool rcu argument to get_acl for rcu path walk.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_GET_ACL], [
ZFS_LINUX_TEST_SRC([inode_operations_get_acl], [
#include <linux/fs.h>
@ -174,22 +177,55 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_GET_ACL], [
.get_acl = get_acl_fn,
};
],[])
ZFS_LINUX_TEST_SRC([inode_operations_get_acl_rcu], [
#include <linux/fs.h>
struct posix_acl *get_acl_fn(struct inode *inode, int type,
bool rcu) { return NULL; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.get_acl = get_acl_fn,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_INODE_OPERATIONS_GET_ACL], [
AC_MSG_CHECKING([whether iops->get_acl() exists])
ZFS_LINUX_TEST_RESULT([inode_operations_get_acl], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GET_ACL, 1, [iops->get_acl() exists])
],[
ZFS_LINUX_TEST_RESULT([inode_operations_get_acl_rcu], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GET_ACL_RCU, 1, [iops->get_acl() takes rcu])
],[
ZFS_LINUX_TEST_ERROR([iops->get_acl()])
])
])
])
dnl #
dnl # 3.14 API change,
dnl # Check if inode_operations contains the function set_acl
dnl #
dnl # 5.12 API change,
dnl # set_acl() added a user_namespace* parameter first
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_SET_ACL], [
ZFS_LINUX_TEST_SRC([inode_operations_set_acl_userns], [
#include <linux/fs.h>
int set_acl_fn(struct user_namespace *userns,
struct inode *inode, struct posix_acl *acl,
int type) { return 0; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.set_acl = set_acl_fn,
};
],[])
ZFS_LINUX_TEST_SRC([inode_operations_set_acl], [
#include <linux/fs.h>
@ -205,12 +241,18 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_SET_ACL], [
AC_DEFUN([ZFS_AC_KERNEL_INODE_OPERATIONS_SET_ACL], [
AC_MSG_CHECKING([whether iops->set_acl() exists])
ZFS_LINUX_TEST_RESULT([inode_operations_set_acl], [
ZFS_LINUX_TEST_RESULT([inode_operations_set_acl_userns], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SET_ACL, 1, [iops->set_acl() exists])
AC_DEFINE(HAVE_SET_ACL_USERNS, 1, [iops->set_acl() takes 4 args])
],[
ZFS_LINUX_TEST_RESULT([inode_operations_set_acl], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SET_ACL, 1, [iops->set_acl() exists, takes 3 args])
],[
AC_MSG_RESULT(no)
])
])
])
dnl #

View File

@ -191,6 +191,24 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO_SET_DEV], [
], [], [ZFS_META_LICENSE])
])
dnl #
dnl # Linux 5.16 API
dnl #
dnl # bio_set_dev is no longer a helper macro and is now an inline function,
dnl # meaning that the function it calls internally can no longer be overridden
dnl # by our code
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO_SET_DEV_MACRO], [
ZFS_LINUX_TEST_SRC([bio_set_dev_macro], [
#include <linux/bio.h>
#include <linux/fs.h>
],[
#ifndef bio_set_dev
#error Not a macro
#endif
], [], [ZFS_META_LICENSE])
])
AC_DEFUN([ZFS_AC_KERNEL_BIO_SET_DEV], [
AC_MSG_CHECKING([whether bio_set_dev() is available])
ZFS_LINUX_TEST_RESULT([bio_set_dev], [
@ -205,6 +223,15 @@ AC_DEFUN([ZFS_AC_KERNEL_BIO_SET_DEV], [
AC_DEFINE(HAVE_BIO_SET_DEV_GPL_ONLY, 1,
[bio_set_dev() GPL-only])
])
AC_MSG_CHECKING([whether bio_set_dev() is a macro])
ZFS_LINUX_TEST_RESULT([bio_set_dev_macro], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BIO_SET_DEV_MACRO, 1,
[bio_set_dev() is a macro])
],[
AC_MSG_RESULT(no)
])
],[
AC_MSG_RESULT(no)
])
@ -294,9 +321,8 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO_SUBMIT_BIO], [
ZFS_LINUX_TEST_SRC([submit_bio], [
#include <linux/bio.h>
],[
blk_qc_t blk_qc;
struct bio *bio = NULL;
blk_qc = submit_bio(bio);
(void) submit_bio(bio);
])
])
@ -369,6 +395,85 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKG_TRYGET], [
])
])
dnl #
dnl # Linux 5.12 API,
dnl #
dnl # The Linux 5.12 kernel updated struct bio to create a new bi_bdev member
dnl # and bio->bi_disk was moved to bio->bi_bdev->bd_disk
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO_BDEV_DISK], [
ZFS_LINUX_TEST_SRC([bio_bdev_disk], [
#include <linux/blk_types.h>
#include <linux/blkdev.h>
],[
struct bio *b = NULL;
struct gendisk *d = b->bi_bdev->bd_disk;
blk_register_queue(d);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BIO_BDEV_DISK], [
AC_MSG_CHECKING([whether bio->bi_bdev->bd_disk exists])
ZFS_LINUX_TEST_RESULT([bio_bdev_disk], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BIO_BDEV_DISK, 1, [bio->bi_bdev->bd_disk exists])
],[
AC_MSG_RESULT(no)
])
])
dnl #
dnl # Linux 5.16 API
dnl #
dnl # The Linux 5.16 API for submit_bio changed the return type to be
dnl # void instead of int
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BDEV_SUBMIT_BIO_RETURNS_VOID], [
ZFS_LINUX_TEST_SRC([bio_bdev_submit_bio_void], [
#include <linux/blkdev.h>
],[
struct block_device_operations *bdev = NULL;
__attribute__((unused)) void(*f)(struct bio *) = bdev->submit_bio;
])
])
AC_DEFUN([ZFS_AC_KERNEL_BDEV_SUBMIT_BIO_RETURNS_VOID], [
AC_MSG_CHECKING(
[whether block_device_operations->submit_bio() returns void])
ZFS_LINUX_TEST_RESULT([bio_bdev_submit_bio_void], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BDEV_SUBMIT_BIO_RETURNS_VOID, 1,
[block_device_operations->submit_bio() returns void])
],[
AC_MSG_RESULT(no)
])
])
dnl #
dnl # Linux 5.16 API
dnl #
dnl # The Linux 5.16 API moved struct blkcg_gq into linux/blk-cgroup.h, which
dnl # has been around since 2015. This test looks for the presence of that
dnl # header, so that it can be conditionally included where it exists, but
dnl # still be backward compatible with kernels that pre-date its introduction.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_CGROUP_HEADER], [
ZFS_LINUX_TEST_SRC([blk_cgroup_header], [
#include <linux/blk-cgroup.h>
], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_CGROUP_HEADER], [
AC_MSG_CHECKING([for existence of linux/blk-cgroup.h])
ZFS_LINUX_TEST_RESULT([blk_cgroup_header],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_LINUX_BLK_CGROUP_HEADER, 1,
[linux/blk-cgroup.h exists])
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO], [
ZFS_AC_KERNEL_SRC_REQ
ZFS_AC_KERNEL_SRC_BIO_OPS
@ -379,6 +484,10 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO], [
ZFS_AC_KERNEL_SRC_BIO_SUBMIT_BIO
ZFS_AC_KERNEL_SRC_BIO_CURRENT_BIO_LIST
ZFS_AC_KERNEL_SRC_BLKG_TRYGET
ZFS_AC_KERNEL_SRC_BIO_BDEV_DISK
ZFS_AC_KERNEL_SRC_BDEV_SUBMIT_BIO_RETURNS_VOID
ZFS_AC_KERNEL_SRC_BIO_SET_DEV_MACRO
ZFS_AC_KERNEL_SRC_BLK_CGROUP_HEADER
])
AC_DEFUN([ZFS_AC_KERNEL_BIO], [
@ -400,4 +509,7 @@ AC_DEFUN([ZFS_AC_KERNEL_BIO], [
ZFS_AC_KERNEL_BIO_SUBMIT_BIO
ZFS_AC_KERNEL_BIO_CURRENT_BIO_LIST
ZFS_AC_KERNEL_BLKG_TRYGET
ZFS_AC_KERNEL_BIO_BDEV_DISK
ZFS_AC_KERNEL_BDEV_SUBMIT_BIO_RETURNS_VOID
ZFS_AC_KERNEL_BLK_CGROUP_HEADER
])

View File

@ -0,0 +1,23 @@
dnl #
dnl # 5.12 API change removes BIO_MAX_PAGES in favor of bio_max_segs()
dnl # which will handle the logic of setting the upper-bound to a
dnl # BIO_MAX_PAGES, internally.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO_MAX_SEGS], [
ZFS_LINUX_TEST_SRC([bio_max_segs], [
#include <linux/bio.h>
],[
bio_max_segs(1);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BIO_MAX_SEGS], [
AC_MSG_CHECKING([whether bio_max_segs() exists])
ZFS_LINUX_TEST_RESULT([bio_max_segs], [
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_BIO_MAX_SEGS], 1, [bio_max_segs() is implemented])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -48,7 +48,45 @@ AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_BDI], [
])
dnl #
dnl # 2.6.32 - 4.x API,
dnl # 5.9: added blk_queue_update_readahead(),
dnl # 5.15: renamed to disk_update_readahead()
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_UPDATE_READAHEAD], [
ZFS_LINUX_TEST_SRC([blk_queue_update_readahead], [
#include <linux/blkdev.h>
],[
struct request_queue q;
blk_queue_update_readahead(&q);
])
ZFS_LINUX_TEST_SRC([disk_update_readahead], [
#include <linux/blkdev.h>
],[
struct gendisk disk;
disk_update_readahead(&disk);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_UPDATE_READAHEAD], [
AC_MSG_CHECKING([whether blk_queue_update_readahead() exists])
ZFS_LINUX_TEST_RESULT([blk_queue_update_readahead], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_UPDATE_READAHEAD, 1,
[blk_queue_update_readahead() exists])
],[
AC_MSG_CHECKING([whether disk_update_readahead() exists])
ZFS_LINUX_TEST_RESULT([disk_update_readahead], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_DISK_UPDATE_READAHEAD, 1,
[disk_update_readahead() exists])
],[
AC_MSG_RESULT(no)
])
])
])
dnl #
dnl # 2.6.32 API,
dnl # blk_queue_discard()
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_DISCARD], [
@ -280,6 +318,7 @@ AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_MAX_SEGMENTS], [
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE], [
ZFS_AC_KERNEL_SRC_BLK_QUEUE_PLUG
ZFS_AC_KERNEL_SRC_BLK_QUEUE_BDI
ZFS_AC_KERNEL_SRC_BLK_QUEUE_UPDATE_READAHEAD
ZFS_AC_KERNEL_SRC_BLK_QUEUE_DISCARD
ZFS_AC_KERNEL_SRC_BLK_QUEUE_SECURE_ERASE
ZFS_AC_KERNEL_SRC_BLK_QUEUE_FLAG_SET
@ -292,6 +331,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE], [
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE], [
ZFS_AC_KERNEL_BLK_QUEUE_PLUG
ZFS_AC_KERNEL_BLK_QUEUE_BDI
ZFS_AC_KERNEL_BLK_QUEUE_UPDATE_READAHEAD
ZFS_AC_KERNEL_BLK_QUEUE_DISCARD
ZFS_AC_KERNEL_BLK_QUEUE_SECURE_ERASE
ZFS_AC_KERNEL_BLK_QUEUE_FLAG_SET

View File

@ -294,6 +294,27 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_BDEV_WHOLE], [
])
])
dnl #
dnl # 5.13 API change
dnl # blkdev_get_by_path() no longer handles ERESTARTSYS
dnl #
dnl # Unfortunately we're forced to rely solely on the kernel version
dnl # number in order to determine the expected behavior. This was an
dnl # internal change to blkdev_get_by_dev(), see commit a8ed1a0607.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_GET_ERESTARTSYS], [
AC_MSG_CHECKING([whether blkdev_get_by_path() handles ERESTARTSYS])
AS_VERSION_COMPARE([$LINUX_VERSION], [5.13.0], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLKDEV_GET_ERESTARTSYS, 1,
[blkdev_get_by_path() handles ERESTARTSYS])
],[
AC_MSG_RESULT(no)
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV], [
ZFS_AC_KERNEL_SRC_BLKDEV_GET_BY_PATH
ZFS_AC_KERNEL_SRC_BLKDEV_PUT
@ -318,4 +339,5 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV], [
ZFS_AC_KERNEL_BLKDEV_CHECK_DISK_CHANGE
ZFS_AC_KERNEL_BLKDEV_BDEV_CHECK_MEDIA_CHANGE
ZFS_AC_KERNEL_BLKDEV_BDEV_WHOLE
ZFS_AC_KERNEL_BLKDEV_GET_ERESTARTSYS
])

View File

@ -52,12 +52,44 @@ AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
])
])
dnl #
dnl # 5.13 API change
dnl # block_device_operations->revalidate_disk() was removed
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
ZFS_LINUX_TEST_SRC([block_device_operations_revalidate_disk], [
#include <linux/blkdev.h>
int blk_revalidate_disk(struct gendisk *disk) {
return(0);
}
static const struct block_device_operations
bops __attribute__ ((unused)) = {
.revalidate_disk = blk_revalidate_disk,
};
], [], [$NO_UNUSED_BUT_SET_VARIABLE])
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
AC_MSG_CHECKING([whether bops->revalidate_disk() exists])
ZFS_LINUX_TEST_RESULT([block_device_operations_revalidate_disk], [
AC_DEFINE([HAVE_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [1],
[Define if revalidate_disk() in block_device_operations])
AC_MSG_RESULT(yes)
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS], [
ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS
ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID
ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS], [
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK
])

View File

@ -2,6 +2,9 @@ dnl #
dnl # Handle differences in kernel FPU code.
dnl #
dnl # Kernel
dnl # 5.16: XCR code put into asm/fpu/xcr.h
dnl # HAVE_KERNEL_FPU_XCR_HEADER
dnl #
dnl # 5.0: Wrappers have been introduced to save/restore the FPU state.
dnl # This change was made to the 4.19.38 and 4.14.120 LTS kernels.
dnl # HAVE_KERNEL_FPU_INTERNAL
@ -25,6 +28,18 @@ AC_DEFUN([ZFS_AC_KERNEL_FPU_HEADER], [
AC_DEFINE(HAVE_KERNEL_FPU_API_HEADER, 1,
[kernel has asm/fpu/api.h])
AC_MSG_RESULT(asm/fpu/api.h)
AC_MSG_CHECKING([whether fpu/xcr header is available])
ZFS_LINUX_TRY_COMPILE([
#include <linux/module.h>
#include <asm/fpu/xcr.h>
],[
],[
AC_DEFINE(HAVE_KERNEL_FPU_XCR_HEADER, 1,
[kernel has asm/fpu/xcr.h])
AC_MSG_RESULT(asm/fpu/xcr.h)
],[
AC_MSG_RESULT(no asm/fpu/xcr.h)
])
],[
AC_MSG_RESULT(i387.h & xcr.h)
])

View File

@ -0,0 +1,28 @@
dnl #
dnl # 5.12 API
dnl #
dnl # generic_fillattr in linux/fs.h now requires a struct user_namespace*
dnl # as the first arg, to support idmapped mounts.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_GENERIC_FILLATTR_USERNS], [
ZFS_LINUX_TEST_SRC([generic_fillattr_userns], [
#include <linux/fs.h>
],[
struct user_namespace *userns = NULL;
struct inode *in = NULL;
struct kstat *k = NULL;
generic_fillattr(userns, in, k);
])
])
AC_DEFUN([ZFS_AC_KERNEL_GENERIC_FILLATTR_USERNS], [
AC_MSG_CHECKING([whether generic_fillattr requres struct user_namespace*])
ZFS_LINUX_TEST_RESULT([generic_fillattr_userns], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_GENERIC_FILLATTR_USERNS, 1,
[generic_fillattr requires struct user_namespace*])
],[
AC_MSG_RESULT([no])
])
])

View File

@ -2,6 +2,17 @@ dnl #
dnl # Check for generic io accounting interface.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_GENERIC_IO_ACCT], [
ZFS_LINUX_TEST_SRC([disk_io_acct], [
#include <linux/blkdev.h>
], [
struct gendisk *disk = NULL;
struct bio *bio = NULL;
unsigned long start_time;
start_time = disk_start_io_acct(disk, bio_sectors(bio), bio_op(bio));
disk_end_io_acct(disk, bio_op(bio), start_time);
])
ZFS_LINUX_TEST_SRC([bio_io_acct], [
#include <linux/blkdev.h>
], [
@ -38,6 +49,19 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_GENERIC_IO_ACCT], [
])
AC_DEFUN([ZFS_AC_KERNEL_GENERIC_IO_ACCT], [
dnl #
dnl # 5.12 API,
dnl #
dnl # bio_start_io_acct() and bio_end_io_acct() became GPL-exported
dnl # so use disk_start_io_acct() and disk_end_io_acct() instead
dnl #
AC_MSG_CHECKING([whether generic disk_*_io_acct() are available])
ZFS_LINUX_TEST_RESULT([disk_io_acct], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_DISK_IO_ACCT, 1, [disk_*_io_acct() available])
], [
AC_MSG_RESULT(no)
dnl #
dnl # 5.7 API,
dnl #
@ -84,4 +108,5 @@ AC_DEFUN([ZFS_AC_KERNEL_GENERIC_IO_ACCT], [
])
])
])
])
])

View File

@ -1,7 +1,25 @@
dnl #
dnl # 3.6 API change
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_CREATE_FLAGS], [
AC_DEFUN([ZFS_AC_KERNEL_SRC_CREATE], [
dnl #
dnl # 5.12 API change that added the struct user_namespace* arg
dnl # to the front of this function type's arg list.
dnl #
ZFS_LINUX_TEST_SRC([create_userns], [
#include <linux/fs.h>
#include <linux/sched.h>
int inode_create(struct user_namespace *userns,
struct inode *inode ,struct dentry *dentry,
umode_t umode, bool flag) { return 0; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.create = inode_create,
};
],[])
dnl #
dnl # 3.6 API change
dnl #
ZFS_LINUX_TEST_SRC([create_flags], [
#include <linux/fs.h>
#include <linux/sched.h>
@ -16,11 +34,20 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_CREATE_FLAGS], [
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_CREATE_FLAGS], [
AC_DEFUN([ZFS_AC_KERNEL_CREATE], [
AC_MSG_CHECKING([whether iops->create() takes struct user_namespace*])
ZFS_LINUX_TEST_RESULT([create_userns], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_IOPS_CREATE_USERNS, 1,
[iops->create() takes struct user_namespace*])
],[
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether iops->create() passes flags])
ZFS_LINUX_TEST_RESULT([create_flags], [
AC_MSG_RESULT(yes)
],[
ZFS_LINUX_TEST_ERROR([iops->create()])
])
])
])

View File

@ -1,8 +1,29 @@
dnl #
dnl # Linux 4.11 API
dnl # See torvalds/linux@a528d35
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_GETATTR], [
dnl #
dnl # Linux 5.12 API
dnl # The getattr I/O operations handler type was extended to require
dnl # a struct user_namespace* as its first arg, to support idmapped
dnl # mounts.
dnl #
ZFS_LINUX_TEST_SRC([inode_operations_getattr_userns], [
#include <linux/fs.h>
int test_getattr(
struct user_namespace *userns,
const struct path *p, struct kstat *k,
u32 request_mask, unsigned int query_flags)
{ return 0; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.getattr = test_getattr,
};
],[])
dnl #
dnl # Linux 4.11 API
dnl # See torvalds/linux@a528d35
dnl #
ZFS_LINUX_TEST_SRC([inode_operations_getattr_path], [
#include <linux/fs.h>
@ -33,6 +54,20 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_GETATTR], [
])
AC_DEFUN([ZFS_AC_KERNEL_INODE_GETATTR], [
dnl #
dnl # Kernel 5.12 test
dnl #
AC_MSG_CHECKING([whether iops->getattr() takes user_namespace])
ZFS_LINUX_TEST_RESULT([inode_operations_getattr_userns], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_USERNS_IOPS_GETATTR, 1,
[iops->getattr() takes struct user_namespace*])
],[
AC_MSG_RESULT(no)
dnl #
dnl # Kernel 4.11 test
dnl #
AC_MSG_CHECKING([whether iops->getattr() takes a path])
ZFS_LINUX_TEST_RESULT([inode_operations_getattr_path], [
AC_MSG_RESULT(yes)
@ -41,6 +76,9 @@ AC_DEFUN([ZFS_AC_KERNEL_INODE_GETATTR], [
],[
AC_MSG_RESULT(no)
dnl #
dnl # Kernel < 4.11 test
dnl #
AC_MSG_CHECKING([whether iops->getattr() takes a vfsmount])
ZFS_LINUX_TEST_RESULT([inode_operations_getattr_vfsmount], [
AC_MSG_RESULT(yes)
@ -50,4 +88,5 @@ AC_DEFUN([ZFS_AC_KERNEL_INODE_GETATTR], [
AC_MSG_RESULT(no)
])
])
])
])

View File

@ -11,13 +11,32 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_OWNER_OR_CAPABLE], [
struct inode *ip = NULL;
(void) inode_owner_or_capable(ip);
])
ZFS_LINUX_TEST_SRC([inode_owner_or_capable_idmapped], [
#include <linux/fs.h>
],[
struct inode *ip = NULL;
(void) inode_owner_or_capable(&init_user_ns, ip);
])
])
AC_DEFUN([ZFS_AC_KERNEL_INODE_OWNER_OR_CAPABLE], [
AC_MSG_CHECKING([whether inode_owner_or_capable() exists])
ZFS_LINUX_TEST_RESULT([inode_owner_or_capable], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_INODE_OWNER_OR_CAPABLE, 1,
[inode_owner_or_capable() exists])
], [
AC_MSG_RESULT(no)
AC_MSG_CHECKING(
[whether inode_owner_or_capable() takes user_ns])
ZFS_LINUX_TEST_RESULT([inode_owner_or_capable_idmapped], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_INODE_OWNER_OR_CAPABLE_IDMAPPED, 1,
[inode_owner_or_capable() takes user_ns])
],[
ZFS_LINUX_TEST_ERROR([capability])
])
])
])

View File

@ -42,6 +42,13 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_MAKE_REQUEST_FN], [
struct block_device_operations o;
o.submit_bio = NULL;
])
ZFS_LINUX_TEST_SRC([blk_alloc_disk], [
#include <linux/blkdev.h>
],[
struct gendisk *disk __attribute__ ((unused));
disk = blk_alloc_disk(NUMA_NO_NODE);
])
])
AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
@ -56,6 +63,19 @@ AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
AC_DEFINE(HAVE_SUBMIT_BIO_IN_BLOCK_DEVICE_OPERATIONS, 1,
[submit_bio is member of struct block_device_operations])
dnl #
dnl # Linux 5.14 API Change:
dnl # blk_alloc_queue() + alloc_disk() combo replaced by
dnl # a single call to blk_alloc_disk().
dnl #
AC_MSG_CHECKING([whether blk_alloc_disk() exists])
ZFS_LINUX_TEST_RESULT([blk_alloc_disk], [
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_BLK_ALLOC_DISK], 1, [blk_alloc_disk() exists])
], [
AC_MSG_RESULT(no)
])
],[
AC_MSG_RESULT(no)

View File

@ -1,32 +0,0 @@
dnl #
dnl # 3.3 API change
dnl # The VFS .create, .mkdir and .mknod callbacks were updated to take a
dnl # umode_t type rather than an int. The expectation is that any backport
dnl # would also change all three prototypes. However, if it turns out that
dnl # some distribution doesn't backport the whole thing this could be
dnl # broken apart into three separate checks.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_MKDIR_UMODE_T], [
ZFS_LINUX_TEST_SRC([inode_operations_mkdir], [
#include <linux/fs.h>
int mkdir(struct inode *inode, struct dentry *dentry,
umode_t umode) { return 0; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.mkdir = mkdir,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_MKDIR_UMODE_T], [
AC_MSG_CHECKING([whether iops->create()/mkdir()/mknod() take umode_t])
ZFS_LINUX_TEST_RESULT([inode_operations_mkdir], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_MKDIR_UMODE_T, 1,
[iops->create()/mkdir()/mknod() take umode_t])
],[
ZFS_LINUX_TEST_ERROR([mkdir()])
])
])

65
config/kernel-mkdir.m4 Normal file
View File

@ -0,0 +1,65 @@
dnl #
dnl # Supported mkdir() interfaces checked newest to oldest.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_MKDIR], [
dnl #
dnl # 5.12 API change
dnl # The struct user_namespace arg was added as the first argument to
dnl # mkdir()
dnl #
ZFS_LINUX_TEST_SRC([mkdir_user_namespace], [
#include <linux/fs.h>
int mkdir(struct user_namespace *userns,
struct inode *inode, struct dentry *dentry,
umode_t umode) { return 0; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.mkdir = mkdir,
};
],[])
dnl #
dnl # 3.3 API change
dnl # The VFS .create, .mkdir and .mknod callbacks were updated to take a
dnl # umode_t type rather than an int. The expectation is that any backport
dnl # would also change all three prototypes. However, if it turns out that
dnl # some distribution doesn't backport the whole thing this could be
dnl # broken apart into three separate checks.
dnl #
ZFS_LINUX_TEST_SRC([inode_operations_mkdir], [
#include <linux/fs.h>
int mkdir(struct inode *inode, struct dentry *dentry,
umode_t umode) { return 0; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.mkdir = mkdir,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_MKDIR], [
dnl #
dnl # 5.12 API change
dnl # The struct user_namespace arg was added as the first argument to
dnl # mkdir() of the iops structure.
dnl #
AC_MSG_CHECKING([whether iops->mkdir() takes struct user_namespace*])
ZFS_LINUX_TEST_RESULT([mkdir_user_namespace], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_IOPS_MKDIR_USERNS, 1,
[iops->mkdir() takes struct user_namespace*])
],[
AC_MSG_CHECKING([whether iops->mkdir() takes umode_t])
ZFS_LINUX_TEST_RESULT([inode_operations_mkdir], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_MKDIR_UMODE_T, 1,
[iops->mkdir() takes umode_t])
],[
ZFS_LINUX_TEST_ERROR([mkdir()])
])
])
])

30
config/kernel-mknod.m4 Normal file
View File

@ -0,0 +1,30 @@
AC_DEFUN([ZFS_AC_KERNEL_SRC_MKNOD], [
dnl #
dnl # 5.12 API change that added the struct user_namespace* arg
dnl # to the front of this function type's arg list.
dnl #
ZFS_LINUX_TEST_SRC([mknod_userns], [
#include <linux/fs.h>
#include <linux/sched.h>
int tmp_mknod(struct user_namespace *userns,
struct inode *inode ,struct dentry *dentry,
umode_t u, dev_t d) { return 0; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.mknod = tmp_mknod,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_MKNOD], [
AC_MSG_CHECKING([whether iops->mknod() takes struct user_namespace*])
ZFS_LINUX_TEST_RESULT([mknod_userns], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_IOPS_MKNOD_USERNS, 1,
[iops->mknod() takes struct user_namespace*])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,26 @@
dnl #
dnl # Linux 5.16 no longer allows directly calling wait_on_page_bit, and
dnl # instead requires you to call folio-specific functions. In this case,
dnl # wait_on_page_bit(pg, PG_writeback) becomes
dnl # folio_wait_bit(pg, PG_writeback)
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_PAGEMAP_FOLIO_WAIT_BIT], [
ZFS_LINUX_TEST_SRC([pagemap_has_folio_wait_bit], [
#include <linux/pagemap.h>
],[
static struct folio *f = NULL;
folio_wait_bit(f, PG_writeback);
])
])
AC_DEFUN([ZFS_AC_KERNEL_PAGEMAP_FOLIO_WAIT_BIT], [
AC_MSG_CHECKING([folio_wait_bit() exists])
ZFS_LINUX_TEST_RESULT([pagemap_has_folio_wait_bit], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_PAGEMAP_FOLIO_WAIT_BIT, 1,
[folio_wait_bit() exists])
],[
AC_MSG_RESULT([no])
])
])

View File

@ -1,10 +1,10 @@
dnl #
dnl # 4.9 API change,
dnl # iops->rename2() merged into iops->rename(), and iops->rename() now wants
dnl # flags.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_RENAME_WANTS_FLAGS], [
ZFS_LINUX_TEST_SRC([inode_operations_rename], [
AC_DEFUN([ZFS_AC_KERNEL_SRC_RENAME], [
dnl #
dnl # 4.9 API change,
dnl # iops->rename2() merged into iops->rename(), and iops->rename() now wants
dnl # flags.
dnl #
ZFS_LINUX_TEST_SRC([inode_operations_rename_flags], [
#include <linux/fs.h>
int rename_fn(struct inode *sip, struct dentry *sdp,
struct inode *tip, struct dentry *tdp,
@ -15,15 +15,41 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_RENAME_WANTS_FLAGS], [
.rename = rename_fn,
};
],[])
dnl #
dnl # 5.12 API change,
dnl #
dnl # Linux 5.12 introduced passing struct user_namespace* as the first argument
dnl # of the rename() and other inode_operations members.
dnl #
ZFS_LINUX_TEST_SRC([inode_operations_rename_userns], [
#include <linux/fs.h>
int rename_fn(struct user_namespace *user_ns, struct inode *sip,
struct dentry *sdp, struct inode *tip, struct dentry *tdp,
unsigned int flags) { return 0; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.rename = rename_fn,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_RENAME_WANTS_FLAGS], [
AC_MSG_CHECKING([whether iops->rename() wants flags])
ZFS_LINUX_TEST_RESULT([inode_operations_rename], [
AC_DEFUN([ZFS_AC_KERNEL_RENAME], [
AC_MSG_CHECKING([whether iops->rename() takes struct user_namespace*])
ZFS_LINUX_TEST_RESULT([inode_operations_rename_userns], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_IOPS_RENAME_USERNS, 1,
[iops->rename() takes struct user_namespace*])
],[
AC_MSG_RESULT(no)
ZFS_LINUX_TEST_RESULT([inode_operations_rename_flags], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_RENAME_WANTS_FLAGS, 1,
[iops->rename() wants flags])
],[
AC_MSG_RESULT(no)
])
])
])

View File

@ -1,9 +1,9 @@
dnl #
dnl # 4.9 API change
dnl # The inode_change_ok() function has been renamed setattr_prepare()
dnl # and updated to take a dentry rather than an inode.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_SETATTR_PREPARE], [
dnl #
dnl # 4.9 API change
dnl # The inode_change_ok() function has been renamed setattr_prepare()
dnl # and updated to take a dentry rather than an inode.
dnl #
ZFS_LINUX_TEST_SRC([setattr_prepare], [
#include <linux/fs.h>
], [
@ -12,16 +12,41 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_SETATTR_PREPARE], [
int error __attribute__ ((unused)) =
setattr_prepare(dentry, attr);
])
dnl #
dnl # 5.12 API change
dnl # The setattr_prepare() function has been changed to accept a new argument
dnl # for struct user_namespace*
dnl #
ZFS_LINUX_TEST_SRC([setattr_prepare_userns], [
#include <linux/fs.h>
], [
struct dentry *dentry = NULL;
struct iattr *attr = NULL;
struct user_namespace *userns = NULL;
int error __attribute__ ((unused)) =
setattr_prepare(userns, dentry, attr);
])
])
AC_DEFUN([ZFS_AC_KERNEL_SETATTR_PREPARE], [
AC_MSG_CHECKING([whether setattr_prepare() is available])
AC_MSG_CHECKING([whether setattr_prepare() is available and accepts struct user_namespace*])
ZFS_LINUX_TEST_RESULT_SYMBOL([setattr_prepare_userns],
[setattr_prepare], [fs/attr.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SETATTR_PREPARE_USERNS, 1,
[setattr_prepare() accepts user_namespace])
], [
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether setattr_prepare() is available, doesn't accept user_namespace])
ZFS_LINUX_TEST_RESULT_SYMBOL([setattr_prepare],
[setattr_prepare], [fs/attr.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SETATTR_PREPARE, 1,
[setattr_prepare() is available])
AC_DEFINE(HAVE_SETATTR_PREPARE_NO_USERNS, 1,
[setattr_prepare() is available, doesn't accept user_namespace])
], [
AC_MSG_RESULT(no)
])
])
])

21
config/kernel-siginfo.m4 Normal file
View File

@ -0,0 +1,21 @@
dnl #
dnl # 4.20 API change
dnl # Added kernel_siginfo_t
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_SIGINFO], [
ZFS_LINUX_TEST_SRC([siginfo], [
#include <linux/signal_types.h>
],[
kernel_siginfo_t info __attribute__ ((unused));
])
])
AC_DEFUN([ZFS_AC_KERNEL_SIGINFO], [
AC_MSG_CHECKING([whether kernel_siginfo_t tyepedef exists])
ZFS_LINUX_TEST_RESULT([siginfo], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SIGINFO, 1, [kernel_siginfo_t exists])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,21 @@
dnl #
dnl # 4.4 API change
dnl # Added kernel_signal_stop
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_SIGNAL_STOP], [
ZFS_LINUX_TEST_SRC([signal_stop], [
#include <linux/sched/signal.h>
],[
kernel_signal_stop();
])
])
AC_DEFUN([ZFS_AC_KERNEL_SIGNAL_STOP], [
AC_MSG_CHECKING([whether signal_stop() exists])
ZFS_LINUX_TEST_RESULT([signal_stop], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SIGNAL_STOP, 1, [signal_stop() exists])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,21 @@
dnl #
dnl # 4.17 API change
dnl # Added set_special_state() function
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_SET_SPECIAL_STATE], [
ZFS_LINUX_TEST_SRC([set_special_state], [
#include <linux/sched.h>
],[
set_special_state(TASK_STOPPED);
])
])
AC_DEFUN([ZFS_AC_KERNEL_SET_SPECIAL_STATE], [
AC_MSG_CHECKING([whether set_special_state() exists])
ZFS_LINUX_TEST_RESULT([set_special_state], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SET_SPECIAL_STATE, 1, [set_special_state() exists])
],[
AC_MSG_RESULT(no)
])
])

32
config/kernel-stdarg.m4 Normal file
View File

@ -0,0 +1,32 @@
dnl #
dnl # Linux 5.15 gets rid of -isystem and external <stdarg.h> inclusion
dnl # and ships its own <linux/stdarg.h>. Check if this header file does
dnl # exist and provide all necessary definitions for variable argument
dnl # functions. Adjust the inclusion of <stdarg.h> according to the
dnl # results.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_STANDALONE_LINUX_STDARG], [
ZFS_LINUX_TEST_SRC([has_standalone_linux_stdarg], [
#include <linux/stdarg.h>
#if !defined(va_start) || !defined(va_end) || \
!defined(va_arg) || !defined(va_copy)
#error "<linux/stdarg.h> is invalid"
#endif
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_STANDALONE_LINUX_STDARG], [
dnl #
dnl # Linux 5.15 ships its own stdarg.h and doesn't allow to
dnl # include compiler headers.
dnl #
AC_MSG_CHECKING([whether standalone <linux/stdarg.h> exists])
ZFS_LINUX_TEST_RESULT([has_standalone_linux_stdarg], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_STANDALONE_LINUX_STDARG, 1,
[standalone <linux/stdarg.h> exists])
],[
AC_MSG_RESULT([no])
])
])

30
config/kernel-symlink.m4 Normal file
View File

@ -0,0 +1,30 @@
AC_DEFUN([ZFS_AC_KERNEL_SRC_SYMLINK], [
dnl #
dnl # 5.12 API change that added the struct user_namespace* arg
dnl # to the front of this function type's arg list.
dnl #
ZFS_LINUX_TEST_SRC([symlink_userns], [
#include <linux/fs.h>
#include <linux/sched.h>
int tmp_symlink(struct user_namespace *userns,
struct inode *inode ,struct dentry *dentry,
const char *path) { return 0; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.symlink = tmp_symlink,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_SYMLINK], [
AC_MSG_CHECKING([whether iops->symlink() takes struct user_namespace*])
ZFS_LINUX_TEST_RESULT([symlink_userns], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_IOPS_SYMLINK_USERNS, 1,
[iops->symlink() takes struct user_namespace*])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -3,6 +3,20 @@ dnl # 3.11 API change
dnl # Add support for i_op->tmpfile
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_TMPFILE], [
dnl #
dnl # 5.11 API change
dnl # add support for userns parameter to tmpfile
dnl #
ZFS_LINUX_TEST_SRC([inode_operations_tmpfile_userns], [
#include <linux/fs.h>
int tmpfile(struct user_namespace *userns,
struct inode *inode, struct dentry *dentry,
umode_t mode) { return 0; }
static struct inode_operations
iops __attribute__ ((unused)) = {
.tmpfile = tmpfile,
};
],[])
ZFS_LINUX_TEST_SRC([inode_operations_tmpfile], [
#include <linux/fs.h>
int tmpfile(struct inode *inode, struct dentry *dentry,
@ -16,10 +30,16 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_TMPFILE], [
AC_DEFUN([ZFS_AC_KERNEL_TMPFILE], [
AC_MSG_CHECKING([whether i_op->tmpfile() exists])
ZFS_LINUX_TEST_RESULT([inode_operations_tmpfile_userns], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_TMPFILE, 1, [i_op->tmpfile() exists])
AC_DEFINE(HAVE_TMPFILE_USERNS, 1, [i_op->tmpfile() has userns])
],[
ZFS_LINUX_TEST_RESULT([inode_operations_tmpfile], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_TMPFILE, 1, [i_op->tmpfile() exists])
],[
AC_MSG_RESULT(no)
])
])
])

View File

@ -99,6 +99,14 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_VFS_IOV_ITER], [
bytes = copy_from_iter((void *)&buf, size, &iter);
])
ZFS_LINUX_TEST_SRC([iov_iter_type], [
#include <linux/fs.h>
#include <linux/uio.h>
],[
struct iov_iter iter = { 0 };
__attribute__((unused)) enum iter_type i = iov_iter_type(&iter);
])
])
AC_DEFUN([ZFS_AC_KERNEL_VFS_IOV_ITER], [
@ -193,6 +201,20 @@ AC_DEFUN([ZFS_AC_KERNEL_VFS_IOV_ITER], [
enable_vfs_iov_iter="no"
])
dnl #
dnl # This checks for iov_iter_type() in linux/uio.h. It is not
dnl # required, however, and the module will compiled without it
dnl # using direct access of the member attribute
dnl #
AC_MSG_CHECKING([whether iov_iter_type() is available])
ZFS_LINUX_TEST_RESULT([iov_iter_type], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_IOV_ITER_TYPE, 1,
[iov_iter_type() is available])
],[
AC_MSG_RESULT(no)
])
dnl #
dnl # As of the 4.9 kernel support is provided for iovecs, kvecs,
dnl # bvecs and pipes in the iov_iter structure. As long as the

View File

@ -0,0 +1,34 @@
dnl #
dnl # Linux 5.14 adds a change to require set_page_dirty to be manually
dnl # wired up in struct address_space_operations. Determine if this needs
dnl # to be done. This patch set also introduced __set_page_dirty_nobuffers
dnl # declaration in linux/pagemap.h, so these tests look for the presence
dnl # of that function to tell the compiler to assign set_page_dirty in
dnl # module/os/linux/zfs/zpl_file.c
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_VFS_SET_PAGE_DIRTY_NOBUFFERS], [
ZFS_LINUX_TEST_SRC([vfs_has_set_page_dirty_nobuffers], [
#include <linux/pagemap.h>
#include <linux/fs.h>
static const struct address_space_operations
aops __attribute__ ((unused)) = {
.set_page_dirty = __set_page_dirty_nobuffers,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_VFS_SET_PAGE_DIRTY_NOBUFFERS], [
dnl #
dnl # Linux 5.14 change requires set_page_dirty() to be assigned
dnl # in address_space_operations()
dnl #
AC_MSG_CHECKING([__set_page_dirty_nobuffers exists])
ZFS_LINUX_TEST_RESULT([vfs_has_set_page_dirty_nobuffers], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_VFS_SET_PAGE_DIRTY_NOBUFFERS, 1,
[__set_page_dirty_nobuffers exists])
],[
AC_MSG_RESULT([no])
])
])

View File

@ -152,6 +152,21 @@ dnl #
dnl # Supported xattr handler set() interfaces checked newest to oldest.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_XATTR_HANDLER_SET], [
ZFS_LINUX_TEST_SRC([xattr_handler_set_userns], [
#include <linux/xattr.h>
int set(const struct xattr_handler *handler,
struct user_namespace *mnt_userns,
struct dentry *dentry, struct inode *inode,
const char *name, const void *buffer,
size_t size, int flags)
{ return 0; }
static const struct xattr_handler
xops __attribute__ ((unused)) = {
.set = set,
};
],[])
ZFS_LINUX_TEST_SRC([xattr_handler_set_dentry_inode], [
#include <linux/xattr.h>
@ -193,11 +208,23 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_XATTR_HANDLER_SET], [
])
AC_DEFUN([ZFS_AC_KERNEL_XATTR_HANDLER_SET], [
dnl #
dnl # 5.12 API change,
dnl # The xattr_handler->set() callback was changed to 8 arguments, and
dnl # struct user_namespace* was inserted as arg #2
dnl #
AC_MSG_CHECKING([whether xattr_handler->set() wants dentry, inode, and user_namespace])
ZFS_LINUX_TEST_RESULT([xattr_handler_set_userns], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_XATTR_SET_USERNS, 1,
[xattr_handler->set() takes user_namespace])
],[
dnl #
dnl # 4.7 API change,
dnl # The xattr_handler->set() callback was changed to take both
dnl # dentry and inode.
dnl #
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether xattr_handler->set() wants dentry and inode])
ZFS_LINUX_TEST_RESULT([xattr_handler_set_dentry_inode], [
AC_MSG_RESULT(yes)
@ -236,6 +263,7 @@ AC_DEFUN([ZFS_AC_KERNEL_XATTR_HANDLER_SET], [
])
])
])
])
])
dnl #

View File

@ -79,9 +79,9 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_EVICT_INODE
ZFS_AC_KERNEL_SRC_DIRTY_INODE
ZFS_AC_KERNEL_SRC_SHRINKER
ZFS_AC_KERNEL_SRC_MKDIR_UMODE_T
ZFS_AC_KERNEL_SRC_MKDIR
ZFS_AC_KERNEL_SRC_LOOKUP_FLAGS
ZFS_AC_KERNEL_SRC_CREATE_FLAGS
ZFS_AC_KERNEL_SRC_CREATE
ZFS_AC_KERNEL_SRC_GET_LINK
ZFS_AC_KERNEL_SRC_PUT_LINK
ZFS_AC_KERNEL_SRC_TMPFILE
@ -115,7 +115,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_KUIDGID_T
ZFS_AC_KERNEL_SRC_KUID_HELPERS
ZFS_AC_KERNEL_SRC_MODULE_PARAM_CALL_CONST
ZFS_AC_KERNEL_SRC_RENAME_WANTS_FLAGS
ZFS_AC_KERNEL_SRC_RENAME
ZFS_AC_KERNEL_SRC_CURRENT_TIME
ZFS_AC_KERNEL_SRC_USERNS_CAPABILITIES
ZFS_AC_KERNEL_SRC_IN_COMPAT_SYSCALL
@ -124,6 +124,16 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_TOTALHIGH_PAGES
ZFS_AC_KERNEL_SRC_KSTRTOUL
ZFS_AC_KERNEL_SRC_PERCPU
ZFS_AC_KERNEL_SRC_GENERIC_FILLATTR_USERNS
ZFS_AC_KERNEL_SRC_MKNOD
ZFS_AC_KERNEL_SRC_SYMLINK
ZFS_AC_KERNEL_SRC_BIO_MAX_SEGS
ZFS_AC_KERNEL_SRC_SIGNAL_STOP
ZFS_AC_KERNEL_SRC_SIGINFO
ZFS_AC_KERNEL_SRC_SET_SPECIAL_STATE
ZFS_AC_KERNEL_SRC_VFS_SET_PAGE_DIRTY_NOBUFFERS
ZFS_AC_KERNEL_SRC_STANDALONE_LINUX_STDARG
ZFS_AC_KERNEL_SRC_PAGEMAP_FOLIO_WAIT_BIT
AC_MSG_CHECKING([for available kernel interfaces])
ZFS_LINUX_TEST_COMPILE_ALL([kabi])
@ -176,9 +186,9 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_EVICT_INODE
ZFS_AC_KERNEL_DIRTY_INODE
ZFS_AC_KERNEL_SHRINKER
ZFS_AC_KERNEL_MKDIR_UMODE_T
ZFS_AC_KERNEL_MKDIR
ZFS_AC_KERNEL_LOOKUP_FLAGS
ZFS_AC_KERNEL_CREATE_FLAGS
ZFS_AC_KERNEL_CREATE
ZFS_AC_KERNEL_GET_LINK
ZFS_AC_KERNEL_PUT_LINK
ZFS_AC_KERNEL_TMPFILE
@ -212,7 +222,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_KUIDGID_T
ZFS_AC_KERNEL_KUID_HELPERS
ZFS_AC_KERNEL_MODULE_PARAM_CALL_CONST
ZFS_AC_KERNEL_RENAME_WANTS_FLAGS
ZFS_AC_KERNEL_RENAME
ZFS_AC_KERNEL_CURRENT_TIME
ZFS_AC_KERNEL_USERNS_CAPABILITIES
ZFS_AC_KERNEL_IN_COMPAT_SYSCALL
@ -221,6 +231,16 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_TOTALHIGH_PAGES
ZFS_AC_KERNEL_KSTRTOUL
ZFS_AC_KERNEL_PERCPU
ZFS_AC_KERNEL_GENERIC_FILLATTR_USERNS
ZFS_AC_KERNEL_MKNOD
ZFS_AC_KERNEL_SYMLINK
ZFS_AC_KERNEL_BIO_MAX_SEGS
ZFS_AC_KERNEL_SIGNAL_STOP
ZFS_AC_KERNEL_SIGINFO
ZFS_AC_KERNEL_SET_SPECIAL_STATE
ZFS_AC_KERNEL_VFS_SET_PAGE_DIRTY_NOBUFFERS
ZFS_AC_KERNEL_STANDALONE_LINUX_STDARG
ZFS_AC_KERNEL_PAGEMAP_FOLIO_WAIT_BIT
])
dnl #

View File

@ -153,8 +153,12 @@ AC_DEFUN([ZFS_AC_DEBUG_KMEM_TRACKING], [
])
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS], [
AX_COUNT_CPUS([])
AC_SUBST(CPU_COUNT)
ZFS_AC_CONFIG_ALWAYS_CC_NO_UNUSED_BUT_SET_VARIABLE
ZFS_AC_CONFIG_ALWAYS_CC_NO_BOOL_COMPARE
ZFS_AC_CONFIG_ALWAYS_CC_IMPLICIT_FALLTHROUGH
ZFS_AC_CONFIG_ALWAYS_CC_FRAME_LARGER_THAN
ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_TRUNCATION
ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_ZERO_LENGTH
@ -167,6 +171,7 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS], [
ZFS_AC_CONFIG_ALWAYS_PYTHON
ZFS_AC_CONFIG_ALWAYS_PYZFS
ZFS_AC_CONFIG_ALWAYS_SED
ZFS_AC_CONFIG_ALWAYS_CPPCHECK
])
AC_DEFUN([ZFS_AC_CONFIG], [
@ -191,12 +196,10 @@ AC_DEFUN([ZFS_AC_CONFIG], [
ZFS_AC_CONFIG_ALWAYS
AM_COND_IF([BUILD_LINUX], [
AC_ARG_VAR([TEST_JOBS],
[simultaneous jobs during configure (defaults to $(nproc))])
AC_ARG_VAR([TEST_JOBS], [simultaneous jobs during configure])
if test "x$ac_cv_env_TEST_JOBS_set" != "xset"; then
TEST_JOBS=$(nproc)
TEST_JOBS=$CPU_COUNT
fi
AC_SUBST(TEST_JOBS)
])

View File

@ -221,6 +221,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/cmd/mktree/Makefile
tests/zfs-tests/cmd/mmap_exec/Makefile
tests/zfs-tests/cmd/mmap_libaio/Makefile
tests/zfs-tests/cmd/mmap_seek/Makefile
tests/zfs-tests/cmd/mmapwrite/Makefile
tests/zfs-tests/cmd/nvlist_to_lua/Makefile
tests/zfs-tests/cmd/randfree_file/Makefile
@ -228,6 +229,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/cmd/readmmap/Makefile
tests/zfs-tests/cmd/rename_dir/Makefile
tests/zfs-tests/cmd/rm_lnkcnt_zero_file/Makefile
tests/zfs-tests/cmd/send_doall/Makefile
tests/zfs-tests/cmd/stride_dd/Makefile
tests/zfs-tests/cmd/threadsappend/Makefile
tests/zfs-tests/cmd/user_ns_exec/Makefile
@ -236,6 +238,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/tests/Makefile
tests/zfs-tests/tests/functional/Makefile
tests/zfs-tests/tests/functional/acl/Makefile
tests/zfs-tests/tests/functional/acl/off/Makefile
tests/zfs-tests/tests/functional/acl/posix/Makefile
tests/zfs-tests/tests/functional/acl/posix-sa/Makefile
tests/zfs-tests/tests/functional/alloc_class/Makefile

View File

@ -82,7 +82,11 @@ alloc_pw_size(size_t len)
return (NULL);
}
pw->len = len;
pw->value = malloc(len);
/*
* The use of malloc() triggers a spurious gcc 11 -Wmaybe-uninitialized
* warning in the mlock() function call below, so use calloc().
*/
pw->value = calloc(len, 1);
if (!pw->value) {
free(pw);
return (NULL);
@ -99,7 +103,11 @@ alloc_pw_string(const char *source)
return (NULL);
}
pw->len = strlen(source) + 1;
pw->value = malloc(pw->len);
/*
* The use of malloc() triggers a spurious gcc 11 -Wmaybe-uninitialized
* warning in the mlock() function call below, so use calloc().
*/
pw->value = calloc(pw->len, 1);
if (!pw->value) {
free(pw);
return (NULL);

View File

@ -1,8 +0,0 @@
preprocessorErrorDirective:./module/zfs/vdev_raidz_math_avx512f.c:243
preprocessorErrorDirective:./module/zfs/vdev_raidz_math_sse2.c:266
uninitvar:module/os/freebsd/zfs/vdev_geom.c
uninitvar:module/os/freebsd/zfs/zfs_vfsops.c
uninitvar:module/os/freebsd/spl/spl_zone.c
uninitvar:lib/libzutil/os/freebsd/zutil_import_os.c
*:module/zstd/lib/zstd.c
*:module/zstd/lib/zstd.h

View File

@ -159,6 +159,16 @@ void color_start(char *color);
void color_end(void);
int printf_color(char *color, char *format, ...);
/*
* These functions are used by the ZFS libraries and cmd/zpool code, but are
* not exported in the ABI.
*/
typedef int (*pool_vdev_iter_f)(void *, nvlist_t *, void *);
int for_each_vdev_cb(void *zhp, nvlist_t *nv, pool_vdev_iter_f func,
void *data);
int for_each_vdev_in_nvlist(nvlist_t *nvroot, pool_vdev_iter_f func,
void *data);
void update_vdevs_config_dev_sysfs_path(nvlist_t *config);
#ifdef __cplusplus
}
#endif

View File

@ -67,6 +67,7 @@
#define __always_inline inline
#define noinline __noinline
#define ____cacheline_aligned __aligned(CACHE_LINE_SIZE)
#define fallthrough __attribute__((__fallthrough__))
#if !defined(_KERNEL) && !defined(_STANDALONE)
#define likely(x) __builtin_expect(!!(x), 1)

View File

@ -4,6 +4,7 @@ KERNEL_H = \
atomic.h \
byteorder.h \
callb.h \
ccompat.h \
ccompile.h \
cmn_err.h \
condvar.h \
@ -18,9 +19,11 @@ KERNEL_H = \
fcntl.h \
file.h \
freebsd_rwlock.h \
idmap.h \
inttypes.h \
isa_defs.h \
kmem_cache.h \
kidmap.h \
kmem.h \
kstat.h \
list_impl.h \

View File

@ -74,9 +74,9 @@ void spl_dumpstack(void);
"%s", "VERIFY(" #cond ") failed\n"))
#define VERIFY3B(LEFT, OP, RIGHT) do { \
boolean_t _verify3_left = (boolean_t)(LEFT); \
boolean_t _verify3_right = (boolean_t)(RIGHT); \
if (!(_verify3_left OP _verify3_right)) \
const boolean_t _verify3_left = (boolean_t)(LEFT); \
const boolean_t _verify3_right = (boolean_t)(RIGHT);\
if (unlikely(!(_verify3_left OP _verify3_right))) \
spl_panic(__FILE__, __FUNCTION__, __LINE__, \
"VERIFY3(" #LEFT " " #OP " " #RIGHT ") " \
"failed (%d " #OP " %d)\n", \
@ -85,9 +85,9 @@ void spl_dumpstack(void);
} while (0)
#define VERIFY3S(LEFT, OP, RIGHT) do { \
int64_t _verify3_left = (int64_t)(LEFT); \
int64_t _verify3_right = (int64_t)(RIGHT); \
if (!(_verify3_left OP _verify3_right)) \
const int64_t _verify3_left = (int64_t)(LEFT); \
const int64_t _verify3_right = (int64_t)(RIGHT); \
if (unlikely(!(_verify3_left OP _verify3_right))) \
spl_panic(__FILE__, __FUNCTION__, __LINE__, \
"VERIFY3(" #LEFT " " #OP " " #RIGHT ") " \
"failed (%lld " #OP " %lld)\n", \
@ -96,9 +96,9 @@ void spl_dumpstack(void);
} while (0)
#define VERIFY3U(LEFT, OP, RIGHT) do { \
uint64_t _verify3_left = (uint64_t)(LEFT); \
uint64_t _verify3_right = (uint64_t)(RIGHT); \
if (!(_verify3_left OP _verify3_right)) \
const uint64_t _verify3_left = (uint64_t)(LEFT); \
const uint64_t _verify3_right = (uint64_t)(RIGHT); \
if (unlikely(!(_verify3_left OP _verify3_right))) \
spl_panic(__FILE__, __FUNCTION__, __LINE__, \
"VERIFY3(" #LEFT " " #OP " " #RIGHT ") " \
"failed (%llu " #OP " %llu)\n", \
@ -107,9 +107,9 @@ void spl_dumpstack(void);
} while (0)
#define VERIFY3P(LEFT, OP, RIGHT) do { \
uintptr_t _verify3_left = (uintptr_t)(LEFT); \
uintptr_t _verify3_right = (uintptr_t)(RIGHT); \
if (!(_verify3_left OP _verify3_right)) \
const uintptr_t _verify3_left = (uintptr_t)(LEFT); \
const uintptr_t _verify3_right = (uintptr_t)(RIGHT);\
if (unlikely(!(_verify3_left OP _verify3_right))) \
spl_panic(__FILE__, __FUNCTION__, __LINE__, \
"VERIFY3(" #LEFT " " #OP " " #RIGHT ") " \
"failed (%px " #OP " %px)\n", \
@ -118,9 +118,9 @@ void spl_dumpstack(void);
} while (0)
#define VERIFY0(RIGHT) do { \
int64_t _verify3_left = (int64_t)(0); \
int64_t _verify3_right = (int64_t)(RIGHT); \
if (!(_verify3_left == _verify3_right)) \
const int64_t _verify3_left = (int64_t)(0); \
const int64_t _verify3_right = (int64_t)(RIGHT); \
if (unlikely(!(_verify3_left == _verify3_right))) \
spl_panic(__FILE__, __FUNCTION__, __LINE__, \
"VERIFY3(0 == " #RIGHT ") " \
"failed (0 == %lld)\n", \
@ -154,11 +154,11 @@ void spl_dumpstack(void);
#define ASSERT0 VERIFY0
#define ASSERT VERIFY
#define IMPLY(A, B) \
((void)(((!(A)) || (B)) || \
((void)(likely((!(A)) || (B)) || \
spl_panic(__FILE__, __FUNCTION__, __LINE__, \
"(" #A ") implies (" #B ")")))
#define EQUIV(A, B) \
((void)((!!(A) == !!(B)) || \
((void)(likely(!!(A) == !!(B)) || \
spl_panic(__FILE__, __FUNCTION__, __LINE__, \
"(" #A ") is equivalent to (" #B ")")))
/* END CSTYLED */

View File

@ -30,6 +30,7 @@
#define _OPENSOLARIS_SYS_MISC_H_
#include <sys/limits.h>
#include <sys/filio.h>
#define MAXUID UID_MAX
@ -40,8 +41,8 @@
#define _FIOGDIO (INT_MIN+1)
#define _FIOSDIO (INT_MIN+2)
#define _FIO_SEEK_DATA FIOSEEKDATA
#define _FIO_SEEK_HOLE FIOSEEKHOLE
#define F_SEEK_DATA FIOSEEKDATA
#define F_SEEK_HOLE FIOSEEKHOLE
struct opensolaris_utsname {
char *sysname;

View File

@ -35,81 +35,47 @@
#include <sys/_uio.h>
#include <sys/debug.h>
#define uio_loffset uio_offset
typedef struct uio uio_t;
typedef struct iovec iovec_t;
typedef enum uio_seg uio_seg_t;
typedef enum uio_seg zfs_uio_seg_t;
typedef enum uio_rw zfs_uio_rw_t;
typedef enum xuio_type {
UIOTYPE_ASYNCIO,
UIOTYPE_ZEROCOPY
} xuio_type_t;
typedef struct zfs_uio {
struct uio *uio;
} zfs_uio_t;
typedef struct xuio {
uio_t xu_uio;
/* Extended uio fields */
enum xuio_type xu_type; /* What kind of uio structure? */
union {
struct {
int xu_zc_rw;
void *xu_zc_priv;
} xu_zc;
} xu_ext;
} xuio_t;
#define XUIO_XUZC_PRIV(xuio) xuio->xu_ext.xu_zc.xu_zc_priv
#define XUIO_XUZC_RW(xuio) xuio->xu_ext.xu_zc.xu_zc_rw
static __inline int
zfs_uiomove(void *cp, size_t n, enum uio_rw dir, uio_t *uio)
{
ASSERT(uio->uio_rw == dir);
return (uiomove(cp, (int)n, uio));
}
#define uiomove(cp, n, dir, uio) zfs_uiomove((cp), (n), (dir), (uio))
int uiocopy(void *p, size_t n, enum uio_rw rw, struct uio *uio, size_t *cbytes);
void uioskip(uio_t *uiop, size_t n);
#define uio_segflg(uio) (uio)->uio_segflg
#define uio_offset(uio) (uio)->uio_loffset
#define uio_resid(uio) (uio)->uio_resid
#define uio_iovcnt(uio) (uio)->uio_iovcnt
#define uio_iovlen(uio, idx) (uio)->uio_iov[(idx)].iov_len
#define uio_iovbase(uio, idx) (uio)->uio_iov[(idx)].iov_base
#define uio_fault_disable(uio, set)
#define GET_UIO_STRUCT(u) (u)->uio
#define zfs_uio_segflg(u) GET_UIO_STRUCT(u)->uio_segflg
#define zfs_uio_offset(u) GET_UIO_STRUCT(u)->uio_offset
#define zfs_uio_resid(u) GET_UIO_STRUCT(u)->uio_resid
#define zfs_uio_iovcnt(u) GET_UIO_STRUCT(u)->uio_iovcnt
#define zfs_uio_iovlen(u, idx) GET_UIO_STRUCT(u)->uio_iov[(idx)].iov_len
#define zfs_uio_iovbase(u, idx) GET_UIO_STRUCT(u)->uio_iov[(idx)].iov_base
#define zfs_uio_td(u) GET_UIO_STRUCT(u)->uio_td
#define zfs_uio_rw(u) GET_UIO_STRUCT(u)->uio_rw
#define zfs_uio_fault_disable(u, set)
#define zfs_uio_prefaultpages(size, u) (0)
static inline void
uio_iov_at_index(uio_t *uio, uint_t idx, void **base, uint64_t *len)
zfs_uio_setoffset(zfs_uio_t *uio, offset_t off)
{
*base = uio_iovbase(uio, idx);
*len = uio_iovlen(uio, idx);
zfs_uio_offset(uio) = off;
}
static inline void
uio_advance(uio_t *uio, size_t size)
zfs_uio_advance(zfs_uio_t *uio, size_t size)
{
uio->uio_resid -= size;
uio->uio_loffset += size;
zfs_uio_resid(uio) -= size;
zfs_uio_offset(uio) += size;
}
static inline offset_t
uio_index_at_offset(uio_t *uio, offset_t off, uint_t *vec_idx)
static __inline void
zfs_uio_init(zfs_uio_t *uio, struct uio *uio_s)
{
*vec_idx = 0;
while (*vec_idx < uio_iovcnt(uio) && off >= uio_iovlen(uio, *vec_idx)) {
off -= uio_iovlen(uio, *vec_idx);
(*vec_idx)++;
}
return (off);
GET_UIO_STRUCT(uio) = uio_s;
}
int zfs_uio_fault_move(void *p, size_t n, zfs_uio_rw_t dir, zfs_uio_t *uio);
#endif /* !_STANDALONE */
#endif /* !_OPENSOLARIS_SYS_UIO_H_ */

View File

@ -59,6 +59,8 @@ enum symfollow { NO_FOLLOW = NOFOLLOW };
#include <sys/file.h>
#include <sys/filedesc.h>
#include <sys/syscallsubr.h>
#include <sys/vm.h>
#include <vm/vm_object.h>
typedef struct vop_vector vnodeops_t;
#define VOP_FID VOP_VPTOFH
@ -88,6 +90,22 @@ vn_is_readonly(vnode_t *vp)
#define vn_has_cached_data(vp) \
((vp)->v_object != NULL && \
(vp)->v_object->resident_page_count > 0)
static __inline void
vn_flush_cached_data(vnode_t *vp, boolean_t sync)
{
#if __FreeBSD_version > 1300054
if (vm_object_mightbedirty(vp->v_object)) {
#else
if (vp->v_object->flags & OBJ_MIGHTBEDIRTY) {
#endif
int flags = sync ? OBJPC_SYNC : 0;
zfs_vmobject_wlock(vp->v_object);
vm_object_page_clean(vp->v_object, 0, 0, flags);
zfs_vmobject_wunlock(vp->v_object);
}
}
#define vn_exists(vp) do { } while (0)
#define vn_invalid(vp) do { } while (0)
#define vn_renamepath(tdvp, svp, tnm, lentnm) do { } while (0)

View File

@ -92,7 +92,7 @@ int freebsd_crypt_newsession(freebsd_crypt_session_t *sessp,
void freebsd_crypt_freesession(freebsd_crypt_session_t *sessp);
int freebsd_crypt_uio(boolean_t, freebsd_crypt_session_t *,
struct zio_crypt_info *, uio_t *, crypto_key_t *, uint8_t *,
struct zio_crypt_info *, zfs_uio_t *, crypto_key_t *, uint8_t *,
size_t, size_t);
#endif /* _ZFS_FREEBSD_CRYPTO_H */

View File

@ -42,7 +42,6 @@
#include <linux/types.h>
#define cond_resched() kern_yield(PRI_USER)
#define uio_prefaultpages(size, uio) (0)
#define taskq_create_sysdc(a, b, d, e, p, dc, f) \
(taskq_create(a, b, maxclsyspri, d, e, f))

View File

@ -27,6 +27,10 @@
#ifndef _SYS_FS_ZFS_VFSOPS_H
#define _SYS_FS_ZFS_VFSOPS_H
#if __FreeBSD_version >= 1300125
#define TEARDOWN_RMS
#endif
#if __FreeBSD_version >= 1300109
#define TEARDOWN_INACTIVE_RMS
#endif
@ -46,10 +50,16 @@
extern "C" {
#endif
#ifdef TEARDOWN_INACTIVE_RMS
#ifdef TEARDOWN_RMS
typedef struct rmslock zfs_teardown_lock_t;
#else
#define zfs_teardown_lock_t krwlock_t
#define zfs_teardown_lock_t rrmlock_t
#endif
#ifdef TEARDOWN_INACTIVE_RMS
typedef struct rmslock zfs_teardown_inactive_lock_t;
#else
#define zfs_teardown_inactive_lock_t krwlock_t
#endif
typedef struct zfsvfs zfsvfs_t;
@ -80,8 +90,8 @@ struct zfsvfs {
int z_norm; /* normalization flags */
boolean_t z_atime; /* enable atimes mount option */
boolean_t z_unmounted; /* unmounted */
rrmlock_t z_teardown_lock;
zfs_teardown_lock_t z_teardown_inactive_lock;
zfs_teardown_lock_t z_teardown_lock;
zfs_teardown_inactive_lock_t z_teardown_inactive_lock;
list_t z_all_znodes; /* all vnodes in the fs */
uint64_t z_nr_znodes; /* number of znodes in the fs */
kmutex_t z_znodes_lock; /* lock for z_all_znodes */
@ -112,53 +122,121 @@ struct zfsvfs {
struct task z_unlinked_drain_task;
};
#ifdef TEARDOWN_RMS
#define ZFS_TEARDOWN_INIT(zfsvfs) \
rms_init(&(zfsvfs)->z_teardown_lock, "zfs teardown")
#define ZFS_TEARDOWN_DESTROY(zfsvfs) \
rms_destroy(&(zfsvfs)->z_teardown_lock)
#define ZFS_TEARDOWN_TRY_ENTER_READ(zfsvfs) \
rms_try_rlock(&(zfsvfs)->z_teardown_lock)
#define ZFS_TEARDOWN_ENTER_READ(zfsvfs, tag) \
rms_rlock(&(zfsvfs)->z_teardown_lock);
#define ZFS_TEARDOWN_EXIT_READ(zfsvfs, tag) \
rms_runlock(&(zfsvfs)->z_teardown_lock)
#define ZFS_TEARDOWN_ENTER_WRITE(zfsvfs, tag) \
rms_wlock(&(zfsvfs)->z_teardown_lock)
#define ZFS_TEARDOWN_EXIT_WRITE(zfsvfs) \
rms_wunlock(&(zfsvfs)->z_teardown_lock)
#define ZFS_TEARDOWN_EXIT(zfsvfs, tag) \
rms_unlock(&(zfsvfs)->z_teardown_lock)
#define ZFS_TEARDOWN_READ_HELD(zfsvfs) \
rms_rowned(&(zfsvfs)->z_teardown_lock)
#define ZFS_TEARDOWN_WRITE_HELD(zfsvfs) \
rms_wowned(&(zfsvfs)->z_teardown_lock)
#define ZFS_TEARDOWN_HELD(zfsvfs) \
rms_owned_any(&(zfsvfs)->z_teardown_lock)
#else
#define ZFS_TEARDOWN_INIT(zfsvfs) \
rrm_init(&(zfsvfs)->z_teardown_lock, B_FALSE)
#define ZFS_TEARDOWN_DESTROY(zfsvfs) \
rrm_destroy(&(zfsvfs)->z_teardown_lock)
#define ZFS_TEARDOWN_TRY_ENTER_READ(zfsvfs) \
rw_tryenter(&(zfsvfs)->z_teardown_lock, RW_READER)
#define ZFS_TEARDOWN_ENTER_READ(zfsvfs, tag) \
rrm_enter_read(&(zfsvfs)->z_teardown_lock, tag);
#define ZFS_TEARDOWN_EXIT_READ(zfsvfs, tag) \
rrm_exit(&(zfsvfs)->z_teardown_lock, tag)
#define ZFS_TEARDOWN_ENTER_WRITE(zfsvfs, tag) \
rrm_enter(&(zfsvfs)->z_teardown_lock, RW_WRITER, tag)
#define ZFS_TEARDOWN_EXIT_WRITE(zfsvfs) \
rrm_exit(&(zfsvfs)->z_teardown_lock, tag)
#define ZFS_TEARDOWN_EXIT(zfsvfs, tag) \
rrm_exit(&(zfsvfs)->z_teardown_lock, tag)
#define ZFS_TEARDOWN_READ_HELD(zfsvfs) \
RRM_READ_HELD(&(zfsvfs)->z_teardown_lock)
#define ZFS_TEARDOWN_WRITE_HELD(zfsvfs) \
RRM_WRITE_HELD(&(zfsvfs)->z_teardown_lock)
#define ZFS_TEARDOWN_HELD(zfsvfs) \
RRM_LOCK_HELD(&(zfsvfs)->z_teardown_lock)
#endif
#ifdef TEARDOWN_INACTIVE_RMS
#define ZFS_INIT_TEARDOWN_INACTIVE(zfsvfs) \
#define ZFS_TEARDOWN_INACTIVE_INIT(zfsvfs) \
rms_init(&(zfsvfs)->z_teardown_inactive_lock, "zfs teardown inactive")
#define ZFS_DESTROY_TEARDOWN_INACTIVE(zfsvfs) \
#define ZFS_TEARDOWN_INACTIVE_DESTROY(zfsvfs) \
rms_destroy(&(zfsvfs)->z_teardown_inactive_lock)
#define ZFS_TRYRLOCK_TEARDOWN_INACTIVE(zfsvfs) \
#define ZFS_TEARDOWN_INACTIVE_TRY_ENTER_READ(zfsvfs) \
rms_try_rlock(&(zfsvfs)->z_teardown_inactive_lock)
#define ZFS_RLOCK_TEARDOWN_INACTIVE(zfsvfs) \
#define ZFS_TEARDOWN_INACTIVE_ENTER_READ(zfsvfs) \
rms_rlock(&(zfsvfs)->z_teardown_inactive_lock)
#define ZFS_RUNLOCK_TEARDOWN_INACTIVE(zfsvfs) \
#define ZFS_TEARDOWN_INACTIVE_EXIT_READ(zfsvfs) \
rms_runlock(&(zfsvfs)->z_teardown_inactive_lock)
#define ZFS_WLOCK_TEARDOWN_INACTIVE(zfsvfs) \
#define ZFS_TEARDOWN_INACTIVE_ENTER_WRITE(zfsvfs) \
rms_wlock(&(zfsvfs)->z_teardown_inactive_lock)
#define ZFS_WUNLOCK_TEARDOWN_INACTIVE(zfsvfs) \
#define ZFS_TEARDOWN_INACTIVE_EXIT_WRITE(zfsvfs) \
rms_wunlock(&(zfsvfs)->z_teardown_inactive_lock)
#define ZFS_TEARDOWN_INACTIVE_WLOCKED(zfsvfs) \
#define ZFS_TEARDOWN_INACTIVE_WRITE_HELD(zfsvfs) \
rms_wowned(&(zfsvfs)->z_teardown_inactive_lock)
#else
#define ZFS_INIT_TEARDOWN_INACTIVE(zfsvfs) \
#define ZFS_TEARDOWN_INACTIVE_INIT(zfsvfs) \
rw_init(&(zfsvfs)->z_teardown_inactive_lock, NULL, RW_DEFAULT, NULL)
#define ZFS_DESTROY_TEARDOWN_INACTIVE(zfsvfs) \
#define ZFS_TEARDOWN_INACTIVE_DESTROY(zfsvfs) \
rw_destroy(&(zfsvfs)->z_teardown_inactive_lock)
#define ZFS_TRYRLOCK_TEARDOWN_INACTIVE(zfsvfs) \
#define ZFS_TEARDOWN_INACTIVE_TRY_ENTER_READ(zfsvfs) \
rw_tryenter(&(zfsvfs)->z_teardown_inactive_lock, RW_READER)
#define ZFS_RLOCK_TEARDOWN_INACTIVE(zfsvfs) \
#define ZFS_TEARDOWN_INACTIVE_ENTER_READ(zfsvfs) \
rw_enter(&(zfsvfs)->z_teardown_inactive_lock, RW_READER)
#define ZFS_RUNLOCK_TEARDOWN_INACTIVE(zfsvfs) \
#define ZFS_TEARDOWN_INACTIVE_EXIT_READ(zfsvfs) \
rw_exit(&(zfsvfs)->z_teardown_inactive_lock)
#define ZFS_WLOCK_TEARDOWN_INACTIVE(zfsvfs) \
#define ZFS_TEARDOWN_INACTIVE_ENTER_WRITE(zfsvfs) \
rw_enter(&(zfsvfs)->z_teardown_inactive_lock, RW_WRITER)
#define ZFS_WUNLOCK_TEARDOWN_INACTIVE(zfsvfs) \
#define ZFS_TEARDOWN_INACTIVE_EXIT_WRITE(zfsvfs) \
rw_exit(&(zfsvfs)->z_teardown_inactive_lock)
#define ZFS_TEARDOWN_INACTIVE_WLOCKED(zfsvfs) \
#define ZFS_TEARDOWN_INACTIVE_WRITE_HELD(zfsvfs) \
RW_WRITE_HELD(&(zfsvfs)->z_teardown_inactive_lock)
#endif

View File

@ -40,6 +40,7 @@
#include <sys/zil.h>
#include <sys/zfs_project.h>
#include <vm/vm_object.h>
#include <sys/uio.h>
#ifdef __cplusplus
extern "C" {
@ -117,24 +118,26 @@ extern minor_t zfsdev_minor_alloc(void);
#define Z_ISDIR(type) ((type) == VDIR)
#define zn_has_cached_data(zp) vn_has_cached_data(ZTOV(zp))
#define zn_rlimit_fsize(zp, uio, td) vn_rlimit_fsize(ZTOV(zp), (uio), (td))
#define zn_flush_cached_data(zp, sync) vn_flush_cached_data(ZTOV(zp), sync)
#define zn_rlimit_fsize(zp, uio) \
vn_rlimit_fsize(ZTOV(zp), GET_UIO_STRUCT(uio), zfs_uio_td(uio))
/* Called on entry to each ZFS vnode and vfs operation */
#define ZFS_ENTER(zfsvfs) \
{ \
rrm_enter_read(&(zfsvfs)->z_teardown_lock, FTAG); \
if ((zfsvfs)->z_unmounted) { \
ZFS_EXIT(zfsvfs); \
ZFS_TEARDOWN_ENTER_READ((zfsvfs), FTAG); \
if (__predict_false((zfsvfs)->z_unmounted)) { \
ZFS_TEARDOWN_EXIT_READ(zfsvfs, FTAG); \
return (EIO); \
} \
}
/* Must be called before exiting the vop */
#define ZFS_EXIT(zfsvfs) rrm_exit(&(zfsvfs)->z_teardown_lock, FTAG)
#define ZFS_EXIT(zfsvfs) ZFS_TEARDOWN_EXIT_READ(zfsvfs, FTAG)
/* Verifies the znode is valid */
#define ZFS_VERIFY_ZP(zp) \
if ((zp)->z_sa_hdl == NULL) { \
if (__predict_false((zp)->z_sa_hdl == NULL)) { \
ZFS_EXIT((zp)->z_zfsvfs); \
return (EIO); \
} \
@ -173,7 +176,6 @@ extern void zfs_tstamp_update_setup_ext(struct znode *,
uint_t, uint64_t [2], uint64_t [2], boolean_t have_tx);
extern void zfs_znode_free(struct znode *);
extern zil_get_data_t zfs_get_data;
extern zil_replay_func_t *zfs_replay_vector[TX_MAX_TYPE];
extern int zfsfstype;

View File

@ -30,9 +30,9 @@
#define _ZFS_BLKDEV_H
#include <linux/blkdev.h>
#include <linux/elevator.h>
#include <linux/backing-dev.h>
#include <linux/hdreg.h>
#include <linux/major.h>
#include <linux/msdos_fs.h> /* for SECTOR_* */
#ifndef HAVE_BLK_QUEUE_FLAG_SET
@ -92,11 +92,14 @@ blk_queue_set_write_cache(struct request_queue *q, bool wc, bool fua)
static inline void
blk_queue_set_read_ahead(struct request_queue *q, unsigned long ra_pages)
{
#if !defined(HAVE_BLK_QUEUE_UPDATE_READAHEAD) && \
!defined(HAVE_DISK_UPDATE_READAHEAD)
#ifdef HAVE_BLK_QUEUE_BDI_DYNAMIC
q->backing_dev_info->ra_pages = ra_pages;
#else
q->backing_dev_info.ra_pages = ra_pages;
#endif
#endif
}
#ifdef HAVE_BIO_BVEC_ITER
@ -277,18 +280,22 @@ bio_set_bi_error(struct bio *bio, int error)
static inline int
zfs_check_media_change(struct block_device *bdev)
{
#ifdef HAVE_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK
struct gendisk *gd = bdev->bd_disk;
const struct block_device_operations *bdo = gd->fops;
#endif
if (!bdev_check_media_change(bdev))
return (0);
#ifdef HAVE_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK
/*
* Force revalidation, to mimic the old behavior of
* check_disk_change()
*/
if (bdo->revalidate_disk)
bdo->revalidate_disk(gd);
#endif
return (0);
}
@ -520,7 +527,9 @@ blk_generic_start_io_acct(struct request_queue *q __attribute__((unused)),
struct gendisk *disk __attribute__((unused)),
int rw __attribute__((unused)), struct bio *bio)
{
#if defined(HAVE_BIO_IO_ACCT)
#if defined(HAVE_DISK_IO_ACCT)
return (disk_start_io_acct(disk, bio_sectors(bio), bio_op(bio)));
#elif defined(HAVE_BIO_IO_ACCT)
return (bio_start_io_acct(bio));
#elif defined(HAVE_GENERIC_IO_ACCT_3ARG)
unsigned long start_time = jiffies;
@ -541,7 +550,9 @@ blk_generic_end_io_acct(struct request_queue *q __attribute__((unused)),
struct gendisk *disk __attribute__((unused)),
int rw __attribute__((unused)), struct bio *bio, unsigned long start_time)
{
#if defined(HAVE_BIO_IO_ACCT)
#if defined(HAVE_DISK_IO_ACCT)
disk_end_io_acct(disk, bio_op(bio), start_time);
#elif defined(HAVE_BIO_IO_ACCT)
bio_end_io_acct(bio, start_time);
#elif defined(HAVE_GENERIC_IO_ACCT_3ARG)
generic_end_io_acct(rw, &disk->part0, start_time);

View File

@ -28,6 +28,14 @@
#include <linux/compiler.h>
#if !defined(fallthrough)
#if defined(HAVE_IMPLICIT_FALLTHROUGH)
#define fallthrough __attribute__((__fallthrough__))
#else
#define fallthrough ((void)0)
#endif
#endif
#if !defined(READ_ONCE)
#define READ_ONCE(x) ACCESS_ONCE(x)
#endif

View File

@ -88,6 +88,9 @@
#if defined(HAVE_KERNEL_FPU_API_HEADER)
#include <asm/fpu/api.h>
#include <asm/fpu/internal.h>
#if defined(HAVE_KERNEL_FPU_XCR_HEADER)
#include <asm/fpu/xcr.h>
#endif
#else
#include <asm/i387.h>
#include <asm/xcr.h>

View File

@ -343,7 +343,8 @@ static inline void zfs_gid_write(struct inode *ip, gid_t gid)
/*
* 4.9 API change
*/
#ifndef HAVE_SETATTR_PREPARE
#if !(defined(HAVE_SETATTR_PREPARE_NO_USERNS) || \
defined(HAVE_SETATTR_PREPARE_USERNS))
static inline int
setattr_prepare(struct dentry *dentry, struct iattr *ia)
{
@ -389,6 +390,15 @@ func(const struct path *path, struct kstat *stat, u32 request_mask, \
{ \
return (func##_impl(path, stat, request_mask, query_flags)); \
}
#elif defined(HAVE_USERNS_IOPS_GETATTR)
#define ZPL_GETATTR_WRAPPER(func) \
static int \
func(struct user_namespace *user_ns, const struct path *path, \
struct kstat *stat, u32 request_mask, unsigned int query_flags) \
{ \
return (func##_impl(user_ns, path, stat, request_mask, \
query_flags)); \
}
#else
#error
#endif
@ -436,4 +446,16 @@ zpl_is_32bit_api(void)
#endif
}
/*
* 5.12 API change
* To support id-mapped mounts, generic_fillattr() was modified to
* accept a new struct user_namespace* as its first arg.
*/
#ifdef HAVE_GENERIC_FILLATTR_USERNS
#define zpl_generic_fillattr(user_ns, ip, sp) \
generic_fillattr(user_ns, ip, sp)
#else
#define zpl_generic_fillattr(user_ns, ip, sp) generic_fillattr(ip, sp)
#endif
#endif /* _ZFS_VFS_H */

View File

@ -119,12 +119,27 @@ fn(struct dentry *dentry, const char *name, void *buffer, size_t size, \
#error "Unsupported kernel"
#endif
/*
* 5.12 API change,
* The xattr_handler->set() callback was changed to take the
* struct user_namespace* as the first arg, to support idmapped
* mounts.
*/
#if defined(HAVE_XATTR_SET_USERNS)
#define ZPL_XATTR_SET_WRAPPER(fn) \
static int \
fn(const struct xattr_handler *handler, struct user_namespace *user_ns, \
struct dentry *dentry, struct inode *inode, const char *name, \
const void *buffer, size_t size, int flags) \
{ \
return (__ ## fn(inode, name, buffer, size, flags)); \
}
/*
* 4.7 API change,
* The xattr_handler->set() callback was changed to take a both dentry and
* inode, because the dentry might not be attached to an inode yet.
*/
#if defined(HAVE_XATTR_SET_DENTRY_INODE)
#elif defined(HAVE_XATTR_SET_DENTRY_INODE)
#define ZPL_XATTR_SET_WRAPPER(fn) \
static int \
fn(const struct xattr_handler *handler, struct dentry *dentry, \

View File

@ -24,7 +24,11 @@
#ifndef _SPL_CMN_ERR_H
#define _SPL_CMN_ERR_H
#if defined(_KERNEL) && defined(HAVE_STANDALONE_LINUX_STDARG)
#include <linux/stdarg.h>
#else
#include <stdarg.h>
#endif
#define CE_CONT 0 /* continuation */
#define CE_NOTE 1 /* notice */

View File

@ -68,9 +68,9 @@ void spl_dumpstack(void);
"%s", "VERIFY(" #cond ") failed\n"))
#define VERIFY3B(LEFT, OP, RIGHT) do { \
boolean_t _verify3_left = (boolean_t)(LEFT); \
boolean_t _verify3_right = (boolean_t)(RIGHT); \
if (!(_verify3_left OP _verify3_right)) \
const boolean_t _verify3_left = (boolean_t)(LEFT); \
const boolean_t _verify3_right = (boolean_t)(RIGHT);\
if (unlikely(!(_verify3_left OP _verify3_right))) \
spl_panic(__FILE__, __FUNCTION__, __LINE__, \
"VERIFY3(" #LEFT " " #OP " " #RIGHT ") " \
"failed (%d " #OP " %d)\n", \
@ -79,9 +79,9 @@ void spl_dumpstack(void);
} while (0)
#define VERIFY3S(LEFT, OP, RIGHT) do { \
int64_t _verify3_left = (int64_t)(LEFT); \
int64_t _verify3_right = (int64_t)(RIGHT); \
if (!(_verify3_left OP _verify3_right)) \
const int64_t _verify3_left = (int64_t)(LEFT); \
const int64_t _verify3_right = (int64_t)(RIGHT); \
if (unlikely(!(_verify3_left OP _verify3_right))) \
spl_panic(__FILE__, __FUNCTION__, __LINE__, \
"VERIFY3(" #LEFT " " #OP " " #RIGHT ") " \
"failed (%lld " #OP " %lld)\n", \
@ -90,9 +90,9 @@ void spl_dumpstack(void);
} while (0)
#define VERIFY3U(LEFT, OP, RIGHT) do { \
uint64_t _verify3_left = (uint64_t)(LEFT); \
uint64_t _verify3_right = (uint64_t)(RIGHT); \
if (!(_verify3_left OP _verify3_right)) \
const uint64_t _verify3_left = (uint64_t)(LEFT); \
const uint64_t _verify3_right = (uint64_t)(RIGHT); \
if (unlikely(!(_verify3_left OP _verify3_right))) \
spl_panic(__FILE__, __FUNCTION__, __LINE__, \
"VERIFY3(" #LEFT " " #OP " " #RIGHT ") " \
"failed (%llu " #OP " %llu)\n", \
@ -101,9 +101,9 @@ void spl_dumpstack(void);
} while (0)
#define VERIFY3P(LEFT, OP, RIGHT) do { \
uintptr_t _verify3_left = (uintptr_t)(LEFT); \
uintptr_t _verify3_right = (uintptr_t)(RIGHT); \
if (!(_verify3_left OP _verify3_right)) \
const uintptr_t _verify3_left = (uintptr_t)(LEFT); \
const uintptr_t _verify3_right = (uintptr_t)(RIGHT);\
if (unlikely(!(_verify3_left OP _verify3_right))) \
spl_panic(__FILE__, __FUNCTION__, __LINE__, \
"VERIFY3(" #LEFT " " #OP " " #RIGHT ") " \
"failed (%px " #OP " %px)\n", \
@ -112,9 +112,9 @@ void spl_dumpstack(void);
} while (0)
#define VERIFY0(RIGHT) do { \
int64_t _verify3_left = (int64_t)(0); \
int64_t _verify3_right = (int64_t)(RIGHT); \
if (!(_verify3_left == _verify3_right)) \
const int64_t _verify3_left = (int64_t)(0); \
const int64_t _verify3_right = (int64_t)(RIGHT); \
if (unlikely(!(_verify3_left == _verify3_right))) \
spl_panic(__FILE__, __FUNCTION__, __LINE__, \
"VERIFY3(0 == " #RIGHT ") " \
"failed (0 == %lld)\n", \
@ -154,11 +154,11 @@ void spl_dumpstack(void);
#define ASSERT0 VERIFY0
#define ASSERT VERIFY
#define IMPLY(A, B) \
((void)(((!(A)) || (B)) || \
((void)(likely((!(A)) || (B)) || \
spl_panic(__FILE__, __FUNCTION__, __LINE__, \
"(" #A ") implies (" #B ")")))
#define EQUIV(A, B) \
((void)((!!(A) == !!(B)) || \
((void)(likely(!!(A) == !!(B)) || \
spl_panic(__FILE__, __FUNCTION__, __LINE__, \
"(" #A ") is equivalent to (" #B ")")))
/* END CSTYLED */

View File

@ -33,22 +33,6 @@
#define FORREAL 0 /* Usual side-effects */
#define JUSTLOOKING 1 /* Don't stop the process */
/*
* The "why" argument indicates the allowable side-effects of the call:
*
* FORREAL: Extract the next pending signal from p_sig into p_cursig;
* stop the process if a stop has been requested or if a traced signal
* is pending.
*
* JUSTLOOKING: Don't stop the process, just indicate whether or not
* a signal might be pending (FORREAL is needed to tell for sure).
*/
static __inline__ int
issig(int why)
{
ASSERT(why == FORREAL || why == JUSTLOOKING);
return (signal_pending(current));
}
extern int issig(int why);
#endif /* SPL_SIGNAL_H */

View File

@ -70,4 +70,17 @@ extern struct task_struct *spl_kthread_create(int (*func)(void *),
extern proc_t p0;
#ifdef HAVE_SIGINFO
typedef kernel_siginfo_t spl_kernel_siginfo_t;
#else
typedef siginfo_t spl_kernel_siginfo_t;
#endif
#ifdef HAVE_SET_SPECIAL_STATE
#define spl_set_special_state(x) set_special_state((x))
#else
#define spl_set_special_state(x) __set_current_state((x))
#endif
#endif /* _SPL_THREAD_H */

View File

@ -36,21 +36,21 @@
typedef struct iovec iovec_t;
typedef enum uio_rw {
typedef enum zfs_uio_rw {
UIO_READ = 0,
UIO_WRITE = 1,
} uio_rw_t;
} zfs_uio_rw_t;
typedef enum uio_seg {
typedef enum zfs_uio_seg {
UIO_USERSPACE = 0,
UIO_SYSSPACE = 1,
UIO_BVEC = 2,
#if defined(HAVE_VFS_IOV_ITER)
UIO_ITER = 3,
#endif
} uio_seg_t;
} zfs_uio_seg_t;
typedef struct uio {
typedef struct zfs_uio {
union {
const struct iovec *uio_iov;
const struct bio_vec *uio_bvec;
@ -60,91 +60,39 @@ typedef struct uio {
};
int uio_iovcnt;
offset_t uio_loffset;
uio_seg_t uio_segflg;
zfs_uio_seg_t uio_segflg;
boolean_t uio_fault_disable;
uint16_t uio_fmode;
uint16_t uio_extflg;
ssize_t uio_resid;
size_t uio_skip;
} uio_t;
} zfs_uio_t;
typedef struct aio_req {
uio_t *aio_uio;
void *aio_private;
} aio_req_t;
#define zfs_uio_segflg(u) (u)->uio_segflg
#define zfs_uio_offset(u) (u)->uio_loffset
#define zfs_uio_resid(u) (u)->uio_resid
#define zfs_uio_iovcnt(u) (u)->uio_iovcnt
#define zfs_uio_iovlen(u, idx) (u)->uio_iov[(idx)].iov_len
#define zfs_uio_iovbase(u, idx) (u)->uio_iov[(idx)].iov_base
#define zfs_uio_fault_disable(u, set) (u)->uio_fault_disable = set
#define zfs_uio_rlimit_fsize(z, u) (0)
#define zfs_uio_fault_move(p, n, rw, u) zfs_uiomove((p), (n), (rw), (u))
typedef enum xuio_type {
UIOTYPE_ASYNCIO,
UIOTYPE_ZEROCOPY,
} xuio_type_t;
#define UIOA_IOV_MAX 16
typedef struct uioa_page_s {
int uioa_pfncnt;
void **uioa_ppp;
caddr_t uioa_base;
size_t uioa_len;
} uioa_page_t;
typedef struct xuio {
uio_t xu_uio;
enum xuio_type xu_type;
union {
struct {
uint32_t xu_a_state;
ssize_t xu_a_mbytes;
uioa_page_t *xu_a_lcur;
void **xu_a_lppp;
void *xu_a_hwst[4];
uioa_page_t xu_a_locked[UIOA_IOV_MAX];
} xu_aio;
struct {
int xu_zc_rw;
void *xu_zc_priv;
} xu_zc;
} xu_ext;
} xuio_t;
#define XUIO_XUZC_PRIV(xuio) xuio->xu_ext.xu_zc.xu_zc_priv
#define XUIO_XUZC_RW(xuio) xuio->xu_ext.xu_zc.xu_zc_rw
#define uio_segflg(uio) (uio)->uio_segflg
#define uio_offset(uio) (uio)->uio_loffset
#define uio_resid(uio) (uio)->uio_resid
#define uio_iovcnt(uio) (uio)->uio_iovcnt
#define uio_iovlen(uio, idx) (uio)->uio_iov[(idx)].iov_len
#define uio_iovbase(uio, idx) (uio)->uio_iov[(idx)].iov_base
#define uio_fault_disable(uio, set) (uio)->uio_fault_disable = set
extern int zfs_uio_prefaultpages(ssize_t, zfs_uio_t *);
static inline void
uio_iov_at_index(uio_t *uio, uint_t idx, void **base, uint64_t *len)
zfs_uio_setoffset(zfs_uio_t *uio, offset_t off)
{
*base = uio_iovbase(uio, idx);
*len = uio_iovlen(uio, idx);
uio->uio_loffset = off;
}
static inline void
uio_advance(uio_t *uio, size_t size)
zfs_uio_advance(zfs_uio_t *uio, size_t size)
{
uio->uio_resid -= size;
uio->uio_loffset += size;
}
static inline offset_t
uio_index_at_offset(uio_t *uio, offset_t off, uint_t *vec_idx)
{
*vec_idx = 0;
while (*vec_idx < uio_iovcnt(uio) && off >= uio_iovlen(uio, *vec_idx)) {
off -= uio_iovlen(uio, *vec_idx);
(*vec_idx)++;
}
return (off);
}
static inline void
iov_iter_init_compat(struct iov_iter *iter, unsigned int dir,
const struct iovec *iov, unsigned long nr_segs, size_t count)
@ -159,8 +107,9 @@ iov_iter_init_compat(struct iov_iter *iter, unsigned int dir,
}
static inline void
uio_iovec_init(uio_t *uio, const struct iovec *iov, unsigned long nr_segs,
offset_t offset, uio_seg_t seg, ssize_t resid, size_t skip)
zfs_uio_iovec_init(zfs_uio_t *uio, const struct iovec *iov,
unsigned long nr_segs, offset_t offset, zfs_uio_seg_t seg, ssize_t resid,
size_t skip)
{
ASSERT(seg == UIO_USERSPACE || seg == UIO_SYSSPACE);
@ -176,7 +125,7 @@ uio_iovec_init(uio_t *uio, const struct iovec *iov, unsigned long nr_segs,
}
static inline void
uio_bvec_init(uio_t *uio, struct bio *bio)
zfs_uio_bvec_init(zfs_uio_t *uio, struct bio *bio)
{
uio->uio_bvec = &bio->bi_io_vec[BIO_BI_IDX(bio)];
uio->uio_iovcnt = bio->bi_vcnt - BIO_BI_IDX(bio);
@ -191,7 +140,7 @@ uio_bvec_init(uio_t *uio, struct bio *bio)
#if defined(HAVE_VFS_IOV_ITER)
static inline void
uio_iov_iter_init(uio_t *uio, struct iov_iter *iter, offset_t offset,
zfs_uio_iov_iter_init(zfs_uio_t *uio, struct iov_iter *iter, offset_t offset,
ssize_t resid, size_t skip)
{
uio->uio_iter = iter;

View File

@ -52,6 +52,12 @@
#define F_FREESP 11 /* Free file space */
#if defined(SEEK_HOLE) && defined(SEEK_DATA)
#define F_SEEK_DATA SEEK_DATA
#define F_SEEK_HOLE SEEK_HOLE
#endif
/*
* The vnode AT_ flags are mapped to the Linux ATTR_* flags.
* This allows them to be used safely with an iattr structure.

View File

@ -23,8 +23,8 @@
#ifndef ZFS_CONTEXT_OS_H
#define ZFS_CONTEXT_OS_H
#include <sys/uio_impl.h>
#include <linux/dcache_compat.h>
#include <linux/utsname_compat.h>
#include <linux/compiler_compat.h>
#endif

Some files were not shown because too many files have changed in this diff Show More