FreeBSD has a per-page "busy" lock which is held when handling a page
fault on a mapped file. This lock is also acquired when copying data
from the DMU to the page cache in zfs_write(). File range locks are
also acquired in both of these paths, in the opposite order with respect
to the busy lock.
In the getpages VOP, the range lock is only used to determine the extent
of optional read-ahead and read-behind operations. To resolve the lock
order reversal, modify the getpages VOP to avoid blocking on the range
lock.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes#10519
zfs_rangelock_tryenter() bails immediately instead of waiting for the
lock to become available. This will be used to resolve a deadlock in
the FreeBSD page-in code. No functional change intended.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes#10519
When clearing a bit, we should check whether that bit is 0.
Note atomic_clear_long_excl is not used.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Liu Qing <winglq@gmail.com>
Closes#10526
`zfs_freebsd_need_inactive` appears to been based on an unfinished
version of https://reviews.freebsd.org/D22130 which had a bug where
files written via mmap wouldn't actually persist.
Update the function to match the final version committed to FreeBSD.
Authored-by: Mateusz Guzik <mjg@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes#10527Closes#10528
The device_rebuild feature enables sequential reconstruction when
resilvering. Mirror vdevs can be rebuilt in LBA order which may
more quickly restore redundancy depending on the pools average block
size, overall fragmentation and the performance characteristics
of the devices. However, block checksums cannot be verified
as part of the rebuild thus a scrub is automatically started after
the sequential resilver completes.
The new '-s' option has been added to the `zpool attach` and
`zpool replace` command to request sequential reconstruction
instead of healing reconstruction when resilvering.
zpool attach -s <pool> <existing vdev> <new vdev>
zpool replace -s <pool> <old vdev> <new vdev>
The `zpool status` output has been updated to report the progress
of sequential resilvering in the same way as healing resilvering.
The one notable difference is that multiple sequential resilvers
may be in progress as long as they're operating on different
top-level vdevs.
The `zpool wait -t resilver` command was extended to wait on
sequential resilvers. From this perspective they are no different
than healing resilvers.
Sequential resilvers cannot be supported for RAIDZ, but are
compatible with the dRAID feature being developed.
As part of this change the resilver_restart_* tests were moved
in to the functional/replacement directory. Additionally, the
replacement tests were renamed and extended to verify both
resilvering and rebuilding.
Original-patch-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: John Poduska <jpoduska@datto.com>
Co-authored-by: Mark Maybee <mmaybee@cray.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#10349
Fix header conflicts when building zfs with openzfs as a vendor import.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes#10497
a+=b is not supported by all shells. It is equivalent to a=${a}b, so
just rewrite it as that.
This also fixes commit 9ea6c3d3, which intended to only make the
definitions of _dracutdir, _udevdir, and _udevruledir conditional, but
actually ensured that _initconfdir no longer got defined if _dracutdir
was defined, and defined _udevdir to the value that should have been
used for _udevruledir.
This also fixes the fact that the checks introduced by commit 9ea6c3d3
could never work: ZFS_AC_PACKAGE was called before the configuration
options were processed.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Harald van Dijk <harald@gigawatt.nl>
Closes#10518
The standard test command does not support the == operator. Certain
shells, including bash, do support it, but in those shells it does
exactly the same thing as the standard = operator. Use that instead.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Harald van Dijk <harald@gigawatt.nl>
Closes#10509
OS-specific code (e.g. under `module/os/linux`) does not need to share
its code structure with any other operating systems. In particular, the
ARC and kmem code need not be similar to the code in illumos, because we
won't be syncing this OS-specific code between operating systems. For
example, if/when illumos support is added to the common repo, we would
add a file `module/os/illumos/zfs/arc_os.c` for the illumos versions of
this code.
Therefore, we can simplify the code in the OS-specific ARC and kmem
routines.
These changes do not impact system behavior, they are purely code
cleanup. The changes are:
Arenas are not used on Linux or FreeBSD (they are always `NULL`), so
`heap_arena`, `zio_arena`, and `zio_alloc_arena` can be removed, along
with code that uses them.
In `arc_available_memory()`:
* `desfree` is unused, remove it
* rename `freemem` to avoid conflict with pre-existing `#define`
* remove checks related to arenas
* use units of bytes, rather than converting from bytes to pages and
then back to bytes
`SPL_KMEM_CACHE_REAP` is unused, remove it.
`skc_reap` is unused, remove it.
The `count` argument to `spl_kmem_cache_reap_now()` is unused, remove
it.
`vmem_size()` and associated type and macros are unused, remove them.
In `arc_memory_throttle()`, use a less confusing variable name to store
the result of `arc_free_memory()`.
Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes#10499
Otherwise it does nothing on an out-of-tree build.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10506
Move this file out of .github and add it to distribution.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10506
This doesn't appear to be used by the buildbot any more, let's remove
it.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10506
The kernel headers are installed for DKMS on linux, so don't install
them unless we're building on linux.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10506
This is helpful for determining the size of the nvlist of snapshots
and properties
Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes#10505
The SPL provides a wrapper for the kernel's shrinker callbacks, which
enables the ZFS code to interface with multiple versions of the shrinker
API's from different kernel versions. Specifically, Linux kernels 3.0 -
3.11 has a single "combined" callback, and Linux kernels 3.12 and later
have two "split" callbacks. The SPL provides a wrapper function so that
the ZFS code only needs to implement one version of the callbacks.
Currently the SPL's wrappers are designed such that the ZFS code
implements the older, "combined" callback. There are a few downsides to
this approach:
* The general design within ZFS is for the latest Linux kernel to be
considered the "first class" API.
* The newer, "split" callback API is easier to understand, because each
callback has one purpose.
* The current wrappers do not completely abstract out the differing
API's, so ZFS code needs `#ifdef` code to handle the differing return
values required for different kernel versions.
This commit addresses these drawbacks by having the ZFS code provide the
latest, "split" callbacks, and the SPL provides a wrapping function for
the older, "combined" API.
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes#10502
A previous commit enabled the tracking of object allocations
in Linux-backed caches from the SPL layer for debuggability.
The commit is: 9a170fc6fe
Unfortunately, it also introduced minor performance regressions
that were highlighted by the ZFS perf test-suite. Within Delphix
we found that the regression would be from -1%, all the way up
to -8% for some workloads.
This commit brings performance back up to par by creating a
separate counter for those caches and making it a percpu in
order to avoid lock-contention.
The initial performance testing was done by myself, and the
final round was conducted by @tonynguien who was also the one
that discovered the regression and highlighted the culprit.
Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes#10397
The meaning of the `free` field is currently `zfs_arc_sys_free`, which
is the target amount of memory to leave free for the system, and is
constant after booting.
This commit changes the meaning of `free` to arc_free_memory(), the
amount of memory that the ARC considers to be free.
It also adds a new arcstat field `avail`, which tracks
`arc_available_memory()`.
Since `avail` can be negative, it also updates the arcstat script to
pretty-print negative values.
example output:
$ arcstat -f time,miss,arcsz,c,grow,need,free,avail 1
time miss arcsz c grow need free avail
15:03:02 39K 114G 114G 0 0 2.4G 407M
15:03:03 42K 114G 114G 0 0 2.1G 120M
15:03:04 40K 114G 114G 0 0 1.8G -177M
15:03:05 24K 113G 112G 0 0 1.7G -269M
15:03:06 29K 111G 110G 0 0 1.6G -385M
15:03:07 27K 110G 108G 0 0 1.4G -535M
15:03:08 13K 108G 108G 0 0 2.2G 239M
15:03:09 33K 107G 107G 0 0 1.3G -639M
15:03:10 16K 105G 102G 0 0 2.6G 704M
15:03:11 7.2K 102G 102G 0 0 5.1G 3.1G
15:03:12 42K 103G 102G 0 0 4.8G 2.8G
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes#10494
The block histogram tracks the changes to psize, lsize and asize
both in the count of the number of blocks (by blocksize) and the
total length of all of the blocks for that blocksize. It also
keeps a running total of the cumulative size of all of the blocks
up to each size to help determine the size of caching SSDs to be
added to zfs hardware deployments.
The block history counts and lengths are summarized in bins
which are powers of two. Even rows with counts of zero are printed.
This change is accessed by specifying one of two options:
zdb -bbb pool
zdb -Pbbb pool
The first version prints the table in fixed size columns.
The second prints in "parseable" output that can be placed into
a CSV file.
Fixed Column, nicenum output sample:
block psize lsize asize
size Count Length Cum. Count Length Cum. Count Length Cum.
512: 3.50K 1.75M 1.75M 3.43K 1.71M 1.71M 3.41K 1.71M 1.71M
1K: 3.65K 3.67M 5.43M 3.43K 3.44M 5.15M 3.50K 3.51M 5.22M
2K: 3.45K 6.92M 12.3M 3.41K 6.83M 12.0M 3.59K 7.26M 12.5M
4K: 3.44K 13.8M 26.1M 3.43K 13.7M 25.7M 3.49K 14.1M 26.6M
8K: 3.42K 27.3M 53.5M 3.41K 27.3M 53.0M 3.44K 27.6M 54.2M
16K: 3.43K 54.9M 108M 3.50K 56.1M 109M 3.42K 54.7M 109M
32K: 3.44K 110M 219M 3.41K 109M 218M 3.43K 110M 219M
64K: 3.41K 218M 437M 3.41K 218M 437M 3.44K 221M 439M
128K: 3.41K 437M 874M 3.70K 474M 911M 3.41K 437M 876M
256K: 3.41K 874M 1.71G 3.41K 874M 1.74G 3.41K 874M 1.71G
512K: 3.41K 1.71G 3.41G 3.41K 1.71G 3.45G 3.41K 1.71G 3.42G
1M: 3.41K 3.41G 6.82G 3.41K 3.41G 6.86G 3.41K 3.41G 6.83G
2M: 0 0 6.82G 0 0 6.86G 0 0 6.83G
4M: 0 0 6.82G 0 0 6.86G 0 0 6.83G
8M: 0 0 6.82G 0 0 6.86G 0 0 6.83G
16M: 0 0 6.82G 0 0 6.86G 0 0 6.83G
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Robert E. Novak <novak5@llnl.gov>
Closes: #9158Closes#10315
Reduce the usage of EXTRA_DIST. If files are conditionally included in
_SOURCES, _HEADERS etc, automake is smart enough to dist all files that
could possibly be included, but this does not apply to EXTRA_DIST,
resulting in make dist depending on the configuration.
Add some files that were missing altogether in various Makefile's.
The changes to disted files in this commit (excluding deleted files):
+./cmd/zed/agents/README.md
+./etc/init.d/README.md
+./lib/libspl/os/freebsd/getexecname.c
+./lib/libspl/os/freebsd/gethostid.c
+./lib/libspl/os/freebsd/getmntany.c
+./lib/libspl/os/freebsd/mnttab.c
-./lib/libzfs/libzfs_core.pc
-./lib/libzfs/libzfs.pc
+./lib/libzfs/os/freebsd/libzfs_compat.c
+./lib/libzfs/os/freebsd/libzfs_fsshare.c
+./lib/libzfs/os/freebsd/libzfs_ioctl_compat.c
+./lib/libzfs/os/freebsd/libzfs_zmount.c
+./lib/libzutil/os/freebsd/zutil_compat.c
+./lib/libzutil/os/freebsd/zutil_device_path_os.c
+./lib/libzutil/os/freebsd/zutil_import_os.c
+./module/lua/README.zfs
+./module/os/linux/spl/README.md
+./tests/README.md
+./tests/zfs-tests/tests/functional/cli_root/zfs_clone/zfs_clone_rm_nested.ksh
+./tests/zfs-tests/tests/functional/cli_root/zfs_send/zfs_send_encrypted_unloaded.ksh
+./tests/zfs-tests/tests/functional/inheritance/README.config
+./tests/zfs-tests/tests/functional/inheritance/README.state
+./tests/zfs-tests/tests/functional/rsend/rsend_016_neg.ksh
+./tests/zfs-tests/tests/perf/fio/sequential_readwrite.fio
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10501
ZFS registers a memory hook, `__arc_shrinker_func`, which is supposed to
allow the ARC to shrink when the kernel experiences memory pressure.
The ARC shrinker changes `arc_c` via a call to
`arc_reduce_target_size()`. Before commit 3ec34e5527, the ARC
shrinker would also evict data from the ARC to bring `arc_size` down to
the new `arc_c`. However, that commit (seemingly inadvertently) made it
so that the ARC shrinker no longer evicts any data or waits for eviction
to complete.
Repeated calls to the ARC shrinker can reduce `arc_c` drastically, often
all the way to `arc_c_min`. Since it doesn't wait for the actual
eviction of data from the ARC, this creates a situation where `arc_size`
is more than `arc_c` for the several seconds/minutes it takes for
`arc_adjust_zthr` to evict data from the ARC. During this time,
arc_get_data_impl() will block, so ZFS can't process read/write requests
(e.g. from iSCSI, NFS, or read/write syscalls).
To ensure that `arc_c` doesn't shrink faster than the adjust thread can
keep up, this commit makes the ARC shrinker wait for the eviction to
complete, resulting in similar behavior to what we had before commit
3ec34e5527.
Note: commit 3ec34e5527 is `OpenZFS 9284 - arc_reclaim_thread
has 2 jobs` and was integrated in December 2018, and is part of ZoL
0.8.x but not 0.7.x.
Additionally, when the ARC size is reduced drastically, the
`arc_adjust_zthr` can be on-CPU for many seconds without blocking. Any
threads that are bound to the same CPU that arc_adjust_zthr is running
on will not able to run for a long time.
To ensure that CPU-bound threads can make progress, this commit changes
`arc_evict_state_impl()` make a voluntary preemption call,
`cond_resched()`.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-70703
Closes#10496
Implements a pam module for automatically loading zfs encryption keys
for home datasets. The pam module:
- loads a zfs key and mounts the dataset when a session opens.
- unmounts the dataset and unloads the key when the session closes.
- when the user is logged on and changes the password, the module
changes the encryption key.
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: @jengelh <jengelh@inai.de>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Felix Dörre <felix@dogcraft.de>
Closes#9886Closes#9903
There's no need to specify the srcdir explicitly in _HEADERS and
EXTRA_DIST.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10493
The test added in commit
4313a5b4c5 ("Detect if sed supports --in-place")
doesn't work at least on my system (autoconfig-2.69).
The issue is that SED has already been found and cached before this
function is evaluated, with the result that the test is completely
skipped.
...
checking for a sed that does not truncate output... /usr/bin/sed
...
checking for sed --in-place... (cached) /usr/bin/sed
The first test is executed by libtool.m4. This looks to have been around
in libtool for at least 15 years or so, not sure why this was not
encountered at the time of the original commit.
Fix this by caching the value of the ac_inplace flag rather than the
path to SED. Also use $SED and add AC_REQUIRE to ensure that we use the
sed that was located by the standard configure test.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10493
These license files were added in commit
31b160f0a6 ("ICP: Improve AES-GCM performance")
but not added to the distributed files.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10493
These targets look to have been copied from an automake-generated
Makefile.in, and can't work since none of the auto-generated automake
variables are defined here.
Moreover, ctags has been overridden in the top-level Makefile, so the
target is pointless anyway, and gtags is not a recursive target.
Fix cscopelist by moving it to the top-level Makefile as well, in line
with ctags and etags.
Also, add -a to ctags command as well, otherwise it won't work if more
than one xargs invocation takes place.
Add assembler files to ctags/etags, prune all dotted-dirs, and restrict
the find to files only.
Cleanup: add .PHONY to module/Makefile.in, and fix one recipe with a
missing continuation character.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10493
Currently, asm-generic/atomic.c is compiled into a .S file, with a
comment saying this is to simplify the upper-level Makefile.
However, this doesn't work properly with a VPATH build, which would
require better logic to deal with generated sources correctly.
It also doesn't seem more complex to just specify the .c/.S source file,
depending on the cpu, instead of only the source directory in
lib/libspl/Makefile.am, which eliminates the need to do the intermediate
compilation.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10493
Currently an out-of-tree build does not work with read-only source
directory because zfs_gitrev.h can't be created. Move this file to the
build directory, which is more appropriate for a generated file, and
drop the dist-hook for zfs_gitrev.h. There is no need to distribute this
file since it will be regenerated as part of the compilation in any
case.
scripts/make_gitrev.sh tries to avoid updating zfs_gitrev.h if there has
been no change, however this doesn't cover the case when the source
directory is not in git: in that case zfs_gitrev.h gets overwritten even
though it's always "unknown". Simplify the logic to always write out a
new version of zfs_gitrev.h, compare against the old and overwrite only
if different. This is now simple enough to just include in the
Makefile, so drop the script.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10493
If srcdir != builddir, pass down MAKEOBJDIR to the FreeBSD make to
support out-of-tree builds.
Also allow passing all the gmake options that FreeBSD make understands
to support useful flags like -k, -n, -q etc, and detect the number of
CPUs if -j was specified without an argument.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10493
Allow users to configure notifications when TRIM operations are
completed on pools. Unlike resilver_finish and scrub_finish,
the trim_finish event is generated for each vdev in the pool
which was trimmed, so the script will generate a notification
for each one.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Kevin P. Fleming <kevin@km6g.us>
Closes#10491
This tunable required a handler to be implemented for
ZFS_MODULE_PARAM_CALL.
Add the handler so the tunable can be declared in common code.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes#10490
Running zfs -V when the modules are not loaded would currently
result in the following output:
zfs_version_kernel() failed: No such file or directory
Note the lack of userland version output. Reorder the code to
ensure the userland version is printed even when the kmods
are not loaded.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: InsanePrawn <insane.prawny@gmail.com>
Closes#10483
This commit adds two features to zed, that macOS desires. The first
is that when you unload the kernel module, zed would enter into a
cpubusy loop calling zfs_events_next() repeatedly. We now look for
ENODEV, returned by kernel, so zed can exit gracefully.
Second feature is -I (idle) (alas -P persist was taken) is for the
deamon to;
1; if started without ZFS kernel module, stick around waiting for it.
2; if kernel module is unloaded, go back to 1.
This is due to daemons in macOS is started by launchctl, and is
expected to stick around.
Currently, the busy loop only exists when errno is ENODEV. This is
to ensure that functionality that upstream expects is not changed.
It did not care about errors before, and it still does not. (with the
exception of ENODEV).
However, it is probably better that all errors
(ERESTART notwithstanding) exits the loop, and the issues complaining
about zed taking all CPU will go away.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes#10476
Rephrase error-injection comment in ztest to be more clear.
Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Sara Hartse <sara.hartse@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes#10482
Rephrase comments to be more clear.
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes#10481
The following test cases may still occasionally fail and are being
added to the "maybe" list for Linux until they can be updated to be
entirely reliable.
cli_root/zfs_rename/zfs_rename_002_pos.ksh
cli_root/zpool_reopen/zpool_reopen_003_pos.ksh
refreserv/refreserv_raidz
These 6 tests consistently fail only on Fedora 31+, the failures
are related to the kernel rescanning the partition table on loopback
devices which is no longer reliable unless partprobe is used. In
order to enable the Fedora bot by default they are also being added
to the list until the tests can be updated. Any significant regression
in functionality covered by these tests will still be detected by the
FreeBSD builders.
alloc_class/alloc_class_009_pos
alloc_class/alloc_class_010_pos
cli_root/zpool_expand/zpool_expand_001_pos
cli_root/zpool_expand/zpool_expand_005_pos
rsend/rsend_007_pos
rsend/rsend_010_pos
rsend/rsend_011_pos
snapshot/rollback_003_pos
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#10489
Resolve the FreeBSD head build failure.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes#10480
And that should work even (especially) if there is no matching user or
group name. The change is originally by Xin Lin <delphij@FreeBSD.org>.
Original-patch-by: Xin Li <delphij@FreeBSD.org>
Reviewed-by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed-by: Andy Stormont <astormont@racktopsystems.com>
Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes#9792Closes#10280
KPI changed in FreeBSD, update accordingly.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes#10475
Switch on warning flags to detect mismatch between declaration and
definition.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10470
spl-generic.c defines some of the libgcc integer library functions on
32-bit. Don't bother checking -Wmissing-prototypes since nothing should
directly call these functions from C code.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10470
Turn the generic versions into inline functions and avoid
SKEIN_PORT_CODE trickery.
Also drop the PLATFORM_MUST_ALIGN check for using the fast bcopy
variants. bcopy doesn't assume alignment, and the userspace version is
currently different because the _ALIGNMENT_REQUIRED macro is only
defined by the kernelspace headers.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10470
Include the header with prototypes in the file that provides definitions
as well, to catch any mismatch between prototype and definition.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10470
Mark functions used only in the same translation unit as static. This
only includes functions that do not have a prototype in a header file
either.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10470
Commit
ec21397127 ("async zvol minor node creation interferes with receive")
replaced zvol_create_minors with zvol_create_minor and
zvol_create_minors_recursive, changing the prototype at the same time.
However the stub functions in libzpool/kernel.c were defined with the
old prototype. As the definitions are empty, this doesn't cause any
runtime issues, but an LTO build shows warnings because of the
mismatched prototypes.
Commit
a0bd735adb ("Add support for asynchronous zvol minor operations")
removed the real zvol_remove_minor, but for some reason added a stub
implementation in libzpool/kernel.c with no references. Delete this dead
code.
Commit
196bee4cfd ("Remove deduplicated send/receive code")
removed zfs_onexit_del_cb and zfs_onexit_cb_data. Drop the stubs as
well.
Add zvol.h include to provide prototypes, and sort the include
directives.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10470
Implement semi-compatible functionality for mode=0 (preallocation)
and mode=FALLOC_FL_KEEP_SIZE (preallocation beyond EOF) for ZPL.
Since ZFS does COW and snapshots, preallocating blocks for a file
cannot guarantee that writes to the file will not run out of space.
Even if the first overwrite was guaranteed, it would not handle any
later overwrite of blocks due to COW, so strict compliance is futile.
Instead, make a best-effort check that at least enough free space is
currently available in the pool (with a bit of margin), then create
a sparse file of the requested size and continue on with life.
This does not handle all cases (e.g. several fallocate() calls before
writing into the files when the filesystem is nearly full), which
would require a more complex mechanism to be implemented, probably
based on a modified version of dmu_prealloc(), but is usable as-is.
A new module option zfs_fallocate_reserve_percent is used to control
the reserve margin for any single fallocate call. By default, this
is 110% of the requested preallocation size, so an additional 10% of
available space is reserved for overhead to allow the application a
good chance of finishing the write when the fallocate() succeeds.
If the heuristics of this basic fallocate implementation are not
desirable, the old non-functional behavior of returning EOPNOTSUPP
for calls can be restored by setting zfs_fallocate_reserve_percent=0.
The parameter of zfs_statvfs() is changed to take an inode instead
of a dentry, since no dentry is available in zfs_fallocate_common().
A few tests from @behlendorf cover basic fallocate functionality.
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Issue #326Closes#10408