This allows ZFS datasets to be delegated to a user/mount namespace
Within that namespace, only the delegated datasets are visible
Works very similarly to Zones/Jailes on other ZFS OSes
As a user:
```
$ unshare -Um
$ zfs list
no datasets available
$ echo $$
1234
```
As root:
```
# zfs list
NAME ZONED MOUNTPOINT
containers off /containers
containers/host off /containers/host
containers/host/child off /containers/host/child
containers/host/child/gchild off /containers/host/child/gchild
containers/unpriv on /unpriv
containers/unpriv/child on /unpriv/child
containers/unpriv/child/gchild on /unpriv/child/gchild
# zfs zone /proc/1234/ns/user containers/unpriv
```
Back to the user namespace:
```
$ zfs list
NAME USED AVAIL REFER MOUNTPOINT
containers 129M 47.8G 24K /containers
containers/unpriv 128M 47.8G 24K /unpriv
containers/unpriv/child 128M 47.8G 128M /unpriv/child
```
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Will Andrews <will.andrews@klarasystems.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Mateusz Piotrowski <mateusz.piotrowski@klarasystems.com>
Co-authored-by: Allan Jude <allan@klarasystems.com>
Co-authored-by: Mateusz Piotrowski <mateusz.piotrowski@klarasystems.com>
Sponsored-by: Buddy <https://buddy.works>
Closes#12263
This commit adds BLAKE3 checksums to OpenZFS, it has similar
performance to Edon-R, but without the caveats around the latter.
Homepage of BLAKE3: https://github.com/BLAKE3-team/BLAKE3
Wikipedia: https://en.wikipedia.org/wiki/BLAKE_(hash_function)#BLAKE3
Short description of Wikipedia:
BLAKE3 is a cryptographic hash function based on Bao and BLAKE2,
created by Jack O'Connor, Jean-Philippe Aumasson, Samuel Neves, and
Zooko Wilcox-O'Hearn. It was announced on January 9, 2020, at Real
World Crypto. BLAKE3 is a single algorithm with many desirable
features (parallelism, XOF, KDF, PRF and MAC), in contrast to BLAKE
and BLAKE2, which are algorithm families with multiple variants.
BLAKE3 has a binary tree structure, so it supports a practically
unlimited degree of parallelism (both SIMD and multithreading) given
enough input. The official Rust and C implementations are
dual-licensed as public domain (CC0) and the Apache License.
Along with adding the BLAKE3 hash into the OpenZFS infrastructure a
new benchmarking file called chksum_bench was introduced. When read
it reports the speed of the available checksum functions.
On Linux: cat /proc/spl/kstat/zfs/chksum_bench
On FreeBSD: sysctl kstat.zfs.misc.chksum_bench
This is an example output of an i3-1005G1 test system with Debian 11:
implementation 1k 4k 16k 64k 256k 1m 4m
edonr-generic 1196 1602 1761 1749 1762 1759 1751
skein-generic 546 591 608 615 619 612 616
sha256-generic 240 300 316 314 304 285 276
sha512-generic 353 441 467 476 472 467 426
blake3-generic 308 313 313 313 312 313 312
blake3-sse2 402 1289 1423 1446 1432 1458 1413
blake3-sse41 427 1470 1625 1704 1679 1607 1629
blake3-avx2 428 1920 3095 3343 3356 3318 3204
blake3-avx512 473 2687 4905 5836 5844 5643 5374
Output on Debian 5.10.0-10-amd64 system: (Ryzen 7 5800X)
implementation 1k 4k 16k 64k 256k 1m 4m
edonr-generic 1840 2458 2665 2719 2711 2723 2693
skein-generic 870 966 996 992 1003 1005 1009
sha256-generic 415 442 453 455 457 457 457
sha512-generic 608 690 711 718 719 720 721
blake3-generic 301 313 311 309 309 310 310
blake3-sse2 343 1865 2124 2188 2180 2181 2186
blake3-sse41 364 2091 2396 2509 2463 2482 2488
blake3-avx2 365 2590 4399 4971 4915 4802 4764
Output on Debian 5.10.0-9-powerpc64le system: (POWER 9)
implementation 1k 4k 16k 64k 256k 1m 4m
edonr-generic 1213 1703 1889 1918 1957 1902 1907
skein-generic 434 492 520 522 511 525 525
sha256-generic 167 183 187 188 188 187 188
sha512-generic 186 216 222 221 225 224 224
blake3-generic 153 152 154 153 151 153 153
blake3-sse2 391 1170 1366 1406 1428 1426 1414
blake3-sse41 352 1049 1212 1174 1262 1258 1259
Output on Debian 5.10.0-11-arm64 system: (Pi400)
implementation 1k 4k 16k 64k 256k 1m 4m
edonr-generic 487 603 629 639 643 641 641
skein-generic 271 299 303 308 309 309 307
sha256-generic 117 127 128 130 130 129 130
sha512-generic 145 165 170 172 173 174 175
blake3-generic 81 29 71 89 89 89 89
blake3-sse2 112 323 368 379 380 371 374
blake3-sse41 101 315 357 368 369 364 360
Structurally, the new code is mainly split into these parts:
- 1x cross platform generic c variant: blake3_generic.c
- 4x assembly for X86-64 (SSE2, SSE4.1, AVX2, AVX512)
- 2x assembly for ARMv8 (NEON converted from SSE2)
- 2x assembly for PPC64-LE (POWER8 converted from SSE2)
- one file for switching between the implementations
Note the PPC64 assembly requires the VSX instruction set and the
kfpu_begin() / kfpu_end() calls on PowerPC were updated accordingly.
Reviewed-by: Felix Dörre <felix@dogcraft.de>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Co-authored-by: Rich Ercolani <rincebrain@gmail.com>
Closes#10058Closes#12918
Makes getmntent and getmntany thread-safe for external consumers of
libzfs zpool_disable_datasets, zfs_iter_mounted, libzfs_mnttab_update,
libzfs_mnttab_find.
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes#13484
Even on Illumos it's only available in the 32-bit programming
environment, and, quoth enable_extended_FILE_stdio(3C):
> Historically, 32-bit Solaris applications have been limited to using
> only the file descriptors 0 through 255 with the standard I/O
> functions (see stdio(3C)) in the C library. The extended FILE
> facility allows well-behaved 32-bit applications to use any
> valid file descriptor with the standard I/O functions.
where "well-behaved" means that it
> does not directly access any fields in the FILE structure pointed
> to by the FILE pointer associated with any standard I/O stream,
And the stdio/flush.c implementation reads:
/*
* if this is not an internal extended FILE then check
* if _file is being changed from underneath us.
* It should not be because if
* it is then then we lose our ability to guard against
* silent data corruption.
*/
if (!iop->__xf_nocheck && bad_fd > -1 && iop->_magic != bad_fd) {
(void) fprintf(stderr,
"Application violated extended FILE safety mechanism.\n"
"Please read the man page for extendedFILE.\nAborting\n");
abort();
}
This appears to be an insane workaround for broken implementation with
exposed FILE internals and _file being an u8, both only on non-LP64;
it's shimmed out on all LP64 targets in Illumos,
and we shim it out as well: just get rid of it
This appears to've been originally fixed in illumos-gate
a5f69788de7ac07553de47f7fec8c05a9a94c105 ("PSARC 2006/162 Extended FILE
space for 32-bit Solaris processes", "1085341 32-bit stdio routines
should support file descriptors >255"), which also bears extendedFILE
and enable_extended_FILE_stdio(3C):
- unsigned char _file; /* UNIX System file descriptor */
+ unsigned char _magic; /* Old home of the file descriptor */
+ /* Only fileno(3C) can retrieve the
value now */
and
+/*
+ * Macros to aid the extended fd FILE work.
+ * This helps isolate the changes to only the 32-bit code
+ * since 64-bit Solaris is not affected by this.
+ */
+#ifdef _LP64
+#define GET_FD(iop) ((iop)->_file)
+#define SET_FILE(iop, fd) ((iop)->_file = (fd))
+#else
+#define GET_FD(iop) \
+ (((iop)->__extendedfd) ? _file_get(iop) : (iop)->_magic)
+#define SET_FILE(iop, fd) (iop)->_magic = (fd); (iop)->__extendedfd = 0
+#endif
Also remove the 1k setrlimit(NOFILE) calls: that's the default on Linux,
with 64k on Illumos and 171k on FreeBSD
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#13411
- Prefer O_* flags over F* flags that mostly mirror O_* flags anyway,
but O_* flags seem to be preferred.
- Simplify the code as all the F*SYNC flags were defined as FFSYNC flag.
- Don't define FRSYNC flag, so we don't generate unnecessary ZIL commits.
- Remove EXCL define, FreeBSD ignores the excl argument for zfs_create()
anyway.
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes#13400
For #13083, curiously, it did not print the actual error, just
that the compile failed with "Error 1".
In theory, this flag should cause it to report errors twice sometimes.
In practice, I'm pretty okay with reporting some twice if it avoids
reporting some never.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes#13086
bcopy() has a confusing argument order and is actually a move, not a
copy; they're all deprecated since POSIX.1-2001 and removed in -2008,
and we shim them out to mem*() on Linux anyway
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#12996
aarch64 is a different architecture than arm. Some
compilers might choke when both __arm__ and __aarch64__
are defined.
This change separates the checks for arm and for
aarch64 in the isa_defs.h header files.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Windel Bouwman <windel@windel.nl>
Closes#10335Closes#13151
A function that returns with no value is a different thing from a
function that doesn't return at all. Those are two orthogonal
concepts, commonly confused.
pthread_create(3) expects a pointer to a start routine that has a
very precise prototype:
void *(*start_routine)(void *);
However, other thread functions, such as kernel ones, expect:
void (*start_routine)(void *);
Providing a different one is incorrect, and has only been working
because the ABIs happen to produce a compatible function.
We should use '_Noreturn void', since it's the natural type, and
then provide a '_Noreturn void *' wrapper for pthread functions.
For consistency, replace most cases of __NORETURN or
__attribute__((noreturn)) by _Noreturn. _Noreturn is understood
by -std=gnu89, so it should be safe to use everywhere.
Ref: https://github.com/openzfs/zfs/pull/13110#discussion_r808450136
Ref: https://software.codidact.com/posts/285972
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com>
Closes#13120
Unfortunately macOS has obj-C keyword "fallthrough" in the OS headers.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes#13097
`configure` now accepts `--enable-asan` and `--enable-ubsan` switches
which results in passing `-fsanitize=address`
and `-fsanitize=undefined`, respectively, to the compiler. Those
flags are enabled in GitHub workflows for ZTS and zloop. Errors
reported by both instrumentations are corrected, except for:
- Memory leak reporting is (temporarily) suppressed. The cost of
fixing them is relatively high compared to the gains.
- Checksum computing functions in `module/zcommon/zfs_fletcher*`
have UBSan errors suppressed. It is completely impractical
to enforce 64-byte payload alignment there due to performance
impact.
- There's no ASan heap poisoning in `module/zstd/lib/zstd.c`. A custom
memory allocator is used there rendering that measure
unfeasible.
- Memory leaks detection has to be suppressed for `cmd/zvol_id`.
`zvol_id` is run by udev with the help of `ptrace(2)`. Tracing is
incompatible with memory leaks detection.
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes#12928
69 CSTYLED BEGINs remain, appx. 30 of which can be removed if cstyle(1)
had a useful policy regarding
CALL(ARG1,
ARG2,
ARG3);
above 2 lines. As it stands, it spits out *both*
sysctl_os.c: 385: continuation line should be indented by 4 spaces
sysctl_os.c: 385: indent by spaces instead of tabs
which is very cool
Another >10 could be fixed by removing "ulong" &al. handling.
I don't foresee anyone actually using it intentionally
(does it even exist in modern headers? why did it in the first place?).
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#12993
It hosts only asm_linkage.h, which is entirely unused,
and has slightly diverged from the one that's actually used
(module/icp/include/sys/ia32/asm_linkage.h)
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#12993
This led to these two warning types:
debug.h:139:67: warning: the address of ‘ARC_anon’
will always evaluate as ‘true’ [-Waddress]
139 | #define ASSERT3P(x, y, z)
((void) sizeof (!!(x)), (void) sizeof (!!(z)))
| ^
arc.c:1591:2: note: in expansion of macro ‘ASSERT3P’
1591 | ASSERT3P(hdr->b_l1hdr.b_state, ==, arc_anon);
| ^~~~~~~~
and
arc.h:66:44: warning: ‘<<’ in boolean context,
did you mean ‘<’? [-Wint-in-bool-context]
66 | #define HDR_GET_LSIZE(hdr)
((hdr)->b_lsize << SPA_MINBLOCKSHIFT)
debug.h:138:46: note: in definition of macro ‘ASSERT3U’
138 | #define ASSERT3U(x, y, z)
((void) sizeof (!!(x)), (void) sizeof (!!(z)))
| ^
arc.c:1760:12: note: in expansion of macro ‘HDR_GET_LSIZE’
1760 | ASSERT3U(HDR_GET_LSIZE(hdr), !=, 0);
| ^~~~~~~~~~~~~
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#13009
sizeof(bitfield.member) is invalid, and this shows up in some FreeBSD
build configurations: work around this by !!ing ‒
this makes the sizeof target the ! result type (_Bool), instead
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Fixes: 42aaf0e ("libspl: ASSERT*: mark arguments as used")
Closes#12984Closes#12986
Evaluated every variable that lives in .data (and globals in .rodata)
in the kernel modules, and constified/eliminated/localised them
appropriately. This means that all read-only data is now actually
read-only data, and, if possible, at file scope. A lot of previously-
global-symbols became inlinable (and inlined!) constants. Probably
not in a big Wowee Performance Moment, but hey.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#12899
The FreeBSD implementations of various libspl functions for getting
mounted device information were found to leak several strings which
were being allocated in statfs2mnttab but never freed.
The Solaris getmntany(3C) and related interfaces are expected to return
strings residing in static buffers that need to be copied rather than
freed by the caller.
Use static thread-local storage to stash the mnttab structure strings
from FreeBSD's statfs info rather than strings allocated on the heap by
strdup(3).
While here, remove some stray commented out lines.
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes#12961
Do not redefine the fallthrough macro when building with libcpp.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Martin Matuska <mm@FreeBSD.org>
Closes#12880
Though it's unlikely anyone will alter its signature, avoids any
possible type mismatch.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Érico Nogueira <erico.erc@gmail.com>
Closes#12567
As of the Linux 5.9 kernel a fallthrough macro has been added which
should be used to anotate all intentional fallthrough paths. Once
all of the kernel code paths have been updated to use fallthrough
the -Wimplicit-fallthrough option will because the default. To
avoid warnings in the OpenZFS code base when this happens apply
the fallthrough macro.
Additional reading: https://lwn.net/Articles/794944/
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#12441
These were mostly used to annotate do {} while(0)s
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #12201
Move HAVE_LARGE_STACKS definitions to header and set when appropriate.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Kevin Bowling <kbowling@FreeBSD.org>
Closes#12350
This mostly reverts "3537 want pool io kstats" commit of 8 years ago.
From one side this code using pool-wide locks became pretty bad for
performance, creating significant lock contention in I/O pipeline.
From another, there are more efficient ways now to obtain detailed
statistics, while this statistics is illumos-specific and much less
usable on Linux and FreeBSD, reported only via procfs/sysctls.
This commit does not remove KSTAT_TYPE_IO implementation, that may
be removed later together with already unused KSTAT_TYPE_INTR and
KSTAT_TYPE_TIMER.
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes#12212
No symbols affected in libavl
No symbols affected by libtpool, but pre-ANSI declarations got purged
No symbols affected by libzfs_core
No symbols affected by libzfs_bootenv
libefi got cleaned, gained efi_debug documentation in efi_partition.h,
and removes one undocumented and unused symbol from libzfs_core:
D default_vtoc_map
libnvpair saw removal of these symbols:
D nv_alloc_nosleep_def
D nv_alloc_sleep
D nv_alloc_sleep_def
D nv_fixed_ops_def
D nvlist_hashtable_init_size
D nvpair_max_recursion
libshare saw removal of these symbols from libzfs:
T libshare_nfs_init
T libshare_smb_init
T register_fstype
B smb_shares
libzutil saw removal of these internal symbols from libzfs_core:
T label_paths
T slice_cache_compare
T zpool_find_import_blkid
T zpool_open_func
T zutil_alloc
T zutil_strdup
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#12191
`getfsstat(2)` is used to retrieve the list of mounted file systems,
which libzfs uses when fetching properties like mountpoint, atime,
setuid, etc. The `mode` parameter may be `MNT_NOWAIT`, which uses
information in the VFS's cache, or `MNT_WAIT`, which effectively does a
`statfs` on every single mounted file system in order to fetch the most
up-to-date information. As far as I can tell, the only fields that
libzfs cares about are the filesystem's name, mountpoint, fstypename,
and mount flags. Those things are always updated on mount and unmount,
so they will always be accurate in the VFS's mount cache except in two
circumstances:
1) When a file system is busy unmounting
2) When a ZFS file system changes the value of a mount-overridable
property like atime or setuid, but doesn't remount the file system.
Right now that only happens when the property is changed by an
unprivileged user who has delegated authority to change the property
but not to mount the dataset. But perhaps libzfs could choose to do
it for other reasons in the future.
Switching to `MNT_NOWAIT` will greatly improve speed with no downside,
as long as we explicitly update the mount cache whenever we change a
mount-overridable property.
For comparison, Illumos gets this information using the native
`getmntany` and `getmntent` functions, which also use cached
information. The illumos function that would refresh the cache,
`resetmnttab`, is never called by libzfs.
And on GNU/Linux, `getmntany` and `getmntent` don't even communicate
with the kernel directly. They simply parse the file they are given,
which is usually /etc/mtab or /proc/mounts. Perhaps the implementation
of /proc/mounts is synchronous, ala MNT_WAIT; I don't know.
Sponsored-by: Axcient
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alan Somers <asomers@gmail.com>
Closes: #12091
- Avoid atomic_add() when updating as_lower_bound/as_upper_bound.
Previous code was excessively strong on 64bit systems while not
strong enough on 32bit ones. Instead introduce and use real
atomic_load() and atomic_store() operations, just an assignments
on 64bit machines, but using proper atomics on 32bit ones to avoid
torn reads/writes.
- Reduce number of buckets on large systems. Extra buckets not as
much improve add speed, as hurt reads. Unlike wmsum for aggsum
reads are still important.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes#12145
Exporting names this short can easily cause nasty collisions with user code.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#12050
wmsum counters are a reduced version of aggsum counters, optimized for
write-mostly scenarios. They do not provide optimized read functions,
but instead allow much cheaper add function. The primary usage is
infrequently read statistic counters, not requiring exact precision.
The Linux implementation is directly mapped into percpu_counter KPI.
The FreeBSD implementation is directly mapped into counter(9) KPI.
In user-space due to lack of better implementation mapped to aggsum.
Unfortunately neither Linux percpu_counter nor FreeBSD counter(9)
provide sufficient functionality to completelly replace aggsum, so
it still remains to be used for several hot counters.
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes#12114
This replaces the generic libspl atomic.c atomics implementation
with one based on builtin gcc atomics. This functionality was added
as an experimental feature in gcc 4.4. Today even CentOS 7 ships
with gcc 4.8 as the default compiler we can make this the default.
Furthermore, the builtin atomics are as good or better than our
hand-rolled implementation so it's reasonable to drop that custom code.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#11904
Fixes get_system_hostid() if it was set via the aforementioned sysctl
and simplifies the code a bit. The kernel and user-space must agree,
after all.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#11879
Merge the actual implementations of getexecname() and slightly clean
up the FreeBSD one.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#11879
All users did a freopen() on it. Even some non-users did!
This is point-less ‒ just open the mtab when needed
If I understand Solaris' getextmntent(3C) correctly, the non-user
freopen()s are very likely an odd, twisted vestigial tail of that ‒
but it's got a completely different calling convention and caching
semantics than any platform we support
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#11868
As found by
git grep -E '(open|setmntent|pipe2?)\(' |
grep -vE '((zfs|zpool)_|fd|dl|lzc_re|pidfile_|g_)open\('
FreeBSD's pidfile_open() says nothing about the flags of the files it
opens, but we can't do anything about it anyway; the implementation does
open all files with O_CLOEXEC
Consider this output with zpool.d/media appended with
"pid=$$; (ls -l /proc/$pid/fd > /dev/tty)":
$ /sbin/zpool iostat -vc media
lrwx------ 0 -> /dev/pts/0
l-wx------ 1 -> 'pipe:[3278500]'
l-wx------ 2 -> /dev/null
lrwx------ 3 -> /dev/zfs
lr-x------ 4 -> /proc/31895/mounts
lrwx------ 5 -> /dev/zfs
lr-x------ 10 -> /usr/lib/zfs-linux/zpool.d/media
vs
$ ./zpool iostat -vc vendor,upath,iostat,media
lrwx------ 0 -> /dev/pts/0
l-wx------ 1 -> 'pipe:[3279887]'
l-wx------ 2 -> /dev/null
lr-x------ 10 -> /usr/lib/zfs-linux/zpool.d/media
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#11866
Arm-based Macs are like FreeBSD and provide a full 64-bit stat from the
start, so have no stat64 variants. Thus, define stat64 and fstat64 as
aliases for the normal versions.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
Closes#11771
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes#11775
In order for cppcheck to perform a proper analysis it needs to be
aware of how the sources are compiled (source files, include
paths/files, extra defines, etc). All the needed information is
available from the Makefiles and can be leveraged with a generic
cppcheck Makefile target. So let's add one.
Additional minor changes:
* Removing the cppcheck-suppressions.txt file. With cppcheck 2.3
and these changes it appears to no longer be needed. Some inline
suppressions were also removed since they appear not to be
needed. We can add them back if it turns out they're needed
for older versions of cppcheck.
* Added the ax_count_cpus m4 macro to detect at configure time how
many processors are available in order to run multiple cppcheck
jobs. This value is also now used as a replacement for nproc
when executing the kernel interface checks.
* "PHONY =" line moved in to the Rules.am file which is included
at the top of all Makefile.am's. This is just convenient becase
it allows us to use the += syntax to add phony targets.
* One upside of this integration worth mentioning is it now allows
`make cppcheck` to be run in any directory to check that subtree.
* For the moment, cppcheck is not run against the FreeBSD specific
kernel sources. The cppcheck-FreeBSD target will need to be
implemented and testing on FreeBSD to support this.
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#11508
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes#11458
In FreeBSD the struct uio was just a typedef to uio_t. In order to
extend this struct, outside of the definition for the struct uio, the
struct uio has been embedded inside of a uio_t struct.
Also renamed all the uio_* interfaces to be zfs_uio_* to make it clear
this is a ZFS interface.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes#11438
As of the 5.10 kernel the generic splice compatibility code has been
removed. All filesystems are now responsible for registering a
->splice_read and ->splice_write callback to support this operation.
The good news is the VFS provided generic_file_splice_read() and
iter_file_splice_write() callbacks can be used provided the ->iter_read
and ->iter_write callback support pipes. However, this is currently
not the case and only iovecs and bvecs (not pipes) are ever attached
to the uio structure.
This commit changes that by allowing full iov_iter structures to be
attached to uios. Ever since the 4.9 kernel the iov_iter structure
has supported iovecs, kvecs, bvevs, and pipes so it's desirable to
pass the entire thing when possible. In conjunction with this the
uio helper functions (i.e uiomove(), uiocopy(), etc) have been
updated to understand the new UIO_ITER type.
Note that using the kernel provided uio_iter interfaces allowed the
existing Linux specific uio handling code to be simplified. When
there's no longer a need to support kernel's older than 4.9, then
it will be possible to remove the iovec and bvec members from the
uio structure and always use a uio_iter. Until then we need to
maintain all of the existing types for older kernels.
Some additional refactoring and cleanup was included in this change:
- Added checks to configure to detect available iov_iter interfaces.
Some are available all the way back to the 3.10 kernel and are used
when available. In particular, uio_prefaultpages() now always uses
iov_iter_fault_in_readable() which is available for all supported
kernels.
- The unused UIO_USERISPACE type has been removed. It is no longer
needed now that the uio_seg enum is platform specific.
- Moved zfs_uio.c from the zcommon.ko module to the Linux specific
platform code for the zfs.ko module. This gets it out of libzfs
where it was never needed and keeps this Linux specific code out
of the common sources.
- Removed unnecessary O_APPEND handling from zfs_iter_write(), this
is redundant and O_APPEND is already handled in zfs_write();
Reviewed-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#11351
The original xuio zero copy functionality has always been unused
on Linux and FreeBSD. Remove this disabled code to avoid any
confusion and improve readability.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes#11124
The zfs_fsync, zfs_read, and zfs_write function are almost identical
between Linux and FreeBSD. With a little refactoring they can be
moved to the common code which is what is done by this commit.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes#11078
The acltype property is currently hidden on FreeBSD and does not
reflect the NFSv4 style ZFS ACLs used on the platform. This makes it
difficult to observe that a pool imported from FreeBSD on Linux has a
different type of ACL that is being ignored, and vice versa.
Add an nfsv4 acltype and expose the property on FreeBSD.
Make the default acltype nfsv4 on FreeBSD.
Setting acltype to an unhanded style is treated the same as setting
it to off. The ACLs will not be removed, but they will be ignored.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes#10520
In FreeBSD, there are three compile environments that are supported:
user land, the kernel and the bootloader / standalone. Adjust the
headers to compile in the standalone environment. Limit kernel-only
items from view when _STANDALONE is defined.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Warner Losh <imp@FreeBSD.org>
Closes#10998
It is a common mistake to have failed to autoload the module due to
permission issues when running a ZFS command as a user. "Operation
not permitted" is an unhelpfully vague error message.
Use a thread-local message buffer to format a nicer error message.
We can infer that loading the kernel module failed if the module is
not loaded. This can be extended with heuristics for other errors
in the future.
While looking at this stuff, remove an unused thread-local message
buffer found in libspl and remove some inaccurate verbiage from the
comment on libzfs_load_module.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes#11033
Adding an #ifdef __FreeBSD__ to a FreeBSD-specific header may seem odd,
but these headers are used on non-FreeBSD systems during the bootstrap
tools phase.
Originally submitted downstream as https://reviews.freebsd.org/D26193
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alex Richardson <Alexander.Richardson@cl.cam.ac.uk>
Closes#10863
Those macros are also defined by the compiler-provided float.h which
will be included later on (at least in the FreeBSD buildworld case) and
triggers these -Werror warnings. Including <float.h> first and only
defining the macros when DBL_DIG/FLT_DIG is missing fixes this problem.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alex Richardson <Alexander.Richardson@cl.cam.ac.uk>
Closes#10864
FreeBSD has the concept of jails, a precursor to Solaris's zones, which
can be mapped to the required zones interface with relative ease. The
previous ZFS implementation in FreeBSD did so, and we should continue
to provide an appropriate implementation in OpenZFS as well.
Move lib/libspl/zone.c into platform code and adopt the correct
implementation for FreeBSD.
While here, prune unused code.
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes#10851
The matching ioctl is DIOCGMEDIASIZE.
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Signed-off-by: Alex Richardson <Alexander.Richardson@cl.cam.ac.uk>
Closes#10818
Increase the size of DDT_NAMELEN and MNT_LINE_MAX to appease GCC
snprintf truncation warnings.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris McDonough <chrism@plope.com>
Closes#10712Closes#10766
Remove dead code to make the implementation easier to understand.
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Ahrens <matt@delphix.com>
Closes#10650
Remove dead code to make the implementation easier to understand.
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Ahrens <matt@delphix.com>
Closes#10650
A collection of header changes to enable FreeBSD to build
with vendored OpenZFS.
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes#10635
FreeBSD defines _BIG_ENDIAN BIG_ENDIAN _LITTLE_ENDIAN
LITTLE_ENDIAN on every architecture. Trying to do
cross builds whilst hiding this from ZFS has proven
extremely cumbersome.
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes#10621
* libspl: umem: These are obviously and intentionally unused; annotate
them as such to appease -Wunused-parameter builds that include this
header.
* sys/dmu.h: In this case, clear_on_evict_dbufp is only used for
ZFS_DEBUG builds, so annotate it as __maybe_unused to appease
-Wunused-parameter.
Reviewed-By: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Kyle Evans <kevans@FreeBSD.org>
Closes#10606
On linux the list debug code has been setting off a failure when
checking that the node->next->prev value is pointing back at the node.
At times this check evaluates to 0xdead. When removing a child from a
gang ABD we must acquire the child's abd_mtx to make sure that the
same ABD is not being added to another gang ABD while it is being
removed from a gang ABD. This fixes a race condition when checking
if an ABDs link is already active and part of another gang ABD before
adding it to a gang.
Added additional debug code for the gang ABD in abd_verify() to make
sure each child ABD has active links. Also check to make sure another
gang ABD is not added to a gang ABD.
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <matt@delphix.com>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes#10511
== Motivation and Context
The current implementation of 'sharenfs' and 'sharesmb' relies on
the use of the sharetab file. The use of this file is os-specific
and not required by linux or freebsd. Currently the code must
maintain updates to this file which adds complexity and presents
a significant performance impact when sharing many datasets. In
addition, concurrently running 'zfs sharenfs' command results in
missing entries in the sharetab file leading to unexpected failures.
== Description
This change removes the sharetab logic from the linux and freebsd
implementation of 'sharenfs' and 'sharesmb'. It still preserves an
os-specific library which contains the logic required for sharing
NFS or SMB. The following entry points exist in the vastly simplified
libshare library:
- sa_enable_share -- shares a dataset but may not commit the change
- sa_disable_share -- unshares a dataset but may not commit the change
- sa_is_shared -- determine if a dataset is shared
- sa_commit_share -- notify NFS/SMB subsystem to commit the shares
- sa_validate_shareopts -- determine if sharing options are valid
The sa_commit_share entry point is provided as a performance enhancement
and is not required. The sa_enable_share/sa_disable_share may commit
the share as part of the implementation. Libshare provides a framework
for both NFS and SMB but some operating systems may not fully support
these protocols or all features of the protocol.
NFS Operation:
For linux, libshare updates /etc/exports.d/zfs.exports to add
and remove shares and then commits the changes by invoking
'exportfs -r'. This file, is automatically read by the kernel NFS
implementation which makes for better integration with the NFS systemd
service. For FreeBSD, libshare updates /etc/zfs/exports to add and
remove shares and then commits the changes by sending a SIGHUP to
mountd.
SMB Operation:
For linux, libshare adds and removes files in /var/lib/samba/usershares
by calling the 'net' command directly. There is no need to commit the
changes. FreeBSD does not support SMB.
== Performance Results
To test sharing performance we created a pool with an increasing number
of datasets and invoked various zfs actions that would enable and
disable sharing. The performance testing was limited to NFS sharing.
The following tests were performed on an 8 vCPU system with 128GB and
a pool comprised of 4 50GB SSDs:
Scale testing:
- Share all filesystems in parallel -- zfs sharenfs=on <dataset> &
- Unshare all filesystems in parallel -- zfs sharenfs=off <dataset> &
Functional testing:
- share each filesystem serially -- zfs share -a
- unshare each filesystem serially -- zfs unshare -a
- reset sharenfs property and unshare -- zfs inherit -r sharenfs <pool>
For 'zfs sharenfs=on' scale testing we saw an average reduction in time
of 89.43% and for 'zfs sharenfs=off' we saw an average reduction in time
of 83.36%.
Functional testing also shows a huge improvement:
- zfs share -- 97.97% reduction in time
- zfs unshare -- 96.47% reduction in time
- zfs inhert -r sharenfs -- 99.01% reduction in time
Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Bryant G. Ly <bryangly@gmail.com>
Signed-off-by: George Wilson <gwilson@delphix.com>
External-Issue: DLPX-68690
Closes#1603Closes#7692Closes#7943Closes#10300
libzutil is currently statically linked into libzfs, libzfs_core and
libzpool. Avoid the unnecessary duplication by removing it from libzfs
and libzpool, and adding libzfs_core to libzpool.
Remove a few unnecessary dependencies:
- libuutil from libzfs_core
- libtirpc from libspl
- keep only libcrypto in libzfs, as we don't use any functions from
libssl
- librt is only used for clock_gettime, however on modern systems that's
in libc rather than librt. Add a configure check to see if we actually
need librt
- libdl from raidz_test
Add a few missing dependencies:
- zlib to libefi and libzfs
- libuuid to zpool, and libuuid and libudev to zed
- libnvpair uses assertions, so add assert.c to provide aok and
libspl_assertf
Sort the LDADD for programs so that libraries that satisfy dependencies
come at the end rather than the beginning of the linker command line.
Revamp the configure tests for libaries to use FIND_SYSTEM_LIBRARY
instead. This can take advantage of pkg-config, and it also avoids
polluting LIBS.
List all the required dependencies in the pkgconfig files, and move the
one for libzfs_core into the latter's directory. Install pkgconfig files
in $(libdir)/pkgconfig on linux and $(prefix)/libdata/pkgconfig on
FreeBSD, instead of /usr/share/pkgconfig, as the more correct location
for library .pc files.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10538
Variadic functions cannot be inlined. libspl_assertf ends up being
duplicated in every file that uses it.
Fix this by moving the function into a new assert.c. Also move the
definition of aok into the new file instead of zone.c.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10538
FreeBSD has a zfsbootcfg command that wants zpool_nextboot in libzfs.
Add the function to FreeBSD's libzfs_compat.c, and while here move
the prototype for zfs_jail out of param.h in FreeBSD's SPL and into
libzfs.h under an ifdef for FreeBSD, where the prototype for
zpool_nextboot joins it.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes#10524
When clearing a bit, we should check whether that bit is 0.
Note atomic_clear_long_excl is not used.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Liu Qing <winglq@gmail.com>
Closes#10526
Reduce the usage of EXTRA_DIST. If files are conditionally included in
_SOURCES, _HEADERS etc, automake is smart enough to dist all files that
could possibly be included, but this does not apply to EXTRA_DIST,
resulting in make dist depending on the configuration.
Add some files that were missing altogether in various Makefile's.
The changes to disted files in this commit (excluding deleted files):
+./cmd/zed/agents/README.md
+./etc/init.d/README.md
+./lib/libspl/os/freebsd/getexecname.c
+./lib/libspl/os/freebsd/gethostid.c
+./lib/libspl/os/freebsd/getmntany.c
+./lib/libspl/os/freebsd/mnttab.c
-./lib/libzfs/libzfs_core.pc
-./lib/libzfs/libzfs.pc
+./lib/libzfs/os/freebsd/libzfs_compat.c
+./lib/libzfs/os/freebsd/libzfs_fsshare.c
+./lib/libzfs/os/freebsd/libzfs_ioctl_compat.c
+./lib/libzfs/os/freebsd/libzfs_zmount.c
+./lib/libzutil/os/freebsd/zutil_compat.c
+./lib/libzutil/os/freebsd/zutil_device_path_os.c
+./lib/libzutil/os/freebsd/zutil_import_os.c
+./module/lua/README.zfs
+./module/os/linux/spl/README.md
+./tests/README.md
+./tests/zfs-tests/tests/functional/cli_root/zfs_clone/zfs_clone_rm_nested.ksh
+./tests/zfs-tests/tests/functional/cli_root/zfs_send/zfs_send_encrypted_unloaded.ksh
+./tests/zfs-tests/tests/functional/inheritance/README.config
+./tests/zfs-tests/tests/functional/inheritance/README.state
+./tests/zfs-tests/tests/functional/rsend/rsend_016_neg.ksh
+./tests/zfs-tests/tests/perf/fio/sequential_readwrite.fio
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10501
There's no need to specify the srcdir explicitly in _HEADERS and
EXTRA_DIST.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10493
Currently, asm-generic/atomic.c is compiled into a .S file, with a
comment saying this is to simplify the upper-level Makefile.
However, this doesn't work properly with a VPATH build, which would
require better logic to deal with generated sources correctly.
It also doesn't seem more complex to just specify the .c/.S source file,
depending on the cpu, instead of only the source directory in
lib/libspl/Makefile.am, which eliminates the need to do the intermediate
compilation.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10493
Include the header with prototypes in the file that provides definitions
as well, to catch any mismatch between prototype and definition.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10470
The macOS uio struct is opaque and the API must be used, this
makes the smallest changes to the code for all platforms.
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes#10412
The horrible effects of human slavery continue to impact society. The
casual use of the term "slave" in computer software is an unnecessary
reference to a painful human experience.
This commit removes all possible references to the term "slave".
Implementation notes:
The zpool.d/slaves script is renamed to dm-deps, which uses the same
terminology as `dmsetup deps`.
References to the `/sys/class/block/$dev/slaves` directory remain. This
directory name is determined by the Linux kernel. Although
`dmsetup deps` provides the same information, it unfortunately requires
elevated privileges, whereas the `/sys/...` directory is world-readable.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes#10435
C++ is a little picky about not using keywords for names, or string
constness.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes#10409
Given the following test program:
#include <time.h>
#include <stdio.h>
#include <stdint.h>
int main() {
printf("time_t: %d\n", sizeof(time_t));
printf("long: %d\n", sizeof(long));
printf("long long: %d\n", sizeof(long long));
}
These are output on various x86 architectures:
x32$ time_t: 8
x32$ long: 4
x32$ long long: 8
amd64$ time_t: 8
amd64$ long: 8
amd64$ long long: 8
i386$ time_t: 4
i386$ long: 4
i386$ long long: 8
Therefore code using "%l[du]" to format time_ts produced warnings on x32
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@gmail.com>
Closes#10357Closes#844
This aids in debugging, so that we can use the same infrastructure to
walk zfs's list_t in the kernel module and in the userland libraries
(e.g. when debugging ztest).
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes#10236
Musl libc defined `stat64` as a macro, which causes the build to fail
upon compiling os/linux/getmntany.c due to conflicts between the forward
declaration and the implementation.
This commit fixes that by including <sys/stat.h> in "sys/mnttab.h"
directly.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Hiếu Lê <leorize+oss@disroot.org>
Closes#10195
Add the FreeBSD platform code to the OpenZFS repository. As of this
commit the source can be compiled and tested on FreeBSD 11 and 12.
Subsequent commits are now required to compile on FreeBSD and Linux.
Additionally, they must pass the ZFS Test Suite on FreeBSD which is
being run by the CI. As of this commit 1230 tests pass on FreeBSD
and there are no unexpected failures.
Reviewed-by: Sean Eric Fagan <sef@ixsystems.com>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes#898Closes#8987
As part of the Linux kernel's y2038 changes the time_t type has been
fully retired. Callers are now required to use the time64_t type.
Rather than move to the new type, I've removed the few remaining
places where a time_t is used in the kernel code. They've been
replaced with a uint64_t which is already how ZFS internally
handled these values.
Going forward we should work towards updating the remaining user
space time_t consumers to the 64-bit interfaces.
Reviewed-by: Matthew Macy <mmacy@freebsd.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#10052Closes#10064
Currently SIMD accelerated AES-GCM performance is limited by two
factors:
a. The need to disable preemption and interrupts and save the FPU
state before using it and to do the reverse when done. Due to the
way the code is organized (see (b) below) we have to pay this price
twice for each 16 byte GCM block processed.
b. Most processing is done in C, operating on single GCM blocks.
The use of SIMD instructions is limited to the AES encryption of the
counter block (AES-NI) and the Galois multiplication (PCLMULQDQ).
This leads to the FPU not being fully utilized for crypto
operations.
To solve (a) we do crypto processing in larger chunks while owning
the FPU. An `icp_gcm_avx_chunk_size` module parameter was introduced
to make this chunk size tweakable. It defaults to 32 KiB. This step
alone roughly doubles performance. (b) is tackled by porting and
using the highly optimized openssl AES-GCM assembler routines, which
do all the processing (CTR, AES, GMULT) in a single routine. Both
steps together result in up to 32x reduction of the time spend in
the en/decryption routines, leading up to approximately 12x
throughput increase for large (128 KiB) blocks.
Lastly, this commit changes the default encryption algorithm from
AES-CCM to AES-GCM when setting the `encryption=on` property.
Reviewed-By: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-By: Jason King <jason.king@joyent.com>
Reviewed-By: Tom Caputi <tcaputi@datto.com>
Reviewed-By: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes#9749
Implements the RAID-Z function using AltiVec SIMD.
This is basically the NEON code translated to AltiVec.
Note that the 'fletcher' algorithm requires 64-bits
operations, and the initial implementations of AltiVec
(PPC74xx a.k.a. G4, PPC970 a.k.a. G5) only has up to
32-bits operations, so no 'fletcher'.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Romain Dolbeau <romain.dolbeau@european-processor-initiative.eu>
Closes#9539
Over the years several slightly different approaches were used
in the Makefiles to determine the target architecture. This
change updates both the build system and Makefile to handle
this in a consistent fashion.
TARGET_CPU is set to i386, x86_64, powerpc, aarch6 or sparc64
and made available in the Makefiles to be used as appropriate.
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#9848
Rather than defining a new instance of 'aok' in every compilation
unit which includes this header, there is a single instance
defined in zone.c, and the header now only declares an extern.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Paul Zuchowski <pzuchowski@datto.com>
Signed-off-by: Nick Black <dankamongmen@gmail.com>
Closes#9752