Commit Graph

145 Commits

Author SHA1 Message Date
Richard Yao 0e4c830bc1
Cleanup: Use OpenSolaris functions to call scheduler
In our codebase, `cond_resched() and `schedule()` are Linux kernel
functions that have replaced the OpenSolaris `kpreempt()` functions in
the codebase to such an extent that `kpreempt()` in zfs_context.h was
broken. Nobody noticed because we did not actually use it. The header
had defined `kpreempt()` as `yield()`, which works on OpenSolaris and
Illumos where `sched_yield()` is a wrapper for `yield()`, but that does
not work on any other platform.

The FreeBSD platform specific code implemented shims for these, but the
shim for `schedule()` forced us to wait, which is different than merely
rescheduling to another thread as the original Linux code does, while
the shim for `cond_resched()` had the same definition as its kernel
kpreempt() shim.

After studying this, I have concluded that we should reintroduce the
kpreempt() function in platform independent code with the following
definitions:

	- In the Linux kernel:
		kpreempt(unused)	-> cond_resched()

	- In the FreeBSD kernel:
		kpreempt(unused)	-> kern_yield(PRI_USER)

	- In userspace:
		kpreempt(unused)	-> sched_yield()

In userspace, nothing changes from this cleanup. In the kernels, the
function `fm_fini()` will now call `kern_yield(PRI_USER)` on FreeBSD and
`cond_resched()` on Linux.  This is instead of `pause("schedule", 1)` on
FreeBSD and `schedule()` on Linux. This makes our behavior consistent
across platforms.

Note that Linux's SPL continues to use `cond_resched()` and
`schedule()`.  However, those functions have been removed from both the
FreeBSD code and userspace code.

This should have the benefit of making it slightly easier to port the
code to new platforms by making how things should be mapped less
confusing.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13845
2022-09-12 09:55:37 -07:00
Ryan Moeller 7bb707ffaf FreeBSD: Organize sysctls
FreeBSD had a few platform-specific ARC tunables in the wrong place:

- Move FreeBSD-specifc ARC tunables into the same vfs.zfs.arc node as
  the rest of the ARC tunables.
- Move the handlers from arc_os.c to sysctl_os.c and add compat sysctls
  for the legacy names.

While here, some additional clean up:

- Most handlers are specific to a particular variable and don't need a
  pointer passed through the args.
- Group blocks of related variables, handlers, and sysctl declarations
  into logical sections.
- Match variable types for temporaries in handlers with the type of the
  global variable.
- Remove leftover comments.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13756
2022-09-02 13:26:24 -07:00
Ryan Moeller 4723eba8c0 FreeBSD: Mark ZFS_MODULE_PARAM_CALL as MPSAFE
ZFS_MODULE_PARAM_CALL handlers implement their own locking if needed
and do not require Giant.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13756
2022-09-02 13:26:04 -07:00
Richard Yao 0b30dc484f
FreeBSD: Cleanup dead code from VFS
The vfs_*_feature() macros turn anything that uses them into dead code,
so we can delete all of it.

As a side effect, zfs_set_fuid_feature() is now identical in
module/os/freebsd/zfs/zfs_vnops_os.c and
module/os/linux/zfs/zfs_vnops_os.c. A few other functions are identical
too. Future cleanup could move these into a common file.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13832
2022-09-02 13:20:10 -07:00
Tino Reichardt 1d3ba0bf01
Replace dead opensolaris.org license link
The commit replaces all findings of the link:
http://www.opensolaris.org/os/licensing with this one:
https://opensource.org/licenses/CDDL-1.0

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #13619
2022-07-11 14:16:13 -07:00
наб a926aab902 Enable -Wwrite-strings
Also, fix leak from ztest_global_vars_to_zdb_args()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13348
2022-06-29 14:08:54 -07:00
Kristof Provost 325096545a
FreeBSD: only define B_FALSE/B_TRUE if NEED_SOLARIS_BOOLEAN is not set
If NEED_SOLARIS_BOOLEAN is defined we define an enum boolean_t, which
defines B_TRUE/B_FALSE as well. If we have both the define and the enum
things don't build (because that translates to
'enum { 0, 1 }     boolean_t').

While here also remove an incorrect '#else'. With it in place we only
parse a section if the include guard is triggered. So we'd only use that
code if this file is included twice. This is clearly unintended, and
also means we don't get the 'boolean_t' definition. Fix this.

Reviewed-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Kristof Provost <kprovost@netgate.com>
Sponsored-By: Rubicon Communications, LLC ("Netgate")
Closes #13596
2022-06-28 14:11:38 -07:00
Tino Reichardt 985c33b132
Introduce BLAKE3 checksums as an OpenZFS feature
This commit adds BLAKE3 checksums to OpenZFS, it has similar
performance to Edon-R, but without the caveats around the latter.

Homepage of BLAKE3: https://github.com/BLAKE3-team/BLAKE3
Wikipedia: https://en.wikipedia.org/wiki/BLAKE_(hash_function)#BLAKE3

Short description of Wikipedia:

  BLAKE3 is a cryptographic hash function based on Bao and BLAKE2,
  created by Jack O'Connor, Jean-Philippe Aumasson, Samuel Neves, and
  Zooko Wilcox-O'Hearn. It was announced on January 9, 2020, at Real
  World Crypto. BLAKE3 is a single algorithm with many desirable
  features (parallelism, XOF, KDF, PRF and MAC), in contrast to BLAKE
  and BLAKE2, which are algorithm families with multiple variants.
  BLAKE3 has a binary tree structure, so it supports a practically
  unlimited degree of parallelism (both SIMD and multithreading) given
  enough input. The official Rust and C implementations are
  dual-licensed as public domain (CC0) and the Apache License.

Along with adding the BLAKE3 hash into the OpenZFS infrastructure a
new benchmarking file called chksum_bench was introduced.  When read
it reports the speed of the available checksum functions.

On Linux: cat /proc/spl/kstat/zfs/chksum_bench
On FreeBSD: sysctl kstat.zfs.misc.chksum_bench

This is an example output of an i3-1005G1 test system with Debian 11:

implementation      1k      4k     16k     64k    256k      1m      4m
edonr-generic     1196    1602    1761    1749    1762    1759    1751
skein-generic      546     591     608     615     619     612     616
sha256-generic     240     300     316     314     304     285     276
sha512-generic     353     441     467     476     472     467     426
blake3-generic     308     313     313     313     312     313     312
blake3-sse2        402    1289    1423    1446    1432    1458    1413
blake3-sse41       427    1470    1625    1704    1679    1607    1629
blake3-avx2        428    1920    3095    3343    3356    3318    3204
blake3-avx512      473    2687    4905    5836    5844    5643    5374

Output on Debian 5.10.0-10-amd64 system: (Ryzen 7 5800X)

implementation      1k      4k     16k     64k    256k      1m      4m
edonr-generic     1840    2458    2665    2719    2711    2723    2693
skein-generic      870     966     996     992    1003    1005    1009
sha256-generic     415     442     453     455     457     457     457
sha512-generic     608     690     711     718     719     720     721
blake3-generic     301     313     311     309     309     310     310
blake3-sse2        343    1865    2124    2188    2180    2181    2186
blake3-sse41       364    2091    2396    2509    2463    2482    2488
blake3-avx2        365    2590    4399    4971    4915    4802    4764

Output on Debian 5.10.0-9-powerpc64le system: (POWER 9)

implementation      1k      4k     16k     64k    256k      1m      4m
edonr-generic     1213    1703    1889    1918    1957    1902    1907
skein-generic      434     492     520     522     511     525     525
sha256-generic     167     183     187     188     188     187     188
sha512-generic     186     216     222     221     225     224     224
blake3-generic     153     152     154     153     151     153     153
blake3-sse2        391    1170    1366    1406    1428    1426    1414
blake3-sse41       352    1049    1212    1174    1262    1258    1259

Output on Debian 5.10.0-11-arm64 system: (Pi400)

implementation      1k      4k     16k     64k    256k      1m      4m
edonr-generic      487     603     629     639     643     641     641
skein-generic      271     299     303     308     309     309     307
sha256-generic     117     127     128     130     130     129     130
sha512-generic     145     165     170     172     173     174     175
blake3-generic      81      29      71      89      89      89      89
blake3-sse2        112     323     368     379     380     371     374
blake3-sse41       101     315     357     368     369     364     360

Structurally, the new code is mainly split into these parts:
- 1x cross platform generic c variant: blake3_generic.c
- 4x assembly for X86-64 (SSE2, SSE4.1, AVX2, AVX512)
- 2x assembly for ARMv8 (NEON converted from SSE2)
- 2x assembly for PPC64-LE (POWER8 converted from SSE2)
- one file for switching between the implementations

Note the PPC64 assembly requires the VSX instruction set and the
kfpu_begin() / kfpu_end() calls on PowerPC were updated accordingly.

Reviewed-by: Felix Dörre <felix@dogcraft.de>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Co-authored-by: Rich Ercolani <rincebrain@gmail.com>
Closes #10058
Closes #12918
2022-06-08 15:55:57 -07:00
наб c25b281378 Remove hw_serial, ddi_strtoul()
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13434
2022-05-13 10:15:31 -07:00
наб 09a7ad38a5 autoconf: single-step includes
Still descend, but only once: we get a lot of mileage out of nodist_

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13316
2022-05-10 10:18:51 -07:00
Pawel Jakub Dawidek a64d757aa4
FreeBSD: Clean up the use of ioflags
- Prefer O_* flags over F* flags that mostly mirror O_* flags anyway,
  but O_* flags seem to be preferred.
- Simplify the code as all the F*SYNC flags were defined as FFSYNC flag.
- Don't define FRSYNC flag, so we don't generate unnecessary ZIL commits.
- Remove EXCL define, FreeBSD ignores the excl argument for zfs_create()
  anyway.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes #13400
2022-05-02 16:26:28 -07:00
наб ad9e767657 linux: module: weld all but spl.ko into zfs.ko
Originally it was thought it would be useful to split up the kmods
by functionality.  This would allow external consumers to only load
what was needed.  However, in practice we've never had a case where
this functionality would be needed, and conversely managing multiple
kmods can be awkward.  Therefore, this change merges all but the
spl.ko kmod in to a single zfs.ko kmod.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13274
2022-04-20 13:28:24 -07:00
Ryan Moeller d42979c6ef
Fix ACL checks for NFS kernel server
This PR changes ZFS ACL checks to evaluate
fsuid / fsgid rather than euid / egid to avoid
accidentally granting elevated permissions to
NFS clients.

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Andrew Walker <awalker@ixsystems.com>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #13221
2022-03-18 06:47:57 -06:00
наб d465fc5844 Forbid b{copy,zero,cmp}(). Don't include <strings.h> for <string.h>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12996
2022-03-15 15:13:48 -07:00
наб 861166b027 Remove bcopy(), bzero(), bcmp()
bcopy() has a confusing argument order and is actually a move, not a
copy; they're all deprecated since POSIX.1-2001 and removed in -2008,
and we shim them out to mem*() on Linux anyway

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12996
2022-03-15 15:13:42 -07:00
Mark Johnston fdcb79b52e
spl: Don't check FreeBSD rwlocks for double initialization (#13019)
This checking breaks KMSAN since it effectively loads from uninitialized
memory to see if the lock is already initialized.  This happens in
dnode_cons() for example.  This checking is not very useful, partly due
to UMA's memory trashing, and is already disabled for mutexes.  Make
mutexes and rwlocks consistent: remove double-initialization checking
for rwlocks, and pass SX_NEW to disable the same checking in
lock_init().

No functional change intended, this affects only debug builds.

As a side note, kmem cache constructors/destructors are implemented
suboptimally on FreeBSD.  FreeBSD's slab allocator, UMA, supports two
pairs of constructors/destructors: ctor/dtor and init/fini.  The former
are called upon every allocation and free of an item, while the latter
are called when an item is imported or released from a zone,
respectively.  That is, when a slab is allocated to a particular cache,
it is subdivided into items, and init is called on each.  fini is called
when the slab is being prepared to be freed back to the system.  The
intent is for them to initialize static fields such as locks, which
do not need to be initialized upon each allocation of an item.

In illumos, kmem_cache constructors/destructors correspond to UMA's
init/fini callbacks.  However, in the SPL they are implemented as UMA
ctor/dtors, meaning that they get called far more often than necessary.
This may be difficult to fix, since new code may assume the kmem cache
ctor/dtors are in fact called upon each allocation/free, and there
doesn't seem to be a clear way to implement the intended semantics on
Linux.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #13019
2022-01-31 10:58:45 -08:00
наб c70bb2f610 Replace *CTASSERT() with _Static_assert()
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12993
2022-01-26 11:38:52 -08:00
наб 7ada752a93 Clean up CSTYLEDs
69 CSTYLED BEGINs remain, appx. 30 of which can be removed if cstyle(1)
had a useful policy regarding
  CALL(ARG1,
  	ARG2,
  	ARG3);
above 2 lines. As it stands, it spits out *both*
  sysctl_os.c: 385: continuation line should be indented by 4 spaces
  sysctl_os.c: 385: indent by spaces instead of tabs
which is very cool

Another >10 could be fixed by removing "ulong" &al. handling.
I don't foresee anyone actually using it intentionally
(does it even exist in modern headers? why did it in the first place?).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12993
2022-01-26 11:38:52 -08:00
наб a9e2788ffe
libspl: cast to uintptr_t instead of !!ing
This led to these two warning types:
  debug.h:139:67: warning: the address of ‘ARC_anon’
  will always evaluate as ‘true’ [-Waddress]
    139 | #define ASSERT3P(x, y, z)
              ((void) sizeof (!!(x)), (void) sizeof (!!(z)))
        |                                               ^
  arc.c:1591:2: note: in expansion of macro ‘ASSERT3P’
   1591 |  ASSERT3P(hdr->b_l1hdr.b_state, ==, arc_anon);
        |  ^~~~~~~~
and
  arc.h:66:44: warning: ‘<<’ in boolean context,
  did you mean ‘<’? [-Wint-in-bool-context]
     66 | #define HDR_GET_LSIZE(hdr)
              ((hdr)->b_lsize << SPA_MINBLOCKSHIFT)
  debug.h:138:46: note: in definition of macro ‘ASSERT3U’
    138 | #define ASSERT3U(x, y, z)
              ((void) sizeof (!!(x)), (void) sizeof (!!(z)))
        |                        ^
  arc.c:1760:12: note: in expansion of macro ‘HDR_GET_LSIZE’
   1760 |   ASSERT3U(HDR_GET_LSIZE(hdr), !=, 0);
        |            ^~~~~~~~~~~~~

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13009
2022-01-24 17:05:42 -08:00
наб bc40713a8f
libspl: ASSERT*: !! for sizeof
sizeof(bitfield.member) is invalid, and this shows up in some FreeBSD
build configurations: work around this by !!ing ‒
this makes the sizeof target the ! result type (_Bool), instead

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Fixes: 42aaf0e ("libspl: ASSERT*: mark arguments as used")
Closes #12984
Closes #12986
2022-01-21 10:20:11 -08:00
наб 42aaf0e7c4 libspl: ASSERT*: mark arguments as used
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12844
2021-12-23 09:35:47 -08:00
Ryan Moeller 92a9e8c618
FreeBSD: Provide correct file generation number
va_seq was actually a thin veil over va_gen, so z_gen is a more
appropriate value than z_seq to populate the field with.

Drop the unnecessary compat obfuscation and provide the correct
file generation number.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <freqlabs@freebsd.org>
Closes #12851
2021-12-16 13:22:15 -08:00
Martin Matuška b8dcfb2c9f
FreeBSD: fix world build after de198f2d9
The inline function vn_flush_cached_data() in vnode.h
must not be compiled when building BASE.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Martin Matuska <mm@FreeBSD.org>
Closes #12743
2021-11-15 09:07:39 -07:00
Brian Behlendorf de198f2d95
Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency
When using lseek(2) to report data/holes memory mapped regions of
the file were ignored.  This could result in incorrect results.
To handle this zfs_holey_common() was updated to asynchronously
writeback any dirty mmap(2) regions prior to reporting holes.

Additionally, while not strictly required, the dn_struct_rwlock is
now held over the dirty check to prevent the dnode structure from
changing.  This ensures that a clean dnode can't be dirtied before
the data/hole is located.  The range lock is now also taken to
ensure the call cannot race with zfs_write().

Furthermore, the code was refactored to provide a dnode_is_dirty()
helper function which checks the dnode for any dirty records to
determine its dirtiness.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #11900
Closes #12724
2021-11-07 14:27:44 -07:00
Allan Jude e945e8d7f4
Restore FreeBSD sysctl processing for arc.min and arc.max
Before OpenZFS 2.0, trying to set the FreeBSD sysctl vfs.zfs.arc_max
to a disallowed value would return an error.
Since the switch, it instead only generates WARN_IF_TUNING_IGNORED

Keep the ability to set the sysctl's specifically to 0, even though
that is less than the minimum, because some tests depend on this.

Also lost, was the ability to set vfs.zfs.arc_max to a value less
than the default vfs.zfs.arc_min at boot time. Restore this as well.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #12161
2021-08-16 09:35:19 -06:00
Brian Behlendorf 4bd99c11d7 Remove overlooked __sun_attr__ based macros
The __NORETURN, __CONST, and __PURE macros in the FreeBSD platform
code were based on the __sun_attr__ macro which was removed in
commit 5dbf6c5a6.  This caused a build failure because the
__NORETURN macro was still used in one place in kernel code.
The __CONST and __PURE macros were entirely unused.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12435
2021-07-27 09:49:11 -07:00
наб 5dbf6c5a66 Replace /*PRINTFLIKEn*/ with attribute(printf)
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #12201
2021-07-26 12:07:15 -07:00
Martin Matuška 14d2841b53
FreeBSD: fix compilation of FreeBSD world after 29274c9f6
prng32_bounded() is available to kernel only on FreeBSD 13+.

Call inline random_get_pseudo_bytes() with correct pointer type.
To be consistent, apply to Linux as well.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Martin Matuska <mm@FreeBSD.org>
Closes #12282
2021-06-25 10:28:51 -07:00
Alexander Motin 29274c9f6d
Optimize small random numbers generation
In all places except two spa_get_random() is used for small values,
and the consumers do not require well seeded high quality values.
Switch those two exceptions directly to random_get_pseudo_bytes()
and optimize spa_get_random(), renaming it to random_in_range(),
since it is not related to SPA or ZFS in general.

On FreeBSD directly map random_in_range() to new prng32_bounded() KPI
added in FreeBSD 13.  On Linux and in user-space just reduce the type
used to uint32_t to avoid more expensive 64bit division.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12183
2021-06-22 17:35:23 -06:00
Alexander Motin 371f88d96f
Remove pool io kstats (#12212)
This mostly reverts "3537 want pool io kstats" commit of 8 years ago.

From one side this code using pool-wide locks became pretty bad for
performance, creating significant lock contention in I/O pipeline.
From another, there are more efficient ways now to obtain detailed
statistics, while this statistics is illumos-specific and much less
usable on Linux and FreeBSD, reported only via procfs/sysctls.

This commit does not remove KSTAT_TYPE_IO implementation, that may
be removed later together with already unused KSTAT_TYPE_INTR and
KSTAT_TYPE_TIMER.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12212
2021-06-10 08:27:33 -07:00
Alexander Motin 86706441a8
Introduce write-mostly sums
wmsum counters are a reduced version of aggsum counters, optimized for
write-mostly scenarios.  They do not provide optimized read functions,
but instead allow much cheaper add function.  The primary usage is
infrequently read statistic counters, not requiring exact precision.

The Linux implementation is directly mapped into percpu_counter KPI.
The FreeBSD implementation is directly mapped into counter(9) KPI.
In user-space due to lack of better implementation mapped to aggsum.

Unfortunately neither Linux percpu_counter nor FreeBSD counter(9)
provide sufficient functionality to completelly replace aggsum, so
it still remains to be used for several hot counters.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12114
2021-05-27 14:27:29 -06:00
Ryan Moeller 4704be2879
Remove unimplemented virus scanning hooks
Reviewed-by: Adam Moss <c@yotes.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11972
2021-05-10 22:02:25 -07:00
наб 38c6d6cedd
module/zfs: remove zfs_zevent_console and zfs_zevent_cols
zfs_zevent_console committed multiple printk()s per line without
properly continuing them ‒ a single event could easily be fragmented
across over thirty lines, making it useless for direct application

zfs_zevent_cols exists purely to wrap the output from zfs_zevent_console

The niche this was supposed to fill can be better served by something
akin to the all-syslog ZEDLET

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #7082 
Closes #11996
2021-05-10 11:00:15 -07:00
Ryan Moeller 801c76149b
FreeBSD: Prune some unneeded definitions
IS_XATTRDIR is never used.
v_count is only used in two places, one immediately followed by the
use of the real name, v_usecount.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #11973
2021-04-30 07:34:53 -07:00
Andrea Gelmini bf169e9f15 Fix various typos
Correct an assortment of typos throughout the code base.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #11774
2021-04-02 18:52:15 -07:00
Adam D. Moss c94d648b1c
Microoptimizations for VERIFY() and friends
Add branch hints and constify the intermediate evaluations of 
left/right params in VERIFY3*().

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Adam Moss <c@yotes.com>
Closes #11708
2021-03-11 17:16:09 -08:00
Allan Jude 92e8fb6395
Add missing files to Makefile
Some .h files that were added were missed in this Makefile. Since 
they are .h files, their being missing only resulted in them 
disappeared from the dist archive.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #11705
2021-03-11 17:13:34 -08:00
Brian Atkinson c0801bf35a
Cleaning up uio headers
Making uio_impl.h the common header interface between Linux and FreeBSD
so both OS's can share a common header file. This also helps reduce code
duplication for zfs_uio_t for each OS.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #11622
2021-02-20 20:16:50 -08:00
Matt Macy 0e9bcd5d4f FreeBSD: fix HEAD build, conditionally remove FDSYNC defines
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #11458
2021-01-23 15:39:55 -08:00
Brian Atkinson d0cd9a5cc6
Extending FreeBSD UIO Struct
In FreeBSD the struct uio was just a typedef to uio_t. In order to
extend this struct, outside of the definition for the struct uio, the
struct uio has been embedded inside of a uio_t struct.

Also renamed all the uio_* interfaces to be zfs_uio_* to make it clear
this is a ZFS interface.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #11438
2021-01-20 21:27:30 -08:00
Matthew Macy f11b09dec3
FreeBSD: minor_t needs to be signed so that -1 is recognized as such
zfsdev_close sets zs_minor to -1 to avoid duplicate calls to
destroy. This doesn't mix well with the current u_int used.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #11437
2021-01-07 10:41:27 -08:00
Ryan Moeller 439dc034e9 FreeBSD: Implement sysctl for fletcher4 impl
There is a tunable to select the fletcher 4 checksum implementation on
Linux but it was not present in FreeBSD.

Implement the sysctl handler for FreeBSD and use ZFS_MODULE_PARAM_CALL
to provide the tunable on both platforms.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11270
2020-12-11 10:29:01 -08:00
Ryan Moeller e0716250bf
FreeBSD: Do zcommon_init sooner to avoid FPU panic
There has been a panic affecting some system configurations where the
thread FPU context is disturbed during the fletcher 4 benchmarks,
leading to a panic at boot.

module_init() registers zcommon_init to run in the last subsystem
(SI_SUB_LAST).  Running it as soon as interrupts have been configured
(SI_SUB_INT_CONFIG_HOOKS) makes sure we have finished the benchmarks
before we start doing other things.

While it's not clear *how* the FPU context was being disturbed, this
does seem to avoid it.

Add a module_init_early() macro to run zcommon_init() at this earlier
point on FreeBSD.  On Linux this is defined as module_init().

Authored by: Konstantin Belousov <kib@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11302
2020-12-09 21:29:00 -08:00
Adrian Chadd 3fcd737478 Fix compiling on FreeBSD + gcc - don't assume illmnos bits
This looks like it was once from the illumnos compat code.
FreeBSD doesn't have cmn_err as a compiler format attribute, so
it definitely errors out.

It doesn't show up on LLVM because it doesn't trigger at all.

Add in the format flags but keep them behind #if 0 for now;
there are too many format issues that trigger when one does
format checking in the shared code.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: adrian chadd <adrian@freebsd.org>
Closes #11068
Closes #11069
2020-11-10 15:54:12 -08:00
Ryan Moeller 2074dfd0e9 FreeBSD: Move uio_prefaultpages def to uio.h
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11176
2020-11-10 10:58:59 -08:00
Mateusz Guzik 09eb36ce3d
Introduce CPU_SEQID_UNSTABLE
Current CPU_SEQID users don't care about possibly changing CPU ID, but
enclose it within kpreempt disable/enable in order to fend off warnings
from Linux's CONFIG_DEBUG_PREEMPT.

There is no need to do it. The expected way to get CPU ID while allowing
for migration is to use raw_smp_processor_id.

In order to make this future-proof this patch keeps CPU_SEQID as is and
introduces CPU_SEQID_UNSTABLE instead, to make it clear that consumers
explicitly want this behavior.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #11142
2020-11-02 11:51:12 -08:00
Matthew Macy 8583540c6e
Consolidate zfs_holey and zfs_access
The zfs_holey() and zfs_access() functions can be made common
to both FreeBSD and Linux.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #11125
2020-10-31 09:40:08 -07:00
Matthew Macy 5fa356ea44
Remove UIO_ZEROCOPY functions structures
The original xuio zero copy functionality has always been unused 
on Linux and FreeBSD.  Remove this disabled code to avoid any
confusion and improve readability.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #11124
2020-10-30 10:00:33 -07:00
Matthew Macy e53d678d4a
Share zfs_fsync, zfs_read, zfs_write, et al between Linux and FreeBSD
The zfs_fsync, zfs_read, and zfs_write function are almost identical
between Linux and FreeBSD.  With a little refactoring they can be
moved to the common code which is what is done by this commit.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #11078
2020-10-21 14:08:06 -07:00
Warner Losh b302185a92
FreeBSD: make adjustments for the standalone environment
In FreeBSD, there are three compile environments that are supported:
user land, the kernel and the bootloader / standalone. Adjust the
headers to compile in the standalone environment. Limit kernel-only
items from view when _STANDALONE is defined.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Warner Losh <imp@FreeBSD.org>
Closes #10998
2020-10-13 21:05:49 -07:00
Brian Behlendorf d0249a4bd0
Replace ZFS on Linux references with OpenZFS
This change updates the documentation to refer to the project
as OpenZFS instead ZFS on Linux.  Web links have been updated
to refer to https://github.com/openzfs/zfs.  The extraneous
zfsonlinux.org web links in the ZED and SPL sources have been
dropped.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11007
2020-10-08 20:10:13 -07:00
Ryan Moeller 79f0935fab
FreeBSD: Sort out kernel FPU headers for 12.1-REL
We were missing an include for kernel FPU functions, breaking the build
on FreeBSD 12.1-RELEASE.  This was apparently being pulled in from
elsewhere on stable/12 and head.

Sorted the other includes in these files while here.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11005
2020-10-02 17:48:45 -07:00
Matthew Macy 7b8363d7f0
FreeBSD: Add support for procfs_list
The procfs_list interface is required by several kstats. Implement
this functionality for FreeBSD to provide access to these kstats.
                           
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10890
2020-09-23 16:43:51 -07:00
Matthew Macy 3dad29fb4b
FreeBSD: Don't save user FPU context in kernel threads
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10899
2020-09-23 11:09:48 -07:00
Alexander Richardson f3064162ba
Fixes for running FreeBSD buildworld on Linux/macOS hosts
Adding an #ifdef __FreeBSD__ to a FreeBSD-specific header may seem odd,
but these headers are used on non-FreeBSD systems during the bootstrap
tools phase.
Originally submitted downstream as https://reviews.freebsd.org/D26193

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alex Richardson <Alexander.Richardson@cl.cam.ac.uk>
Closes #10863
2020-09-03 20:06:03 -07:00
Matthew Macy ac6e5fb202
Replace cv_{timed}wait_sig with cv_{timed}wait_idle where appropriate
There are a number of places where cv_?_sig is used simply for
accounting purposes but the surrounding code has no ability to
cope with actually receiving a signal. On FreeBSD it is possible
to send signals to individual kernel threads so this could
enable undesirable behavior.

This patch adds routines on Linux that will do the same idle
accounting as _sig without making the task interruptible. On
FreeBSD cv_*_idle  are all aliases for cv_*

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10843
2020-09-03 20:04:09 -07:00
Ryan Moeller eff621071f FreeBSD: Simplify INGLOBALZONE
FreeBSD's previous ZFS implemented INGLOBALZONE(thread) as
(!jailed((thread)->td_ucred)) and passed curthread to INGLOBALZONE.

We pass curproc instead of curthread, so we can achieve the same effect
with (!jailed((proc)->p_ucred)).  The implementation is trivial enough
to fit on a single line in a define.  We don't really need a whole
separate function for something that's already macros all the way down.

Eliminate in_globalzone.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #10851
2020-08-31 19:43:08 -07:00
Ryan Moeller 2f65c7a608 FreeBSD: Define crgetzoneid appropriately
The previous ZFS implementation on FreeBSD had ifdefs to use jailed()
instead of crgetzoneid() in dsl_dir.c, however we can simply provide an
appropriate definition of crgetzoneid for the same effect.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #10851
2020-08-31 19:42:26 -07:00
Toomas Soome ec0c480c14
Remove pragma ident lines
The #pragma ident is a historical relic and not needed any more, this
pragma is actually unknown for common compilers and is only causing
trouble.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #10810
2020-08-26 10:35:50 -07:00
Ryan Moeller 6fe3498ca3
Import vdev ashift optimization from FreeBSD
Many modern devices use physical allocation units that are much
larger than the minimum logical allocation size accessible by
external commands. Two prevalent examples of this are 512e disk
drives (512b logical sector, 4K physical sector) and flash devices
(512b logical sector, 4K or larger allocation block size, and 128k
or larger erase block size). Operations that modify less than the
physical sector size result in a costly read-modify-write or garbage
collection sequence on these devices.

Simply exporting the true physical sector of the device to ZFS would
yield optimal performance, but has two serious drawbacks:

 1. Existing pools created with devices that have different logical
    and physical block sizes, but were configured to use the logical
    block size (e.g. because the OS version used for pool construction
    reported the logical block size instead of the physical block
    size) will suddenly find that the vdev allocation size has
    increased. This can be easily tolerated for active members of
    the array, but ZFS would prevent replacement of a vdev with
    another identical device because it now appears that the smaller
    allocation size required by the pool is not supported by the new
    device.

 2. The device's physical block size may be too large to be supported
    by ZFS. The optimal allocation size for the vdev may be quite
    large. For example, a RAID controller may export a vdev that
    requires read-modify-write cycles unless accessed using 64k
    aligned/sized requests. ZFS currently has an 8k minimum block
    size limit.

Reporting both the logical and physical allocation sizes for vdevs
solves these problems. A device may be used so long as the logical
block size is compatible with the configuration. By comparing the
logical and physical block sizes, new configurations can be optimized
and administrators can be notified of any existing pools that are
sub-optimal.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Matthew Macy <mmacy@freebsd.org>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10619
2020-08-21 12:53:17 -07:00
Matthew Macy 716b53d0a1
FreeBSD: Fix UNIX permissions checking
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10727
2020-08-18 09:57:07 -07:00
Ryan Moeller 009cc8e884
Make zc_nvlist_src_size limit tunable
We limit the size of nvlists passed to the kernel so a user cannot make
the kernel do an unreasonably large allocation.  On FreeBSD this limit
was 128 kiB, which turns out to be a bit too small when doing some
operations involving a large number of datasets or snapshots, for
example replication.

Make this limit tunable, with a platform-specific auto default.
Linux keeps its limit at KMALLOC_MAX_SIZE. FreeBSD uses 1/4 of the
system limit on user wired memory, which allows it to scale depending
on system configuration.

Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Issue #6572 
Closes #10706
2020-08-18 09:33:55 -07:00
Ryan Moeller 3c3d7c8a57
FreeBSD: Create taskq threads in appropriate proc
Stepping stone toward re-enabling spa_thread on FreeBSD.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10715
2020-08-17 11:01:19 -07:00
Matthew Ahrens 74e994ec63 Remove KM_NODEBUG
Remove dead code to make the implementation easier to understand.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Ahrens <matt@delphix.com>
Closes #10650
2020-08-05 10:28:13 -07:00
Matthew Ahrens f68af67a0c Remove KMC_NOTOUCH
Remove dead code to make the implementation easier to understand.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Ahrens <matt@delphix.com>
Closes #10650
2020-08-05 10:27:46 -07:00
Matthew Macy 1b376d176e
FreeBSD: Add support for lockless lookup
Authored-by: mjg <mjg@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10657
2020-08-05 10:19:51 -07:00
Matthew Macy 47ed79ff60
Changes to make openzfs build within FreeBSD buildworld
A collection of header changes to enable FreeBSD to build
with vendored OpenZFS.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10635
2020-07-31 21:30:31 -07:00
Matthew Macy 27d96d2254
Rename refcount.h to zfs_refcount.h
Renamed to avoid conflicting with refcount.h when a different
implementation is already provided by the platform.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10620
2020-07-29 16:35:33 -07:00
Serapheim Dimitropoulos 843e9ca2e1
Introduce names for ZTHRs
When debugging issues or generally analyzing the runtime of
a system it would be nice to be able to tell the different
ZTHRs running by name rather than having to analyze their
stack.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Co-authored-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #10630
2020-07-29 09:43:33 -07:00
Matthew Macy 5678d3f593
Prefix zfs internal endian checks with _ZFS
FreeBSD defines _BIG_ENDIAN BIG_ENDIAN _LITTLE_ENDIAN
LITTLE_ENDIAN on every architecture. Trying to do
cross builds whilst hiding this from ZFS has proven
extremely cumbersome.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10621
2020-07-28 13:02:49 -07:00
Matthew Macy e64cc4954c
Refactor ccompile.h to not include system headers
This is a step toward being able to vendor the OpenZFS code in FreeBSD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10625
2020-07-25 20:09:50 -07:00
Matthew Macy 6d8da84106
Make use of ZFS_DEBUG consistent within kmod sources
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10623
2020-07-25 20:07:44 -07:00
Matthew Macy f5b189f937
FreeBSD: Fixes required to build ZFS on PowerPC
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10622
2020-07-25 11:00:23 -07:00
Matthew Ahrens e59a377a8f
filesystem_limit/snapshot_limit is incorrectly enforced against root
The filesystem_limit and snapshot_limit properties limit the number of
filesystems or snapshots that can be created below this dataset.
According to the manpage, "The limit is not enforced if the user is
allowed to change the limit."  Two types of users are allowed to change
the limit:

1. Those that have been delegated the `filesystem_limit` or
`snapshot_limit` permission, e.g. with
`zfs allow USER filesystem_limit DATASET`.  This works properly.

2. A user with elevated system privileges (e.g. root).  This does not
work - the root user will incorrectly get an error when trying to create
a snapshot/filesystem, if it exceeds the `_limit` property.

The problem is that `priv_policy_ns()` does not work if the `cred_t` is
not that of the current process.  This happens when
`dsl_enforce_ds_ss_limits()` is called in syncing context (as part of a
sync task's check func) to determine the permissions of the
corresponding user process.

This commit fixes the issue by passing the `task_struct` (typedef'ed as
a `proc_t`) to syncing context, and then using `has_capability()` to
determine if that process is privileged.  Note that we still need to
pass the `cred_t` to syncing context so that we can check if the user
was delegated this permission with `zfs allow`.

This problem only impacts Linux.  Wrappers are added to FreeBSD but it
continues to use `priv_check_cred()`, which works on arbitrary `cred_t`.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #8226
Closes #10545
2020-07-11 17:18:02 -07:00
Matthew Macy 3933305eac
FreeBSD: Use a hash table for taskqid lookups
Previously a tqent could be recycled prematurely, update the
code to use a hash table for lookups to resolve this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10529
2020-07-11 17:13:45 -07:00
Matthew Macy 7ddb753d17
freebsd: changes necessary to coexist with dtrace in tree
Fix header conflicts when building zfs with openzfs as a vendor import.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10497
2020-07-01 09:10:08 -07:00
Matthew Ahrens 3c42c9ed84
Clean up OS-specific ARC and kmem code
OS-specific code (e.g. under `module/os/linux`) does not need to share
its code structure with any other operating systems.  In particular, the
ARC and kmem code need not be similar to the code in illumos, because we
won't be syncing this OS-specific code between operating systems.  For
example, if/when illumos support is added to the common repo, we would
add a file `module/os/illumos/zfs/arc_os.c` for the illumos versions of
this code.

Therefore, we can simplify the code in the OS-specific ARC and kmem
routines.

These changes do not impact system behavior, they are purely code
cleanup.  The changes are:

Arenas are not used on Linux or FreeBSD (they are always `NULL`), so
`heap_arena`, `zio_arena`, and `zio_alloc_arena` can be removed, along
with code that uses them.

In `arc_available_memory()`:
 * `desfree` is unused, remove it
 * rename `freemem` to avoid conflict with pre-existing `#define`
 * remove checks related to arenas
 * use units of bytes, rather than converting from bytes to pages and
   then back to bytes

`SPL_KMEM_CACHE_REAP` is unused, remove it.

`skc_reap` is unused, remove it.

The `count` argument to `spl_kmem_cache_reap_now()` is unused, remove
it.

`vmem_size()` and associated type and macros are unused, remove them.

In `arc_memory_throttle()`, use a less confusing variable name to store
the result of `arc_free_memory()`.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10499
2020-06-29 09:01:07 -07:00
Arvind Sankar 4d8e68c42f Avoid installing kernel headers on FreeBSD
The kernel headers are installed for DKMS on linux, so don't install
them unless we're building on linux.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes #10506
2020-06-27 17:40:14 -07:00
Arvind Sankar 6b99fc0620 Fixes for make dist
Reduce the usage of EXTRA_DIST. If files are conditionally included in
_SOURCES, _HEADERS etc, automake is smart enough to dist all files that
could possibly be included, but this does not apply to EXTRA_DIST,
resulting in make dist depending on the configuration.

Add some files that were missing altogether in various Makefile's.

The changes to disted files in this commit (excluding deleted files):

+./cmd/zed/agents/README.md
+./etc/init.d/README.md
+./lib/libspl/os/freebsd/getexecname.c
+./lib/libspl/os/freebsd/gethostid.c
+./lib/libspl/os/freebsd/getmntany.c
+./lib/libspl/os/freebsd/mnttab.c
-./lib/libzfs/libzfs_core.pc
-./lib/libzfs/libzfs.pc
+./lib/libzfs/os/freebsd/libzfs_compat.c
+./lib/libzfs/os/freebsd/libzfs_fsshare.c
+./lib/libzfs/os/freebsd/libzfs_ioctl_compat.c
+./lib/libzfs/os/freebsd/libzfs_zmount.c
+./lib/libzutil/os/freebsd/zutil_compat.c
+./lib/libzutil/os/freebsd/zutil_device_path_os.c
+./lib/libzutil/os/freebsd/zutil_import_os.c
+./module/lua/README.zfs
+./module/os/linux/spl/README.md
+./tests/README.md
+./tests/zfs-tests/tests/functional/cli_root/zfs_clone/zfs_clone_rm_nested.ksh
+./tests/zfs-tests/tests/functional/cli_root/zfs_send/zfs_send_encrypted_unloaded.ksh
+./tests/zfs-tests/tests/functional/inheritance/README.config
+./tests/zfs-tests/tests/functional/inheritance/README.state
+./tests/zfs-tests/tests/functional/rsend/rsend_016_neg.ksh
+./tests/zfs-tests/tests/perf/fio/sequential_readwrite.fio

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes #10501
2020-06-26 14:20:02 -07:00
Arvind Sankar 7513807320 Drop unnecessary srcdir paths
There's no need to specify the srcdir explicitly in _HEADERS and
EXTRA_DIST.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes #10493
2020-06-24 18:20:18 -07:00
Ryan Moeller 9192f27c1d
Add zfs_multihost_interval tunable handler for FreeBSD
This tunable required a handler to be implemented for
ZFS_MODULE_PARAM_CALL.

Add the handler so the tunable can be declared in common code.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10490
2020-06-23 13:32:42 -07:00
Arvind Sankar c3fe42aabd Remove dead code
Delete unused functions.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes #10470
2020-06-18 12:21:18 -07:00
Arvind Sankar 65c7cc49bf Mark functions as static
Mark functions used only in the same translation unit as static. This
only includes functions that do not have a prototype in a header file
either.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes #10470
2020-06-18 12:20:38 -07:00
Matthew Macy 8056a75672
Disambiguate condvar API contract
On Illumos callers of cv_timedwait and cv_timedwait_hires
can't distinguish between whether or not the cv was signaled
or the call timed out. Illumos handles this (for some definition
of handles) by calling cv_signal in the return path if we were
signaled but the return value indicates instead that we timed
out. This would make sense if it were possible to query the the
cv for its net signal disposition. However, this isn't possible
and, in spite of the fact that there are places in the code that
clearly take a different and incompatible path if a timeout value
is indicated, this distinction appears to be rather subtle to most
developers. This problem is further compounded by the fact that on
Linux, calling cv_signal in the return path wouldn't even do the
right thing unless there are other waiters.

Since it is possible for the caller to independently determine how
much time is remaining but it is not possible to query if the cv
was in fact signaled, prioritizing signalling over timeout seems
like a cleaner solution. In addition, judging from usage patterns
within the code itself, it is also less error prone.

Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10471
2020-06-18 10:17:50 -07:00
Ryan Moeller c13facb9c4
Fix FreeBSD condvar semantics
We should return -1 instead of negative deltas, and 0 if signaled.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10460
2020-06-16 09:59:31 -07:00
Jorgen Lundman 883a40fff4
Add convenience wrappers for common uio usage
The macOS uio struct is opaque and the API must be used, this
makes the smallest changes to the code for all platforms.

Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #10412
2020-06-14 10:09:55 -07:00
Ryan Moeller 499dccd69b
FreeBSD: Don't require zeroing new locks before init
This has not shown to be of use enough to justify the inconvenience.

Reviewed-by: Matt Macy <mmacy@FreeBSD.org>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10449
2020-06-13 10:58:10 -07:00
Matthew Ahrens f66434268c
Remove unnecessary references to slavery
The horrible effects of human slavery continue to impact society.  The
casual use of the term "slave" in computer software is an unnecessary
reference to a painful human experience.

This commit removes all possible references to the term "slave".

Implementation notes:

The zpool.d/slaves script is renamed to dm-deps, which uses the same
terminology as `dmsetup deps`.

References to the `/sys/class/block/$dev/slaves` directory remain.  This
directory name is determined by the Linux kernel.  Although
`dmsetup deps` provides the same information, it unfortunately requires
elevated privileges, whereas the `/sys/...` directory is world-readable.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10435
2020-06-10 17:07:59 -07:00
Andrea Gelmini dd4bc569b9
Fix typos
Correct various typos in the comments and tests.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #10423
2020-06-09 21:24:09 -07:00
Ryan Moeller 60265072e0
Improve compatibility with C++ consumers
C++ is a little picky about not using keywords for names, or string
constness.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10409
2020-06-06 12:54:04 -07:00
Jorgen Lundman eeb8fae9c7
Upstream: add missing thread_exit()
Undo FreeBSD wrapper for thread_create() added to call thread_exit.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #10314
2020-05-14 15:58:09 -07:00
Ryan Moeller 639dfeb831
Update FreeBSD SPL atomics
Sync up with the following changes from FreeBSD:

ZFS: add emulation of atomic_swap_64 and atomic_load_64

Some 32-bit platforms do not provide 64-bit atomic operations that ZFS
requires, either in userland or at all.  We emulate those operations
for those platforms using a mutex.  That is not entirely correct and
it's very efficient.  Besides, the loads are plain loads, so torn
values are possible.

Nevertheless, the emulation seems to work for some definition of work.

This change adds atomic_swap_64, which is already used in ZFS code,
and atomic_load_64 that can be used to prevent torn reads.

Authored by: avg <avg@FreeBSD.org>
FreeBSD-commit: freebsd/freebsd@3458e5d1e6

cleanup of illumos compatibility atomics

atomic_cas_32 is implemented using atomic_fcmpset_32 on all platforms.
Ditto for atomic_cas_64 and atomic_fcmpset_64 on platforms that have
it.  The only exception is sparc64 that provides MD atomic_cas_32 and
atomic_cas_64.
This is slightly inefficient as fcmpset reports whether the operation
updated the target and that information is not needed for cas.
Nevertheless, there is less code to maintain and to add for new
platforms.  Also, the operations are done inline now as opposed to
function calls before.

atomic_add_64_nv is implemented using atomic_fetchadd_64 on platforms
that provide it.

casptr, cas32, atomic_or_8, atomic_or_8_nv are completely removed as
they have no users.

atomic_mtx that is used to emulate 64-bit atomics on platforms that
lack them is defined only on those platforms.

As a result, platform specific opensolaris_atomic.S files have lost
most of their code.  The only exception is i386 where the
compat+contrib code provides 64-bit atomics for userland use.  That
code assumes availability of cmpxchg8b instruction.  FreeBSD does not
have that assumption for i386 userland and does not provide 64-bit
atomics.  Hopefully, this can and will be fixed.

Authored by: avg <avg@FreeBSD.org>
FreeBSD-commit: freebsd/freebsd@e9642c209b

emulate illumos membar_producer with atomic_thread_fence_rel

membar_producer is supposed to be a store-store barrier.
Also, in the code that FreeBSD has ported from illumos membar_producer
is used only with regular stores to regular memory (with respect to
caching).

We do not have an MI primitive for the store-store barrier, so
atomic_thread_fence_rel is the closest we have as it provides
(load | store) -> store barrier.

Previously, membar_producer was an empty function call on all 32-bit
arm-s, 32-bit powerpc, riscv and all mips variants.  I think that it
was inadequate.
On other platforms, such as amd64, arm64, i386, powerpc64, sparc64,
membar_producer was implemented using stronger primitives than required
for a store-store barrier with respect to regular memory access.
For example, it used sfence on amd64 and lock-ed nop in i386 (despite
TSO).
On powerpc64 we now use recommended lwsync instead of eieio.
On sparc64 FreeBSD uses TSO mode.
On arm64/aarch64 we now use dmb sy instead of dmb ish.  Not sure if
this is an improvement, actually.

After this change we can drop opensolaris_atomic.S for aarch64, amd64,
powerpc64 and sparc64 as all required atomic operations have either
direct or light-weight mapping to FreeBSD native atomic operations.

Discussed with: kib
Authored by: avg <avg@FreeBSD.org>
FreeBSD-commit: freebsd/freebsd@50cdda62fc

fix up r353340, don't assume that fcmpset has strong semantics

fcmpset can have two kinds of semantics, weak and strong.
For practical purposes, strong semantics means that if fcmpset fails
then the reported current value is always different from the expected
value.  Weak semantics means that the reported current value may be the
same as the expected value even though fcmpset failed.  That's a so
called "sporadic" failure.

I originally implemented atomic_cas expecting strong semantics, but
many platforms actually have weak one.

Reported by:    pkubaj (not confirmed if same issue)
Discussed with: kib, mjg
Authored by: avg <avg@FreeBSD.org>
FreeBSD-commit: freebsd/freebsd@238787c74e

[PowerPC] [MIPS] Implement 32-bit kernel emulation of atomic64 operations

This is a lock-based emulation of 64-bit atomics for kernel use, split off
from an earlier patch by jhibbits.

This is needed to unblock future improvements that reduce the need for
locking on 64-bit platforms by using atomic updates.

The implementation allows for future integration with userland atomic64,
but as that implies going through sysarch for every use, the current
status quo of userland doing its own locking may be for the best.

Submitted by:   jhibbits (original patch), kevans (mips bits)
Reviewed by:    jhibbits, jeff, kevans
Authored by: bdragon <bdragon@FreeBSD.org>
Differential Revision:  https://reviews.freebsd.org/D22976
FreeBSD-commit: freebsd/freebsd@db39dab3a8

Remove sparc64 kernel support

Remove all sparc64 specific files
Remove all sparc64 ifdefs
Removee indireeect sparc64 ifdefs

Authored by: imp <imp@FreeBSD.org>
FreeBSD-commit: freebsd/freebsd@48b94864c5

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Ported-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #10250
2020-05-04 15:07:04 -07:00
Matthew Ahrens 1f043c8be1
Fix zfs send progress reporting
The progress of a send is supposed to be reported by `zfs send -v`, but
it is not.  This works by creating a new user thread (with
pthread_create()) which does ZFS_IOC_SEND_PROGRESS ioctls to check how
much progress has been made.  This IOCTL finds the specified send (since
there may be multiple concurrent sends in the system).  The IOCTL also
checks that the specified send was started by the current process.

On Linux, different threads of the same process are represented as
different `struct task_struct`s (and, confusingly, have different
PID's).  To check if if two threads are in the same process, we need to
check if they have the same `struct task_struct:group_leader`.

We used to to this correctly, but it was inadvertently changed by
30af21b025 (Redacted Send) to simply check if the current
`struct task_struct` is the one that started the send.

This commit changes the code back to checking if the send was started by
a `struct task_struct` with the same `group_leader` as the calling
thread.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Chris Wedgwood <cw@f00f.org>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10215 
Closes #10216
2020-04-20 10:12:48 -07:00
Matthew Macy c614fd6e12
Use new FreeBSD API to largely eliminate object locking
Propagate changes in HEAD that mostly eliminate object locking.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10205
2020-04-17 09:30:26 -07:00
Matthew Macy 9f0a21e641
Add FreeBSD support to OpenZFS
Add the FreeBSD platform code to the OpenZFS repository.  As of this
commit the source can be compiled and tested on FreeBSD 11 and 12.
Subsequent commits are now required to compile on FreeBSD and Linux.
Additionally, they must pass the ZFS Test Suite on FreeBSD which is
being run by the CI.  As of this commit 1230 tests pass on FreeBSD
and there are no unexpected failures.

Reviewed-by: Sean Eric Fagan <sef@ixsystems.com>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #898 
Closes #8987
2020-04-14 11:36:28 -07:00