Commit Graph

246 Commits

Author SHA1 Message Date
Rob Norris 2e6b3c4d94 Linux 6.8 compat: handle mnt_idmap user_namespace change
struct mnt_idmap no longer has a struct user_namespace within it. Work
around this by creating a temporary with the copy of the map we need
taken from the idmap.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Youzhong Yang <yyang@mathworks.com>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #15805
2024-01-29 11:36:07 -08:00
Rob Norris 84980ee0e6 Linux 6.8 compat: implement strlcpy fallback
Linux has removed strlcpy in favour of strscpy. This implements a
fallback implementation of strlcpy for this case.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #15805
2024-01-29 11:36:07 -08:00
Shengqi Chen 86239a5b9c
compact: workaround for GPL-only symbols on riscv from Linux 6.2
Since Linux 6.2, the implementation of flush_dcache_page on riscv
references GPL-only symbol `PageHuge`, breaking the build of zfs.

This patch uses existing mechanism to override flush_dcache_page,
removing the call to `PageHuge`. According to comments in kernel,
it is only used to do some check against HugeTLB pages, which only
exist in userspace. ZFS uses flush_dcache_page only on kernel pages,
thus this patch will not introduce any behaviour change.

See also: torvalds/linux@d33deda, openzfs/zfs@589f59b

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #14974 
Closes #15627
2023-12-06 12:37:50 -08:00
Coleman Kane 7ac56b86cd Linux 6.6 compat: fsync_bdev() has been removed in favor of sync_blockdev()
In Linux commit 560e20e4bf6484a0c12f9f3c7a1aa55056948e1e, the
fsync_bdev() function was removed in favor of sync_blockdev() to do
(roughly) the same thing, given the same input. This change
conditionally attempts to call sync_blockdev() if fsync_bdev() isn't
discovered during configure.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #15263
2023-09-21 18:38:40 -07:00
Coleman Kane 8ce2eba9e6
Linux 6.5 compat: Use copy_splice_read instead of filemap_splice_read
Using the filemap_splice_read function for the splice_read handler was
leading to occasional data corruption under certain circumstances. Favor
using copy_splice_read instead, which does not demonstrate the same
erroneous behavior under the tested failure cases.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #15164
2023-08-08 15:42:32 -07:00
Coleman Kane 36261c8238
Linux 6.5 compat: replace generic_file_splice_read with filemap_splice_read
The generic_file_splice_read function was removed in Linux 6.5 in favor
of filemap_splice_read. Add an autoconf test for filemap_splice_read and
use it if it is found as the handler for .splice_read in the
file_operations struct. Additionally, ITER_PIPE was removed in 6.5. This
change removes the ITER_* macros that OpenZFS doesn't use from being
tested in config/kernel-vfs-iov_iter.m4. The removal of ITER_PIPE was
causing the test to fail, which also affected the code responsible for
setting the .splice_read handler, above. That behavior caused run-time
panics on Linux 6.5.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #15155
2023-08-07 15:47:46 -07:00
Coleman Kane e47e9bbe86
Linux 6.5 compat: register_sysctl_table removed
Additionally, the .child element of ctl_table has been removed in 6.5.
This change adds a new test for the pre-6.5 register_sysctl_table()
function, and uses the old code in that case. If it isn't found, then
the parentage entries in the tables are removed, and the register_sysctl
call is provided the paths of "kernel/spl", "kernel/spl/kmem", and
"kernel/spl/kstat" directly, to populate each subdirectory over three
calls, as is the new API.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #15138
2023-08-02 14:05:46 -07:00
Brian Atkinson a5fdba1185
Revert "Linux 6.5 compat: register_sysctl_table removed"
This reverts commit b35374fd64 as there
are error messages when loading the SPL module. Errors seemed to be tied
to duplicate a duplicate entry.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #15134
2023-08-01 14:48:19 -07:00
Coleman Kane b35374fd64
Linux 6.5 compat: register_sysctl_table removed
Additionally, the .child element of ctl_table has been removed in 6.5.
This change adds a new test for the pre-6.5 register_sysctl_table()
function, and uses the old code in that case. If it isn't found, then
the parentage entries in the tables are removed, and the register_sysctl
call is provided the paths of "kernel/spl", "kernel/spl/kmem", and
"kernel/spl/kstat" directly, to populate each subdirectory over three
calls, as is the new API.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #15098
2023-08-01 08:27:58 -07:00
Rob Norris 6b0a4be5fe linux: implement filesystem-side copy/clone functions for EL7
Redhat have backported copy_file_range and clone_file_range to the EL7
kernel using an "extended file operations" wrapper structure. This
connects all that up to let cloning work there too.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Kay Pedersen <mail@mkwg.de>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Sponsored-By: OpenDrives Inc.
Sponsored-By: Klara Inc.
Closes #15050
2023-07-24 16:37:04 -07:00
Rob Norris 5a35c68b67 linux: implement filesystem-side copy/clone functions
This implements the Linux VFS ops required to service the file
copy/clone APIs:

  .copy_file_range    (4.5+)
  .clone_file_range   (4.5-4.19)
  .dedupe_file_range  (4.5-4.19)
  .remap_file_range   (4.20+)

Note that dedupe_file_range() and remap_file_range(REMAP_FILE_DEDUP) are
hooked up here, but are not implemented yet.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Kay Pedersen <mail@mkwg.de>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Sponsored-By: OpenDrives Inc.
Sponsored-By: Klara Inc.
Closes #15050
2023-07-24 16:36:38 -07:00
youzhongyang f8447cf22e
Linux 6.4 compat: reclaimed_slab renamed to reclaimed
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes #14891
2023-05-24 12:23:42 -07:00
youzhongyang d4dc53dad2
Linux 6.3 compat: idmapped mount API changes
Linux kernel 6.3 changed a bunch of APIs to use the dedicated idmap 
type for mounts (struct mnt_idmap), we need to detect these changes 
and make zfs work with the new APIs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes #14682
2023-04-10 14:15:36 -07:00
youzhongyang 8eb2f26057
Linux 6.3 compat: writepage_t first arg struct folio*
The type def of writepage_t in kernel 6.3 is changed to take
struct folio* as the first argument. We need to detect this
change and pass correct function to write_cache_pages().

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes #14699
2023-04-05 10:01:38 -07:00
Low-power 589f59b52a
Workaround for Linux PowerPC GPL-only cpu_has_feature()
Linux since 4.7 makes interface 'cpu_has_feature' to use jump labels on
powerpc if CONFIG_JUMP_LABEL_FEATURE_CHECKS is enabled, in this case
however the inline function references GPL-only symbol
'cpu_feature_keys'.

ZFS currently uses 'cpu_has_feature' either directly or indirectly from
several places; while it is unknown how this issue didn't break ZFS on
64-bit little-endian powerpc, it is known to break ZFS with many Linux
versions on both 32-bit and 64-bit big-endian powerpc.

Until this issue is fixed in Linux, we have to workaround it by
overriding affected inline functions without depending on
'cpu_feature_keys'.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: WHR <msl0000023508@gmail.com>
Closes #14590
2023-03-10 09:35:00 -08:00
Brian Behlendorf 3fc92adc40
Linux: use filemap_range_has_page()
As of the 4.13 kernel filemap_range_has_page() can be used to
check if there is a page mapped in a given file range.  When
available this interface should be used which eliminates the
need for the zp->z_is_mapped boolean.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #14493
2023-02-14 11:04:34 -08:00
Antonio Russo 556ed09537 Introduce ZFS_LINUX_REQUIRE_API autoconf macro
Currently, if API tests fail, we either ignore the failures, or
unconditionally halt the kernel build.  This leads to situations where
incompatibilities with existing APIs may develop, but not trip the
configure compatibility checks.

This introduces a new mechanism to require APIs for kernels above a
particular version.  While not perfect, this at least guarantees
mainline kernels do not break existing APIs without at least providing
some warning.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #14343
2023-01-06 14:33:53 -08:00
youzhongyang f224eddf92
Support idmapped mount in user namespace
Linux 5.17 commit torvalds/linux@5dfbfe71e enables "the idmapping 
infrastructure to support idmapped mounts of filesystems mounted 
with an idmapping". Update the OpenZFS accordingly to improve the 
idmapped mount support. 

This pull request contains the following changes:

- xattr setter functions are fixed to take mnt_ns argument. Without
  this, cp -p would fail for an idmapped mount in a user namespace.
- idmap_util is enhanced/fixed for its use in a user ns context.
- One test case added to test idmapped mount in a user ns.

Reviewed-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes #14097
2022-11-08 10:28:56 -08:00
Pavel Snajdr 86db35c447 Remove zpl_revalidate: fix snapshot rollback
Open files, which aren't present in the snapshot, which is being
roll-backed to, need to disappear from the visible VFS image of
the dataset.

Kernel provides d_drop function to drop invalid entry from
the dcache, but inode can be referenced by dentry multiple dentries.

The introduced zpl_d_drop_aliases function walks and invalidates
all aliases of an inode.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Closes #9600
Closes #14070
2022-10-28 09:47:19 -07:00
youzhongyang 2a068a1394
Support idmapped mount
Adds support for idmapped mounts.  Supported as of Linux 5.12 this 
functionality allows user and group IDs to be remapped without changing 
their state on disk.  This can be useful for portable home directories
and a variety of container related use cases.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes #12923
Closes #13671
2022-10-19 11:17:09 -07:00
Alexander ab49df487b
Linux compat: fix DECLARE_EVENT_CLASS() test when ZFS is built-in
ZFS_LINUX_TRY_COMPILE_HEADER macro doesn't take CONFIG_ZFS=y into
account. As a result, on several latest Linux versions, configure
script marks DECLARE_EVENT_CLASS() available for non-GPL when ZFS
is being built as a module, but marks it unavailable when ZFS is
built-in.
Follow the logic of the neighbor macros and adjust
ZFS_LINUX_TRY_COMPILE_HEADER accordingly, so that it doesn't try
to look for a .ko when ZFS is built-in.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #14006
2022-10-17 11:08:36 -07:00
Richard Yao ff7a0a108f Linux: Remove ZFS_AC_KERNEL_SRC_MODULE_PARAM_CALL_CONST autotools check
On older kernels, the definition for `module_param_call()` typecasts
function pointers to `(void *)`, which triggers -Werror, causing the
check to return false when it should return true.

Fixing this breaks the build process on some older kernels because they
define a `__check_old_set_param()` function in their headers that checks
for a non-constified `->set()`. We workaround that through the c
preprocessor by defining `__check_old_set_param(set)` to `(set)`, which
prevents the build failures.

However, it is now apparent that all kernels that we support have
adopted the GRSecurity change, so there is no need to have an explicit
autotools check for it anymore. We therefore remove the autotools check,
while adding the workaround to our headers for the build time
non-constified `->set()` check done by older kernel headers.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13984
Closes #14004
2022-10-13 10:03:09 -07:00
Finix1979 320f0c6022
Add Linux posix_fadvise support
The purpose of this PR is to accepts fadvise ioctl from userland
to do read-ahead by demand.

It could dramatically improve sequential read performance especially
when primarycache is set to metadata or zfs_prefetch_disable is 1.

If the file is mmaped, generic_fadvise is also called for page cache
read-ahead besides dmu_prefetch.

Only POSIX_FADV_WILLNEED and POSIX_FADV_SEQUENTIAL are supported in
this PR currently.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Finix Yan <yancw@info2soft.com>
Closes #13694
2022-09-08 10:29:41 -07:00
Will Andrews 4ed5e25074 Add Linux namespace delegation support
This allows ZFS datasets to be delegated to a user/mount namespace
Within that namespace, only the delegated datasets are visible
Works very similarly to Zones/Jailes on other ZFS OSes

As a user:
```
 $ unshare -Um
 $ zfs list
no datasets available
 $ echo $$
1234
```

As root:
```
 # zfs list
NAME                            ZONED  MOUNTPOINT
containers                      off    /containers
containers/host                 off    /containers/host
containers/host/child           off    /containers/host/child
containers/host/child/gchild    off    /containers/host/child/gchild
containers/unpriv               on     /unpriv
containers/unpriv/child         on     /unpriv/child
containers/unpriv/child/gchild  on     /unpriv/child/gchild

 # zfs zone /proc/1234/ns/user containers/unpriv
```

Back to the user namespace:
```
 $ zfs list
NAME                             USED  AVAIL     REFER  MOUNTPOINT
containers                       129M  47.8G       24K  /containers
containers/unpriv                128M  47.8G       24K  /unpriv
containers/unpriv/child          128M  47.8G      128M  /unpriv/child
```

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Will Andrews <will.andrews@klarasystems.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Mateusz Piotrowski <mateusz.piotrowski@klarasystems.com>
Co-authored-by: Allan Jude <allan@klarasystems.com>
Co-authored-by: Mateusz Piotrowski <mateusz.piotrowski@klarasystems.com>
Sponsored-by: Buddy <https://buddy.works>
Closes #12263
2022-06-10 09:51:46 -07:00
Brian Behlendorf b9d98453f9 autoconf: AC_MSG_CHECKING consistency
Make the wording more consistent for the kernel AC_MSG_CHECKING
output (e.g. "checking whether ...".).  Additionally, group some
of the VFS interface checks with the others.  No functional change.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13529
2022-06-01 09:59:37 -07:00
Brian Behlendorf c2c2e7bb8b Linux 5.19 compat: aops->read_folio()
As of the Linux 5.19 kernel the readpage() address space operation
has been replaced by read_folio().

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515
2022-05-31 12:04:31 -07:00
Satadru Pramanik 7dde17e860
Linux 5.18 compat: replace __set_page_dirty_nobuffers
Replace __set_page_dirty_nobuffers with filemap_dirty_folio.

Upstream-commit: 6b1f86f8e9c7f9de7ca1cb987b2cf25e99b1ae3a
("Merge tag 'folio-5.18b' of
git://git.infradead.org/users/willy/pagecache ")

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Authored-by: Satadru Pramanik <satadru@gmail.com>
Signed-off-by: Satadru Pramanik <satadru@gmail.com>
Closes #13325
Closes #13380
2022-04-27 12:54:17 -07:00
Damian Szuberski 849c14e048
PPC get_user workaround
Linux 5.12 PPC 5.12 get_user() and __copy_from_user_inatomic()
inline helpers very indirectly include a reference to the GPL'd
array mmu_feature_keys[] and fails to build. Workaround this by
using copy_from_user() and throwing EFAULT for any calls to
__copy_from_user_inatomic(). This is a workaround until a fix
for Linux commit 7613f5a66becfd0e43a0f34de8518695888f5458
"powerpc/64s/kuap: Use mmu_has_feature()" is fully addressed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Authored-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #11958
Closes #12590
Closes #13367
2022-04-26 10:52:40 -07:00
Damian Szuberski a0dfd98a25
autoconf: Pretend `CONFIG_MODULES` is always on
- Unconditionally inject `CONFIG_MODULES` make variable
  and `#define CONFIG_MODULES` to Kbuild in `ZFS_LINUX_COMPILE`
  autoconf function to emulate loadable kernel modules support.
  This allows OpenZFS to perform Linux checks despite
  `CONFIG_MODULES=n` in the actual Linux config.

- Add `ZFS_AC_KERNEL_CONFIG_MODULES` check which encompasses
  the logic from `ZFS_AC_KERNEL_TEST_MODULE` with additional
  diagnostic messages to the user

- Removed `ZFS_AC_KERNEL_TEST_MODULE` as it merely duplicates
  every check in `ZFS_AC_KERNEL_CONFIG_DEFINED`

- Moved `ZFS_AC_MODULE_SYMVERS` after `ZFS_AC_KERNEL_CONFIG_DEFINED`
  so the user has a chance to see the proper diagnostic from the
  steps before.

A workaround for Linux's

```
commit 3e3005df73b535cb849cf4ec8075d6aa3c460f68
Author: Masahiro Yamada <masahiroy@kernel.org>
Date:   Wed Mar 31 22:38:03 2021 +0900

kbuild: unify modules(_install) for in-tree and external modules

If you attempt to build or install modules ('make modules(_install)'
with CONFIG_MODULES disabled, you will get a clear error message, but
nothing for external module builds.

Factor out the modules and modules_install rules into the common part,
so you will get the same error message when you try to build external
modules with CONFIG_MODULES=n.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
```

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #10832
Closes #13361
2022-04-26 10:47:09 -07:00
наб 0cdda2edb3 Linux 5.18 compat: kobj_type.default_attrs replaced with default_groups
Upstream-commit: cdb4f26a63c391317e335e6e683a614358e70aeb ("kobject:
 kobj_type: remove default_attrs")
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13357
2022-04-22 14:27:10 -07:00
Brian Behlendorf 026f126b83
Linux 5.17 compat: GENHD_FL_EXT_DEVT / GENHD_FL_NO_PART_SCAN
As of the 5.17 kernel the GENHD_FL_EXT_DEVT flag has been removed
and the GENHD_FL_NO_PART_SCAN flag renamed GENHD_FL_NO_PART. Update
zvol_alloc() to set GENHD_FL_NO_PART for the newer kernels which
is sufficient.  The behavior for prior kernels remains unchanged.

1ebe2e5f ("block: remove GENHD_FL_EXT_DEVT")
46e7eac6 ("block: rename GENHD_FL_NO_PART_SCAN to GENHD_FL_NO_PART")

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13294
Closes #13297
2022-04-19 10:38:04 -07:00
Riccardo Schirone 7d524c068d Linux 5.18 compat: use address_space_operations->readahead
->readpages was removed and replaced by ->readahead. Define
zpl_readahead for kernels that don't have ->readpages.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Riccardo Schirone <rschirone91@gmail.com>
Closes #13278
2022-04-04 09:35:24 -07:00
наб d30577c9dd fgrep -> grep -F
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13259
2022-04-01 17:58:11 -07:00
Andrew eebfd28e9d
Linux optimize access checks when ACL is trivial
Bypass check of ZFS aces if the ACL is trivial. When an ACL is
trivial its permissions are represented by the mode without any
loss of information. In this case, it is safe to convert the
access request into equivalent mode and then pass desired mask
and inode to generic_permission(). This has the added benefit
of also checking whether entries in a POSIX ACL on the file grant
the desired access.

This commit also skips the ACL check on looking up the xattr dir
since such restrictions don't exist in Linux kernel and it makes
xattr lookup behavior inconsistent between SA and file-based
xattrs. We also don't want to perform a POSIX ACL check while
looking up the POSIX ACL if for some reason it is located in
the xattr dir rather than an SA.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Andrew Walker <awalker@ixsystems.com>
Closes #13237
2022-04-01 09:53:54 -07:00
Brian Atkinson becc717f61
Adding ZERO_PAGE detection
On some architectures ZERO_PAGE is unavailable because it references
a GPL exported symbol of empty_zero_page. Originally e08b993 removed
the call to PAGE_ZERO(0) for assignment to the abd_zero_page. However,
a simple check can be done to avoid a kernel allocation and free for
the abd_zero_page if ZERO_PAGE is available.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #13199
2022-03-14 12:37:39 -07:00
Damian Szuberski 72f3c8b130
Fix Linux kernel directories detection
Most modern Linux distributions have separate locations for bare
source and prebuilt ("build") files. Additionally, there are `source`
and `build` symlinks in `/lib/modules/$(KERNEL_VERSION)` pointing to
them. The order of directory search is now:
- `configure` command line values if both `--with-linux` and
  `--with-linux-obj` were defined
- If only `--with-linux` was defined, `--with-linux-obj` is assumed
  to have the same value as `--with-linux`
- If neither `--with-linux` nor `--with-linux-obj` were defined
  autodetection is used:
  - `/lib/modules/$(uname -r)/{source,build}` respectively, if exist
  - The first directory in `/lib/modules` with the highest version
    number according to `sort -V` which contains `source` and `build`
    symlinks/directories
  - The first directory matching `/usr/src/kernels/*` and
    `/usr/src/linux-*` with the highest version number according to
    `sort -V`. Here the source and prebuilt directories are assumed
    to be the same.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #9935
Closes #13096
2022-02-16 15:17:16 -08:00
Peter Levine b66140c6ad Add support for $KERNEL_{CC,LD,LLVM} variables
Currently, $(CC), $(LD), and $(LLVM) variables aren't passed to kbuild
while building modules.  This causes modules to build with the default
GNU GCC toolchain and prevents experimenting with other toolchains such
as CLANG/LLVM.  It can also lead to build failure if the CFLAGS/LDFLAGS
passed are incompatible with gcc/ld.

Pass $KERNEL_CC, $KERNEL_LD, and $KERNEL_LLVM as $(CC), $(LD), and
$(LLVM), respectively, to kbuild for each that is defined in the
environment.  This should take care of the majority of alternative
toolchain use cases.

Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Peter Levine <plevine457@gmail.com>
Closes #13046
2022-02-09 13:24:22 -08:00
наб a9856574cf Linux 5.17 compat: detect complete_and_exit() rename
Linux 5.17 sees a rename from complete_and_exit()
to kthread complete_and_exit()

Upstream commit cead18552660702a4a46f58e65188fe5f36e9dfe
("exit: Rename complete_and_exit to kthread_complete_and_exit")

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12989
2022-01-25 12:52:12 -08:00
Rich Ercolani 12fa250d84 Linux 5.16 compat: Added add_disk check for return
add_disk went from void to must-check int return.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12975
2022-01-24 12:59:01 -08:00
Coleman Kane c0fb44c506 Linux 5.16: wait_on_page_bit() no longer available to modules
Instead, linux/pagemap.h offers a number of folio-specific functions to
be called instead. In this case, module/os/linux/zfs/zfs_vnops_os.c
wants to call wait_on_page_bit(pp, PG_writeback). This gets replaced
with folio_wait_bit(folio_page(pp), PG_writeback). This change modifies
the code to conditionally compile that if configure identifies th
presence of the folio_wait_bit() function.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12800
2021-11-29 10:48:52 -08:00
Alexander a3588c68f7
Linux 5.15 compat: standalone <linux/stdarg.h>
Kernel commits

39f75da7bcc8 ("isystem: trim/fixup stdarg.h and other headers")
c0891ac15f04 ("isystem: ship and use stdarg.h")
564f963eabd1 ("isystem: delete global -isystem compile option")

(for now can be found in linux-next.git tree, will land into the
 Linus' tree during the ongoing 5.15 cycle with one of akpm merges)

removed the -isystem flag and disallowed the inclusion of any
compiler header files. They also introduced a minimal
<linux/stdarg.h> as a replacement for <stdarg.h>.
include/os/linux/spl/sys/cmn_err.h in the ZFS source tree includes
<stdarg.h> unconditionally. Introduce a test for <linux/stdarg.h>
and include it instead of the compiler's one to prevent module
build breakage.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #12531
2021-09-08 12:59:43 -07:00
Coleman Kane 1c24bf966c
Linux 5.14 compat: explicity assign set_page_dirty
Kernel 5.14 introduced a change where set_page_dirty of
struct address_space_operations is no longer implicitly set to
__set_page_dirty_buffers(), which ended up resulting in a NULL
pointer deref in the kernel when it is attempted to be called.
This change sets .set_page_dirty in the structure to
__set_page_dirty_nobuffers(), which was introduced with the
related patch set. The breaking change was introduce in commit
0af573780b0b13fceb7fabd49dc1b073cee9a507 to torvalds/linux.git.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12427
2021-07-26 10:55:55 -07:00
Paul Dagnelie 414f7249dc
Add SIGSTOP and SIGTSTP handling to issig
This change adds SIGSTOP and SIGTSTP handling to the issig function; 
this mirrors its behavior on Solaris. This way, long running kernel 
tasks can be stopped with the appropriate signals. Note that doing 
so with ctrl-z on the command line doesn't return control of the tty 
to the shell, because tty handling is done separately from stopping 
the process. That can be future work, if people feel that it is a 
necessary addition.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Issue #810 
Issue #10843 
Closes #11801
2021-04-15 13:34:35 -07:00
Coleman Kane ffd6978ef5
Linux 5.12 update: bio_max_segs() replaces BIO_MAX_PAGES
The BIO_MAX_PAGES macro is being retired in favor of a bio_max_segs()
function that implements the typical MIN(x,y) logic used throughout the
kernel for bounding the allocation, and also the new implementation is
intended to be signed-safe (which the former was not).

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #11765
2021-03-19 22:33:42 -07:00
Coleman Kane e2a8296131
Linux 5.12 compat: idmapped mounts
In Linux 5.12, the filesystem API was modified to support ipmapped
mounts by adding a "struct user_namespace *" parameter to a number
functions and VFS handlers. This change adds the needed autoconf
macros to detect the new interfaces and updates the code appropriately.
This change does not add support for idmapped mounts, instead it
preserves the existing behavior by passing the initial user namespace
where needed.  A subsequent commit will be required to add support
for idmapped mounted.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #11712
2021-03-19 21:00:59 -07:00
Brian Behlendorf 06346cc5b5
Autoconf 2.70 compatibility
Several m4 macros have been retired in autoconf 2.70.  Update the
the build system to use the new macros provided to replace them.

* Replaced AC_HELP_STRING with AS_HELP_STRING.

* Replaced AC_TRY_COMPILE with AC_COMPILE_IFELSE/AC_LANG_PROGRAM.

* Replaced AC_CANONICAL_SYSTEM with AC_CANONICAL_TARGET

* Replaced AC_PROG_LIBTOOL with LT_INIT

* $CPP is not defined in ZFS_AC_KERNEL and really shouldn't be
  directly used like this.  Replace it with an $AWK command
  to extract the kernel source version.

Reviewed-by: Eli Schwartz <eschwartz@archlinux.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #11413 
Closes #11419
2021-01-02 16:55:55 -08:00
Brian Behlendorf c449d4b06d Linux 5.11 compat: blk_{un}register_region()
As of 5.11 the blk_register_region() and blk_unregister_region()
functions have been retired. This isn't a problem since add_disk()
has implicitly allocated minor numbers for a very long time.

Reviewed-by: Rafael Kitover <rkitover@gmail.com>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11387
Closes #11390
2020-12-27 16:20:46 -08:00
Brian Behlendorf 19697e4545 Linux 5.11 compat: revalidate_disk_size()
Both revalidate_disk_size() and revalidate_disk() have been removed.
Functionally this isn't a problem because we only relied on these
functions to call zvol_revalidate_disk() for us and to perform any
additional handling which might be needed for that kernel version.
When neither are available we know there's no additional handling
needed and we can directly call zvol_revalidate_disk().

Reviewed-by: Rafael Kitover <rkitover@gmail.com>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11387
Closes #11390
2020-12-27 16:20:40 -08:00
Brian Behlendorf c347fac586 Linux 5.11 compat: conftest
Update the ZFS_LINUX_TEST_PROGRAM macro to always set the module
license.  As of the 5.11 kernel not setting a license has been
converted from a warning to an error.

Reviewed-by: Rafael Kitover <rkitover@gmail.com>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11387
Closes #11390
2020-12-27 16:15:19 -08:00
Brian Behlendorf 1c2358c12a
Linux 5.10 compat: use iov_iter in uio structure
As of the 5.10 kernel the generic splice compatibility code has been
removed.  All filesystems are now responsible for registering a
->splice_read and ->splice_write callback to support this operation.

The good news is the VFS provided generic_file_splice_read() and
iter_file_splice_write() callbacks can be used provided the ->iter_read
and ->iter_write callback support pipes.  However, this is currently
not the case and only iovecs and bvecs (not pipes) are ever attached
to the uio structure.

This commit changes that by allowing full iov_iter structures to be
attached to uios.  Ever since the 4.9 kernel the iov_iter structure
has supported iovecs, kvecs, bvevs, and pipes so it's desirable to
pass the entire thing when possible.  In conjunction with this the
uio helper functions (i.e uiomove(), uiocopy(), etc) have been
updated to understand the new UIO_ITER type.

Note that using the kernel provided uio_iter interfaces allowed the
existing Linux specific uio handling code to be simplified.  When
there's no longer a need to support kernel's older than 4.9, then
it will be possible to remove the iovec and bvec members from the
uio structure and always use a uio_iter.  Until then we need to
maintain all of the existing types for older kernels.

Some additional refactoring and cleanup was included in this change:

- Added checks to configure to detect available iov_iter interfaces.
  Some are available all the way back to the 3.10 kernel and are used
  when available.  In particular, uio_prefaultpages() now always uses
  iov_iter_fault_in_readable() which is available for all supported
  kernels.

- The unused UIO_USERISPACE type has been removed.  It is no longer
  needed now that the uio_seg enum is platform specific.

- Moved zfs_uio.c from the zcommon.ko module to the Linux specific
  platform code for the zfs.ko module.  This gets it out of libzfs
  where it was never needed and keeps this Linux specific code out
  of the common sources.

- Removed unnecessary O_APPEND handling from zfs_iter_write(), this
  is redundant and O_APPEND is already handled in zfs_write();

Reviewed-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11351
2020-12-18 08:48:26 -08:00