Compare commits

...

78 Commits

Author SHA1 Message Date
Tony Hutter 2bc66898b7 Tag zfs-0.8.6
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2020-12-14 12:05:24 -08:00
Matthew Macy 033c788bd5 Restore ARC MFU/MRU pressure
The arc_adapt() function tunes LRU/MLU balance according to 4 types of
cache hits (which is passed as state agrument): ghost LRU, LRU, MRU,
ghost MRU. If this function is called with wrong cache hit (state),
adaptation will be sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before
call to arc_access() which revive element in cache and change state from
ghost to real hit.

After this commit, the order of calls was reverted and arc_adapt() is
now called only with «real» hits even if hit was in one of two ghost
lists, which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc
code.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10548
Closes #10618
2020-12-14 12:05:24 -08:00
Coleman Kane 26a3f3c7d9 Linux 5.10 compat: check_disk_change() removed
Kernel 5.10 removed check_disk_change() in favor of callers using
the faster bdev_check_media_change() instead, and explicitly forcing
bdev revalidation when they desire that behavior. To preserve prior
behavior, I have wrapped this into a zfs_check_media_change() macro
that calls an inline function for the new API that mimics the old
behavior when check_disk_change() doesn't exist, and just calls
check_disk_change() if it exists.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #11085
2020-12-14 12:05:24 -08:00
Brian Behlendorf 8c7d1599eb Linux 5.10 compat: frame.h renamed objtool.h
In Linux 5.10 the linux/frame.h header was renamed linux/objtool.h.
Add a configure check to detect and use the correctly named header.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #11085
2020-12-14 12:05:24 -08:00
Mathieu Velten 38cf54e18a blkg_tryget config test: initialize struct
Missing struct initialization in a config test results in the
interface being incorrectly detected.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Adam Moss <c@yotes.com>
Signed-off-by: Mathieu Velten <matmaul@gmail.com>
Closes #10713
Closes #11049
2020-12-14 12:05:24 -08:00
Michael D Labriola d7d0752a18 copy-builtin: make sure kernel Makefiles don't look in zfs source tree
The icp Makefile was referencing absolute paths to the zfs source tree for
include files.  The result was that even though the headers get added to
the Linux kernel tree, building fails if you move/remove the zfs checkout.

Signed-off-by: Michael D Labriola <michael.d.labriola@gmail.com>
2020-12-14 12:05:24 -08:00
Michael D Labriola 3fdfb85850 copy-builtin: don't create sed backup files
This creates a bit of a mess in the kernel tree, and is largely unnecessary
unless you're debugging copy-builtin.

Signed-off-by: Michael D Labriola <michael.d.labriola@gmail.com>
2020-12-14 12:05:24 -08:00
Michael D Labriola acfc4944d0 copy-builtin: remove .gitignore from KERNEL_DIR/include/zfs
While it makes sense to make sure we don't control the generated
zfs_gitrev.h in the zfs source tree, once we copy the module sources into a
kernel source tree that's a problem.  If you're keeping your zfs sources in
a kernel tree on a branch, you want to commit all the required files.

Signed-off-by: Michael D Labriola <michael.d.labriola@gmail.com>
2020-12-14 12:05:24 -08:00
Kjeld Schouten-Lebbing 0b4f698e2a Increase Supported Linux Kernel to 5.9 (#11057)
This increases the Linux kernel version to 5.9 from 5.8
as most compatibility fixes should already be included.

Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
2020-12-14 12:05:24 -08:00
Nick Danyluk e1be543a48 Bump META Linux-Maximum to kernel 5.8 (#11019)
Bump max kernel version to 5.8 to match the supported releases
indicated in the release notes of 0.8.5

Signed-off-by: Nick Danyluk <ndanyluk7@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2020-12-14 12:05:24 -08:00
Tony Hutter d2632f0cc1 Tag zfs-0.8.5
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2020-10-06 15:56:16 -07:00
Brian Behlendorf e23ba78bd8 Fix buggy procfs_list_seq_next warning
The kernel seq_read() helper function expects ->next() to update
the passed position even there are no more entries.  Failure to
do so results in the following warning being logged.

    seq_file: buggy .next function procfs_list_seq_next [spl]
    did not update position index

Functionally there is no issue with the way procfs_list_seq_next()
is implemented and the warning is harmless.  However, we want to
silence this some what scary incorrect warning.  This commit
updates the Linux procfs code to advance the position even for
the last entry.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10984 
Closes #10996
2020-09-30 16:31:37 -07:00
Brian Behlendorf bf75263ace Fix CONFIG_DEBUG_LOCK_ALLOC configure check
This check was accidentally broken when the kABI checks were updated
to run in parallel, commit 608f874.  The check must be for the
config_debug_lock_alloc_license name to determine if the symbol
is license compatible.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10991
2020-09-29 09:42:31 -07:00
Brian Behlendorf 5e245ab21b Fix objtool configure check
The m4 objtool configure check can incorrectly fail because of a
missing header in the test.  This appears to be the result of a
recent kernel change and was observed on the Fedora 5.8.11-200
kernel.

  In file included from /home/fedora/zfs/build/objtool/objtool.c:75:
  ./arch/x86/include/asm/frame.h💯57: error: 'struct pt_regs'
      declared inside parameter list will not be visible outside
      of this definition or declaration [-Werror]

The consequence of this is that the "stack_frame_non_standard"
check is never run and HAVE_STACK_FRAME_NON_STANDARD is set
incorrectly which results in a build failure.  This change adds
the appropriate header to the "objtool" check so it now behaves
as intended.

Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10990
2020-09-29 09:42:31 -07:00
Brian Behlendorf 3df7f11322 Fix PREEMPTION=y and BLK_CGROUP=y config on arm64
With PREEMPTION=y and BLK_CGROUP=y preempt_schedule_notrace() is being
used on arm64 which is a GPL-only function and hence the build of the
DKMS kernel module fails.

Fix that by redefining preempt_schedule_notrace() to preempt_schedule()
which should be safe as long as tracing is not used.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Closes #8545 
Closes #9948 
Closes #10416 
Closes #10973
2020-09-28 09:53:32 -07:00
Brian Behlendorf ae64398819 Update zts-report.py with additional tests
The following test cases may still occasionally fail and are being
added to the "maybe" list for Linux until they can be updated to be
entirely reliable.

  cli_root/zfs_rename/zfs_rename_002_pos.ksh
  cli_root/zpool_reopen/zpool_reopen_003_pos.ksh
  refreserv/refreserv_raidz

These 6 tests consistently fail only on Fedora 31+, the failures
are related to the kernel rescanning the partition table on loopback
devices which is no longer reliable unless partprobe is used.  In
order to enable the Fedora bot by default they are also being added
to the list until the tests can be updated.  Any significant regression
in functionality covered by these tests will still be detected by the
FreeBSD builders.

  alloc_class/alloc_class_009_pos
  alloc_class/alloc_class_010_pos
  cli_root/zpool_expand/zpool_expand_001_pos
  cli_root/zpool_expand/zpool_expand_005_pos
  rsend/rsend_007_pos
  rsend/rsend_010_pos
  rsend/rsend_011_pos
  snapshot/rollback_003_pos

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2020-09-24 16:05:35 -07:00
Richard Laager 3275147c71 Fix another dependency loop
zfs-load-key-DATASET.service was gaining an
After=systemd-journald.socket due to its stdout/stderr going to the
journal (which is the default).  systemd-journald.socket has an After
(via RequiresMountsFor=/run/systemd/journal) on -.mount.  If the root
filesystem is encrypted, -.mount gets an After
zfs-load-key-DATASET.service.

By setting stdout and stderr to null on the key load services, we avoid
this loop.

Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: InsanePrawn <insane.prawny@gmail.com>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #10356
Closes #10388
2020-09-22 12:22:26 -07:00
Richard Laager 230970d368 Fix a dependency loop
When generating units with zfs-mount-generator, if the pool is already
imported, zfs-import.target is not needed.  This avoids a dependency
loop on root-on-ZFS systems:
  systemd-random-seed.service After (via RequiresMountsFor)
  var-lib.mount After
  zfs-import.target After
  zfs-import-{cache,scan}.service After
  cryptsetup.service After
  systemd-random-seed.service

Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: InsanePrawn <insane.prawny@gmail.com>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #10388
2020-09-22 12:18:33 -07:00
Jonathon 79291065fc Verify zfs module loaded before starting services
This is a minor change to the systemd service templates that verifies
the zfs kernel module is loaded by the kernel prior to attempting to
import any zpool.

The services check for the presence of /sys/module/zfs which indicates
the zfs is module is loaded. This uses the systemd built-in check
ConditionPathIsDirectory.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Thode <prometheanfire@gentoo.org>
Signed-off-by: Jonathon Fernyhough <jonathon.fernyhough@york.ac.uk>
Closes #10663
2020-09-22 12:18:26 -07:00
Jonathon 10c7a12f3b Verify zfs module loaded before starting services
This is a minor change to the systemd service templates that verifies the zfs
kernel module is loaded by the kernel prior to attempting to import any zpool.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jonathon Fernyhough <jonathon.fernyhough@york.ac.uk>
Closes #10627
2020-09-22 12:18:19 -07:00
sterlingjensen 2acca5ec70 Mark lua setjmp/longjmp for powerpc weak
Linux already defines setjmp/longjmp for powerpc, which leads to
duplicate symbols in a statically linked build.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sterlng Jensen <sterlingjensen@users.noreply.github.com>
Closes #10795
2020-09-22 11:43:07 -07:00
Jean-Baptiste Lallement e8baad51e0 Make unloading the key more robust
The unit was failing instead of stopping if someone manually unloaded
the key before stopping the unit (zfs unload-key is failing on an
unavailable key).
Follow a similar logic than for loading the key, checking for the key
status before unloading it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Co-authored-by: Didier Roche <didrocks@ubuntu.com>
Signed-off-by: Didier Roche <didrocks@ubuntu.com>
Closes #10477
2020-09-22 11:43:01 -07:00
Jean-Baptiste Lallement df03f21b54 BindsTo dataset keyload unit to mount associate unit
We need a stronger dependency between the mount unit and its keyload unit
when we know that the dataset is encrypted.
If the keyload unit fails, Wants= will still try to mount the dataset,
which will then fail.
It’s better to show that the failure is due to a dependency failing, the
keyload unit, by tighting up the dependency. We can do this as we know
that we generate both units in the generator and so, it’s not an
optional dependency.
BindsTo enable as well that if the keyload unit fails at any point, the
associated mountpoint will be then unmounted.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Didier Roche <didrocks@ubuntu.com>
Signed-off-by: Didier Roche <didrocks@ubuntu.com>
Closes #10477
2020-09-22 11:42:56 -07:00
Jean-Baptiste Lallement e8e68905c9 Ensure mount unit pilots when its ZFS key is loaded
Drop Before=zfs.mount dependency explicity on generated key-load .service
unit.
Indeed, the associated mount unit is After=<dataset-key-load>.service.
This is thus the mount point which controls at what point it wants to be
mounted (Before=zfs-mount.service in stock generator), but this can be
an automount point, or triggered by another service.
This additional dependency from the key load service is not needed thus.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Didier Roche <didrocks@ubuntu.com>
Signed-off-by: Didier Roche <didrocks@ubuntu.com>
Closes #10477
2020-09-22 11:42:45 -07:00
John Poduska c161360dce Resilver restarts unnecessarily when it encounters errors
When a resilver finishes, vdev_dtl_reassess is called to hopefully
excise DTL_MISSING (amongst other things). If there are errors during
the resilver, they are tracked in DTL_SCRUB, as spelled out in the
block comment in vdev.c. DTL_SCRUB is in-core only, so it can only
be used if the pool was online for the whole resilver. This state is
tracked with the spa_scrub_started flag, which only gets set when
the scan is initialized. Unfortunately, this flag gets cleared right
before vdev_dtl_reassess gets called, so if there are any errors
during the scan, DTL_MISSING will never get excised and the resilver
will just continually restart. This fix simply moves clearing that
flag until after the call to vdev_dtl_reasses.

In addition, if a pool is imported and already has scn_errors > 0,
this change will restart the resilver immediately instead of doing
the rest of the scan and then restarting it from the beginning. On
the other hand, if scn_errors == 0 at import, then no errors have
been encountered so far, so the spa_scrub_started flag can be safely
set.

A test has been added to verify that resilver does not restart when
relevant DTL's are available.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Paul Zuchowski <pzuchowski@datto.com>
Signed-off-by: John Poduska <jpoduska@datto.com>
Closes #10291
2020-09-22 11:33:35 -07:00
Brian Behlendorf a72ae9bf3d ZED: Do not offline a missing device if no spare is available
Due to commit d48091d a removed device is now explicitly offlined by
the ZED if no spare is available, rather than the letting ZFS detect
it as UNAVAIL. This broke auto-replacing of whole-disk devices, as
described in issue #10577.  In short, when a new device is reinserted
in the same slot, the ZED will try to ONLINE it without letting ZFS
recreate the necessary partition table.

This change simply avoids setting the device OFFLINE when removed if
no spare is available (or if spare_on_remove is false).  This change
has been left minimal to allow it to be backported to 0.8.x release.
The auto_offline_001_pos ZTS test has been updated accordingly.

Some follow up work is planned to update the ZED so it transitions
the vdev to a REMOVED state.  This is a state which has always
existed but there is no current interface the ZED can use to
accomplish this.  Therefore it's being left to a follow up PR.

Reviewed-by: Gionatan Danti <g.danti@assyoma.it>
Co-authored-by: Gionatan Danti <g.danti@assyoma.it>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10577
Closes #10730
2020-09-16 14:48:22 -07:00
Brian Behlendorf 3e268f4934 ZTS: Improve zts-auto_offline_001_pos
The zts-auto_offline_001_pos test could exceed the 10 minute test
limit and be KILLED by the test infrastructure.  To prevent this
speed up the test case by:

* Removing redundant pool configurations.  Each of the following
  vdev types is tested once: mirror, raidz, cache, and special.

* The block_device_wait function need only wait on the block
  device which has been removed as part of the test.

Reviewed-by: Paul Zuchowski <pzuchowski@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9827
2020-09-16 14:48:05 -07:00
youzhongyang df01daab13 Fix inability to destroy snapshot used over NFS
The cache of struct svc_export and struct svc_expkey by nfsd and
rpc.mountd for the snapshot holds references to the mount point.
We need to flush them out before unmounting, otherwise umount
would fail with EBUSY.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes #6000 
Closes #10783
2020-09-16 14:30:51 -07:00
Brian Behlendorf 62278325a5 Export dmu_offset_next() symbol
Export the dmu_offset_next() symbol for use by Lustre.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10796
2020-09-16 14:17:51 -07:00
Matthew Ahrens 43186f94f4 Fix lua stack overflow on recursive call to gsub()
The `zfs program` subcommand invokes a LUA interpreter to run ZFS
"channel programs".  This interpreter runs in a constrained environment,
with defined memory limits.  The LUA stack (used for LUA functions that
call each other) is allocated in the kernel's heap, and is limited by
the `-m MEMORY-LIMIT` flag and the `zfs_lua_max_memlimit` module
parameter.  The C stack is used by certain LUA features that are
implemented in C.  The C stack is limited by `LUAI_MAXCCALLS=20`, which
limits call depth.

Some LUA C calls use more stack space than others, and `gsub()` uses an
unusually large amount.  With a programming trick, it can be invoked
recursively using the C stack (rather than the LUA stack).  This
overflows the 16KB Linux kernel stack after about 11 iterations, less
than the limit of 20.

One solution would be to decrease `LUAI_MAXCCALLS`.  This could be made
to work, but it has a few drawbacks:

1. The existing test suite does not pass with `LUAI_MAXCCALLS=10`.

2. There may be other LUA functions that use a lot of stack space, and
the stack space may change depending on compiler version and options.

This commit addresses the problem by adding a new limit on the amount of
free space (in bytes) remaining on the C stack while running the LUA
interpreter: `LUAI_MINCSTACK=4096`.  If there is less than this amount
of stack space remaining, a LUA runtime error is generated.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10611
Closes #10613
2020-09-16 14:17:32 -07:00
DeHackEd 76a1232ee7 Use boot_ncpus in place of max_ncpus in taskq_create
Due to hotplug support or BIOS bugs sometimes max_ncpus can be
an absurdly high value. I have a system with 32 cores/threads
but reports max_ncpus == 440. This many threads potentially
cripples the system during arc_prune floods for example.

boot_ncpus is the number of working CPUs when called so use
that instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #10282
2020-09-16 13:35:42 -07:00
Olaf Faaland 0801e4e5c9 Initialize mmp_last_write when the mmp thread starts (#10912)
A great deal of time may go by between when mmp_init() is called and
the MMP thread starts, particularly if there are bad devices, because
there is I/O checking configs etc.  If this time is too long,

    (gethrtime() - mmp_last_write) > mmp_fail_ns

at the time the MMP thread starts.  If MMP is configured to suspend
the pool, the pool will be suspended immediately.

This can be seen in issue #10838

The value of mmp_last_write doesn't matter before the mmp thread
starts.  To give the MMP thread time to issue and land MMP writes,
initialize mmp_last_write when the MMP thread starts.

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Closes #10873
2020-09-16 00:16:47 +00:00
Chris McDonough fa612dd1fd Appease GCC sprintf warnings found on Fedora 32/GCC 10.0.1
Increase the size of DDT_NAMELEN and MNT_LINE_MAX to appease GCC
snprintf truncation warnings.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris McDonough <chrism@plope.com>
Closes #10712
Closes #10766
2020-09-15 23:43:12 +00:00
Arvind Sankar f78b6dbd5b Switch off -Wmissing-prototypes for libgcc math functions
spl-generic.c defines some of the libgcc integer library functions on
32-bit. Don't bother checking -Wmissing-prototypes since nothing should
directly call these functions from C code.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes #10470
2020-09-15 21:21:05 +00:00
George Amanakis 88533ec59a Fix gcc10.1 truncation error
gcc10.1 complains with:

../../include/sys/dmu.h:373:24: error: ‘%s’ directive output may be
truncated writing up to 95 bytes into a region of size 75
[-Werror=format-truncation=]
  373 | #define DMU_POOL_DDT   "DDT-%s-%s-%s"
      |                        ^~~~~~~~~~~~~~
../../module/zfs/ddt.c:256:37: note: in expansion of macro
‘DMU_POOL_DDT’
  256 |  (void) snprintf(name, DDT_NAMELEN, DMU_POOL_DDT,
      |                                     ^~~~~~~~~~~~
../../include/sys/dmu.h:373:32: note: format string is defined here
  373 | #define DMU_POOL_DDT   "DDT-%s-%s-%s"
      |                                ^~
../../module/zfs/ddt.c:256:9: note: ‘snprintf’ output 7 or more bytes
(assuming 102) into a destination of size 80
  256 |  (void) snprintf(name, DDT_NAMELEN, DMU_POOL_DDT,
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  257 |      zio_checksum_table[ddt->ddt_checksum].ci_name,
      |      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  258 |      ddt_ops[type]->ddt_op_name, ddt_class_name[class]);
      |      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Increasing DTT_NAMELEN fixes it.

Reviewed-By: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #10433
2020-09-15 21:20:43 +00:00
Brian Behlendorf d2acd3696f Linux 5.9 compat: NR_SLAB_RECLAIMABLE (#10865)
Commit dcdc12e added compatibility code to treat NR_SLAB_RECLAIMABLE_B
as if it were the same as NR_SLAB_RECLAIMABLE.  However, the new value
is in bytes while the old value was in pages which means they are not
interchangeable.

The only place the reclaimable slab size is used is as a component of
the calculation done by arc_free_memory().  This function returns the
amount of memory the ARC considers to be free or reclaimable at little
cost.  Rather than switch to a new interface to get this value it has
been removed it from the calculation.  It is normally a minor component
compared to the number of inactive or free pages, and removing it
aligns the behavior with the FreeBSD version of arc_free_memory().


Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Backported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Closes #10834
2020-09-15 21:16:58 +00:00
Pavel Snajdr f6e54ea4f5 Linux 5.7 compat: Include linux/sched.h in spl/sys/mutex.h
struct task_struct is needed for lockdep_off() in mutex.h

This has popped up after e616cb8daadf (in linux-5.7-rc7).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Closes #10741
(cherry picked from commit 772c69d230)
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org>
2020-09-15 21:16:58 +00:00
Coleman Kane c33b623535 Linux 5.9 compat: make_request_fn replaced with submit_bio interface
The make_request_fn and associated API was replaced recently in a
Linux 5.9 merge, to replace its functionality with a new submit_bio
member in struct block_device_operations.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #10696
(cherry picked from commit d817c17100)
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org>
2020-09-15 21:16:58 +00:00
Coleman Kane 42e9450831 Linux 5.9 compat: Update NR_SLAB_RECLAIMABLE to NR_SLAB_RECLAIMABLE_B
This change appears to primarily be a name change for the enum. Had
to update the test logic so that it works so long as either one of
these is present (favoring the newer one). Additionally, as this is
newer, it only shows up in node_page_item, so this commit doesn't
test zone_page_item for the same enum.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #10696
(cherry picked from commit dcdc12e8ba)
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org>
2020-09-15 21:16:58 +00:00
Coleman Kane 70719549f0 Linux 5.9 compat: add linux/blkdev.h include
Many of the block device operations (often functions with bdev in
the name) were moved into linux/blkdev.h from linux/fs.h. Seems
that this header is already included where needed in the code, but
in the autoconf tests it was missing causing false negatives. This
commit has those tests include linux/fs.h (old location) and now
also linux/blkdev.h (new locations).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #10696
(cherry picked from commit 1823c8fe6a)
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org>
2020-09-15 21:16:58 +00:00
Neal Gompa (ニール・ゴンパ) 2d2ce04b99 lib/libzfs, rpm: Install pkgconfig files in the correct directory (#10628)
libzfs is an architecture-specific library, thus its pkgconfig files
should also be installed into the architecture-specific path for
pkgconfig files so that systems that support multilib or multiarch
installation will be able to work properly.

Signed-off-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
2020-09-15 21:16:58 +00:00
Eli Schwartz 638edf1d42 Linux 5.8 compat: __vmalloc()
The `pgprot` argument has been removed from `__vmalloc` in Linux 5.8,
being `PAGE_KERNEL` always now [1].

Detect this during configure and define a wrapper for older kernels.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/mm/vmalloc.c?h=next-20200605&id=88dca4ca5a93d2c09e5bbc6a62fbfc3af83c4fca

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Co-authored-by: Michael Niewöhner <foss@mniewoehner.de>
Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>
Closes #10422

(cherry picked from commit 080102a1b6)
- apply to 0.8.4 before certain files were moved around
- config/kernel-kmem.m4 exists in git but not release tarballs because
  it is unused; introduce it in a new file to prevent conflicts
- linux/mm.h is included in git master via sys/kmem.h; do not remove it
  here or the build will error due to undefined is_vmalloc_addr()

Original-patch-by: Michael Niewöhner <c0d3z3r0@users.noreply.github.com>
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org>
2020-09-15 21:16:58 +00:00
Jorgen Lundman 04837c8dcb Replace sprintf()->snprintf() and strcpy()->strlcpy()
The strcpy() and sprintf() functions are deprecated on some platforms.
Care is needed to ensure correct size is used.  If some platforms
miss snprintf, we can add a #define to sprintf, likewise strlcpy().

The biggest change is adding a size parameter to zfs_id_to_fuidstr().

The various *_impl_get() functions are only used on linux and have
not yet been updated.
2020-09-15 21:16:58 +00:00
George Amanakis 7b98b55282 Fix gcc 10.1 stringop-truncation error
As we do not expect the destination of these strncpy calls to be NULL
terminated, substitute them with memcpy.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #10346
2020-09-15 21:16:58 +00:00
ColMelvin 88451845af RPM: Remove old versions of DKMS on upgrade
Due to a mismatch between the text and a regex looking for that text,
the `%preuninstall` script would never run the `dkms remove` command
necessary to avoid corrupting the DKMS data configuration.  Increase
regex specificity to avoid this issue.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Lindee <chris.lindee+github@gmail.com>
Closes: #9891
Closes #10327
2020-09-15 21:16:58 +00:00
Brian Behlendorf 4f5dcc9dc1 Add longjmp support for Thumb-2
When a Thumb-2 kernel is being used, then longjmp must be implemented
using the Thumb-2 instruction set in module/lua/setjmp/setjmp_arm.S.

Original-patch-by: @jsrlabs
Reviewed-by: @awehrfritz
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7408 
Closes #9957 
Closes #9967
2020-09-15 17:07:40 +00:00
Tony Hutter 6b18d7df37 Tag zfs-0.8.4
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>

TEST_ZIMPORT_SKIP="yes"
2020-05-12 10:53:32 -07:00
George Amanakis 4fe5f9016f Add missing zfs_refcount_destroy() in key_mapping_rele()
Otherwise when running with reference_tracking_enable=TRUE mounting
and unmounting an encrypted dataset panics with:

Call Trace:
 dump_stack+0x66/0x90
 slab_err+0xcd/0xf2
 ? __kmalloc+0x174/0x260
 ? __kmem_cache_shutdown+0x158/0x240
 __kmem_cache_shutdown.cold+0x1d/0x115
 shutdown_cache+0x11/0x140
 kmem_cache_destroy+0x210/0x230
 spl_kmem_cache_destroy+0x122/0x3e0 [spl]
 zfs_refcount_fini+0x11/0x20 [zfs]
 spa_fini+0x4b/0x120 [zfs]
 zfs_kmod_fini+0x6b/0xa0 [zfs]
 _fini+0xa/0x68c [zfs]
 __x64_sys_delete_module+0x19c/0x2b0
 do_syscall_64+0x5b/0x1a0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reviewed-By: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-By: Tom Caputi <tcaputi@datto.com>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #10246
2020-05-12 10:53:32 -07:00
Brian Behlendorf e6142ac0f2 Linux 5.7 compat: blk_alloc_queue()
Commit https://github.com/torvalds/linux/commit/3d745ea5 simplified
the blk_alloc_queue() interface by updating it to take the request
queue as an argument.  Add a wrapper function which accepts the new
arguments and internally uses the available interfaces.

Other minor changes include increasing the Linux-Maximum to 5.6 now
that 5.6 has been released.  It was not bumped to 5.7 because this
release has not yet been finalized and is still subject to change.

Added local 'struct zvol_state_os *zso' variable to zvol_alloc.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10181
Closes #10187
2020-05-12 10:53:32 -07:00
Matthew Macy ea15efd4c9 Prefix struct rangelock
A struct rangelock already exists on FreeBSD.  Add a zfs_ prefix as
per our convention to prevent any conflict with existing symbols.
This change is a follow up to 2cc479d0.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #9534
2020-05-12 10:53:32 -07:00
Arvind Sankar 5c474614ff Fix icp include directories for in-tree build
When zfs is built in-tree using --enable-linux-builtin, the compile
commands are executed from the kernel build directory. If the build
directory is different from the kernel source directory, passing
-Ifs/zfs/icp will not find the headers as they are not present in the
build directory.

Fix this by adding @abs_top_srcdir@ to pull the headers from the zfs
source tree instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes #10021
2020-05-12 10:53:32 -07:00
Attila Fülöp 3d09d3809b ICP: gcm-avx: Support architectures lacking the MOVBE instruction
There are a couple of x86_64 architectures which support all needed
features to make the accelerated GCM implementation work but the
MOVBE instruction. Those are mainly Intel Sandy- and Ivy-Bridge
and AMD Bulldozer, Piledriver, and Steamroller.

By using MOVBE only if available and replacing it with a MOV
followed by a BSWAP if not, those architectures now benefit from
the new GCM routines and performance is considerably better
compared to the original implementation.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Adam D. Moss <c@yotes.com>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Followup #9749 
Closes #10029
2020-05-12 10:53:32 -07:00
Attila Fülöp 76354f945e ICP: Improve AES-GCM performance
Currently SIMD accelerated AES-GCM performance is limited by two
factors:

a. The need to disable preemption and interrupts and save the FPU
state before using it and to do the reverse when done. Due to the
way the code is organized (see (b) below) we have to pay this price
twice for each 16 byte GCM block processed.

b. Most processing is done in C, operating on single GCM blocks.
The use of SIMD instructions is limited to the AES encryption of the
counter block (AES-NI) and the Galois multiplication (PCLMULQDQ).
This leads to the FPU not being fully utilized for crypto
operations.

To solve (a) we do crypto processing in larger chunks while owning
the FPU. An `icp_gcm_avx_chunk_size` module parameter was introduced
to make this chunk size tweakable. It defaults to 32 KiB. This step
alone roughly doubles performance. (b) is tackled by porting and
using the highly optimized openssl AES-GCM assembler routines, which
do all the processing (CTR, AES, GMULT) in a single routine. Both
steps together result in up to 32x reduction of the time spend in
the en/decryption routines, leading up to approximately 12x
throughput increase for large (128 KiB) blocks.

Lastly, this commit changes the default encryption algorithm from
AES-CCM to AES-GCM when setting the `encryption=on` property.

Reviewed-By: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-By: Jason King <jason.king@joyent.com>
Reviewed-By: Tom Caputi <tcaputi@datto.com>
Reviewed-By: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #9749
2020-05-12 10:53:32 -07:00
Fabio Scaccabarozzi 590ababea2 Bugfix/fix uio partial copies
In zfs_write(), the loop continues to the next iteration without
accounting for partial copies occurring in uiomove_iov when 
copy_from_user/__copy_from_user_inatomic return a non-zero status.
This results in "zfs: accessing past end of object..." in the 
kernel log, and the write failing.

Account for partial copies and update uio struct before returning
EFAULT, leave a comment explaining the reason why this is done.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: ilbsmart <wgqimut@gmail.com>
Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
Closes #8673 
Closes #10148
2020-05-12 10:53:32 -07:00
Mark Roper 009ff83548 Prevent deadlock in arc_read in Linux memory reclaim callback
Using zfs with Lustre, an arc_read can trigger kernel memory allocation
that in turn leads to a memory reclaim callback and a deadlock within a
single zfs process. This change uses spl_fstrans_mark and
spl_trans_unmark to prevent the reclaim attempt and the deadlock
(https://zfsonlinux.topicbox.com/groups/zfs-devel/T4db2c705ec1804ba).
The stack trace observed is:

    __schedule at ffffffff81610f2e
    schedule at ffffffff81611558
    schedule_preempt_disabled at ffffffff8161184a
    __mutex_lock at ffffffff816131e8
    arc_buf_destroy at ffffffffa0bf37d7 [zfs]
    dbuf_destroy at ffffffffa0bfa6fe [zfs]
    dbuf_evict_one at ffffffffa0bfaa96 [zfs]
    dbuf_rele_and_unlock at ffffffffa0bfa561 [zfs]
    dbuf_rele_and_unlock at ffffffffa0bfa32b [zfs]
    osd_object_delete at ffffffffa0b64ecc [osd_zfs]
    lu_object_free at ffffffffa06d6a74 [obdclass]
    lu_site_purge_objects at ffffffffa06d7fc1 [obdclass]
    lu_cache_shrink_scan at ffffffffa06d81b8 [obdclass]
    shrink_slab at ffffffff811ca9d8
    shrink_node at ffffffff811cfd94
    do_try_to_free_pages at ffffffff811cfe63
    try_to_free_pages at ffffffff811d01c4
    __alloc_pages_slowpath at ffffffff811be7f2
    __alloc_pages_nodemask at ffffffff811bf3ed
    new_slab at ffffffff81226304
    ___slab_alloc at ffffffff812272ab
    __slab_alloc at ffffffff8122740c
    kmem_cache_alloc at ffffffff81227578
    spl_kmem_cache_alloc at ffffffffa048a1fd [spl]
    arc_buf_alloc_impl at ffffffffa0befba2 [zfs]
    arc_read at ffffffffa0bf0924 [zfs]
    dbuf_read at ffffffffa0bf9083 [zfs]
    dmu_buf_hold_by_dnode at ffffffffa0c04869 [zfs]

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mark Roper <markroper@gmail.com>
Closes #9987
2020-05-12 10:53:32 -07:00
Alexander Motin 4e55349857 Fix infinite scan on a pool with only special allocations
Attempt to run scrub or resilver on a new pool containing only special
allocations (special vdev added on creation) caused infinite loop
because of dsl_scan_should_clear() limiting memory usage to 5% of pool
size, which it calculated accounting only normal allocation class.

Addition of special and just in case dedup classes fixes the issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #10106 
Closes #8694
2020-05-12 10:53:32 -07:00
Brian Behlendorf 0cbed7f026 Static symbols exported by ICP
The crypto_cipher_init_prov and crypto_cipher_init are declared static
and should not be exported by the ICP.  This resolves the following
warnings observed when building with the 5.4 kernel.

WARNING: "crypto_cipher_init" [.../icp] is a static EXPORT_SYMBOL
WARNING: "crypto_cipher_init_prov" [.../icp] is a static EXPORT_SYMBOL

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9791
2020-05-12 10:53:32 -07:00
Brian Behlendorf 6e8c4dc460 Linux 5.6 compat: struct proc_ops
The proc_ops structure was introduced to replace the use of of the
file_operations structure when registering proc handlers.  This
change creates a new kstat_proc_op_t typedef for compatibility
which can be used to pass around the correct structure.

This change additionally adds the 'const' keyword to all of the
existing proc operations structures.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9961
2020-05-12 10:53:32 -07:00
Brian Behlendorf 06b473a8ae Linux 5.6 compat: timestamp_truncate()
The timestamp_truncate() function was added, it replaces the existing
timespec64_trunc() function.  This change renames our wrapper function
to be consistent with the upstream name and updates the compatibility
code for older kernels accordingly.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9956
Closes #9961
2020-05-12 10:53:32 -07:00
Brian Behlendorf ba2cf53545 Linux 5.6 compat: ktime_get_raw_ts64()
The getrawmonotonic() and getrawmonotonic64() interfaces have been
fully retired.  Update gethrtime() to use the replacement interface
ktime_get_raw_ts64() which was introduced in the 4.18 kernel.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10052
Closes #10064
2020-05-12 10:53:32 -07:00
Brian Behlendorf cf2a3464e9 Linux 5.6 compat: time_t
As part of the Linux kernel's y2038 changes the time_t type has been
fully retired.  Callers are now required to use the time64_t type.

Rather than move to the new type, I've removed the few remaining
places where a time_t is used in the kernel code.  They've been
replaced with a uint64_t which is already how ZFS internally
handled these values.

Going forward we should work towards updating the remaining user
space time_t consumers to the 64-bit interfaces.

Reviewed-by: Matthew Macy <mmacy@freebsd.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10052
Closes #10064
2020-05-12 10:53:32 -07:00
Romain Dolbeau 43135b3746 Fix static data to link with -fno-common
-fno-common is the new default in GCC 10, replacing -fcommon in
GCC <= 9, so static data must only be allocated once.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Romain Dolbeau <romain.dolbeau@european-processor-initiative.eu>
Closes #9943
2020-05-12 10:53:32 -07:00
alex 6f42372635 zfs_get: change time format string from %k to %H
Issue #10090 reported that snapshots created between midnight and 1 AM
are missing a padded zero in the creation property

This change fixes the bug reported in issue #10090 where snapshots
created between midnight and 1 AM were missing a padded zero in the
creation timestamp output.

The leading zero was missing because the time format string used `%k`
which formats the hour as a decimal number from 0 to 23 where single
digits are preceded by blanks[0] and is fixed by changing it to `%H`
which formats the hour as 00-23.

The difference in output is as below

```
-Thu Mar 26  0:39 2020
+Thu Mar 26 00:39 2020
```

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Alex John <alex@stty.io>
Closes #10090 
Closes #10153
2020-05-12 10:53:32 -07:00
Matthew Ahrens fc6700425a Deprecate deduplicated send streams
Dedup send can only deduplicate over the set of blocks in the send
command being invoked, and it does not take advantage of the dedup table
to do so. This is a very common misconception among not only users, but
developers, and makes the feature seem more useful than it is. As a
result, many users are using the feature but not getting any benefit
from it.

Dedup send requires a nontrivial expenditure of memory and CPU to
operate, especially if the dataset(s) being sent is (are) not already
using a dedup-strength checksum.

Dedup send adds developer burden. It expands the test matrix when
developing new features, causing bugs in released code, and delaying
development efforts by forcing more testing to be done.

As a result, we are deprecating the use of `zfs send -D` and receiving
of such streams.  This change adds a warning to the man page, and also
prints the warning whenever dedup send or receive are used.

In a future release, we plan to:
1. remove the kernel code for generating deduplicated streams
2. make `zfs send -D` generate regular, non-deduplicated streams
3. remove the kernel code for receiving deduplicated streams
4. make `zfs receive` of deduplicated streams process them in userland
   to "re-duplicate" them, so that they can still be received.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #7887
Closes #10117
2020-05-12 10:53:32 -07:00
Richard Laager e1b0704568 Fix zfs-functions packaging bug
This fixes a bug where the generated zfs-functions was being included
along with original zfs-functions.in in the make dist tarball.  This
caused an unfortunate series of events during build/packaging that
resulted in the RPM-installed /etc/zfs/zfs-functions listing the
paths as:

ZFS="/usr/local/sbin/zfs"
ZED="/usr/local/sbin/zed"
ZPOOL="/usr/local/sbin/zpool"

When they should have been:

ZFS="/sbin/zfs"
ZED="/sbin/zed"
ZPOOL="/sbin/zpool"

This affects init.d (non-systemd) distros like CentOS 6.

/etc/default/zfs and /etc/zfs/zfs-functions are also used by the
initramfs, so they need to be built even when init.d support is not.
They have been moved to the (new) etc/default and (existing) etc/zfs
source directories, respectively.

Fixes: #9443

Co-authored-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
2020-05-12 10:53:32 -07:00
Richard Laager d7c076c793 initramfs: Eliminate substitutions
These are now handled in zfs-functions, so this is all duplicative and
unnecessary.

Signed-off-by: Richard Laager <rlaager@wiktel.com>
2020-05-12 10:53:32 -07:00
Richard Laager e4185a03de Delete built init scripts in make clean
Previously, they were being deleted in make distclean.  This brings it
in line with the example:
https://www.gnu.org/software/automake/manual/html_node/Scripts.html

Signed-off-by: Richard Laager <rlaager@wiktel.com>
2020-05-12 10:53:32 -07:00
Ryan Moeller b06256a997 Restore :: in Makefile.am
The double-colon looked like a typo, but it's actually an obscure
feature. Rules with :: may appear multiple times and are run
independently of one another in the order they appear. The use of ::
for distclean-local was conventional, not accidental.

Add comments to indicate the intentional use of double-colon rules.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9210
2020-05-12 10:53:32 -07:00
Richard Laager f427973159 Make init scripts depend on Makefile
This brings it in line with the example:
https://www.gnu.org/software/automake/manual/html_node/Scripts.html

This way, if the substitution code is changed, they should update.

Signed-off-by: Richard Laager <rlaager@wiktel.com>
2020-05-12 10:53:32 -07:00
InsanePrawn 4bc401b30f Systemd mount generator: don't fail keyload from file if already loaded
Previously the generated keyload units for encryption roots with
keylocation=file://* didn't contain the code to detect if the key
was already loaded and would be marked failed in such situations.

Move the code to check whether the key is already loaded
from keylocation=prompt handling to general key loading code.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: InsanePrawn <insane.prawny@gmail.com>
Closes #10103
2020-05-12 10:53:32 -07:00
InsanePrawn b19477898c Systemd mount generator: Generate noauto units; add control properties
This commit refactors the systemd mount generators and makes the
following major changes:

- The generator now generates units for datasets marked canmount=noauto,
  too. These units are NOT WantedBy local-fs.target.
  If there are multiple noauto datasets for a path, no noauto unit will
  be created. Datasets with canmount=on are prioritized.

- Introduces handling of new user properties which are now included in
  the zfs-list.cache files:
    - org.openzfs.systemd:requires:
      List of units to require for this mount unit
    - org.openzfs.systemd:requires-mounts-for:
      List of mounts to require by this mount unit
    - org.openzfs.systemd:before:
      List of units to order after this mount unit
    - org.openzfs.systemd:after:
      List of units to order before this mount unit
    - org.openzfs.systemd:wanted-by:
      List of units to add a Wants dependency on this mount unit to
    - org.openzfs.systemd:required-by:
      List of units to add a Requires dependency on this mount unit to
    - org.openzfs.systemd:nofail:
      Toggles between a wants and a requires dependency.
    - org.openzfs.systemd:ignore:
      Do not generate a mount unit for this dataset.

  Consult the updated man page for detailed documentation.

- Restructures and extends the zfs-mount-generator(8) man page with the
  above properties, information on unit ordering and a license header.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: InsanePrawn <insane.prawny@gmail.com>
Closes #9649
2020-05-12 10:53:32 -07:00
InsanePrawn d1436d58a7 Systemd mount generator: Silence shellcheck warnings
Silences a warning about an intentionally unquoted variable.
Fixes a warning caused by strings split across lines by slightly
refactoring keyloadcmd.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: InsanePrawn <insane.prawny@gmail.com>
Closes #9649
2020-05-12 10:53:32 -07:00
Brian Behlendorf 13bfad0c96 Fix CONFIG_MODULES=no Linux kernel config
When configuring as builtin (--enable-linux-builtin) for kernels
without loadable module support (CONFIG_MODULES=n) only the object
file is created.  Never a loadable kmod.

Update ZFS_LINUX_TRY_COMPILE to handle this in a manor similar to
the ZFS_LINUX_TEST_COMPILE_ALL macro.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9887
Closes #10063
2020-05-12 10:53:32 -07:00
Brian Behlendorf 49f065d5a4 Linux 5.5 compat: blkg_tryget()
Commit https://github.com/torvalds/linux/commit/9e8d42a0f accidentally
converted the static inline function blkg_tryget() to GPL-only for
kernels built with CONFIG_PREEMPT_RCU=y and CONFIG_BLK_CGROUP=y.

Resolve the build issue by providing our own equivalent functionality
when needed which uses rcu_read_lock_sched() internally as before.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9745
Closes #10072
2020-05-12 10:53:32 -07:00
Richard Laager ebc8e360d5 zfs-mount-generator: Fix escaping for /
The correct name for the mount unit for / is "-.mount", not ".mount".

Reviewed-by: InsanePrawn <insane.prawny@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Antonio Russo <antonio.e.russo@gmail.com>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #9970
2020-05-12 10:53:32 -07:00
Matthew Ahrens d4e04cc145 Missed wakeup when growing kmem cache
When growing the size of a (VMEM or KVMEM) kmem cache, spl_cache_grow()
always does taskq_dispatch(spl_cache_grow_work), and then waits for the
KMC_BIT_GROWING to be cleared by the taskq thread.

The taskq thread (spl_cache_grow_work()) does:
1. allocate new slab and add to list
2. wake_up_all(skc_waitq)
3. clear_bit(KMC_BIT_GROWING)

Therefore, the waiting thread can wake up before GROWING has been
cleared.  It will see that the growing has not yet completed, and go
back to sleep until it hits the 100ms timeout.

This can have an extreme performance impact on workloads that alloc/free
more than fits in the (statically-sized) magazines.  These workloads
allocate and free slabs with high frequency.

The problem can be observed with `funclatency spl_cache_grow`, which on
some workloads shows that 99.5% of the time it takes <64us to allocate
slabs, but we spend ~70% of our time in outliers, waiting for the 100ms
timeout.

The fix is to do `clear_bit(KMC_BIT_GROWING)` before
`wake_up_all(skc_waitq)`.

A future investigation should evaluate if we still actually need to
taskq_dispatch() at all, and if so on which kernel versions.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #9989
2020-05-12 10:53:32 -07:00
Richard Laager f3bf67d04d Order zfs-import-*.service after multipathd
If someone is using both multipathd and ZFS, they are probably using
them together.  Ordering the zpool imports after multipathd is ready
fixes import issues for multipath configurations.

Tested-by: Mike Pastore <mike@oobak.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #9863
2020-05-12 10:53:32 -07:00
lorenz 3916ac5a56 Avoid here-documents in systemd mount generator
On some systems - openSUSE, for example - there is not yet a writeable
temporary file system available, so bash bails out with an error,

  'cannot create temp file for here-document: Read-only file system',

on the here documents in zfs-mount-generator. The simple fix is to
change these into a multi-line echo statement.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-By: Richard Laager <rlaager@wiktel.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Lorenz Hüdepohl <dev@stellardeath.org>
Closes #9802
2020-05-12 10:53:32 -07:00
157 changed files with 4887 additions and 561 deletions

View File

@ -83,6 +83,7 @@ CONTRIBUTORS:
Christopher Voltz <cjunk@voltz.ws>
Chunwei Chen <david.chen@nutanix.com>
Clemens Fruhwirth <clemens@endorphin.org>
Coleman Kane <ckane@colemankane.org>
Colin Ian King <colin.king@canonical.com>
Craig Loomis <cloomis@astro.princeton.edu>
Craig Sanders <github@taz.net.au>

View File

@ -20,6 +20,10 @@ notable exceptions and their respective licenses include:
* AES Implementation: module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.openssl
* PBKDF2 Implementation: lib/libzfs/THIRDPARTYLICENSE.openssl
* SPL Implementation: module/spl/THIRDPARTYLICENSE.gplv2
* GCM Implementaion: module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.cryptogams
* GCM Implementaion: module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.openssl
* GHASH Implementaion: module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.cryptogams
* GHASH Implementaion: module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.openssl
This product includes software developed by the OpenSSL Project for use
in the OpenSSL Toolkit (http://www.openssl.org/)

4
META
View File

@ -1,10 +1,10 @@
Meta: 1
Name: zfs
Branch: 1.0
Version: 0.8.3
Version: 0.8.6
Release: 1
Release-Tags: relext
License: CDDL
Author: OpenZFS on Linux
Linux-Maximum: 5.4
Linux-Maximum: 5.9
Linux-Minimum: 2.6.32

View File

@ -347,9 +347,8 @@ zfs_retire_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl,
zpool_vdev_offline(zhp, devname, B_TRUE);
} else if (!fmd_prop_get_int32(hdl, "spare_on_remove") ||
replace_with_spare(hdl, zhp, vdev) == B_FALSE) {
/* Could not handle with spare: offline the device */
fmd_hdl_debug(hdl, "zpool_vdev_offline '%s'", devname);
zpool_vdev_offline(zhp, devname, B_TRUE);
/* Could not handle with spare */
fmd_hdl_debug(hdl, "no spare for '%s'", devname);
}
free(devname);

View File

@ -46,8 +46,13 @@ case "${ZEVENT_HISTORY_INTERNAL_NAME}" in
set|inherit)
# Only act if one of the tracked properties is altered.
case "${ZEVENT_HISTORY_INTERNAL_STR%%=*}" in
canmount|mountpoint|atime|relatime|devices|exec| \
readonly|setuid|nbmand|encroot|keylocation) ;;
canmount|mountpoint|atime|relatime|devices|exec|readonly| \
setuid|nbmand|encroot|keylocation|org.openzfs.systemd:requires| \
org.openzfs.systemd:requires-mounts-for| \
org.openzfs.systemd:before|org.openzfs.systemd:after| \
org.openzfs.systemd:wanted-by|org.openzfs.systemd:required-by| \
org.openzfs.systemd:nofail|org.openzfs.systemd:ignore \
) ;;
*) exit 0 ;;
esac
;;
@ -61,8 +66,12 @@ esac
zed_lock zfs-list
trap abort_alter EXIT
PROPS="name,mountpoint,canmount,atime,relatime,devices,exec,readonly"
PROPS="${PROPS},setuid,nbmand,encroot,keylocation"
PROPS="name,mountpoint,canmount,atime,relatime,devices,exec\
,readonly,setuid,nbmand,encroot,keylocation\
,org.openzfs.systemd:requires,org.openzfs.systemd:requires-mounts-for\
,org.openzfs.systemd:before,org.openzfs.systemd:after\
,org.openzfs.systemd:wanted-by,org.openzfs.systemd:required-by\
,org.openzfs.systemd:nofail,org.openzfs.systemd:ignore"
"${ZFS}" list -H -t filesystem -o $PROPS -r "${ZEVENT_POOL}" > "${FSLIST_TMP}"

View File

@ -4144,6 +4144,16 @@ zfs_do_send(int argc, char **argv)
}
}
if (flags.dedup) {
(void) fprintf(stderr,
gettext("WARNING: deduplicated send is "
"deprecated, and will be removed in a\n"
"future release. (In the future, the flag will be "
"accepted, but a\n"
"regular, non-deduplicated stream will be "
"generated.)\n\n"));
}
argc -= optind;
argv += optind;
@ -5954,7 +5964,7 @@ typedef struct holds_cbdata {
size_t cb_max_taglen;
} holds_cbdata_t;
#define STRFTIME_FMT_STR "%a %b %e %k:%M %Y"
#define STRFTIME_FMT_STR "%a %b %e %H:%M %Y"
#define DATETIME_BUF_LEN (32)
/*
*

View File

@ -33,7 +33,7 @@ extern "C" {
void * safe_malloc(size_t size);
void nomem(void);
libzfs_handle_t *g_zfs;
extern libzfs_handle_t *g_zfs;
#ifdef __cplusplus
}

View File

@ -73,6 +73,8 @@
#include "statcommon.h"
libzfs_handle_t *g_zfs;
static int zpool_do_create(int, char **);
static int zpool_do_destroy(int, char **);
@ -8618,9 +8620,9 @@ zpool_do_events_short(nvlist_t *nvl, ev_opts_t *opts)
verify(nvlist_lookup_int64_array(nvl, FM_EREPORT_TIME, &tv, &n) == 0);
memset(str, ' ', 32);
(void) ctime_r((const time_t *)&tv[0], ctime_str);
(void) strncpy(str, ctime_str+4, 6); /* 'Jun 30' */
(void) strncpy(str+7, ctime_str+20, 4); /* '1993' */
(void) strncpy(str+12, ctime_str+11, 8); /* '21:49:08' */
(void) memcpy(str, ctime_str+4, 6); /* 'Jun 30' */
(void) memcpy(str+7, ctime_str+20, 4); /* '1993' */
(void) memcpy(str+12, ctime_str+11, 8); /* '21:49:08' */
(void) sprintf(str+20, ".%09lld", (longlong_t)tv[1]); /* '.123456789' */
if (opts->scripted)
(void) printf(gettext("%s\t"), str);

View File

@ -80,7 +80,7 @@ void pool_list_free(zpool_list_t *);
int pool_list_count(zpool_list_t *);
void pool_list_remove(zpool_list_t *, zpool_handle_t *);
libzfs_handle_t *g_zfs;
extern libzfs_handle_t *g_zfs;
typedef struct vdev_cmd_data

View File

@ -2225,7 +2225,7 @@ ztest_get_data(void *arg, lr_write_t *lr, char *buf, struct lwb *lwb,
zgd->zgd_private = zd;
if (buf != NULL) { /* immediate write */
zgd->zgd_lr = (struct locked_range *)ztest_range_lock(zd,
zgd->zgd_lr = (struct zfs_locked_range *)ztest_range_lock(zd,
object, offset, size, RL_READER);
error = dmu_read(os, object, offset, size, buf,
@ -2240,7 +2240,7 @@ ztest_get_data(void *arg, lr_write_t *lr, char *buf, struct lwb *lwb,
offset = 0;
}
zgd->zgd_lr = (struct locked_range *)ztest_range_lock(zd,
zgd->zgd_lr = (struct zfs_locked_range *)ztest_range_lock(zd,
object, offset, size, RL_READER);
error = dmu_buf_hold(os, object, offset, zgd, &db,

View File

@ -0,0 +1,37 @@
dnl #
dnl # Linux 5.5 API,
dnl #
dnl # The Linux 5.5 kernel updated percpu_ref_tryget() which is inlined by
dnl # blkg_tryget() to use rcu_read_lock() instead of rcu_read_lock_sched().
dnl # As a side effect the function was converted to GPL-only.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKG_TRYGET], [
ZFS_LINUX_TEST_SRC([blkg_tryget], [
#include <linux/blk-cgroup.h>
#include <linux/bio.h>
#include <linux/fs.h>
],[
struct blkcg_gq blkg __attribute__ ((unused)) = {};
bool rc __attribute__ ((unused));
rc = blkg_tryget(&blkg);
], [], [$ZFS_META_LICENSE])
])
AC_DEFUN([ZFS_AC_KERNEL_BLKG_TRYGET], [
AC_MSG_CHECKING([whether blkg_tryget() is available])
ZFS_LINUX_TEST_RESULT([blkg_tryget], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLKG_TRYGET, 1, [blkg_tryget() is available])
AC_MSG_CHECKING([whether blkg_tryget() is GPL-only])
ZFS_LINUX_TEST_RESULT([blkg_tryget_license], [
AC_MSG_RESULT(no)
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLKG_TRYGET_GPL_ONLY, 1,
[blkg_tryget() GPL-only])
])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,62 @@
dnl #
dnl # check_disk_change() was removed in 5.10
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_CHECK_DISK_CHANGE], [
ZFS_LINUX_TEST_SRC([check_disk_change], [
#include <linux/fs.h>
#include <linux/blkdev.h>
], [
struct block_device *bdev = NULL;
bool error;
error = check_disk_change(bdev);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_CHECK_DISK_CHANGE], [
AC_MSG_CHECKING([whether check_disk_change() exists])
ZFS_LINUX_TEST_RESULT([check_disk_change], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_CHECK_DISK_CHANGE, 1,
[check_disk_change() exists])
], [
AC_MSG_RESULT(no)
])
])
dnl #
dnl # 5.10 API, check_disk_change() is removed, in favor of
dnl # bdev_check_media_change(), which doesn't force revalidation
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_CHECK_MEDIA_CHANGE], [
ZFS_LINUX_TEST_SRC([bdev_check_media_change], [
#include <linux/fs.h>
#include <linux/blkdev.h>
], [
struct block_device *bdev = NULL;
int error;
error = bdev_check_media_change(bdev);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_BDEV_CHECK_MEDIA_CHANGE], [
AC_MSG_CHECKING([whether bdev_disk_changed() exists])
ZFS_LINUX_TEST_RESULT([bdev_check_media_change], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BDEV_CHECK_MEDIA_CHANGE, 1,
[bdev_check_media_change() exists])
], [
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_CHANGE], [
ZFS_AC_KERNEL_SRC_BLKDEV_CHECK_DISK_CHANGE
ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_CHECK_MEDIA_CHANGE
])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_CHANGE], [
ZFS_AC_KERNEL_BLKDEV_CHECK_DISK_CHANGE
ZFS_AC_KERNEL_BLKDEV_BDEV_CHECK_MEDIA_CHANGE
])

View File

@ -6,6 +6,7 @@ dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_GET_BY_PATH], [
ZFS_LINUX_TEST_SRC([blkdev_get_by_path], [
#include <linux/fs.h>
#include <linux/blkdev.h>
], [
blkdev_get_by_path(NULL, 0, NULL);
])

View File

@ -5,6 +5,7 @@ dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_REREAD_PART], [
ZFS_LINUX_TEST_SRC([blkdev_reread_part], [
#include <linux/fs.h>
#include <linux/blkdev.h>
], [
struct block_device *bdev = NULL;
int error;

View File

@ -91,7 +91,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_CONFIG_DEBUG_LOCK_ALLOC], [
AC_DEFUN([ZFS_AC_KERNEL_CONFIG_DEBUG_LOCK_ALLOC], [
AC_MSG_CHECKING([whether mutex_lock() is GPL-only])
ZFS_LINUX_TEST_RESULT([config_debug_lock_alloc], [
ZFS_LINUX_TEST_RESULT([config_debug_lock_alloc_license], [
AC_MSG_RESULT(no)
],[
AC_MSG_RESULT(yes)

View File

@ -94,7 +94,6 @@ AC_DEFUN([ZFS_AC_KERNEL_GLOBAL_ZONE_PAGE_STATE_SANITY], [
ZFS_AC_KERNEL_GLOBAL_PAGE_STATE_ENUM_CHECK([NR_FILE_PAGES])
ZFS_AC_KERNEL_GLOBAL_PAGE_STATE_ENUM_CHECK([NR_INACTIVE_ANON])
ZFS_AC_KERNEL_GLOBAL_PAGE_STATE_ENUM_CHECK([NR_INACTIVE_FILE])
ZFS_AC_KERNEL_GLOBAL_PAGE_STATE_ENUM_CHECK([NR_SLAB_RECLAIMABLE])
AC_MSG_RESULT(yes)
])
@ -117,8 +116,6 @@ AC_DEFUN([ZFS_AC_KERNEL_GLOBAL_PAGE_STATE], [
[node_stat_item], [$LINUX/include/linux/mmzone.h])
ZFS_AC_KERNEL_ENUM_MEMBER([NR_INACTIVE_FILE],
[node_stat_item], [$LINUX/include/linux/mmzone.h])
ZFS_AC_KERNEL_ENUM_MEMBER([NR_SLAB_RECLAIMABLE],
[node_stat_item], [$LINUX/include/linux/mmzone.h])
ZFS_AC_KERNEL_ENUM_MEMBER([NR_FILE_PAGES],
[zone_stat_item], [$LINUX/include/linux/mmzone.h])
@ -126,8 +123,6 @@ AC_DEFUN([ZFS_AC_KERNEL_GLOBAL_PAGE_STATE], [
[zone_stat_item], [$LINUX/include/linux/mmzone.h])
ZFS_AC_KERNEL_ENUM_MEMBER([NR_INACTIVE_FILE],
[zone_stat_item], [$LINUX/include/linux/mmzone.h])
ZFS_AC_KERNEL_ENUM_MEMBER([NR_SLAB_RECLAIMABLE],
[zone_stat_item], [$LINUX/include/linux/mmzone.h])
ZFS_AC_KERNEL_GLOBAL_ZONE_PAGE_STATE_SANITY
])

View File

@ -1,8 +1,22 @@
dnl #
dnl # 4.18 API change
dnl # i_atime, i_mtime, and i_ctime changed from timespec to timespec64.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_TIMES], [
dnl #
dnl # 5.6 API change
dnl # timespec64_trunc() replaced by timestamp_truncate() interface.
dnl #
ZFS_LINUX_TEST_SRC([timestamp_truncate], [
#include <linux/fs.h>
],[
struct timespec64 ts;
struct inode ip;
ts = timestamp_truncate(ts, &ip);
])
dnl #
dnl # 4.18 API change
dnl # i_atime, i_mtime, and i_ctime changed from timespec to timespec64.
dnl #
ZFS_LINUX_TEST_SRC([inode_times], [
#include <linux/fs.h>
],[
@ -15,6 +29,15 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_TIMES], [
])
AC_DEFUN([ZFS_AC_KERNEL_INODE_TIMES], [
AC_MSG_CHECKING([whether timestamp_truncate() exists])
ZFS_LINUX_TEST_RESULT([timestamp_truncate], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_INODE_TIMESTAMP_TRUNCATE, 1,
[timestamp_truncate() exists])
],[
AC_MSG_RESULT(no)
])
AC_MSG_CHECKING([whether inode->i_*time's are timespec64])
ZFS_LINUX_TEST_RESULT([inode_times], [
AC_MSG_RESULT(no)

View File

@ -5,6 +5,7 @@ dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_INVALIDATE_BDEV], [
ZFS_LINUX_TEST_SRC([invalidate_bdev], [
#include <linux/buffer_head.h>
#include <linux/blkdev.h>
],[
struct block_device *bdev = NULL;
invalidate_bdev(bdev);

View File

@ -0,0 +1,24 @@
dnl #
dnl # 5.8 API,
dnl # __vmalloc PAGE_KERNEL removal
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_VMALLOC_PAGE_KERNEL], [
ZFS_LINUX_TEST_SRC([__vmalloc], [
#include <linux/mm.h>
#include <linux/vmalloc.h>
],[
void *p __attribute__ ((unused));
p = __vmalloc(0, GFP_KERNEL, PAGE_KERNEL);
])
])
AC_DEFUN([ZFS_AC_KERNEL_VMALLOC_PAGE_KERNEL], [
AC_MSG_CHECKING([whether __vmalloc(ptr, flags, pageflags) is available])
ZFS_LINUX_TEST_RESULT([__vmalloc], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_VMALLOC_PAGE_KERNEL, 1, [__vmalloc page flags exists])
],[
AC_MSG_RESULT(no)
])
])

55
config/kernel-ktime.m4 Normal file
View File

@ -0,0 +1,55 @@
dnl #
dnl # 4.18: ktime_get_coarse_real_ts64() replaces current_kernel_time64().
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_KTIME_GET_COARSE_REAL_TS64], [
ZFS_LINUX_TEST_SRC([ktime_get_coarse_real_ts64], [
#include <linux/mm.h>
], [
struct timespec64 ts;
ktime_get_coarse_real_ts64(&ts);
])
])
AC_DEFUN([ZFS_AC_KERNEL_KTIME_GET_COARSE_REAL_TS64], [
AC_MSG_CHECKING([whether ktime_get_coarse_real_ts64() exists])
ZFS_LINUX_TEST_RESULT([ktime_get_coarse_real_ts64], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_KTIME_GET_COARSE_REAL_TS64, 1,
[ktime_get_coarse_real_ts64() exists])
], [
AC_MSG_RESULT(no)
])
])
dnl #
dnl # 4.18: ktime_get_raw_ts64() replaces getrawmonotonic64().
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_KTIME_GET_RAW_TS64], [
ZFS_LINUX_TEST_SRC([ktime_get_raw_ts64], [
#include <linux/mm.h>
], [
struct timespec64 ts;
ktime_get_raw_ts64(&ts);
])
])
AC_DEFUN([ZFS_AC_KERNEL_KTIME_GET_RAW_TS64], [
AC_MSG_CHECKING([whether ktime_get_raw_ts64() exists])
ZFS_LINUX_TEST_RESULT([ktime_get_raw_ts64], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_KTIME_GET_RAW_TS64, 1,
[ktime_get_raw_ts64() exists])
], [
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_KTIME], [
ZFS_AC_KERNEL_SRC_KTIME_GET_COARSE_REAL_TS64
ZFS_AC_KERNEL_SRC_KTIME_GET_RAW_TS64
])
AC_DEFUN([ZFS_AC_KERNEL_KTIME], [
ZFS_AC_KERNEL_KTIME_GET_COARSE_REAL_TS64
ZFS_AC_KERNEL_KTIME_GET_RAW_TS64
])

View File

@ -1,23 +0,0 @@
dnl #
dnl # 4.18: ktime_get_coarse_real_ts64() added. Use it in place of
dnl # current_kernel_time64().
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_KTIME_GET_COARSE_REAL_TS64], [
ZFS_LINUX_TEST_SRC([ktime_get_coarse_real_ts64], [
#include <linux/mm.h>
], [
struct timespec64 ts;
ktime_get_coarse_real_ts64(&ts);
])
])
AC_DEFUN([ZFS_AC_KERNEL_KTIME_GET_COARSE_REAL_TS64], [
AC_MSG_CHECKING([whether ktime_get_coarse_real_ts64() exists])
ZFS_LINUX_TEST_RESULT([ktime_get_coarse_real_ts64], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_KTIME_GET_COARSE_REAL_TS64, 1,
[ktime_get_coarse_real_ts64() exists])
], [
AC_MSG_RESULT(no)
])
])

View File

@ -5,6 +5,7 @@ dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_LOOKUP_BDEV], [
ZFS_LINUX_TEST_SRC([lookup_bdev_1arg], [
#include <linux/fs.h>
#include <linux/blkdev.h>
], [
lookup_bdev(NULL);
])

View File

@ -25,52 +25,106 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_MAKE_REQUEST_FN], [
],[
blk_queue_make_request(NULL, &make_request);
])
ZFS_LINUX_TEST_SRC([blk_alloc_queue_request_fn], [
#include <linux/blkdev.h>
blk_qc_t make_request(struct request_queue *q,
struct bio *bio) { return (BLK_QC_T_NONE); }
],[
struct request_queue *q __attribute__ ((unused));
q = blk_alloc_queue(make_request, NUMA_NO_NODE);
])
ZFS_LINUX_TEST_SRC([block_device_operations_submit_bio], [
#include <linux/blkdev.h>
],[
struct block_device_operations o;
o.submit_bio = NULL;
])
])
AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
dnl # Checked as part of the blk_alloc_queue_request_fn test
dnl #
dnl # Legacy API
dnl # make_request_fn returns int.
dnl # Linux 5.9 API Change
dnl # make_request_fn was moved into block_device_operations->submit_bio
dnl #
AC_MSG_CHECKING([whether make_request_fn() returns int])
ZFS_LINUX_TEST_RESULT([make_request_fn_int], [
AC_MSG_CHECKING([whether submit_bio is member of struct block_device_operations])
ZFS_LINUX_TEST_RESULT([block_device_operations_submit_bio], [
AC_MSG_RESULT(yes)
AC_DEFINE(MAKE_REQUEST_FN_RET, int,
[make_request_fn() return type])
AC_DEFINE(HAVE_MAKE_REQUEST_FN_RET_INT, 1,
[Noting that make_request_fn() returns int])
],[
AC_MSG_RESULT(no)
AC_DEFINE(HAVE_SUBMIT_BIO_IN_BLOCK_DEVICE_OPERATIONS, 1,
[submit_bio is member of struct block_device_operations])
],[
dnl # Checked as part of the blk_alloc_queue_request_fn test
dnl #
dnl # Linux 3.2 API Change
dnl # make_request_fn returns void.
dnl # Linux 5.7 API Change
dnl # blk_alloc_queue() expects request function.
dnl #
AC_MSG_CHECKING([whether make_request_fn() returns void])
ZFS_LINUX_TEST_RESULT([make_request_fn_void], [
AC_MSG_CHECKING([whether blk_alloc_queue() expects request function])
ZFS_LINUX_TEST_RESULT([blk_alloc_queue_request_fn], [
AC_MSG_RESULT(yes)
AC_DEFINE(MAKE_REQUEST_FN_RET, void,
[make_request_fn() return type])
AC_DEFINE(HAVE_MAKE_REQUEST_FN_RET_VOID, 1,
[Noting that make_request_fn() returns void])
dnl # Checked as part of the blk_alloc_queue_request_fn test
AC_MSG_CHECKING([whether make_request_fn() returns blk_qc_t])
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_ALLOC_QUEUE_REQUEST_FN, 1,
[blk_alloc_queue() expects request function])
AC_DEFINE(MAKE_REQUEST_FN_RET, blk_qc_t,
[make_request_fn() return type])
AC_DEFINE(HAVE_MAKE_REQUEST_FN_RET_QC, 1,
[Noting that make_request_fn() returns blk_qc_t])
],[
AC_MSG_RESULT(no)
dnl #
dnl # Linux 4.4 API Change
dnl # make_request_fn returns blk_qc_t.
dnl # Linux 3.2 API Change
dnl # make_request_fn returns void.
dnl #
AC_MSG_CHECKING(
[whether make_request_fn() returns blk_qc_t])
ZFS_LINUX_TEST_RESULT([make_request_fn_blk_qc_t], [
AC_MSG_CHECKING([whether make_request_fn() returns void])
ZFS_LINUX_TEST_RESULT([make_request_fn_void], [
AC_MSG_RESULT(yes)
AC_DEFINE(MAKE_REQUEST_FN_RET, blk_qc_t,
[make_request_fn() return type])
AC_DEFINE(HAVE_MAKE_REQUEST_FN_RET_QC, 1,
[Noting that make_request_fn() ]
[returns blk_qc_t])
AC_DEFINE(MAKE_REQUEST_FN_RET, void,
[make_request_fn() return type])
AC_DEFINE(HAVE_MAKE_REQUEST_FN_RET_VOID, 1,
[Noting that make_request_fn() returns void])
],[
ZFS_LINUX_TEST_ERROR([make_request_fn])
AC_MSG_RESULT(no)
dnl #
dnl # Linux 4.4 API Change
dnl # make_request_fn returns blk_qc_t.
dnl #
AC_MSG_CHECKING(
[whether make_request_fn() returns blk_qc_t])
ZFS_LINUX_TEST_RESULT([make_request_fn_blk_qc_t], [
AC_MSG_RESULT(yes)
AC_DEFINE(MAKE_REQUEST_FN_RET, blk_qc_t,
[make_request_fn() return type])
AC_DEFINE(HAVE_MAKE_REQUEST_FN_RET_QC, 1,
[Noting that make_request_fn() ]
[returns blk_qc_t])
],[
AC_MSG_RESULT(no)
dnl #
dnl # Legacy API
dnl # make_request_fn returns int.
dnl #
AC_MSG_CHECKING(
[whether make_request_fn() returns int])
ZFS_LINUX_TEST_RESULT([make_request_fn_int], [
AC_MSG_RESULT(yes)
AC_DEFINE(MAKE_REQUEST_FN_RET, int,
[make_request_fn() return type])
AC_DEFINE(HAVE_MAKE_REQUEST_FN_RET_INT,
1, [Noting that make_request_fn() ]
[returns int])
],[
ZFS_LINUX_TEST_ERROR([make_request_fn])
])
])
])
])
])

View File

@ -1,3 +1,24 @@
dnl #
dnl # Detect objtool functionality.
dnl #
dnl #
dnl # Kernel 5.10: linux/frame.h was renamed linux/objtool.h
dnl #
AC_DEFUN([ZFS_AC_KERNEL_OBJTOOL_HEADER], [
AC_MSG_CHECKING([whether objtool header is available])
ZFS_LINUX_TRY_COMPILE([
#include <linux/objtool.h>
],[
],[
AC_DEFINE(HAVE_KERNEL_OBJTOOL_HEADER, 1,
[kernel has linux/objtool.h])
AC_MSG_RESULT(linux/objtool.h)
],[
AC_MSG_RESULT(linux/frame.h)
])
])
dnl #
dnl # Check for objtool support.
dnl #
@ -6,19 +27,24 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_OBJTOOL], [
dnl # 4.6 API for compile-time stack validation
ZFS_LINUX_TEST_SRC([objtool], [
#undef __ASSEMBLY__
#include <asm/ptrace.h>
#include <asm/frame.h>
],[
#if !defined(FRAME_BEGIN)
CTASSERT(1);
#error "FRAME_BEGIN is not defined"
#endif
])
dnl # 4.6 API added STACK_FRAME_NON_STANDARD macro
ZFS_LINUX_TEST_SRC([stack_frame_non_standard], [
#ifdef HAVE_KERNEL_OBJTOOL_HEADER
#include <linux/objtool.h>
#else
#include <linux/frame.h>
#endif
],[
#if !defined(STACK_FRAME_NON_STANDARD)
CTASSERT(1);
#error "STACK_FRAME_NON_STANDARD is not defined."
#endif
])
])

View File

@ -0,0 +1,41 @@
dnl #
dnl # 5.6 API Change
dnl # The proc_ops structure was introduced to replace the use of
dnl # of the file_operations structure when registering proc handlers.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_PROC_OPERATIONS], [
ZFS_LINUX_TEST_SRC([proc_ops_struct], [
#include <linux/proc_fs.h>
int test_open(struct inode *ip, struct file *fp) { return 0; }
ssize_t test_read(struct file *fp, char __user *ptr,
size_t size, loff_t *offp) { return 0; }
ssize_t test_write(struct file *fp, const char __user *ptr,
size_t size, loff_t *offp) { return 0; }
loff_t test_lseek(struct file *fp, loff_t off, int flag)
{ return 0; }
int test_release(struct inode *ip, struct file *fp)
{ return 0; }
const struct proc_ops test_ops __attribute__ ((unused)) = {
.proc_open = test_open,
.proc_read = test_read,
.proc_write = test_write,
.proc_lseek = test_lseek,
.proc_release = test_release,
};
], [
struct proc_dir_entry *entry __attribute__ ((unused)) =
proc_create_data("test", 0444, NULL, &test_ops, NULL);
])
])
AC_DEFUN([ZFS_AC_KERNEL_PROC_OPERATIONS], [
AC_MSG_CHECKING([whether proc_ops structure exists])
ZFS_LINUX_TEST_RESULT([proc_ops_struct], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_PROC_OPS_STRUCT, 1, [proc_ops structure exists])
], [
AC_MSG_RESULT(no)
])
])

View File

@ -12,6 +12,7 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
dnl # Sequential ZFS_LINUX_TRY_COMPILE tests
ZFS_AC_KERNEL_FPU_HEADER
ZFS_AC_KERNEL_OBJTOOL_HEADER
ZFS_AC_KERNEL_WAIT_QUEUE_ENTRY_T
ZFS_AC_KERNEL_MISC_MINOR
ZFS_AC_KERNEL_DECLARE_EVENT_CLASS
@ -45,6 +46,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_SCHED
ZFS_AC_KERNEL_SRC_USLEEP_RANGE
ZFS_AC_KERNEL_SRC_KMEM_CACHE
ZFS_AC_KERNEL_SRC_VMALLOC_PAGE_KERNEL
ZFS_AC_KERNEL_SRC_WAIT
ZFS_AC_KERNEL_SRC_INODE_TIMES
ZFS_AC_KERNEL_SRC_INODE_LOCK
@ -53,6 +55,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_TIMER_SETUP
ZFS_AC_KERNEL_SRC_CURRENT_BIO_TAIL
ZFS_AC_KERNEL_SRC_SUPER_USER_NS
ZFS_AC_KERNEL_SRC_PROC_OPERATIONS
ZFS_AC_KERNEL_SRC_SUBMIT_BIO
ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS
ZFS_AC_KERNEL_SRC_BLKDEV_GET_BY_PATH
@ -78,6 +81,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_BLK_QUEUE_MAX_HW_SECTORS
ZFS_AC_KERNEL_SRC_BLK_QUEUE_MAX_SEGMENTS
ZFS_AC_KERNEL_SRC_BLK_QUEUE_PLUG
ZFS_AC_KERNEL_SRC_BLKG_TRYGET
ZFS_AC_KERNEL_SRC_GET_DISK_AND_MODULE
ZFS_AC_KERNEL_SRC_GET_DISK_RO
ZFS_AC_KERNEL_SRC_GENERIC_READLINK_GLOBAL
@ -136,10 +140,11 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_CURRENT_TIME
ZFS_AC_KERNEL_SRC_USERNS_CAPABILITIES
ZFS_AC_KERNEL_SRC_IN_COMPAT_SYSCALL
ZFS_AC_KERNEL_SRC_KTIME_GET_COARSE_REAL_TS64
ZFS_AC_KERNEL_SRC_KTIME
ZFS_AC_KERNEL_SRC_TOTALRAM_PAGES_FUNC
ZFS_AC_KERNEL_SRC_TOTALHIGH_PAGES
ZFS_AC_KERNEL_SRC_KSTRTOUL
ZFS_AC_KERNEL_SRC_BLKDEV_CHANGE
AC_MSG_CHECKING([for available kernel interfaces])
ZFS_LINUX_TEST_COMPILE_ALL([kabi])
@ -161,6 +166,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_SCHED
ZFS_AC_KERNEL_USLEEP_RANGE
ZFS_AC_KERNEL_KMEM_CACHE
ZFS_AC_KERNEL_VMALLOC_PAGE_KERNEL
ZFS_AC_KERNEL_WAIT
ZFS_AC_KERNEL_INODE_TIMES
ZFS_AC_KERNEL_INODE_LOCK
@ -169,6 +175,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_TIMER_SETUP
ZFS_AC_KERNEL_CURRENT_BIO_TAIL
ZFS_AC_KERNEL_SUPER_USER_NS
ZFS_AC_KERNEL_PROC_OPERATIONS
ZFS_AC_KERNEL_SUBMIT_BIO
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS
ZFS_AC_KERNEL_BLKDEV_GET_BY_PATH
@ -186,6 +193,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_BIO_BI_STATUS
ZFS_AC_KERNEL_BIO_RW_BARRIER
ZFS_AC_KERNEL_BIO_RW_DISCARD
ZFS_AC_KERNEL_BLKG_TRYGET
ZFS_AC_KERNEL_BLK_QUEUE_BDI
ZFS_AC_KERNEL_BLK_QUEUE_DISCARD
ZFS_AC_KERNEL_BLK_QUEUE_SECURE_ERASE
@ -252,10 +260,11 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_CURRENT_TIME
ZFS_AC_KERNEL_USERNS_CAPABILITIES
ZFS_AC_KERNEL_IN_COMPAT_SYSCALL
ZFS_AC_KERNEL_KTIME_GET_COARSE_REAL_TS64
ZFS_AC_KERNEL_KTIME
ZFS_AC_KERNEL_TOTALRAM_PAGES_FUNC
ZFS_AC_KERNEL_TOTALHIGH_PAGES
ZFS_AC_KERNEL_KSTRTOUL
ZFS_AC_KERNEL_BLKDEV_CHANGE
])
dnl #
@ -809,11 +818,20 @@ dnl # $2 - source
dnl # $3 - run on success (valid .ko generated)
dnl # $4 - run on failure (unable to compile)
dnl #
dnl # When configuring as builtin (--enable-linux-builtin) for kernels
dnl # without loadable module support (CONFIG_MODULES=n) only the object
dnl # file is created. See ZFS_LINUX_TEST_COMPILE_ALL for details.
dnl #
AC_DEFUN([ZFS_LINUX_TRY_COMPILE], [
ZFS_LINUX_COMPILE_IFELSE(
[ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]])],
[test -f build/conftest/conftest.ko],
[$3], [$4])
AS_IF([test "x$enable_linux_builtin" = "xyes"], [
ZFS_LINUX_COMPILE_IFELSE(
[ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]])],
[test -f build/conftest/conftest.o], [$3], [$4])
], [
ZFS_LINUX_COMPILE_IFELSE(
[ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]])],
[test -f build/conftest/conftest.ko], [$3], [$4])
])
])
dnl #

View File

@ -23,6 +23,7 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_TOOLCHAIN_SIMD], [
ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_AVX512VL
ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_AES
ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_PCLMULQDQ
ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_MOVBE
;;
esac
])
@ -401,3 +402,23 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_PCLMULQDQ], [
AC_MSG_RESULT([no])
])
])
dnl #
dnl # ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_MOVBE
dnl #
AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_MOVBE], [
AC_MSG_CHECKING([whether host toolchain supports MOVBE])
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
{
__asm__ __volatile__("movbe 0(%eax), %eax");
}
]])], [
AC_MSG_RESULT([yes])
AC_DEFINE([HAVE_MOVBE], 1, [Define if host toolchain supports MOVBE])
], [
AC_MSG_RESULT([no])
])
])

View File

@ -67,6 +67,7 @@ AC_CONFIG_FILES([
udev/Makefile
udev/rules.d/Makefile
etc/Makefile
etc/default/Makefile
etc/init.d/Makefile
etc/zfs/Makefile
etc/systemd/Makefile

View File

@ -15,8 +15,10 @@ $(pkgdracut_SCRIPTS):%:%.in
-e 's,@sysconfdir\@,$(sysconfdir),g' \
$< >'$@'
# Double-colon rules are allowed; there are multiple independent definitions.
clean-local::
-$(RM) $(pkgdracut_SCRIPTS)
# Double-colon rules are allowed; there are multiple independent definitions.
distclean-local::
-$(RM) $(pkgdracut_SCRIPTS)

View File

@ -33,5 +33,6 @@ $(pkgdracut_SCRIPTS) $(pkgdracut_DATA) :%:%.in
-e 's,@mounthelperdir\@,$(mounthelperdir),g' \
$< >'$@'
# Double-colon rules are allowed; there are multiple independent definitions.
distclean-local::
-$(RM) $(pkgdracut_SCRIPTS) $(pkgdracut_DATA)

View File

@ -6,15 +6,10 @@ initrd_SCRIPTS = \
SUBDIRS = hooks scripts
EXTRA_DIST = \
$(top_srcdir)/etc/init.d/zfs \
$(top_srcdir)/etc/init.d/zfs-functions \
$(top_srcdir)/contrib/initramfs/conf.d/zfs \
$(top_srcdir)/contrib/initramfs/conf-hooks.d/zfs \
$(top_srcdir)/contrib/initramfs/README.initramfs.markdown
$(top_srcdir)/etc/init.d/zfs $(top_srcdir)/etc/init.d/zfs-functions:
$(MAKE) -C $(top_srcdir)/etc/init.d zfs zfs-functions
install-initrdSCRIPTS: $(EXTRA_DIST)
for d in conf.d conf-hooks.d scripts/local-top; do \
$(MKDIR_P) $(DESTDIR)$(initrddir)/$$d; \
@ -26,9 +21,3 @@ install-initrdSCRIPTS: $(EXTRA_DIST)
cp $(top_builddir)/contrib/initramfs/$$d/zfs \
$(DESTDIR)$(initrddir)/$$d/; \
done
$(MKDIR_P) $(DESTDIR)$(DEFAULT_INITCONF_DIR); \
cp $(top_builddir)/etc/init.d/zfs \
$(DESTDIR)$(DEFAULT_INITCONF_DIR)/; \
$(MKDIR_P) $(DESTDIR)$(sysconfdir)/zfs; \
cp $(top_builddir)/etc/init.d/zfs-functions \
$(DESTDIR)$(sysconfdir)/zfs/

View File

@ -14,8 +14,10 @@ $(hooks_SCRIPTS):%:%.in
-e 's,@mounthelperdir\@,$(mounthelperdir),g' \
$< >'$@'
# Double-colon rules are allowed; there are multiple independent definitions.
clean-local::
-$(RM) $(hooks_SCRIPTS)
# Double-colon rules are allowed; there are multiple independent definitions.
distclean-local::
-$(RM) $(hooks_SCRIPTS)

View File

@ -1,20 +1,6 @@
scriptsdir = /usr/share/initramfs-tools/scripts
scripts_DATA = \
dist_scripts_DATA = \
zfs
SUBDIRS = local-top
EXTRA_DIST = \
$(top_srcdir)/contrib/initramfs/scripts/zfs.in
$(scripts_DATA):%:%.in
-$(SED) -e 's,@sbindir\@,$(sbindir),g' \
-e 's,@sysconfdir\@,$(sysconfdir),g' \
$< >'$@'
clean-local::
-$(RM) $(scripts_SCRIPTS)
distclean-local::
-$(RM) $(scripts_SCRIPTS)

View File

@ -6,17 +6,9 @@
# Enable this by passing boot=zfs on the kernel command line.
#
# Source the common init script
# Source the common functions
. /etc/zfs/zfs-functions
# Paths to what we need - in the initrd, these paths are hardcoded,
# so override the defines in zfs-functions.
ZFS="@sbindir@/zfs"
ZPOOL="@sbindir@/zpool"
ZPOOL_CACHE="@sysconfdir@/zfs/zpool.cache"
export ZFS ZPOOL ZPOOL_CACHE
# Start interactive shell.
# Use debian's panic() if defined, because it allows to prevent shell access
# by setting panic in cmdline (e.g. panic=0 or panic=15).

View File

@ -36,11 +36,13 @@ rm -rf "$KERNEL_DIR/include/zfs" "$KERNEL_DIR/fs/zfs"
cp --recursive include "$KERNEL_DIR/include/zfs"
cp --recursive module "$KERNEL_DIR/fs/zfs"
cp zfs_config.h "$KERNEL_DIR/include/zfs/"
rm "$KERNEL_DIR/include/zfs/.gitignore"
for MODULE in "${MODULES[@]}"
do
sed -i.bak '/obj =/d' "$KERNEL_DIR/fs/zfs/$MODULE/Makefile"
sed -i.bak '/src =/d' "$KERNEL_DIR/fs/zfs/$MODULE/Makefile"
sed -i '/obj =/d' "$KERNEL_DIR/fs/zfs/$MODULE/Makefile"
sed -i '/src =/d' "$KERNEL_DIR/fs/zfs/$MODULE/Makefile"
sed -i "s|-I$PWD/module/|-I\$(srctree)/fs/zfs/|" "$KERNEL_DIR/fs/zfs/$MODULE/Makefile"
done
cat > "$KERNEL_DIR/fs/zfs/Kconfig" <<"EOF"

View File

@ -1,2 +1,2 @@
SUBDIRS = zfs sudoers.d $(ZFS_INIT_SYSTEMD) $(ZFS_INIT_SYSV) $(ZFS_MODULE_LOAD)
DIST_SUBDIRS = init.d zfs systemd modules-load.d sudoers.d
SUBDIRS = default zfs sudoers.d $(ZFS_INIT_SYSTEMD) $(ZFS_INIT_SYSV) $(ZFS_MODULE_LOAD)
DIST_SUBDIRS = default init.d zfs systemd modules-load.d sudoers.d

1
etc/default/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
zfs

12
etc/default/Makefile.am Normal file
View File

@ -0,0 +1,12 @@
initconfdir = $(DEFAULT_INITCONF_DIR)
initconf_SCRIPTS = zfs
EXTRA_DIST = \
$(top_srcdir)/etc/default/zfs.in
$(initconf_SCRIPTS):%:%.in Makefile
$(SED) \
-e 's,@sysconfdir\@,$(sysconfdir),g' \
$< >'$@'
CLEANFILES = $(initconf_SCRIPTS)

View File

@ -1,4 +1,3 @@
zfs-functions
zfs-import
zfs-mount
zfs-share

View File

@ -1,21 +1,15 @@
initdir = $(DEFAULT_INIT_DIR)
init_SCRIPTS = zfs-import zfs-mount zfs-share zfs-zed
initcommondir = $(sysconfdir)/zfs
initcommon_SCRIPTS = zfs-functions
initconfdir = $(DEFAULT_INITCONF_DIR)
initconf_SCRIPTS = zfs
EXTRA_DIST = \
$(top_srcdir)/etc/init.d/zfs-functions.in \
$(top_srcdir)/etc/init.d/zfs-share.in \
$(top_srcdir)/etc/init.d/zfs-import.in \
$(top_srcdir)/etc/init.d/zfs-mount.in \
$(top_srcdir)/etc/init.d/zfs-zed.in \
$(top_srcdir)/etc/init.d/zfs.in
$(top_srcdir)/etc/init.d/zfs-zed.in
$(init_SCRIPTS) $(initconf_SCRIPTS) $(initcommon_SCRIPTS):%:%.in
$(init_SCRIPTS):%:%.in Makefile
-(if [ -e /etc/debian_version ]; then \
NFS_SRV=nfs-kernel-server; \
else \
@ -26,7 +20,8 @@ $(init_SCRIPTS) $(initconf_SCRIPTS) $(initcommon_SCRIPTS):%:%.in
else \
SHELL=/bin/sh; \
fi; \
$(SED) -e 's,@bindir\@,$(bindir),g' \
$(SED) \
-e 's,@bindir\@,$(bindir),g' \
-e 's,@sbindir\@,$(sbindir),g' \
-e 's,@udevdir\@,$(udevdir),g' \
-e 's,@udevruledir\@,$(udevruledir),g' \
@ -37,8 +32,6 @@ $(init_SCRIPTS) $(initconf_SCRIPTS) $(initcommon_SCRIPTS):%:%.in
-e "s,@SHELL\@,$$SHELL,g" \
-e "s,@NFS_SRV\@,$$NFS_SRV,g" \
$< >'$@'; \
[ '$@' = 'zfs-functions' -o '$@' = 'zfs' ] || \
chmod +x '$@')
distclean-local::
-$(RM) $(init_SCRIPTS) $(initcommon_SCRIPTS) $(initconf_SCRIPTS)
CLEANFILES = $(init_SCRIPTS)

View File

@ -35,7 +35,7 @@ SUPPORT
If you're making your own distribution and you want the scripts to
work on that, the biggest problem you'll (probably) have is the part
at the beginning of the "zfs-functions.in" file which sets up the
at the beginning of the "zfs-functions" file which sets up the
logging output.
INSTALLING INIT SCRIPT LINKS

View File

@ -9,5 +9,6 @@ $(modulesload_DATA):%:%.in
-e '' \
$< >'$@'
# Double-colon rules are allowed; there are multiple independent definitions.
distclean-local::
-$(RM) $(modulesload_DATA)

View File

@ -11,5 +11,6 @@ $(systemdgenerator_SCRIPTS): %: %.in
-e 's,@sysconfdir\@,$(sysconfdir),g' \
$< >'$@'
# Double-colon rules are allowed; there are multiple independent definitions.
distclean-local::
-$(RM) $(systemdgenerator_SCRIPTS)

View File

@ -2,6 +2,7 @@
# zfs-mount-generator - generates systemd mount units for zfs
# Copyright (c) 2017 Antonio Russo <antonio.e.russo@gmail.com>
# Copyright (c) 2020 InsanePrawn <insane.prawny@gmail.com>
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
@ -33,6 +34,32 @@ do_fail() {
exit 1
}
# test if $1 is in space-separated list $2
is_known() {
query="$1"
IFS=' '
for element in $2 ; do
if [ "$query" = "$element" ] ; then
return 0
fi
done
return 1
}
# create dependency on unit file $1
# of type $2, i.e. "wants" or "requires"
# in the target units from space-separated list $3
create_dependencies() {
unitfile="$1"
suffix="$2"
IFS=' '
for target in $3 ; do
target_dir="${dest_norm}/${target}.${suffix}/"
mkdir -p "${target_dir}"
ln -s "../${unitfile}" "${target_dir}"
done
}
# see systemd.generator
if [ $# -eq 0 ] ; then
dest_norm="/tmp"
@ -42,11 +69,7 @@ else
do_fail "zero or three arguments required"
fi
# For ZFSs marked "auto", a dependency is created for local-fs.target. To
# avoid regressions, this dependency is reduced to "wants" rather than
# "requires". **THIS MAY CHANGE**
req_dir="${dest_norm}/local-fs.target.wants/"
mkdir -p "${req_dir}"
pools=$(zpool list -H -o name || true)
# All needed information about each ZFS is available from
# zfs list -H -t filesystem -o <properties>
@ -58,10 +81,11 @@ process_line() {
# zfs list -H -o name,...
# fields are tab separated
IFS="$(printf '\t')"
# protect against special characters in, e.g., mountpoints
set -f
# shellcheck disable=SC2086
set -- $1
dataset="${1}"
pool="${dataset%%/*}"
p_mountpoint="${2}"
p_canmount="${3}"
p_atime="${4}"
@ -73,39 +97,116 @@ process_line() {
p_nbmand="${10}"
p_encroot="${11}"
p_keyloc="${12}"
p_systemd_requires="${13}"
p_systemd_requiresmountsfor="${14}"
p_systemd_before="${15}"
p_systemd_after="${16}"
p_systemd_wantedby="${17}"
p_systemd_requiredby="${18}"
p_systemd_nofail="${19}"
p_systemd_ignore="${20}"
# Minimal pre-requisites to mount a ZFS dataset
# By ordering before zfs-mount.service, we avoid race conditions.
after="zfs-import.target"
before="zfs-mount.service"
wants="zfs-import.target"
requires=""
requiredmounts=""
bindsto=""
wantedby=""
requiredby=""
noauto="off"
# If the pool is already imported, zfs-import.target is not needed. This
# avoids a dependency loop on root-on-ZFS systems:
# systemd-random-seed.service After (via RequiresMountsFor) var-lib.mount
# After zfs-import.target After zfs-import-{cache,scan}.service After
# cryptsetup.service After systemd-random-seed.service.
#
# Pools are newline-separated and may contain spaces in their names.
# There is no better portable way to set IFS to just a newline. Using
# $(printf '\n') doesn't work because $(...) strips trailing newlines.
IFS="
"
for p in $pools ; do
if [ "$p" = "$pool" ] ; then
after=""
wants=""
break
fi
done
if [ -n "${p_systemd_after}" ] && \
[ "${p_systemd_after}" != "-" ] ; then
after="${p_systemd_after} ${after}"
fi
if [ -n "${p_systemd_before}" ] && \
[ "${p_systemd_before}" != "-" ] ; then
before="${p_systemd_before} ${before}"
fi
if [ -n "${p_systemd_requires}" ] && \
[ "${p_systemd_requires}" != "-" ] ; then
requires="Requires=${p_systemd_requires}"
fi
if [ -n "${p_systemd_requiresmountsfor}" ] && \
[ "${p_systemd_requiresmountsfor}" != "-" ] ; then
requiredmounts="RequiresMountsFor=${p_systemd_requiresmountsfor}"
fi
# Handle encryption
if [ -n "${p_encroot}" ] &&
[ "${p_encroot}" != "-" ] ; then
keyloadunit="zfs-load-key-$(systemd-escape "${p_encroot}").service"
if [ "${p_encroot}" = "${dataset}" ] ; then
pathdep=""
keymountdep=""
if [ "${p_keyloc%%://*}" = "file" ] ; then
pathdep="RequiresMountsFor='${p_keyloc#file://}'"
keyloadcmd="@sbindir@/zfs load-key '${dataset}'"
if [ -n "${requiredmounts}" ] ; then
keymountdep="${requiredmounts} '${p_keyloc#file://}'"
else
keymountdep="RequiresMountsFor='${p_keyloc#file://}'"
fi
keyloadscript="@sbindir@/zfs load-key \"${dataset}\""
elif [ "${p_keyloc}" = "prompt" ] ; then
keyloadcmd="/bin/sh -c 'set -eu;"\
"keystatus=\"\$\$(@sbindir@/zfs get -H -o value keystatus \"${dataset}\")\";"\
"[ \"\$\$keystatus\" = \"unavailable\" ] || exit 0;"\
"count=0;"\
"while [ \$\$count -lt 3 ];do"\
" systemd-ask-password --id=\"zfs:${dataset}\""\
" \"Enter passphrase for ${dataset}:\"|"\
" @sbindir@/zfs load-key \"${dataset}\" && exit 0;"\
" count=\$\$((count + 1));"\
"done;"\
"exit 1'"
keyloadscript="\
count=0;\
while [ \$\$count -lt 3 ];do\
systemd-ask-password --id=\"zfs:${dataset}\"\
\"Enter passphrase for ${dataset}:\"|\
@sbindir@/zfs load-key \"${dataset}\" && exit 0;\
count=\$\$((count + 1));\
done;\
exit 1"
else
printf 'zfs-mount-generator: (%s) invalid keylocation\n' \
"${dataset}" >/dev/kmsg
fi
keyloadcmd="\
/bin/sh -c '\
set -eu;\
keystatus=\"\$\$(@sbindir@/zfs get -H -o value keystatus \"${dataset}\")\";\
[ \"\$\$keystatus\" = \"unavailable\" ] || exit 0;\
${keyloadscript}'"
keyunloadcmd="\
/bin/sh -c '\
set -eu;\
keystatus=\"\$\$(@sbindir@/zfs get -H -o value keystatus \"${dataset}\")\";\
[ \"\$\$keystatus\" = \"available\" ] || exit 0;\
@sbindir@/zfs unload-key \"${dataset}\"'"
# Generate the key-load .service unit
cat > "${dest_norm}/${keyloadunit}" << EOF
# Automatically generated by zfs-mount-generator
#
# Note: It is tempting to use a `<<EOF` style here-document for this, but
# bash requires a writable /tmp or $TMPDIR for that. This is not always
# available early during boot.
#
echo \
"# Automatically generated by zfs-mount-generator
[Unit]
Description=Load ZFS key for ${dataset}
@ -113,33 +214,50 @@ SourcePath=${cachefile}
Documentation=man:zfs-mount-generator(8)
DefaultDependencies=no
Wants=${wants}
After=${wants}
${pathdep}
After=${after}
${requires}
${keymountdep}
[Service]
Type=oneshot
RemainAfterExit=yes
# This avoids a dependency loop involving systemd-journald.socket if this
# dataset is a parent of the root filesystem.
StandardOutput=null
StandardError=null
ExecStart=${keyloadcmd}
ExecStop=@sbindir@/zfs unload-key '${dataset}'
EOF
ExecStop=${keyunloadcmd}" > "${dest_norm}/${keyloadunit}"
fi
# Update the dependencies for the mount file to require the
# Update the dependencies for the mount file to want the
# key-loading unit.
wants="${wants} ${keyloadunit}"
wants="${wants}"
bindsto="BindsTo=${keyloadunit}"
after="${after} ${keyloadunit}"
fi
# Prepare the .mount unit
# skip generation of the mount unit if org.openzfs.systemd:ignore is "on"
if [ -n "${p_systemd_ignore}" ] ; then
if [ "${p_systemd_ignore}" = "on" ] ; then
return
elif [ "${p_systemd_ignore}" = "-" ] \
|| [ "${p_systemd_ignore}" = "off" ] ; then
: # This is OK
else
do_fail "invalid org.openzfs.systemd:ignore for ${dataset}"
fi
fi
# Check for canmount=off .
if [ "${p_canmount}" = "off" ] ; then
return
elif [ "${p_canmount}" = "noauto" ] ; then
# Don't let a noauto marked mountpoint block an "auto" marked mountpoint
return
noauto="on"
elif [ "${p_canmount}" = "on" ] ; then
: # This is OK
else
do_fail "invalid canmount"
do_fail "invalid canmount for ${dataset}"
fi
# Check for legacy and blank mountpoints.
@ -148,11 +266,11 @@ EOF
elif [ "${p_mountpoint}" = "none" ] ; then
return
elif [ "${p_mountpoint%"${p_mountpoint#?}"}" != "/" ] ; then
do_fail "invalid mountpoint $*"
do_fail "invalid mountpoint for ${dataset}"
fi
# Escape the mountpoint per systemd policy.
mountfile="$(systemd-escape "${p_mountpoint#?}").mount"
mountfile="$(systemd-escape --path --suffix=mount "${p_mountpoint}")"
# Parse options
# see lib/libzfs/libzfs_mount.c:zfs_add_options
@ -226,39 +344,130 @@ EOF
"${dataset}" >/dev/kmsg
fi
# If the mountpoint has already been created, give it precedence.
if [ -n "${p_systemd_wantedby}" ] && \
[ "${p_systemd_wantedby}" != "-" ] ; then
noauto="on"
if [ "${p_systemd_wantedby}" = "none" ] ; then
wantedby=""
else
wantedby="${p_systemd_wantedby}"
before="${before} ${wantedby}"
fi
fi
if [ -n "${p_systemd_requiredby}" ] && \
[ "${p_systemd_requiredby}" != "-" ] ; then
noauto="on"
if [ "${p_systemd_requiredby}" = "none" ] ; then
requiredby=""
else
requiredby="${p_systemd_requiredby}"
before="${before} ${requiredby}"
fi
fi
# For datasets with canmount=on, a dependency is created for
# local-fs.target by default. To avoid regressions, this dependency
# is reduced to "wants" rather than "requires" when nofail is not "off".
# **THIS MAY CHANGE**
# noauto=on disables this behavior completely.
if [ "${noauto}" != "on" ] ; then
if [ "${p_systemd_nofail}" = "off" ] ; then
requiredby="local-fs.target"
before="${before} local-fs.target"
else
wantedby="local-fs.target"
if [ "${p_systemd_nofail}" != "on" ] ; then
before="${before} local-fs.target"
fi
fi
fi
# Handle existing files:
# 1. We never overwrite existing files, although we may delete
# files if we're sure they were created by us. (see 5.)
# 2. We handle files differently based on canmount. Units with canmount=on
# always have precedence over noauto. This is enforced by the sort pipe
# in the loop around this function.
# It is important to use $p_canmount and not $noauto here, since we
# sort by canmount while other properties also modify $noauto, e.g.
# org.openzfs.systemd:wanted-by.
# 3. If no unit file exists for a noauto dataset, we create one.
# Additionally, we use $noauto_files to track the unit file names
# (which are the systemd-escaped mountpoints) of all (exclusively)
# noauto datasets that had a file created.
# 4. If the file to be created is found in the tracking variable,
# we do NOT create it.
# 5. If a file exists for a noauto dataset, we check whether the file
# name is in the variable. If it is, we have multiple noauto datasets
# for the same mountpoint. In such cases, we remove the file for safety.
# To avoid further noauto datasets creating a file for this path again,
# we leave the file name in the tracking variable.
if [ -e "${dest_norm}/${mountfile}" ] ; then
printf 'zfs-mount-generator: %s already exists\n' "${mountfile}" \
>/dev/kmsg
if is_known "$mountfile" "$noauto_files" ; then
# if it's in $noauto_files, we must be noauto too. See 2.
printf 'zfs-mount-generator: removing duplicate noauto %s\n' \
"${mountfile}" >/dev/kmsg
# See 5.
rm "${dest_norm}/${mountfile}"
else
# don't log for canmount=noauto
if [ "${p_canmount}" = "on" ] ; then
printf 'zfs-mount-generator: %s already exists. Skipping.\n' \
"${mountfile}" >/dev/kmsg
fi
fi
# file exists; Skip current dataset.
return
else
if is_known "${mountfile}" "${noauto_files}" ; then
# See 4.
return
elif [ "${p_canmount}" = "noauto" ] ; then
noauto_files="${mountfile} ${noauto_files}"
fi
fi
# Create the .mount unit file.
# By ordering before zfs-mount.service, we avoid race conditions.
cat > "${dest_norm}/${mountfile}" << EOF
# Automatically generated by zfs-mount-generator
#
# (Do not use `<<EOF`-style here-documents for this, see warning above)
#
echo \
"# Automatically generated by zfs-mount-generator
[Unit]
SourcePath=${cachefile}
Documentation=man:zfs-mount-generator(8)
Before=local-fs.target zfs-mount.service
After=${wants}
Before=${before}
After=${after}
Wants=${wants}
${bindsto}
${requires}
${requiredmounts}
[Mount]
Where=${p_mountpoint}
What=${dataset}
Type=zfs
Options=defaults${opts},zfsutil
EOF
Options=defaults${opts},zfsutil" > "${dest_norm}/${mountfile}"
# Finally, create the appropriate dependencies
create_dependencies "${mountfile}" "wants" "$wantedby"
create_dependencies "${mountfile}" "requires" "$requiredby"
# Finally, create the appropriate dependency
ln -s "../${mountfile}" "${req_dir}"
}
# Feed each line into process_line
for cachefile in "${FSLIST}/"* ; do
while read -r fs ; do
process_line "${fs}"
done < "${cachefile}"
# Disable glob expansion to protect against special characters when parsing.
set -f
# Sort cachefile's lines by canmount, "on" before "noauto"
# and feed each line into process_line
sort -t "$(printf '\t')" -k 3 -r "${cachefile}" | \
( # subshell is necessary for `sort|while read` and $noauto_files
noauto_files=""
while read -r fs ; do
process_line "${fs}"
done
)
done

View File

@ -35,5 +35,6 @@ install-data-hook:
$(MKDIR_P) "$(DESTDIR)$(systemdunitdir)"
ln -sf /dev/null "$(DESTDIR)$(systemdunitdir)/zfs-import.service"
# Double-colon rules are allowed; there are multiple independent definitions.
distclean-local::
-$(RM) $(systemdunit_DATA) $(systemdpreset_DATA)

View File

@ -5,9 +5,11 @@ DefaultDependencies=no
Requires=systemd-udev-settle.service
After=systemd-udev-settle.service
After=cryptsetup.target
After=multipathd.target
After=systemd-remount-fs.service
Before=zfs-import.target
ConditionPathExists=@sysconfdir@/zfs/zpool.cache
ConditionPathIsDirectory=/sys/module/zfs
[Service]
Type=oneshot

View File

@ -5,8 +5,10 @@ DefaultDependencies=no
Requires=systemd-udev-settle.service
After=systemd-udev-settle.service
After=cryptsetup.target
After=multipathd.target
Before=zfs-import.target
ConditionPathExists=!@sysconfdir@/zfs/zpool.cache
ConditionPathIsDirectory=/sys/module/zfs
[Service]
Type=oneshot

View File

@ -6,7 +6,7 @@ After=systemd-udev-settle.service
After=zfs-import.target
After=systemd-remount-fs.service
Before=local-fs.target
Before=systemd-random-seed.service
ConditionPathIsDirectory=/sys/module/zfs
[Service]
Type=oneshot

1
etc/zfs/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
zfs-functions

View File

@ -6,5 +6,29 @@ pkgsysconf_DATA = \
vdev_id.conf.sas_switch.example \
vdev_id.conf.multipath.example \
vdev_id.conf.scsi.example
pkgsysconf_SCRIPTS = \
zfs-functions
EXTRA_DIST = $(pkgsysconf_DATA)
EXTRA_DIST = $(pkgsysconf_DATA) \
zfs-functions.in
$(pkgsysconf_SCRIPTS):%:%.in Makefile
-(if [ -e /etc/debian_version ]; then \
NFS_SRV=nfs-kernel-server; \
else \
NFS_SRV=nfs; \
fi; \
if [ -e /sbin/openrc-run ]; then \
SHELL=/sbin/openrc-run; \
else \
SHELL=/bin/sh; \
fi; \
$(SED) \
-e 's,@sbindir\@,$(sbindir),g' \
-e 's,@sysconfdir\@,$(sysconfdir),g' \
-e 's,@initconfdir\@,$(initconfdir),g' \
$< >'$@'; \
[ '$@' = 'zfs-functions' ] || \
chmod +x '$@')
CLEANFILES = $(pkgsysconf_SCRIPTS)

View File

@ -71,6 +71,7 @@ struct libzfs_handle {
int libzfs_pool_iter;
char libzfs_chassis_id[256];
boolean_t libzfs_prop_debug;
boolean_t libzfs_dedup_warning_printed;
};
#define ZFSSHARE_MISS 0x01 /* Didn't find entry in cache */

View File

@ -371,12 +371,48 @@ bio_set_bi_error(struct bio *bio, int error)
*
* For older kernels trigger a re-reading of the partition table by calling
* check_disk_change() which calls flush_disk() to invalidate the device.
*
* For newer kernels (as of 5.10), bdev_check_media_chage is used, in favor of
* check_disk_change(), with the modification that invalidation is no longer
* forced.
*/
#ifdef HAVE_CHECK_DISK_CHANGE
#define zfs_check_media_change(bdev) check_disk_change(bdev)
#ifdef HAVE_BLKDEV_REREAD_PART
#define vdev_bdev_reread_part(bdev) blkdev_reread_part(bdev)
#else
#define vdev_bdev_reread_part(bdev) check_disk_change(bdev)
#endif /* HAVE_BLKDEV_REREAD_PART */
#else
#ifdef HAVE_BDEV_CHECK_MEDIA_CHANGE
static inline int
zfs_check_media_change(struct block_device *bdev)
{
struct gendisk *gd = bdev->bd_disk;
const struct block_device_operations *bdo = gd->fops;
if (!bdev_check_media_change(bdev))
return (0);
/*
* Force revalidation, to mimic the old behavior of
* check_disk_change()
*/
if (bdo->revalidate_disk)
bdo->revalidate_disk(gd);
return (0);
}
#define vdev_bdev_reread_part(bdev) zfs_check_media_change(bdev)
#else
/*
* This is encountered if check_disk_change() and bdev_check_media_change()
* are not available in the kernel - likely due to an API change that needs
* to be chased down.
*/
#error "Unsupported kernel: no usable disk change check"
#endif /* HAVE_BDEV_CHECK_MEDIA_CHANGE */
#endif /* HAVE_CHECK_DISK_CHANGE */
/*
* 2.6.22 API change
@ -669,4 +705,20 @@ blk_generic_end_io_acct(struct request_queue *q, int rw,
#endif
}
#ifndef HAVE_SUBMIT_BIO_IN_BLOCK_DEVICE_OPERATIONS
static inline struct request_queue *
blk_generic_alloc_queue(make_request_fn make_request, int node_id)
{
#if defined(HAVE_BLK_ALLOC_QUEUE_REQUEST_FN)
return (blk_alloc_queue(make_request, node_id));
#else
struct request_queue *q = blk_alloc_queue(GFP_KERNEL);
if (q != NULL)
blk_queue_make_request(q, make_request);
return (q);
#endif
}
#endif /* !HAVE_SUBMIT_BIO_IN_BLOCK_DEVICE_OPERATIONS */
#endif /* _ZFS_BLKDEV_H */

View File

@ -35,11 +35,6 @@
#else
#define nr_inactive_file_pages() global_zone_page_state(NR_INACTIVE_FILE)
#endif
#if defined(ZFS_ENUM_NODE_STAT_ITEM_NR_SLAB_RECLAIMABLE)
#define nr_slab_reclaimable_pages() global_node_page_state(NR_SLAB_RECLAIMABLE)
#else
#define nr_slab_reclaimable_pages() global_zone_page_state(NR_SLAB_RECLAIMABLE)
#endif
#elif defined(ZFS_GLOBAL_NODE_PAGE_STATE)
@ -59,11 +54,6 @@
#else
#define nr_inactive_file_pages() global_page_state(NR_INACTIVE_FILE)
#endif
#if defined(ZFS_ENUM_NODE_STAT_ITEM_NR_SLAB_RECLAIMABLE)
#define nr_slab_reclaimable_pages() global_node_page_state(NR_SLAB_RECLAIMABLE)
#else
#define nr_slab_reclaimable_pages() global_page_state(NR_SLAB_RECLAIMABLE)
#endif
#else
@ -71,7 +61,6 @@
#define nr_file_pages() global_page_state(NR_FILE_PAGES)
#define nr_inactive_anon_pages() global_page_state(NR_INACTIVE_ANON)
#define nr_inactive_file_pages() global_page_state(NR_INACTIVE_FILE)
#define nr_slab_reclaimable_pages() global_page_state(NR_SLAB_RECLAIMABLE)
#endif /* ZFS_GLOBAL_ZONE_PAGE_STATE */

View File

@ -382,7 +382,8 @@ typedef enum cpuid_inst_sets {
AVX512ER,
AVX512VL,
AES,
PCLMULQDQ
PCLMULQDQ,
MOVBE
} cpuid_inst_sets_t;
/*
@ -406,6 +407,7 @@ typedef struct cpuid_feature_desc {
#define _AVX512VL_BIT (1U << 31) /* if used also check other levels */
#define _AES_BIT (1U << 25)
#define _PCLMULQDQ_BIT (1U << 1)
#define _MOVBE_BIT (1U << 22)
/*
* Descriptions of supported instruction sets
@ -433,6 +435,7 @@ static const cpuid_feature_desc_t cpuid_features[] = {
[AVX512VL] = {7U, 0U, _AVX512ER_BIT, EBX },
[AES] = {1U, 0U, _AES_BIT, ECX },
[PCLMULQDQ] = {1U, 0U, _PCLMULQDQ_BIT, ECX },
[MOVBE] = {1U, 0U, _MOVBE_BIT, ECX },
};
/*
@ -505,6 +508,7 @@ CPUID_FEATURE_CHECK(avx512er, AVX512ER);
CPUID_FEATURE_CHECK(avx512vl, AVX512VL);
CPUID_FEATURE_CHECK(aes, AES);
CPUID_FEATURE_CHECK(pclmulqdq, PCLMULQDQ);
CPUID_FEATURE_CHECK(movbe, MOVBE);
#endif /* !defined(_KERNEL) */
@ -719,6 +723,23 @@ zfs_pclmulqdq_available(void)
#endif
}
/*
* Check if MOVBE instruction is available
*/
static inline boolean_t
zfs_movbe_available(void)
{
#if defined(_KERNEL)
#if defined(X86_FEATURE_MOVBE)
return (!!boot_cpu_has(X86_FEATURE_MOVBE));
#else
return (B_FALSE);
#endif
#elif !defined(_KERNEL)
return (__cpuid_has_movbe());
#endif
}
/*
* AVX-512 family of instruction sets:
*

View File

@ -169,6 +169,15 @@ extern void *spl_kmem_alloc(size_t sz, int fl, const char *func, int line);
extern void *spl_kmem_zalloc(size_t sz, int fl, const char *func, int line);
extern void spl_kmem_free(const void *ptr, size_t sz);
/*
* 5.8 API change, pgprot_t argument removed.
*/
#ifdef HAVE_VMALLOC_PAGE_KERNEL
#define spl_vmalloc(size, flags) __vmalloc(size, flags, PAGE_KERNEL)
#else
#define spl_vmalloc(size, flags) __vmalloc(size, flags)
#endif
/*
* The following functions are only available for internal use.
*/

View File

@ -152,6 +152,12 @@ typedef struct kstat_named_s {
#define KSTAT_NAMED_STR_PTR(knptr) ((knptr)->value.string.addr.ptr)
#define KSTAT_NAMED_STR_BUFLEN(knptr) ((knptr)->value.string.len)
#ifdef HAVE_PROC_OPS_STRUCT
typedef struct proc_ops kstat_proc_op_t;
#else
typedef struct file_operations kstat_proc_op_t;
#endif
typedef struct kstat_intr {
uint_t intrs[KSTAT_NUM_INTRS];
} kstat_intr_t;
@ -197,7 +203,7 @@ extern void kstat_proc_entry_init(kstat_proc_entry_t *kpep,
const char *module, const char *name);
extern void kstat_proc_entry_delete(kstat_proc_entry_t *kpep);
extern void kstat_proc_entry_install(kstat_proc_entry_t *kpep, mode_t mode,
const struct file_operations *file_ops, void *data);
const kstat_proc_op_t *file_ops, void *data);
extern void __kstat_install(kstat_t *ksp);
extern void __kstat_delete(kstat_t *ksp);

View File

@ -26,6 +26,7 @@
#define _SPL_MUTEX_H
#include <sys/types.h>
#include <linux/sched.h>
#include <linux/mutex.h>
#include <linux/lockdep.h>
#include <linux/compiler_compat.h>

View File

@ -85,7 +85,7 @@ gethrestime(inode_timespec_t *ts)
#endif
}
static inline time_t
static inline uint64_t
gethrestime_sec(void)
{
#if defined(HAVE_INODE_TIMESPEC64_TIMES)
@ -105,8 +105,13 @@ gethrestime_sec(void)
static inline hrtime_t
gethrtime(void)
{
#if defined(HAVE_KTIME_GET_RAW_TS64)
struct timespec64 ts;
ktime_get_raw_ts64(&ts);
#else
struct timespec ts;
getrawmonotonic(&ts);
#endif
return (((hrtime_t)ts.tv_sec * NSEC_PER_SEC) + ts.tv_nsec);
}

View File

@ -47,10 +47,6 @@
#define membar_producer() smp_wmb()
#define physmem zfs_totalram_pages
#define freemem (nr_free_pages() + \
global_page_state(NR_INACTIVE_FILE) + \
global_page_state(NR_INACTIVE_ANON) + \
global_page_state(NR_SLAB_RECLAIMABLE))
#define xcopyin(from, to, size) copy_from_user(to, from, size)
#define xcopyout(from, to, size) copy_to_user(to, from, size)

View File

@ -175,7 +175,7 @@ typedef struct ddt_ops {
int (*ddt_op_count)(objset_t *os, uint64_t object, uint64_t *count);
} ddt_ops_t;
#define DDT_NAMELEN 80
#define DDT_NAMELEN 107
extern void ddt_object_name(ddt_t *ddt, enum ddt_type type,
enum ddt_class class, char *name);

View File

@ -1042,7 +1042,7 @@ typedef struct zgd {
struct lwb *zgd_lwb;
struct blkptr *zgd_bp;
dmu_buf_t *zgd_db;
struct locked_range *zgd_lr;
struct zfs_locked_range *zgd_lr;
void *zgd_private;
} zgd_t;

View File

@ -23,8 +23,13 @@
extern "C" {
#endif
#if defined(__KERNEL__) && defined(HAVE_STACK_FRAME_NON_STANDARD)
#if defined(__KERNEL__) && defined(HAVE_KERNEL_OBJTOOL) && \
defined(HAVE_STACK_FRAME_NON_STANDARD)
#if defined(HAVE_KERNEL_OBJTOOL_HEADER)
#include <linux/objtool.h>
#else
#include <linux/frame.h>
#endif
#else
#define STACK_FRAME_NON_STANDARD(func)
#endif

View File

@ -274,7 +274,7 @@ struct vdev {
range_tree_t *vdev_initialize_tree; /* valid while initializing */
uint64_t vdev_initialize_bytes_est;
uint64_t vdev_initialize_bytes_done;
time_t vdev_initialize_action_time; /* start and end time */
uint64_t vdev_initialize_action_time; /* start and end time */
/* TRIM related */
boolean_t vdev_trim_exit_wanted;
@ -295,7 +295,7 @@ struct vdev {
uint64_t vdev_trim_rate; /* requested rate (bytes/sec) */
uint64_t vdev_trim_partial; /* requested partial TRIM */
uint64_t vdev_trim_secure; /* requested secure TRIM */
time_t vdev_trim_action_time; /* start and end time */
uint64_t vdev_trim_action_time; /* start and end time */
/* for limiting outstanding I/Os (initialize and TRIM) */
kmutex_t vdev_initialize_io_lock;

View File

@ -39,40 +39,40 @@ typedef enum {
RL_READER,
RL_WRITER,
RL_APPEND
} rangelock_type_t;
} zfs_rangelock_type_t;
struct locked_range;
struct zfs_locked_range;
typedef void (rangelock_cb_t)(struct locked_range *, void *);
typedef void (zfs_rangelock_cb_t)(struct zfs_locked_range *, void *);
typedef struct rangelock {
typedef struct zfs_rangelock {
avl_tree_t rl_tree; /* contains locked_range_t */
kmutex_t rl_lock;
rangelock_cb_t *rl_cb;
zfs_rangelock_cb_t *rl_cb;
void *rl_arg;
} rangelock_t;
} zfs_rangelock_t;
typedef struct locked_range {
rangelock_t *lr_rangelock; /* rangelock that this lock applies to */
typedef struct zfs_locked_range {
zfs_rangelock_t *lr_rangelock; /* rangelock that this lock applies to */
avl_node_t lr_node; /* avl node link */
uint64_t lr_offset; /* file range offset */
uint64_t lr_length; /* file range length */
uint_t lr_count; /* range reference count in tree */
rangelock_type_t lr_type; /* range type */
zfs_rangelock_type_t lr_type; /* range type */
kcondvar_t lr_write_cv; /* cv for waiting writers */
kcondvar_t lr_read_cv; /* cv for waiting readers */
uint8_t lr_proxy; /* acting for original range */
uint8_t lr_write_wanted; /* writer wants to lock this range */
uint8_t lr_read_wanted; /* reader wants to lock this range */
} locked_range_t;
} zfs_locked_range_t;
void zfs_rangelock_init(rangelock_t *, rangelock_cb_t *, void *);
void zfs_rangelock_fini(rangelock_t *);
void zfs_rangelock_init(zfs_rangelock_t *, zfs_rangelock_cb_t *, void *);
void zfs_rangelock_fini(zfs_rangelock_t *);
locked_range_t *zfs_rangelock_enter(rangelock_t *,
uint64_t, uint64_t, rangelock_type_t);
void zfs_rangelock_exit(locked_range_t *);
void zfs_rangelock_reduce(locked_range_t *, uint64_t, uint64_t);
zfs_locked_range_t *zfs_rangelock_enter(zfs_rangelock_t *,
uint64_t, uint64_t, zfs_rangelock_type_t);
void zfs_rangelock_exit(zfs_locked_range_t *);
void zfs_rangelock_reduce(zfs_locked_range_t *, uint64_t, uint64_t);
#ifdef __cplusplus
}

View File

@ -191,7 +191,7 @@ typedef struct znode {
krwlock_t z_parent_lock; /* parent lock for directories */
krwlock_t z_name_lock; /* "master" lock for dirent locks */
zfs_dirlock_t *z_dirlocks; /* directory entry lock list */
rangelock_t z_rangelock; /* file range locks */
zfs_rangelock_t z_rangelock; /* file range locks */
boolean_t z_unlinked; /* file has been unlinked */
boolean_t z_atime_dirty; /* atime needs to be synced */
boolean_t z_zn_prefetch; /* Prefetch znodes? */

View File

@ -118,7 +118,7 @@ enum zio_encrypt {
ZIO_CRYPT_FUNCTIONS
};
#define ZIO_CRYPT_ON_VALUE ZIO_CRYPT_AES_256_CCM
#define ZIO_CRYPT_ON_VALUE ZIO_CRYPT_AES_256_GCM
#define ZIO_CRYPT_DEFAULT ZIO_CRYPT_OFF
/* macros defining encryption lengths */

View File

@ -188,13 +188,14 @@ zpl_dir_emit_dots(struct file *file, zpl_dir_context_t *ctx)
}
#endif /* HAVE_VFS_ITERATE */
/*
* Linux 4.18, inode times converted from timespec to timespec64.
*/
#if defined(HAVE_INODE_TIMESPEC64_TIMES)
#define zpl_inode_timespec_trunc(ts, gran) timespec64_trunc(ts, gran)
#if defined(HAVE_INODE_TIMESTAMP_TRUNCATE)
#define zpl_inode_timestamp_truncate(ts, ip) timestamp_truncate(ts, ip)
#elif defined(HAVE_INODE_TIMESPEC64_TIMES)
#define zpl_inode_timestamp_truncate(ts, ip) \
timespec64_trunc(ts, (ip)->i_sb->s_time_gran)
#else
#define zpl_inode_timespec_trunc(ts, gran) timespec_trunc(ts, gran)
#define zpl_inode_timestamp_truncate(ts, ip) \
timespec_trunc(ts, (ip)->i_sb->s_time_gran)
#endif
#endif /* _SYS_ZPL_H */

View File

@ -20,6 +20,8 @@ ASM_SOURCES_AS = \
asm-x86_64/aes/aes_amd64.S \
asm-x86_64/aes/aes_aesni.S \
asm-x86_64/modes/gcm_pclmulqdq.S \
asm-x86_64/modes/aesni-gcm-x86_64.S \
asm-x86_64/modes/ghash-x86_64.S \
asm-x86_64/sha1/sha1-x86_64.S \
asm-x86_64/sha2/sha256_impl.S \
asm-x86_64/sha2/sha512_impl.S

View File

@ -65,6 +65,8 @@ static boolean_t smb_available(void);
static sa_fstype_t *smb_fstype;
smb_share_t *smb_shares;
/*
* Retrieve the list of SMB shares.
*/

View File

@ -44,6 +44,6 @@ typedef struct smb_share_s {
struct smb_share_s *next;
} smb_share_t;
smb_share_t *smb_shares;
extern smb_share_t *smb_shares;
void libshare_smb_init(void);

View File

@ -39,7 +39,7 @@
#endif /* MNTTAB */
#define MNTTAB "/proc/self/mounts"
#define MNT_LINE_MAX 4096
#define MNT_LINE_MAX 4108
#define MNT_TOOLONG 1 /* entry exceeds MNT_LINE_MAX */
#define MNT_TOOMANY 2 /* too many fields in line */

View File

@ -88,7 +88,7 @@ gethrestime(inode_timespec_t *ts)
ts->tv_nsec = tv.tv_usec * NSEC_PER_USEC;
}
static inline time_t
static inline uint64_t
gethrestime_sec(void)
{
struct timeval tv;

View File

@ -8,7 +8,7 @@ VPATH = \
# Suppress unused but set variable warnings often due to ASSERTs
AM_CFLAGS += $(NO_UNUSED_BUT_SET_VARIABLE)
libzfs_pcdir = $(datarootdir)/pkgconfig
libzfs_pcdir = $(libdir)/pkgconfig
libzfs_pc_DATA = libzfs.pc libzfs_core.pc
DEFAULT_INCLUDES += \

View File

@ -3984,6 +3984,26 @@ zfs_receive_one(libzfs_handle_t *hdl, int infd, const char *tosnap,
(void) printf("found clone origin %s\n", origin);
}
if (!hdl->libzfs_dedup_warning_printed &&
(DMU_GET_FEATUREFLAGS(drrb->drr_versioninfo) &
DMU_BACKUP_FEATURE_DEDUP)) {
(void) fprintf(stderr,
gettext("WARNING: This is a deduplicated send stream. "
"The ability to send and\n"
"receive deduplicated send streams is deprecated. "
"In the future, the\n"
"ability to receive a deduplicated send stream with "
"\"zfs receive\" will be\n"
"removed. However, in the future, a utility will be "
"provided to convert a\n"
"deduplicated send stream to a regular "
"(non-deduplicated) stream. This\n"
"future utility will require that the send stream be "
"located in a\n"
"seek-able file, rather than provided by a pipe.\n\n"));
hdl->libzfs_dedup_warning_printed = B_TRUE;
}
boolean_t resuming = DMU_GET_FEATUREFLAGS(drrb->drr_versioninfo) &
DMU_BACKUP_FEATURE_RESUMING;
boolean_t raw = DMU_GET_FEATUREFLAGS(drrb->drr_versioninfo) &

View File

@ -20,6 +20,7 @@ EXTRA_DIST = \
$(nodist_man_MANS): %: %.in
-$(SED) -e 's,@zfsexecdir\@,$(zfsexecdir),g' \
-e 's,@systemdgeneratordir\@,$(systemdgeneratordir),g' \
-e 's,@runstatedir\@,$(runstatedir),g' \
-e 's,@sysconfdir\@,$(sysconfdir),g' \
$< >'$@'

View File

@ -1,8 +1,33 @@
.TH "ZFS\-MOUNT\-GENERATOR" "8" "ZFS" "zfs-mount-generator" "\""
.\"
.\" Copyright 2018 Antonio Russo <antonio.e.russo@gmail.com>
.\" Copyright 2019 Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
.\" Copyright 2020 InsanePrawn <insane.prawny@gmail.com>
.\"
.\" Permission is hereby granted, free of charge, to any person obtaining
.\" a copy of this software and associated documentation files (the
.\" "Software"), to deal in the Software without restriction, including
.\" without limitation the rights to use, copy, modify, merge, publish,
.\" distribute, sublicense, and/or sell copies of the Software, and to
.\" permit persons to whom the Software is furnished to do so, subject to
.\" the following conditions:
.\"
.\" The above copyright notice and this permission notice shall be
.\" included in all copies or substantial portions of the Software.
.\"
.\" THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
.\" EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
.\" MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
.\" NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
.\" LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
.\" OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
.\" WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
.TH "ZFS\-MOUNT\-GENERATOR" "8" "2020-01-19" "ZFS" "zfs-mount-generator" "\""
.SH "NAME"
zfs\-mount\-generator \- generates systemd mount units for ZFS
.SH SYNOPSIS
.B /lib/systemd/system-generators/zfs\-mount\-generator
.B @systemdgeneratordir@/zfs\-mount\-generator
.sp
.SH DESCRIPTION
zfs\-mount\-generator implements the \fBGenerators Specification\fP
@ -11,22 +36,50 @@ of
and is called during early boot to generate
.BR systemd.mount (5)
units for automatically mounted datasets. Mount ordering and dependencies
are created for all tracked pools (see below). If a dataset has
.BR canmount=on
and
.BR mountpoint
set, the
.BR auto
mount option will be set, and a dependency for
.BR local-fs.target
on the mount will be created.
are created for all tracked pools (see below).
Because zfs pools may not be available very early in the boot process,
information on ZFS mountpoints must be stored separately. The output
of the command
.SS ENCRYPTION KEYS
If the dataset is an encryption root, a service that loads the associated key (either from file or through a
.BR systemd\-ask\-password (1)
prompt) will be created. This service
. BR RequiresMountsFor
the path of the key (if file-based) and also copies the mount unit's
.BR After ,
.BR Before
and
.BR Requires .
All mount units of encrypted datasets add the key\-load service for their encryption root to their
.BR Wants
and
.BR After .
The service will not be
.BR Want ed
or
.BR Require d
by
.BR local-fs.target
directly, and so will only be started manually or as a dependency of a started mount unit.
.SS UNIT ORDERING AND DEPENDENCIES
mount unit's
.BR Before
\->
key\-load service (if any)
\->
mount unit
\->
mount unit's
.BR After
It is worth nothing that when a mount unit is activated, it activates all available mount units for parent paths to its mountpoint, i.e. activating the mount unit for /tmp/foo/1/2/3 automatically activates all available mount units for /tmp, /tmp/foo, /tmp/foo/1, and /tmp/foo/1/2. This is true for any combination of mount units from any sources, not just ZFS.
.SS CACHE FILE
Because ZFS pools may not be available very early in the boot process,
information on ZFS mountpoints must be stored separately. The output of the command
.PP
.RS 4
zfs list -H -o name,mountpoint,canmount,atime,relatime,devices,exec,readonly,setuid,nbmand,encroot,keylocation
zfs list -H -o name,mountpoint,canmount,atime,relatime,devices,exec,readonly,setuid,nbmand,encroot,keylocation,org.openzfs.systemd:requires,org.openzfs.systemd:requires-mounts-for,org.openzfs.systemd:before,org.openzfs.systemd:after,org.openzfs.systemd:wanted-by,org.openzfs.systemd:required-by,org.openzfs.systemd:nofail,org.openzfs.systemd:ignore
.RE
.PP
for datasets that should be mounted by systemd, should be kept
@ -45,6 +98,98 @@ history_event-zfs-list-cacher.sh .
.RE
.PP
.sp
.SS PROPERTIES
The behavior of the generator script can be influenced by the following dataset properties:
.sp
.TP 4
.BR canmount = on | off | noauto
If a dataset has
.BR mountpoint
set and
.BR canmount
is not
.BR off ,
a mount unit will be generated.
Additionally, if
.BR canmount
is
.BR on ,
.BR local-fs.target
will gain a dependency on the mount unit.
This behavior is equal to the
.BR auto
and
.BR noauto
legacy mount options, see
.BR systemd.mount (5).
Encryption roots always generate a key-load service, even for
.BR canmount=off .
.TP 4
.BR org.openzfs.systemd:requires\-mounts\-for = \fIpath\fR...
Space\-separated list of mountpoints to require to be mounted for this mount unit
.TP 4
.BR org.openzfs.systemd:before = \fIunit\fR...
The mount unit and associated key\-load service will be ordered before this space\-separated list of units.
.TP 4
.BR org.openzfs.systemd:after = \fIunit\fR...
The mount unit and associated key\-load service will be ordered after this space\-separated list of units.
.TP 4
.BR org.openzfs.systemd:wanted\-by = \fIunit\fR...
Space-separated list of units that will gain a
.BR Wants
dependency on this mount unit.
Setting this property implies
.BR noauto .
.TP 4
.BR org.openzfs.systemd:required\-by = \fIunit\fR...
Space-separated list of units that will gain a
.BR Requires
dependency on this mount unit.
Setting this property implies
.BR noauto .
.TP 4
.BR org.openzfs.systemd:nofail = unset | on | off
Toggles between a
.BR Wants
and
.BR Requires
type of dependency between the mount unit and
.BR local-fs.target ,
if
.BR noauto
isn't set or implied.
.BR on :
Mount will be
.BR WantedBy
local-fs.target
.BR off :
Mount will be
.BR Before
and
.BR RequiredBy
local-fs.target
.BR unset :
Mount will be
.BR Before
and
.BR WantedBy
local-fs.target
.TP 4
.BR org.openzfs.systemd:ignore = on | off
If set to
.BR on ,
do not generate a mount unit for this dataset.
.RE
See also
.BR systemd.mount (5)
.PP
.SH EXAMPLE
To begin, enable tracking for the pool:
.PP
@ -63,7 +208,9 @@ systemctl enable zfs-zed.service
systemctl restart zfs-zed.service
.RE
.PP
Force the running of the ZEDLET by setting canmount=on for at least one dataset in the pool:
Force the running of the ZEDLET by setting a monitored property, e.g.
.BR canmount ,
for at least one dataset in the pool:
.PP
.RS 4
zfs set canmount=on
@ -71,6 +218,24 @@ zfs set canmount=on
.RE
.PP
This forces an update to the stale cache file.
To test the generator output, run
.PP
.RS 4
@systemdgeneratordir@/zfs-mount-generator /tmp/zfs-mount-generator . .
.RE
.PP
This will generate units and dependencies in
.I /tmp/zfs-mount-generator
for you to inspect them. The second and third argument are ignored.
If you're satisfied with the generated units, instruct systemd to re-run all generators:
.PP
.RS 4
systemctl daemon-reload
.RE
.PP
.sp
.SH SEE ALSO
.BR zfs (5)

View File

@ -1440,7 +1440,7 @@ Selecting
.Sy encryption Ns = Ns Sy on
when creating a dataset indicates that the default encryption suite will be
selected, which is currently
.Sy aes-256-ccm .
.Sy aes-256-gcm .
In order to provide consistent data protection, encryption must be specified at
dataset creation time and it cannot be changed afterwards.
.Pp
@ -3461,6 +3461,9 @@ By default, a full stream is generated.
.Bl -tag -width "-D"
.It Fl D, -dedup
Generate a deduplicated stream.
\fBDeduplicated send is deprecated and will be removed in a future release.\fR
(In the future, the flag will be accepted but a regular, non-deduplicated
stream will be generated.)
Blocks which would have been sent multiple times in the send stream will only be
sent once.
The receiving system must also support this feature to receive a deduplicated
@ -3835,6 +3838,18 @@ destroyed by using the
.Nm zfs Cm destroy Fl d
command.
.Pp
Deduplicated send streams can be generated by using the
.Nm zfs Cm send Fl D
command.
\fBThe ability to send and receive deduplicated send streams is deprecated.\fR
In the future, the ability to receive a deduplicated send stream with
.Nm zfs Cm receive
will be removed.
However, in the future, a utility will be provided to convert a
deduplicated send stream to a regular (non-deduplicated) stream.
This future utility will require that the send stream be located in a
seek-able file, rather than provided by a pipe.
.Pp
If
.Fl o Em property Ns = Ns Ar value
or

0
man/man8/zfsprops.8 Normal file
View File

View File

@ -13,6 +13,16 @@ ASM_SOURCES += asm-x86_64/modes/gcm_pclmulqdq.o
ASM_SOURCES += asm-x86_64/sha1/sha1-x86_64.o
ASM_SOURCES += asm-x86_64/sha2/sha256_impl.o
ASM_SOURCES += asm-x86_64/sha2/sha512_impl.o
ASM_SOURCES += asm-x86_64/aes/aeskey.o
ASM_SOURCES += asm-x86_64/aes/aes_amd64.o
ASM_SOURCES += asm-x86_64/aes/aes_aesni.o
ASM_SOURCES += asm-x86_64/modes/gcm_pclmulqdq.o
ASM_SOURCES += asm-x86_64/modes/aesni-gcm-x86_64.o
ASM_SOURCES += asm-x86_64/modes/ghash-x86_64.o
ASM_SOURCES += asm-x86_64/sha1/sha1-x86_64.o
ASM_SOURCES += asm-x86_64/sha2/sha256_impl.o
ASM_SOURCES += asm-x86_64/sha2/sha512_impl.o
endif
ifeq ($(TARGET_ASM_DIR), asm-i386)
@ -25,9 +35,9 @@ endif
obj-$(CONFIG_ZFS) := $(MODULE).o
asflags-y := -I$(src)/include
asflags-y := -I@abs_top_srcdir@/module/icp/include
asflags-y += $(ZFS_MODULE_CFLAGS) $(ZFS_MODULE_CPPFLAGS)
ccflags-y := -I$(src)/include
ccflags-y := -I@abs_top_srcdir@/module/icp/include
ccflags-y += $(ZFS_MODULE_CFLAGS) $(ZFS_MODULE_CPPFLAGS)
$(MODULE)-objs += illumos-crypto.o
@ -72,6 +82,13 @@ $(MODULE)-$(CONFIG_X86) += algs/modes/gcm_pclmulqdq.o
$(MODULE)-$(CONFIG_X86) += algs/aes/aes_impl_aesni.o
$(MODULE)-$(CONFIG_X86) += algs/aes/aes_impl_x86-64.o
# Suppress objtool "can't find jump dest instruction at" warnings. They
# are caused by the constants which are defined in the text section of the
# assembly file using .byte instructions (e.g. bswap_mask). The objtool
# utility tries to interpret them as opcodes and obviously fails doing so.
OBJECT_FILES_NON_STANDARD_aesni-gcm-x86_64.o := y
OBJECT_FILES_NON_STANDARD_ghash-x86_64.o := y
ICP_DIRS = \
api \
core \

View File

@ -330,7 +330,7 @@ aes_impl_init(void)
sizeof (aes_fastest_impl));
#endif
strcpy(aes_fastest_impl.name, "fastest");
strlcpy(aes_fastest_impl.name, "fastest", AES_IMPL_NAME_MAX);
/* Finish initialization */
atomic_swap_32(&icp_aes_impl, user_sel_impl);
@ -405,7 +405,7 @@ aes_impl_set(const char *val)
return (err);
}
#if defined(_KERNEL)
#if defined(_KERNEL) && defined(__linux__)
#include <linux/mod_compat.h>
static int

View File

@ -30,12 +30,49 @@
#include <sys/byteorder.h>
#include <modes/gcm_impl.h>
#include <linux/simd.h>
#ifdef CAN_USE_GCM_ASM
#include <aes/aes_impl.h>
#endif
#define GHASH(c, d, t, o) \
xor_block((uint8_t *)(d), (uint8_t *)(c)->gcm_ghash); \
(o)->mul((uint64_t *)(void *)(c)->gcm_ghash, (c)->gcm_H, \
(uint64_t *)(void *)(t));
/* Select GCM implementation */
#define IMPL_FASTEST (UINT32_MAX)
#define IMPL_CYCLE (UINT32_MAX-1)
#ifdef CAN_USE_GCM_ASM
#define IMPL_AVX (UINT32_MAX-2)
#endif
#define GCM_IMPL_READ(i) (*(volatile uint32_t *) &(i))
static uint32_t icp_gcm_impl = IMPL_FASTEST;
static uint32_t user_sel_impl = IMPL_FASTEST;
#ifdef CAN_USE_GCM_ASM
/* Does the architecture we run on support the MOVBE instruction? */
boolean_t gcm_avx_can_use_movbe = B_FALSE;
/*
* Whether to use the optimized openssl gcm and ghash implementations.
* Set to true if module parameter icp_gcm_impl == "avx".
*/
static boolean_t gcm_use_avx = B_FALSE;
#define GCM_IMPL_USE_AVX (*(volatile boolean_t *)&gcm_use_avx)
static inline boolean_t gcm_avx_will_work(void);
static inline void gcm_set_avx(boolean_t);
static inline boolean_t gcm_toggle_avx(void);
extern boolean_t atomic_toggle_boolean_nv(volatile boolean_t *);
static int gcm_mode_encrypt_contiguous_blocks_avx(gcm_ctx_t *, char *, size_t,
crypto_data_t *, size_t);
static int gcm_encrypt_final_avx(gcm_ctx_t *, crypto_data_t *, size_t);
static int gcm_decrypt_final_avx(gcm_ctx_t *, crypto_data_t *, size_t);
static int gcm_init_avx(gcm_ctx_t *, unsigned char *, size_t, unsigned char *,
size_t, size_t);
#endif /* ifdef CAN_USE_GCM_ASM */
/*
* Encrypt multiple blocks of data in GCM mode. Decrypt for GCM mode
* is done in another function.
@ -47,6 +84,12 @@ gcm_mode_encrypt_contiguous_blocks(gcm_ctx_t *ctx, char *data, size_t length,
void (*copy_block)(uint8_t *, uint8_t *),
void (*xor_block)(uint8_t *, uint8_t *))
{
#ifdef CAN_USE_GCM_ASM
if (ctx->gcm_use_avx == B_TRUE)
return (gcm_mode_encrypt_contiguous_blocks_avx(
ctx, data, length, out, block_size));
#endif
const gcm_impl_ops_t *gops;
size_t remainder = length;
size_t need = 0;
@ -109,6 +152,14 @@ gcm_mode_encrypt_contiguous_blocks(gcm_ctx_t *ctx, char *data, size_t length,
ctx->gcm_processed_data_len += block_size;
/*
* The following copies a complete GCM block back to where it
* came from if there was a remainder in the last call and out
* is NULL. That doesn't seem to make sense. So we assert this
* can't happen and leave the code in for reference.
* See https://github.com/zfsonlinux/zfs/issues/9661
*/
ASSERT(out != NULL);
if (out == NULL) {
if (ctx->gcm_remainder_len > 0) {
bcopy(blockp, ctx->gcm_copy_to,
@ -169,6 +220,11 @@ gcm_encrypt_final(gcm_ctx_t *ctx, crypto_data_t *out, size_t block_size,
void (*copy_block)(uint8_t *, uint8_t *),
void (*xor_block)(uint8_t *, uint8_t *))
{
#ifdef CAN_USE_GCM_ASM
if (ctx->gcm_use_avx == B_TRUE)
return (gcm_encrypt_final_avx(ctx, out, block_size));
#endif
const gcm_impl_ops_t *gops;
uint64_t counter_mask = ntohll(0x00000000ffffffffULL);
uint8_t *ghash, *macp = NULL;
@ -321,6 +377,11 @@ gcm_decrypt_final(gcm_ctx_t *ctx, crypto_data_t *out, size_t block_size,
int (*encrypt_block)(const void *, const uint8_t *, uint8_t *),
void (*xor_block)(uint8_t *, uint8_t *))
{
#ifdef CAN_USE_GCM_ASM
if (ctx->gcm_use_avx == B_TRUE)
return (gcm_decrypt_final_avx(ctx, out, block_size));
#endif
const gcm_impl_ops_t *gops;
size_t pt_len;
size_t remainder;
@ -526,6 +587,9 @@ gcm_init(gcm_ctx_t *ctx, unsigned char *iv, size_t iv_len,
return (CRYPTO_SUCCESS);
}
/*
* Init the GCM context struct. Handle the cycle and avx implementations here.
*/
int
gcm_init_ctx(gcm_ctx_t *gcm_ctx, char *param, size_t block_size,
int (*encrypt_block)(const void *, const uint8_t *, uint8_t *),
@ -556,11 +620,46 @@ gcm_init_ctx(gcm_ctx_t *gcm_ctx, char *param, size_t block_size,
return (CRYPTO_MECHANISM_PARAM_INVALID);
}
if (gcm_init(gcm_ctx, gcm_param->pIv, gcm_param->ulIvLen,
gcm_param->pAAD, gcm_param->ulAADLen, block_size,
encrypt_block, copy_block, xor_block) != 0) {
rv = CRYPTO_MECHANISM_PARAM_INVALID;
#ifdef CAN_USE_GCM_ASM
if (GCM_IMPL_READ(icp_gcm_impl) != IMPL_CYCLE) {
gcm_ctx->gcm_use_avx = GCM_IMPL_USE_AVX;
} else {
/*
* Handle the "cycle" implementation by creating avx and
* non-avx contexts alternately.
*/
gcm_ctx->gcm_use_avx = gcm_toggle_avx();
/*
* We don't handle byte swapped key schedules in the avx
* code path.
*/
aes_key_t *ks = (aes_key_t *)gcm_ctx->gcm_keysched;
if (ks->ops->needs_byteswap == B_TRUE) {
gcm_ctx->gcm_use_avx = B_FALSE;
}
/* Use the MOVBE and the BSWAP variants alternately. */
if (gcm_ctx->gcm_use_avx == B_TRUE &&
zfs_movbe_available() == B_TRUE) {
(void) atomic_toggle_boolean_nv(
(volatile boolean_t *)&gcm_avx_can_use_movbe);
}
}
/* Avx and non avx context initialization differs from here on. */
if (gcm_ctx->gcm_use_avx == B_FALSE) {
#endif /* ifdef CAN_USE_GCM_ASM */
if (gcm_init(gcm_ctx, gcm_param->pIv, gcm_param->ulIvLen,
gcm_param->pAAD, gcm_param->ulAADLen, block_size,
encrypt_block, copy_block, xor_block) != 0) {
rv = CRYPTO_MECHANISM_PARAM_INVALID;
}
#ifdef CAN_USE_GCM_ASM
} else {
if (gcm_init_avx(gcm_ctx, gcm_param->pIv, gcm_param->ulIvLen,
gcm_param->pAAD, gcm_param->ulAADLen, block_size) != 0) {
rv = CRYPTO_MECHANISM_PARAM_INVALID;
}
}
#endif /* ifdef CAN_USE_GCM_ASM */
return (rv);
}
@ -590,11 +689,37 @@ gmac_init_ctx(gcm_ctx_t *gcm_ctx, char *param, size_t block_size,
return (CRYPTO_MECHANISM_PARAM_INVALID);
}
if (gcm_init(gcm_ctx, gmac_param->pIv, AES_GMAC_IV_LEN,
gmac_param->pAAD, gmac_param->ulAADLen, block_size,
encrypt_block, copy_block, xor_block) != 0) {
rv = CRYPTO_MECHANISM_PARAM_INVALID;
#ifdef CAN_USE_GCM_ASM
/*
* Handle the "cycle" implementation by creating avx and non avx
* contexts alternately.
*/
if (GCM_IMPL_READ(icp_gcm_impl) != IMPL_CYCLE) {
gcm_ctx->gcm_use_avx = GCM_IMPL_USE_AVX;
} else {
gcm_ctx->gcm_use_avx = gcm_toggle_avx();
}
/* We don't handle byte swapped key schedules in the avx code path. */
aes_key_t *ks = (aes_key_t *)gcm_ctx->gcm_keysched;
if (ks->ops->needs_byteswap == B_TRUE) {
gcm_ctx->gcm_use_avx = B_FALSE;
}
/* Avx and non avx context initialization differs from here on. */
if (gcm_ctx->gcm_use_avx == B_FALSE) {
#endif /* ifdef CAN_USE_GCM_ASM */
if (gcm_init(gcm_ctx, gmac_param->pIv, AES_GMAC_IV_LEN,
gmac_param->pAAD, gmac_param->ulAADLen, block_size,
encrypt_block, copy_block, xor_block) != 0) {
rv = CRYPTO_MECHANISM_PARAM_INVALID;
}
#ifdef CAN_USE_GCM_ASM
} else {
if (gcm_init_avx(gcm_ctx, gmac_param->pIv, AES_GMAC_IV_LEN,
gmac_param->pAAD, gmac_param->ulAADLen, block_size) != 0) {
rv = CRYPTO_MECHANISM_PARAM_INVALID;
}
}
#endif /* ifdef CAN_USE_GCM_ASM */
return (rv);
}
@ -645,15 +770,6 @@ const gcm_impl_ops_t *gcm_all_impl[] = {
/* Indicate that benchmark has been completed */
static boolean_t gcm_impl_initialized = B_FALSE;
/* Select GCM implementation */
#define IMPL_FASTEST (UINT32_MAX)
#define IMPL_CYCLE (UINT32_MAX-1)
#define GCM_IMPL_READ(i) (*(volatile uint32_t *) &(i))
static uint32_t icp_gcm_impl = IMPL_FASTEST;
static uint32_t user_sel_impl = IMPL_FASTEST;
/* Hold all supported implementations */
static size_t gcm_supp_impl_cnt = 0;
static gcm_impl_ops_t *gcm_supp_impl[ARRAY_SIZE(gcm_all_impl)];
@ -685,6 +801,16 @@ gcm_impl_get_ops()
size_t idx = (++cycle_impl_idx) % gcm_supp_impl_cnt;
ops = gcm_supp_impl[idx];
break;
#ifdef CAN_USE_GCM_ASM
case IMPL_AVX:
/*
* Make sure that we return a valid implementation while
* switching to the avx implementation since there still
* may be unfinished non-avx contexts around.
*/
ops = &gcm_generic_impl;
break;
#endif
default:
ASSERT3U(impl, <, gcm_supp_impl_cnt);
ASSERT3U(gcm_supp_impl_cnt, >, 0);
@ -731,8 +857,24 @@ gcm_impl_init(void)
sizeof (gcm_fastest_impl));
}
strcpy(gcm_fastest_impl.name, "fastest");
strlcpy(gcm_fastest_impl.name, "fastest", GCM_IMPL_NAME_MAX);
#ifdef CAN_USE_GCM_ASM
/*
* Use the avx implementation if it's available and the implementation
* hasn't changed from its default value of fastest on module load.
*/
if (gcm_avx_will_work()) {
#ifdef HAVE_MOVBE
if (zfs_movbe_available() == B_TRUE) {
atomic_swap_32(&gcm_avx_can_use_movbe, B_TRUE);
}
#endif
if (GCM_IMPL_READ(user_sel_impl) == IMPL_FASTEST) {
gcm_set_avx(B_TRUE);
}
}
#endif
/* Finish initialization */
atomic_swap_32(&icp_gcm_impl, user_sel_impl);
gcm_impl_initialized = B_TRUE;
@ -744,6 +886,9 @@ static const struct {
} gcm_impl_opts[] = {
{ "cycle", IMPL_CYCLE },
{ "fastest", IMPL_FASTEST },
#ifdef CAN_USE_GCM_ASM
{ "avx", IMPL_AVX },
#endif
};
/*
@ -777,6 +922,12 @@ gcm_impl_set(const char *val)
/* Check mandatory options */
for (i = 0; i < ARRAY_SIZE(gcm_impl_opts); i++) {
#ifdef CAN_USE_GCM_ASM
/* Ignore avx implementation if it won't work. */
if (gcm_impl_opts[i].sel == IMPL_AVX && !gcm_avx_will_work()) {
continue;
}
#endif
if (strcmp(req_name, gcm_impl_opts[i].name) == 0) {
impl = gcm_impl_opts[i].sel;
err = 0;
@ -795,6 +946,18 @@ gcm_impl_set(const char *val)
}
}
}
#ifdef CAN_USE_GCM_ASM
/*
* Use the avx implementation if available and the requested one is
* avx or fastest.
*/
if (gcm_avx_will_work() == B_TRUE &&
(impl == IMPL_AVX || impl == IMPL_FASTEST)) {
gcm_set_avx(B_TRUE);
} else {
gcm_set_avx(B_FALSE);
}
#endif
if (err == 0) {
if (gcm_impl_initialized)
@ -806,7 +969,7 @@ gcm_impl_set(const char *val)
return (err);
}
#if defined(_KERNEL)
#if defined(_KERNEL) && defined(__linux__)
#include <linux/mod_compat.h>
static int
@ -826,6 +989,12 @@ icp_gcm_impl_get(char *buffer, zfs_kernel_param_t *kp)
/* list mandatory options */
for (i = 0; i < ARRAY_SIZE(gcm_impl_opts); i++) {
#ifdef CAN_USE_GCM_ASM
/* Ignore avx implementation if it won't work. */
if (gcm_impl_opts[i].sel == IMPL_AVX && !gcm_avx_will_work()) {
continue;
}
#endif
fmt = (impl == gcm_impl_opts[i].sel) ? "[%s] " : "%s ";
cnt += sprintf(buffer + cnt, fmt, gcm_impl_opts[i].name);
}
@ -842,4 +1011,562 @@ icp_gcm_impl_get(char *buffer, zfs_kernel_param_t *kp)
module_param_call(icp_gcm_impl, icp_gcm_impl_set, icp_gcm_impl_get,
NULL, 0644);
MODULE_PARM_DESC(icp_gcm_impl, "Select gcm implementation.");
#endif
#endif /* defined(__KERNEL) */
#ifdef CAN_USE_GCM_ASM
#define GCM_BLOCK_LEN 16
/*
* The openssl asm routines are 6x aggregated and need that many bytes
* at minimum.
*/
#define GCM_AVX_MIN_DECRYPT_BYTES (GCM_BLOCK_LEN * 6)
#define GCM_AVX_MIN_ENCRYPT_BYTES (GCM_BLOCK_LEN * 6 * 3)
/*
* Ensure the chunk size is reasonable since we are allocating a
* GCM_AVX_MAX_CHUNK_SIZEd buffer and disabling preemption and interrupts.
*/
#define GCM_AVX_MAX_CHUNK_SIZE \
(((128*1024)/GCM_AVX_MIN_DECRYPT_BYTES) * GCM_AVX_MIN_DECRYPT_BYTES)
/* Get the chunk size module parameter. */
#define GCM_CHUNK_SIZE_READ *(volatile uint32_t *) &gcm_avx_chunk_size
/* Clear the FPU registers since they hold sensitive internal state. */
#define clear_fpu_regs() clear_fpu_regs_avx()
#define GHASH_AVX(ctx, in, len) \
gcm_ghash_avx((ctx)->gcm_ghash, (const uint64_t (*)[2])(ctx)->gcm_Htable, \
in, len)
#define gcm_incr_counter_block(ctx) gcm_incr_counter_block_by(ctx, 1)
/*
* Module parameter: number of bytes to process at once while owning the FPU.
* Rounded down to the next GCM_AVX_MIN_DECRYPT_BYTES byte boundary and is
* ensured to be greater or equal than GCM_AVX_MIN_DECRYPT_BYTES.
*/
static uint32_t gcm_avx_chunk_size =
((32 * 1024) / GCM_AVX_MIN_DECRYPT_BYTES) * GCM_AVX_MIN_DECRYPT_BYTES;
extern void clear_fpu_regs_avx(void);
extern void gcm_xor_avx(const uint8_t *src, uint8_t *dst);
extern void aes_encrypt_intel(const uint32_t rk[], int nr,
const uint32_t pt[4], uint32_t ct[4]);
extern void gcm_init_htab_avx(uint64_t Htable[16][2], const uint64_t H[2]);
extern void gcm_ghash_avx(uint64_t ghash[2], const uint64_t Htable[16][2],
const uint8_t *in, size_t len);
extern size_t aesni_gcm_encrypt(const uint8_t *, uint8_t *, size_t,
const void *, uint64_t *, uint64_t *);
extern size_t aesni_gcm_decrypt(const uint8_t *, uint8_t *, size_t,
const void *, uint64_t *, uint64_t *);
static inline boolean_t
gcm_avx_will_work(void)
{
/* Avx should imply aes-ni and pclmulqdq, but make sure anyhow. */
return (kfpu_allowed() &&
zfs_avx_available() && zfs_aes_available() &&
zfs_pclmulqdq_available());
}
static inline void
gcm_set_avx(boolean_t val)
{
if (gcm_avx_will_work() == B_TRUE) {
atomic_swap_32(&gcm_use_avx, val);
}
}
static inline boolean_t
gcm_toggle_avx(void)
{
if (gcm_avx_will_work() == B_TRUE) {
return (atomic_toggle_boolean_nv(&GCM_IMPL_USE_AVX));
} else {
return (B_FALSE);
}
}
/*
* Clear senssitve data in the context.
*
* ctx->gcm_remainder may contain a plaintext remainder. ctx->gcm_H and
* ctx->gcm_Htable contain the hash sub key which protects authentication.
*
* Although extremely unlikely, ctx->gcm_J0 and ctx->gcm_tmp could be used for
* a known plaintext attack, they consists of the IV and the first and last
* counter respectively. If they should be cleared is debatable.
*/
static inline void
gcm_clear_ctx(gcm_ctx_t *ctx)
{
bzero(ctx->gcm_remainder, sizeof (ctx->gcm_remainder));
bzero(ctx->gcm_H, sizeof (ctx->gcm_H));
bzero(ctx->gcm_Htable, sizeof (ctx->gcm_Htable));
bzero(ctx->gcm_J0, sizeof (ctx->gcm_J0));
bzero(ctx->gcm_tmp, sizeof (ctx->gcm_tmp));
}
/* Increment the GCM counter block by n. */
static inline void
gcm_incr_counter_block_by(gcm_ctx_t *ctx, int n)
{
uint64_t counter_mask = ntohll(0x00000000ffffffffULL);
uint64_t counter = ntohll(ctx->gcm_cb[1] & counter_mask);
counter = htonll(counter + n);
counter &= counter_mask;
ctx->gcm_cb[1] = (ctx->gcm_cb[1] & ~counter_mask) | counter;
}
/*
* Encrypt multiple blocks of data in GCM mode.
* This is done in gcm_avx_chunk_size chunks, utilizing AVX assembler routines
* if possible. While processing a chunk the FPU is "locked".
*/
static int
gcm_mode_encrypt_contiguous_blocks_avx(gcm_ctx_t *ctx, char *data,
size_t length, crypto_data_t *out, size_t block_size)
{
size_t bleft = length;
size_t need = 0;
size_t done = 0;
uint8_t *datap = (uint8_t *)data;
size_t chunk_size = (size_t)GCM_CHUNK_SIZE_READ;
const aes_key_t *key = ((aes_key_t *)ctx->gcm_keysched);
uint64_t *ghash = ctx->gcm_ghash;
uint64_t *cb = ctx->gcm_cb;
uint8_t *ct_buf = NULL;
uint8_t *tmp = (uint8_t *)ctx->gcm_tmp;
int rv = CRYPTO_SUCCESS;
ASSERT(block_size == GCM_BLOCK_LEN);
/*
* If the last call left an incomplete block, try to fill
* it first.
*/
if (ctx->gcm_remainder_len > 0) {
need = block_size - ctx->gcm_remainder_len;
if (length < need) {
/* Accumulate bytes here and return. */
bcopy(datap, (uint8_t *)ctx->gcm_remainder +
ctx->gcm_remainder_len, length);
ctx->gcm_remainder_len += length;
if (ctx->gcm_copy_to == NULL) {
ctx->gcm_copy_to = datap;
}
return (CRYPTO_SUCCESS);
} else {
/* Complete incomplete block. */
bcopy(datap, (uint8_t *)ctx->gcm_remainder +
ctx->gcm_remainder_len, need);
ctx->gcm_copy_to = NULL;
}
}
/* Allocate a buffer to encrypt to if there is enough input. */
if (bleft >= GCM_AVX_MIN_ENCRYPT_BYTES) {
ct_buf = vmem_alloc(chunk_size, ctx->gcm_kmflag);
if (ct_buf == NULL) {
return (CRYPTO_HOST_MEMORY);
}
}
/* If we completed an incomplete block, encrypt and write it out. */
if (ctx->gcm_remainder_len > 0) {
kfpu_begin();
aes_encrypt_intel(key->encr_ks.ks32, key->nr,
(const uint32_t *)cb, (uint32_t *)tmp);
gcm_xor_avx((const uint8_t *) ctx->gcm_remainder, tmp);
GHASH_AVX(ctx, tmp, block_size);
clear_fpu_regs();
kfpu_end();
/*
* We don't follow gcm_mode_encrypt_contiguous_blocks() here
* but assert that out is not null.
* See gcm_mode_encrypt_contiguous_blocks() above and
* https://github.com/zfsonlinux/zfs/issues/9661
*/
ASSERT(out != NULL);
rv = crypto_put_output_data(tmp, out, block_size);
out->cd_offset += block_size;
gcm_incr_counter_block(ctx);
ctx->gcm_processed_data_len += block_size;
bleft -= need;
datap += need;
ctx->gcm_remainder_len = 0;
}
/* Do the bulk encryption in chunk_size blocks. */
for (; bleft >= chunk_size; bleft -= chunk_size) {
kfpu_begin();
done = aesni_gcm_encrypt(
datap, ct_buf, chunk_size, key, cb, ghash);
clear_fpu_regs();
kfpu_end();
if (done != chunk_size) {
rv = CRYPTO_FAILED;
goto out_nofpu;
}
if (out != NULL) {
rv = crypto_put_output_data(ct_buf, out, chunk_size);
if (rv != CRYPTO_SUCCESS) {
goto out_nofpu;
}
out->cd_offset += chunk_size;
}
datap += chunk_size;
ctx->gcm_processed_data_len += chunk_size;
}
/* Check if we are already done. */
if (bleft == 0) {
goto out_nofpu;
}
/* Bulk encrypt the remaining data. */
kfpu_begin();
if (bleft >= GCM_AVX_MIN_ENCRYPT_BYTES) {
done = aesni_gcm_encrypt(datap, ct_buf, bleft, key, cb, ghash);
if (done == 0) {
rv = CRYPTO_FAILED;
goto out;
}
if (out != NULL) {
rv = crypto_put_output_data(ct_buf, out, done);
if (rv != CRYPTO_SUCCESS) {
goto out;
}
out->cd_offset += done;
}
ctx->gcm_processed_data_len += done;
datap += done;
bleft -= done;
}
/* Less than GCM_AVX_MIN_ENCRYPT_BYTES remain, operate on blocks. */
while (bleft > 0) {
if (bleft < block_size) {
bcopy(datap, ctx->gcm_remainder, bleft);
ctx->gcm_remainder_len = bleft;
ctx->gcm_copy_to = datap;
goto out;
}
/* Encrypt, hash and write out. */
aes_encrypt_intel(key->encr_ks.ks32, key->nr,
(const uint32_t *)cb, (uint32_t *)tmp);
gcm_xor_avx(datap, tmp);
GHASH_AVX(ctx, tmp, block_size);
if (out != NULL) {
rv = crypto_put_output_data(tmp, out, block_size);
if (rv != CRYPTO_SUCCESS) {
goto out;
}
out->cd_offset += block_size;
}
gcm_incr_counter_block(ctx);
ctx->gcm_processed_data_len += block_size;
datap += block_size;
bleft -= block_size;
}
out:
clear_fpu_regs();
kfpu_end();
out_nofpu:
if (ct_buf != NULL) {
vmem_free(ct_buf, chunk_size);
}
return (rv);
}
/*
* Finalize the encryption: Zero fill, encrypt, hash and write out an eventual
* incomplete last block. Encrypt the ICB. Calculate the tag and write it out.
*/
static int
gcm_encrypt_final_avx(gcm_ctx_t *ctx, crypto_data_t *out, size_t block_size)
{
uint8_t *ghash = (uint8_t *)ctx->gcm_ghash;
uint32_t *J0 = (uint32_t *)ctx->gcm_J0;
uint8_t *remainder = (uint8_t *)ctx->gcm_remainder;
size_t rem_len = ctx->gcm_remainder_len;
const void *keysched = ((aes_key_t *)ctx->gcm_keysched)->encr_ks.ks32;
int aes_rounds = ((aes_key_t *)keysched)->nr;
int rv;
ASSERT(block_size == GCM_BLOCK_LEN);
if (out->cd_length < (rem_len + ctx->gcm_tag_len)) {
return (CRYPTO_DATA_LEN_RANGE);
}
kfpu_begin();
/* Pad last incomplete block with zeros, encrypt and hash. */
if (rem_len > 0) {
uint8_t *tmp = (uint8_t *)ctx->gcm_tmp;
const uint32_t *cb = (uint32_t *)ctx->gcm_cb;
aes_encrypt_intel(keysched, aes_rounds, cb, (uint32_t *)tmp);
bzero(remainder + rem_len, block_size - rem_len);
for (int i = 0; i < rem_len; i++) {
remainder[i] ^= tmp[i];
}
GHASH_AVX(ctx, remainder, block_size);
ctx->gcm_processed_data_len += rem_len;
/* No need to increment counter_block, it's the last block. */
}
/* Finish tag. */
ctx->gcm_len_a_len_c[1] =
htonll(CRYPTO_BYTES2BITS(ctx->gcm_processed_data_len));
GHASH_AVX(ctx, (const uint8_t *)ctx->gcm_len_a_len_c, block_size);
aes_encrypt_intel(keysched, aes_rounds, J0, J0);
gcm_xor_avx((uint8_t *)J0, ghash);
clear_fpu_regs();
kfpu_end();
/* Output remainder. */
if (rem_len > 0) {
rv = crypto_put_output_data(remainder, out, rem_len);
if (rv != CRYPTO_SUCCESS)
return (rv);
}
out->cd_offset += rem_len;
ctx->gcm_remainder_len = 0;
rv = crypto_put_output_data(ghash, out, ctx->gcm_tag_len);
if (rv != CRYPTO_SUCCESS)
return (rv);
out->cd_offset += ctx->gcm_tag_len;
/* Clear sensitive data in the context before returning. */
gcm_clear_ctx(ctx);
return (CRYPTO_SUCCESS);
}
/*
* Finalize decryption: We just have accumulated crypto text, so now we
* decrypt it here inplace.
*/
static int
gcm_decrypt_final_avx(gcm_ctx_t *ctx, crypto_data_t *out, size_t block_size)
{
ASSERT3U(ctx->gcm_processed_data_len, ==, ctx->gcm_pt_buf_len);
ASSERT3U(block_size, ==, 16);
size_t chunk_size = (size_t)GCM_CHUNK_SIZE_READ;
size_t pt_len = ctx->gcm_processed_data_len - ctx->gcm_tag_len;
uint8_t *datap = ctx->gcm_pt_buf;
const aes_key_t *key = ((aes_key_t *)ctx->gcm_keysched);
uint32_t *cb = (uint32_t *)ctx->gcm_cb;
uint64_t *ghash = ctx->gcm_ghash;
uint32_t *tmp = (uint32_t *)ctx->gcm_tmp;
int rv = CRYPTO_SUCCESS;
size_t bleft, done;
/*
* Decrypt in chunks of gcm_avx_chunk_size, which is asserted to be
* greater or equal than GCM_AVX_MIN_ENCRYPT_BYTES, and a multiple of
* GCM_AVX_MIN_DECRYPT_BYTES.
*/
for (bleft = pt_len; bleft >= chunk_size; bleft -= chunk_size) {
kfpu_begin();
done = aesni_gcm_decrypt(datap, datap, chunk_size,
(const void *)key, ctx->gcm_cb, ghash);
clear_fpu_regs();
kfpu_end();
if (done != chunk_size) {
return (CRYPTO_FAILED);
}
datap += done;
}
/* Decrypt remainder, which is less then chunk size, in one go. */
kfpu_begin();
if (bleft >= GCM_AVX_MIN_DECRYPT_BYTES) {
done = aesni_gcm_decrypt(datap, datap, bleft,
(const void *)key, ctx->gcm_cb, ghash);
if (done == 0) {
clear_fpu_regs();
kfpu_end();
return (CRYPTO_FAILED);
}
datap += done;
bleft -= done;
}
ASSERT(bleft < GCM_AVX_MIN_DECRYPT_BYTES);
/*
* Now less then GCM_AVX_MIN_DECRYPT_BYTES bytes remain,
* decrypt them block by block.
*/
while (bleft > 0) {
/* Incomplete last block. */
if (bleft < block_size) {
uint8_t *lastb = (uint8_t *)ctx->gcm_remainder;
bzero(lastb, block_size);
bcopy(datap, lastb, bleft);
/* The GCM processing. */
GHASH_AVX(ctx, lastb, block_size);
aes_encrypt_intel(key->encr_ks.ks32, key->nr, cb, tmp);
for (size_t i = 0; i < bleft; i++) {
datap[i] = lastb[i] ^ ((uint8_t *)tmp)[i];
}
break;
}
/* The GCM processing. */
GHASH_AVX(ctx, datap, block_size);
aes_encrypt_intel(key->encr_ks.ks32, key->nr, cb, tmp);
gcm_xor_avx((uint8_t *)tmp, datap);
gcm_incr_counter_block(ctx);
datap += block_size;
bleft -= block_size;
}
if (rv != CRYPTO_SUCCESS) {
clear_fpu_regs();
kfpu_end();
return (rv);
}
/* Decryption done, finish the tag. */
ctx->gcm_len_a_len_c[1] = htonll(CRYPTO_BYTES2BITS(pt_len));
GHASH_AVX(ctx, (uint8_t *)ctx->gcm_len_a_len_c, block_size);
aes_encrypt_intel(key->encr_ks.ks32, key->nr, (uint32_t *)ctx->gcm_J0,
(uint32_t *)ctx->gcm_J0);
gcm_xor_avx((uint8_t *)ctx->gcm_J0, (uint8_t *)ghash);
/* We are done with the FPU, restore its state. */
clear_fpu_regs();
kfpu_end();
/* Compare the input authentication tag with what we calculated. */
if (bcmp(&ctx->gcm_pt_buf[pt_len], ghash, ctx->gcm_tag_len)) {
/* They don't match. */
return (CRYPTO_INVALID_MAC);
}
rv = crypto_put_output_data(ctx->gcm_pt_buf, out, pt_len);
if (rv != CRYPTO_SUCCESS) {
return (rv);
}
out->cd_offset += pt_len;
gcm_clear_ctx(ctx);
return (CRYPTO_SUCCESS);
}
/*
* Initialize the GCM params H, Htabtle and the counter block. Save the
* initial counter block.
*/
static int
gcm_init_avx(gcm_ctx_t *ctx, unsigned char *iv, size_t iv_len,
unsigned char *auth_data, size_t auth_data_len, size_t block_size)
{
uint8_t *cb = (uint8_t *)ctx->gcm_cb;
uint64_t *H = ctx->gcm_H;
const void *keysched = ((aes_key_t *)ctx->gcm_keysched)->encr_ks.ks32;
int aes_rounds = ((aes_key_t *)ctx->gcm_keysched)->nr;
uint8_t *datap = auth_data;
size_t chunk_size = (size_t)GCM_CHUNK_SIZE_READ;
size_t bleft;
ASSERT(block_size == GCM_BLOCK_LEN);
/* Init H (encrypt zero block) and create the initial counter block. */
bzero(ctx->gcm_ghash, sizeof (ctx->gcm_ghash));
bzero(H, sizeof (ctx->gcm_H));
kfpu_begin();
aes_encrypt_intel(keysched, aes_rounds,
(const uint32_t *)H, (uint32_t *)H);
gcm_init_htab_avx(ctx->gcm_Htable, H);
if (iv_len == 12) {
bcopy(iv, cb, 12);
cb[12] = 0;
cb[13] = 0;
cb[14] = 0;
cb[15] = 1;
/* We need the ICB later. */
bcopy(cb, ctx->gcm_J0, sizeof (ctx->gcm_J0));
} else {
/*
* Most consumers use 12 byte IVs, so it's OK to use the
* original routines for other IV sizes, just avoid nesting
* kfpu_begin calls.
*/
clear_fpu_regs();
kfpu_end();
gcm_format_initial_blocks(iv, iv_len, ctx, block_size,
aes_copy_block, aes_xor_block);
kfpu_begin();
}
/* Openssl post increments the counter, adjust for that. */
gcm_incr_counter_block(ctx);
/* Ghash AAD in chunk_size blocks. */
for (bleft = auth_data_len; bleft >= chunk_size; bleft -= chunk_size) {
GHASH_AVX(ctx, datap, chunk_size);
datap += chunk_size;
clear_fpu_regs();
kfpu_end();
kfpu_begin();
}
/* Ghash the remainder and handle possible incomplete GCM block. */
if (bleft > 0) {
size_t incomp = bleft % block_size;
bleft -= incomp;
if (bleft > 0) {
GHASH_AVX(ctx, datap, bleft);
datap += bleft;
}
if (incomp > 0) {
/* Zero pad and hash incomplete last block. */
uint8_t *authp = (uint8_t *)ctx->gcm_tmp;
bzero(authp, block_size);
bcopy(datap, authp, incomp);
GHASH_AVX(ctx, authp, block_size);
}
}
clear_fpu_regs();
kfpu_end();
return (CRYPTO_SUCCESS);
}
#if defined(_KERNEL)
static int
icp_gcm_avx_set_chunk_size(const char *buf, zfs_kernel_param_t *kp)
{
unsigned long val;
char val_rounded[16];
int error = 0;
error = kstrtoul(buf, 0, &val);
if (error)
return (error);
val = (val / GCM_AVX_MIN_DECRYPT_BYTES) * GCM_AVX_MIN_DECRYPT_BYTES;
if (val < GCM_AVX_MIN_ENCRYPT_BYTES || val > GCM_AVX_MAX_CHUNK_SIZE)
return (-EINVAL);
snprintf(val_rounded, 16, "%u", (uint32_t)val);
error = param_set_uint(val_rounded, kp);
return (error);
}
module_param_call(icp_gcm_avx_chunk_size, icp_gcm_avx_set_chunk_size,
param_get_uint, &gcm_avx_chunk_size, 0644);
MODULE_PARM_DESC(icp_gcm_avx_chunk_size,
"How many bytes to process while owning the FPU");
#endif /* defined(__KERNEL) */
#endif /* ifdef CAN_USE_GCM_ASM */

View File

@ -916,8 +916,6 @@ crypto_decrypt_single(crypto_context_t context, crypto_data_t *ciphertext,
}
#if defined(_KERNEL)
EXPORT_SYMBOL(crypto_cipher_init_prov);
EXPORT_SYMBOL(crypto_cipher_init);
EXPORT_SYMBOL(crypto_encrypt_prov);
EXPORT_SYMBOL(crypto_encrypt);
EXPORT_SYMBOL(crypto_encrypt_init_prov);

View File

@ -0,0 +1,36 @@
Copyright (c) 2006-2017, CRYPTOGAMS by <appro@openssl.org>
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
* Redistributions of source code must retain copyright notices,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials
provided with the distribution.
* Neither the name of the CRYPTOGAMS nor the names of its
copyright holder and contributors may be used to endorse or
promote products derived from this software without specific
prior written permission.
ALTERNATIVELY, provided that this notice is retained in full, this
product may be distributed under the terms of the GNU General Public
License (GPL), in which case the provisions of the GPL apply INSTEAD OF
those given above.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View File

@ -0,0 +1 @@
PORTIONS OF GCM and GHASH FUNCTIONALITY

View File

@ -0,0 +1,177 @@
Apache License
Version 2.0, January 2004
https://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS

View File

@ -0,0 +1 @@
PORTIONS OF GCM and GHASH FUNCTIONALITY

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,714 @@
# Copyright 2010-2016 The OpenSSL Project Authors. All Rights Reserved.
#
# Licensed under the Apache License 2.0 (the "License"). You may not use
# this file except in compliance with the License. You can obtain a copy
# in the file LICENSE in the source distribution or at
# https://www.openssl.org/source/license.html
#
# ====================================================================
# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
# project. The module is, however, dual licensed under OpenSSL and
# CRYPTOGAMS licenses depending on where you obtain it. For further
# details see http://www.openssl.org/~appro/cryptogams/.
# ====================================================================
#
# March, June 2010
#
# The module implements "4-bit" GCM GHASH function and underlying
# single multiplication operation in GF(2^128). "4-bit" means that
# it uses 256 bytes per-key table [+128 bytes shared table]. GHASH
# function features so called "528B" variant utilizing additional
# 256+16 bytes of per-key storage [+512 bytes shared table].
# Performance results are for this streamed GHASH subroutine and are
# expressed in cycles per processed byte, less is better:
#
# gcc 3.4.x(*) assembler
#
# P4 28.6 14.0 +100%
# Opteron 19.3 7.7 +150%
# Core2 17.8 8.1(**) +120%
# Atom 31.6 16.8 +88%
# VIA Nano 21.8 10.1 +115%
#
# (*) comparison is not completely fair, because C results are
# for vanilla "256B" implementation, while assembler results
# are for "528B";-)
# (**) it's mystery [to me] why Core2 result is not same as for
# Opteron;
# May 2010
#
# Add PCLMULQDQ version performing at 2.02 cycles per processed byte.
# See ghash-x86.pl for background information and details about coding
# techniques.
#
# Special thanks to David Woodhouse for providing access to a
# Westmere-based system on behalf of Intel Open Source Technology Centre.
# December 2012
#
# Overhaul: aggregate Karatsuba post-processing, improve ILP in
# reduction_alg9, increase reduction aggregate factor to 4x. As for
# the latter. ghash-x86.pl discusses that it makes lesser sense to
# increase aggregate factor. Then why increase here? Critical path
# consists of 3 independent pclmulqdq instructions, Karatsuba post-
# processing and reduction. "On top" of this we lay down aggregated
# multiplication operations, triplets of independent pclmulqdq's. As
# issue rate for pclmulqdq is limited, it makes lesser sense to
# aggregate more multiplications than it takes to perform remaining
# non-multiplication operations. 2x is near-optimal coefficient for
# contemporary Intel CPUs (therefore modest improvement coefficient),
# but not for Bulldozer. Latter is because logical SIMD operations
# are twice as slow in comparison to Intel, so that critical path is
# longer. A CPU with higher pclmulqdq issue rate would also benefit
# from higher aggregate factor...
#
# Westmere 1.78(+13%)
# Sandy Bridge 1.80(+8%)
# Ivy Bridge 1.80(+7%)
# Haswell 0.55(+93%) (if system doesn't support AVX)
# Broadwell 0.45(+110%)(if system doesn't support AVX)
# Skylake 0.44(+110%)(if system doesn't support AVX)
# Bulldozer 1.49(+27%)
# Silvermont 2.88(+13%)
# Knights L 2.12(-) (if system doesn't support AVX)
# Goldmont 1.08(+24%)
# March 2013
#
# ... 8x aggregate factor AVX code path is using reduction algorithm
# suggested by Shay Gueron[1]. Even though contemporary AVX-capable
# CPUs such as Sandy and Ivy Bridge can execute it, the code performs
# sub-optimally in comparison to above mentioned version. But thanks
# to Ilya Albrekht and Max Locktyukhin of Intel Corp. we knew that
# it performs in 0.41 cycles per byte on Haswell processor, in
# 0.29 on Broadwell, and in 0.36 on Skylake.
#
# Knights Landing achieves 1.09 cpb.
#
# [1] http://rt.openssl.org/Ticket/Display.html?id=2900&user=guest&pass=guest
# Generated once from
# https://github.com/openssl/openssl/blob/5ffc3324/crypto/modes/asm/ghash-x86_64.pl
# and modified for ICP. Modification are kept at a bare minimum to ease later
# upstream merges.
#if defined(__x86_64__) && defined(HAVE_AVX) && \
defined(HAVE_AES) && defined(HAVE_PCLMULQDQ)
.text
.globl gcm_gmult_clmul
.type gcm_gmult_clmul,@function
.align 16
gcm_gmult_clmul:
.cfi_startproc
.L_gmult_clmul:
movdqu (%rdi),%xmm0
movdqa .Lbswap_mask(%rip),%xmm5
movdqu (%rsi),%xmm2
movdqu 32(%rsi),%xmm4
.byte 102,15,56,0,197
movdqa %xmm0,%xmm1
pshufd $78,%xmm0,%xmm3
pxor %xmm0,%xmm3
.byte 102,15,58,68,194,0
.byte 102,15,58,68,202,17
.byte 102,15,58,68,220,0
pxor %xmm0,%xmm3
pxor %xmm1,%xmm3
movdqa %xmm3,%xmm4
psrldq $8,%xmm3
pslldq $8,%xmm4
pxor %xmm3,%xmm1
pxor %xmm4,%xmm0
movdqa %xmm0,%xmm4
movdqa %xmm0,%xmm3
psllq $5,%xmm0
pxor %xmm0,%xmm3
psllq $1,%xmm0
pxor %xmm3,%xmm0
psllq $57,%xmm0
movdqa %xmm0,%xmm3
pslldq $8,%xmm0
psrldq $8,%xmm3
pxor %xmm4,%xmm0
pxor %xmm3,%xmm1
movdqa %xmm0,%xmm4
psrlq $1,%xmm0
pxor %xmm4,%xmm1
pxor %xmm0,%xmm4
psrlq $5,%xmm0
pxor %xmm4,%xmm0
psrlq $1,%xmm0
pxor %xmm1,%xmm0
.byte 102,15,56,0,197
movdqu %xmm0,(%rdi)
.byte 0xf3,0xc3
.cfi_endproc
.size gcm_gmult_clmul,.-gcm_gmult_clmul
.globl gcm_init_htab_avx
.type gcm_init_htab_avx,@function
.align 32
gcm_init_htab_avx:
.cfi_startproc
vzeroupper
vmovdqu (%rsi),%xmm2
// KCF/ICP stores H in network byte order with the hi qword first
// so we need to swap all bytes, not the 2 qwords.
vmovdqu .Lbswap_mask(%rip),%xmm4
vpshufb %xmm4,%xmm2,%xmm2
vpshufd $255,%xmm2,%xmm4
vpsrlq $63,%xmm2,%xmm3
vpsllq $1,%xmm2,%xmm2
vpxor %xmm5,%xmm5,%xmm5
vpcmpgtd %xmm4,%xmm5,%xmm5
vpslldq $8,%xmm3,%xmm3
vpor %xmm3,%xmm2,%xmm2
vpand .L0x1c2_polynomial(%rip),%xmm5,%xmm5
vpxor %xmm5,%xmm2,%xmm2
vpunpckhqdq %xmm2,%xmm2,%xmm6
vmovdqa %xmm2,%xmm0
vpxor %xmm2,%xmm6,%xmm6
movq $4,%r10
jmp .Linit_start_avx
.align 32
.Linit_loop_avx:
vpalignr $8,%xmm3,%xmm4,%xmm5
vmovdqu %xmm5,-16(%rdi)
vpunpckhqdq %xmm0,%xmm0,%xmm3
vpxor %xmm0,%xmm3,%xmm3
vpclmulqdq $0x11,%xmm2,%xmm0,%xmm1
vpclmulqdq $0x00,%xmm2,%xmm0,%xmm0
vpclmulqdq $0x00,%xmm6,%xmm3,%xmm3
vpxor %xmm0,%xmm1,%xmm4
vpxor %xmm4,%xmm3,%xmm3
vpslldq $8,%xmm3,%xmm4
vpsrldq $8,%xmm3,%xmm3
vpxor %xmm4,%xmm0,%xmm0
vpxor %xmm3,%xmm1,%xmm1
vpsllq $57,%xmm0,%xmm3
vpsllq $62,%xmm0,%xmm4
vpxor %xmm3,%xmm4,%xmm4
vpsllq $63,%xmm0,%xmm3
vpxor %xmm3,%xmm4,%xmm4
vpslldq $8,%xmm4,%xmm3
vpsrldq $8,%xmm4,%xmm4
vpxor %xmm3,%xmm0,%xmm0
vpxor %xmm4,%xmm1,%xmm1
vpsrlq $1,%xmm0,%xmm4
vpxor %xmm0,%xmm1,%xmm1
vpxor %xmm4,%xmm0,%xmm0
vpsrlq $5,%xmm4,%xmm4
vpxor %xmm4,%xmm0,%xmm0
vpsrlq $1,%xmm0,%xmm0
vpxor %xmm1,%xmm0,%xmm0
.Linit_start_avx:
vmovdqa %xmm0,%xmm5
vpunpckhqdq %xmm0,%xmm0,%xmm3
vpxor %xmm0,%xmm3,%xmm3
vpclmulqdq $0x11,%xmm2,%xmm0,%xmm1
vpclmulqdq $0x00,%xmm2,%xmm0,%xmm0
vpclmulqdq $0x00,%xmm6,%xmm3,%xmm3
vpxor %xmm0,%xmm1,%xmm4
vpxor %xmm4,%xmm3,%xmm3
vpslldq $8,%xmm3,%xmm4
vpsrldq $8,%xmm3,%xmm3
vpxor %xmm4,%xmm0,%xmm0
vpxor %xmm3,%xmm1,%xmm1
vpsllq $57,%xmm0,%xmm3
vpsllq $62,%xmm0,%xmm4
vpxor %xmm3,%xmm4,%xmm4
vpsllq $63,%xmm0,%xmm3
vpxor %xmm3,%xmm4,%xmm4
vpslldq $8,%xmm4,%xmm3
vpsrldq $8,%xmm4,%xmm4
vpxor %xmm3,%xmm0,%xmm0
vpxor %xmm4,%xmm1,%xmm1
vpsrlq $1,%xmm0,%xmm4
vpxor %xmm0,%xmm1,%xmm1
vpxor %xmm4,%xmm0,%xmm0
vpsrlq $5,%xmm4,%xmm4
vpxor %xmm4,%xmm0,%xmm0
vpsrlq $1,%xmm0,%xmm0
vpxor %xmm1,%xmm0,%xmm0
vpshufd $78,%xmm5,%xmm3
vpshufd $78,%xmm0,%xmm4
vpxor %xmm5,%xmm3,%xmm3
vmovdqu %xmm5,0(%rdi)
vpxor %xmm0,%xmm4,%xmm4
vmovdqu %xmm0,16(%rdi)
leaq 48(%rdi),%rdi
subq $1,%r10
jnz .Linit_loop_avx
vpalignr $8,%xmm4,%xmm3,%xmm5
vmovdqu %xmm5,-16(%rdi)
vzeroupper
.byte 0xf3,0xc3
.cfi_endproc
.size gcm_init_htab_avx,.-gcm_init_htab_avx
.globl gcm_gmult_avx
.type gcm_gmult_avx,@function
.align 32
gcm_gmult_avx:
.cfi_startproc
jmp .L_gmult_clmul
.cfi_endproc
.size gcm_gmult_avx,.-gcm_gmult_avx
.globl gcm_ghash_avx
.type gcm_ghash_avx,@function
.align 32
gcm_ghash_avx:
.cfi_startproc
vzeroupper
vmovdqu (%rdi),%xmm10
leaq .L0x1c2_polynomial(%rip),%r10
leaq 64(%rsi),%rsi
vmovdqu .Lbswap_mask(%rip),%xmm13
vpshufb %xmm13,%xmm10,%xmm10
cmpq $0x80,%rcx
jb .Lshort_avx
subq $0x80,%rcx
vmovdqu 112(%rdx),%xmm14
vmovdqu 0-64(%rsi),%xmm6
vpshufb %xmm13,%xmm14,%xmm14
vmovdqu 32-64(%rsi),%xmm7
vpunpckhqdq %xmm14,%xmm14,%xmm9
vmovdqu 96(%rdx),%xmm15
vpclmulqdq $0x00,%xmm6,%xmm14,%xmm0
vpxor %xmm14,%xmm9,%xmm9
vpshufb %xmm13,%xmm15,%xmm15
vpclmulqdq $0x11,%xmm6,%xmm14,%xmm1
vmovdqu 16-64(%rsi),%xmm6
vpunpckhqdq %xmm15,%xmm15,%xmm8
vmovdqu 80(%rdx),%xmm14
vpclmulqdq $0x00,%xmm7,%xmm9,%xmm2
vpxor %xmm15,%xmm8,%xmm8
vpshufb %xmm13,%xmm14,%xmm14
vpclmulqdq $0x00,%xmm6,%xmm15,%xmm3
vpunpckhqdq %xmm14,%xmm14,%xmm9
vpclmulqdq $0x11,%xmm6,%xmm15,%xmm4
vmovdqu 48-64(%rsi),%xmm6
vpxor %xmm14,%xmm9,%xmm9
vmovdqu 64(%rdx),%xmm15
vpclmulqdq $0x10,%xmm7,%xmm8,%xmm5
vmovdqu 80-64(%rsi),%xmm7
vpshufb %xmm13,%xmm15,%xmm15
vpxor %xmm0,%xmm3,%xmm3
vpclmulqdq $0x00,%xmm6,%xmm14,%xmm0
vpxor %xmm1,%xmm4,%xmm4
vpunpckhqdq %xmm15,%xmm15,%xmm8
vpclmulqdq $0x11,%xmm6,%xmm14,%xmm1
vmovdqu 64-64(%rsi),%xmm6
vpxor %xmm2,%xmm5,%xmm5
vpclmulqdq $0x00,%xmm7,%xmm9,%xmm2
vpxor %xmm15,%xmm8,%xmm8
vmovdqu 48(%rdx),%xmm14
vpxor %xmm3,%xmm0,%xmm0
vpclmulqdq $0x00,%xmm6,%xmm15,%xmm3
vpxor %xmm4,%xmm1,%xmm1
vpshufb %xmm13,%xmm14,%xmm14
vpclmulqdq $0x11,%xmm6,%xmm15,%xmm4
vmovdqu 96-64(%rsi),%xmm6
vpxor %xmm5,%xmm2,%xmm2
vpunpckhqdq %xmm14,%xmm14,%xmm9
vpclmulqdq $0x10,%xmm7,%xmm8,%xmm5
vmovdqu 128-64(%rsi),%xmm7
vpxor %xmm14,%xmm9,%xmm9
vmovdqu 32(%rdx),%xmm15
vpxor %xmm0,%xmm3,%xmm3
vpclmulqdq $0x00,%xmm6,%xmm14,%xmm0
vpxor %xmm1,%xmm4,%xmm4
vpshufb %xmm13,%xmm15,%xmm15
vpclmulqdq $0x11,%xmm6,%xmm14,%xmm1
vmovdqu 112-64(%rsi),%xmm6
vpxor %xmm2,%xmm5,%xmm5
vpunpckhqdq %xmm15,%xmm15,%xmm8
vpclmulqdq $0x00,%xmm7,%xmm9,%xmm2
vpxor %xmm15,%xmm8,%xmm8
vmovdqu 16(%rdx),%xmm14
vpxor %xmm3,%xmm0,%xmm0
vpclmulqdq $0x00,%xmm6,%xmm15,%xmm3
vpxor %xmm4,%xmm1,%xmm1
vpshufb %xmm13,%xmm14,%xmm14
vpclmulqdq $0x11,%xmm6,%xmm15,%xmm4
vmovdqu 144-64(%rsi),%xmm6
vpxor %xmm5,%xmm2,%xmm2
vpunpckhqdq %xmm14,%xmm14,%xmm9
vpclmulqdq $0x10,%xmm7,%xmm8,%xmm5
vmovdqu 176-64(%rsi),%xmm7
vpxor %xmm14,%xmm9,%xmm9
vmovdqu (%rdx),%xmm15
vpxor %xmm0,%xmm3,%xmm3
vpclmulqdq $0x00,%xmm6,%xmm14,%xmm0
vpxor %xmm1,%xmm4,%xmm4
vpshufb %xmm13,%xmm15,%xmm15
vpclmulqdq $0x11,%xmm6,%xmm14,%xmm1
vmovdqu 160-64(%rsi),%xmm6
vpxor %xmm2,%xmm5,%xmm5
vpclmulqdq $0x10,%xmm7,%xmm9,%xmm2
leaq 128(%rdx),%rdx
cmpq $0x80,%rcx
jb .Ltail_avx
vpxor %xmm10,%xmm15,%xmm15
subq $0x80,%rcx
jmp .Loop8x_avx
.align 32
.Loop8x_avx:
vpunpckhqdq %xmm15,%xmm15,%xmm8
vmovdqu 112(%rdx),%xmm14
vpxor %xmm0,%xmm3,%xmm3
vpxor %xmm15,%xmm8,%xmm8
vpclmulqdq $0x00,%xmm6,%xmm15,%xmm10
vpshufb %xmm13,%xmm14,%xmm14
vpxor %xmm1,%xmm4,%xmm4
vpclmulqdq $0x11,%xmm6,%xmm15,%xmm11
vmovdqu 0-64(%rsi),%xmm6
vpunpckhqdq %xmm14,%xmm14,%xmm9
vpxor %xmm2,%xmm5,%xmm5
vpclmulqdq $0x00,%xmm7,%xmm8,%xmm12
vmovdqu 32-64(%rsi),%xmm7
vpxor %xmm14,%xmm9,%xmm9
vmovdqu 96(%rdx),%xmm15
vpclmulqdq $0x00,%xmm6,%xmm14,%xmm0
vpxor %xmm3,%xmm10,%xmm10
vpshufb %xmm13,%xmm15,%xmm15
vpclmulqdq $0x11,%xmm6,%xmm14,%xmm1
vxorps %xmm4,%xmm11,%xmm11
vmovdqu 16-64(%rsi),%xmm6
vpunpckhqdq %xmm15,%xmm15,%xmm8
vpclmulqdq $0x00,%xmm7,%xmm9,%xmm2
vpxor %xmm5,%xmm12,%xmm12
vxorps %xmm15,%xmm8,%xmm8
vmovdqu 80(%rdx),%xmm14
vpxor %xmm10,%xmm12,%xmm12
vpclmulqdq $0x00,%xmm6,%xmm15,%xmm3
vpxor %xmm11,%xmm12,%xmm12
vpslldq $8,%xmm12,%xmm9
vpxor %xmm0,%xmm3,%xmm3
vpclmulqdq $0x11,%xmm6,%xmm15,%xmm4
vpsrldq $8,%xmm12,%xmm12
vpxor %xmm9,%xmm10,%xmm10
vmovdqu 48-64(%rsi),%xmm6
vpshufb %xmm13,%xmm14,%xmm14
vxorps %xmm12,%xmm11,%xmm11
vpxor %xmm1,%xmm4,%xmm4
vpunpckhqdq %xmm14,%xmm14,%xmm9
vpclmulqdq $0x10,%xmm7,%xmm8,%xmm5
vmovdqu 80-64(%rsi),%xmm7
vpxor %xmm14,%xmm9,%xmm9
vpxor %xmm2,%xmm5,%xmm5
vmovdqu 64(%rdx),%xmm15
vpalignr $8,%xmm10,%xmm10,%xmm12
vpclmulqdq $0x00,%xmm6,%xmm14,%xmm0
vpshufb %xmm13,%xmm15,%xmm15
vpxor %xmm3,%xmm0,%xmm0
vpclmulqdq $0x11,%xmm6,%xmm14,%xmm1
vmovdqu 64-64(%rsi),%xmm6
vpunpckhqdq %xmm15,%xmm15,%xmm8
vpxor %xmm4,%xmm1,%xmm1
vpclmulqdq $0x00,%xmm7,%xmm9,%xmm2
vxorps %xmm15,%xmm8,%xmm8
vpxor %xmm5,%xmm2,%xmm2
vmovdqu 48(%rdx),%xmm14
vpclmulqdq $0x10,(%r10),%xmm10,%xmm10
vpclmulqdq $0x00,%xmm6,%xmm15,%xmm3
vpshufb %xmm13,%xmm14,%xmm14
vpxor %xmm0,%xmm3,%xmm3
vpclmulqdq $0x11,%xmm6,%xmm15,%xmm4
vmovdqu 96-64(%rsi),%xmm6
vpunpckhqdq %xmm14,%xmm14,%xmm9
vpxor %xmm1,%xmm4,%xmm4
vpclmulqdq $0x10,%xmm7,%xmm8,%xmm5
vmovdqu 128-64(%rsi),%xmm7
vpxor %xmm14,%xmm9,%xmm9
vpxor %xmm2,%xmm5,%xmm5
vmovdqu 32(%rdx),%xmm15
vpclmulqdq $0x00,%xmm6,%xmm14,%xmm0
vpshufb %xmm13,%xmm15,%xmm15
vpxor %xmm3,%xmm0,%xmm0
vpclmulqdq $0x11,%xmm6,%xmm14,%xmm1
vmovdqu 112-64(%rsi),%xmm6
vpunpckhqdq %xmm15,%xmm15,%xmm8
vpxor %xmm4,%xmm1,%xmm1
vpclmulqdq $0x00,%xmm7,%xmm9,%xmm2
vpxor %xmm15,%xmm8,%xmm8
vpxor %xmm5,%xmm2,%xmm2
vxorps %xmm12,%xmm10,%xmm10
vmovdqu 16(%rdx),%xmm14
vpalignr $8,%xmm10,%xmm10,%xmm12
vpclmulqdq $0x00,%xmm6,%xmm15,%xmm3
vpshufb %xmm13,%xmm14,%xmm14
vpxor %xmm0,%xmm3,%xmm3
vpclmulqdq $0x11,%xmm6,%xmm15,%xmm4
vmovdqu 144-64(%rsi),%xmm6
vpclmulqdq $0x10,(%r10),%xmm10,%xmm10
vxorps %xmm11,%xmm12,%xmm12
vpunpckhqdq %xmm14,%xmm14,%xmm9
vpxor %xmm1,%xmm4,%xmm4
vpclmulqdq $0x10,%xmm7,%xmm8,%xmm5
vmovdqu 176-64(%rsi),%xmm7
vpxor %xmm14,%xmm9,%xmm9
vpxor %xmm2,%xmm5,%xmm5
vmovdqu (%rdx),%xmm15
vpclmulqdq $0x00,%xmm6,%xmm14,%xmm0
vpshufb %xmm13,%xmm15,%xmm15
vpclmulqdq $0x11,%xmm6,%xmm14,%xmm1
vmovdqu 160-64(%rsi),%xmm6
vpxor %xmm12,%xmm15,%xmm15
vpclmulqdq $0x10,%xmm7,%xmm9,%xmm2
vpxor %xmm10,%xmm15,%xmm15
leaq 128(%rdx),%rdx
subq $0x80,%rcx
jnc .Loop8x_avx
addq $0x80,%rcx
jmp .Ltail_no_xor_avx
.align 32
.Lshort_avx:
vmovdqu -16(%rdx,%rcx,1),%xmm14
leaq (%rdx,%rcx,1),%rdx
vmovdqu 0-64(%rsi),%xmm6
vmovdqu 32-64(%rsi),%xmm7
vpshufb %xmm13,%xmm14,%xmm15
vmovdqa %xmm0,%xmm3
vmovdqa %xmm1,%xmm4
vmovdqa %xmm2,%xmm5
subq $0x10,%rcx
jz .Ltail_avx
vpunpckhqdq %xmm15,%xmm15,%xmm8
vpxor %xmm0,%xmm3,%xmm3
vpclmulqdq $0x00,%xmm6,%xmm15,%xmm0
vpxor %xmm15,%xmm8,%xmm8
vmovdqu -32(%rdx),%xmm14
vpxor %xmm1,%xmm4,%xmm4
vpclmulqdq $0x11,%xmm6,%xmm15,%xmm1
vmovdqu 16-64(%rsi),%xmm6
vpshufb %xmm13,%xmm14,%xmm15
vpxor %xmm2,%xmm5,%xmm5
vpclmulqdq $0x00,%xmm7,%xmm8,%xmm2
vpsrldq $8,%xmm7,%xmm7
subq $0x10,%rcx
jz .Ltail_avx
vpunpckhqdq %xmm15,%xmm15,%xmm8
vpxor %xmm0,%xmm3,%xmm3
vpclmulqdq $0x00,%xmm6,%xmm15,%xmm0
vpxor %xmm15,%xmm8,%xmm8
vmovdqu -48(%rdx),%xmm14
vpxor %xmm1,%xmm4,%xmm4
vpclmulqdq $0x11,%xmm6,%xmm15,%xmm1
vmovdqu 48-64(%rsi),%xmm6
vpshufb %xmm13,%xmm14,%xmm15
vpxor %xmm2,%xmm5,%xmm5
vpclmulqdq $0x00,%xmm7,%xmm8,%xmm2
vmovdqu 80-64(%rsi),%xmm7
subq $0x10,%rcx
jz .Ltail_avx
vpunpckhqdq %xmm15,%xmm15,%xmm8
vpxor %xmm0,%xmm3,%xmm3
vpclmulqdq $0x00,%xmm6,%xmm15,%xmm0
vpxor %xmm15,%xmm8,%xmm8
vmovdqu -64(%rdx),%xmm14
vpxor %xmm1,%xmm4,%xmm4
vpclmulqdq $0x11,%xmm6,%xmm15,%xmm1
vmovdqu 64-64(%rsi),%xmm6
vpshufb %xmm13,%xmm14,%xmm15
vpxor %xmm2,%xmm5,%xmm5
vpclmulqdq $0x00,%xmm7,%xmm8,%xmm2
vpsrldq $8,%xmm7,%xmm7
subq $0x10,%rcx
jz .Ltail_avx
vpunpckhqdq %xmm15,%xmm15,%xmm8
vpxor %xmm0,%xmm3,%xmm3
vpclmulqdq $0x00,%xmm6,%xmm15,%xmm0
vpxor %xmm15,%xmm8,%xmm8
vmovdqu -80(%rdx),%xmm14
vpxor %xmm1,%xmm4,%xmm4
vpclmulqdq $0x11,%xmm6,%xmm15,%xmm1
vmovdqu 96-64(%rsi),%xmm6
vpshufb %xmm13,%xmm14,%xmm15
vpxor %xmm2,%xmm5,%xmm5
vpclmulqdq $0x00,%xmm7,%xmm8,%xmm2
vmovdqu 128-64(%rsi),%xmm7
subq $0x10,%rcx
jz .Ltail_avx
vpunpckhqdq %xmm15,%xmm15,%xmm8
vpxor %xmm0,%xmm3,%xmm3
vpclmulqdq $0x00,%xmm6,%xmm15,%xmm0
vpxor %xmm15,%xmm8,%xmm8
vmovdqu -96(%rdx),%xmm14
vpxor %xmm1,%xmm4,%xmm4
vpclmulqdq $0x11,%xmm6,%xmm15,%xmm1
vmovdqu 112-64(%rsi),%xmm6
vpshufb %xmm13,%xmm14,%xmm15
vpxor %xmm2,%xmm5,%xmm5
vpclmulqdq $0x00,%xmm7,%xmm8,%xmm2
vpsrldq $8,%xmm7,%xmm7
subq $0x10,%rcx
jz .Ltail_avx
vpunpckhqdq %xmm15,%xmm15,%xmm8
vpxor %xmm0,%xmm3,%xmm3
vpclmulqdq $0x00,%xmm6,%xmm15,%xmm0
vpxor %xmm15,%xmm8,%xmm8
vmovdqu -112(%rdx),%xmm14
vpxor %xmm1,%xmm4,%xmm4
vpclmulqdq $0x11,%xmm6,%xmm15,%xmm1
vmovdqu 144-64(%rsi),%xmm6
vpshufb %xmm13,%xmm14,%xmm15
vpxor %xmm2,%xmm5,%xmm5
vpclmulqdq $0x00,%xmm7,%xmm8,%xmm2
vmovq 184-64(%rsi),%xmm7
subq $0x10,%rcx
jmp .Ltail_avx
.align 32
.Ltail_avx:
vpxor %xmm10,%xmm15,%xmm15
.Ltail_no_xor_avx:
vpunpckhqdq %xmm15,%xmm15,%xmm8
vpxor %xmm0,%xmm3,%xmm3
vpclmulqdq $0x00,%xmm6,%xmm15,%xmm0
vpxor %xmm15,%xmm8,%xmm8
vpxor %xmm1,%xmm4,%xmm4
vpclmulqdq $0x11,%xmm6,%xmm15,%xmm1
vpxor %xmm2,%xmm5,%xmm5
vpclmulqdq $0x00,%xmm7,%xmm8,%xmm2
vmovdqu (%r10),%xmm12
vpxor %xmm0,%xmm3,%xmm10
vpxor %xmm1,%xmm4,%xmm11
vpxor %xmm2,%xmm5,%xmm5
vpxor %xmm10,%xmm5,%xmm5
vpxor %xmm11,%xmm5,%xmm5
vpslldq $8,%xmm5,%xmm9
vpsrldq $8,%xmm5,%xmm5
vpxor %xmm9,%xmm10,%xmm10
vpxor %xmm5,%xmm11,%xmm11
vpclmulqdq $0x10,%xmm12,%xmm10,%xmm9
vpalignr $8,%xmm10,%xmm10,%xmm10
vpxor %xmm9,%xmm10,%xmm10
vpclmulqdq $0x10,%xmm12,%xmm10,%xmm9
vpalignr $8,%xmm10,%xmm10,%xmm10
vpxor %xmm11,%xmm10,%xmm10
vpxor %xmm9,%xmm10,%xmm10
cmpq $0,%rcx
jne .Lshort_avx
vpshufb %xmm13,%xmm10,%xmm10
vmovdqu %xmm10,(%rdi)
vzeroupper
.byte 0xf3,0xc3
.cfi_endproc
.size gcm_ghash_avx,.-gcm_ghash_avx
.align 64
.Lbswap_mask:
.byte 15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0
.L0x1c2_polynomial:
.byte 1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0xc2
.L7_mask:
.long 7,0,7,0
.L7_mask_poly:
.long 7,0,450,0
.align 64
.type .Lrem_4bit,@object
.Lrem_4bit:
.long 0,0,0,471859200,0,943718400,0,610271232
.long 0,1887436800,0,1822425088,0,1220542464,0,1423966208
.long 0,3774873600,0,4246732800,0,3644850176,0,3311403008
.long 0,2441084928,0,2376073216,0,2847932416,0,3051356160
.type .Lrem_8bit,@object
.Lrem_8bit:
.value 0x0000,0x01C2,0x0384,0x0246,0x0708,0x06CA,0x048C,0x054E
.value 0x0E10,0x0FD2,0x0D94,0x0C56,0x0918,0x08DA,0x0A9C,0x0B5E
.value 0x1C20,0x1DE2,0x1FA4,0x1E66,0x1B28,0x1AEA,0x18AC,0x196E
.value 0x1230,0x13F2,0x11B4,0x1076,0x1538,0x14FA,0x16BC,0x177E
.value 0x3840,0x3982,0x3BC4,0x3A06,0x3F48,0x3E8A,0x3CCC,0x3D0E
.value 0x3650,0x3792,0x35D4,0x3416,0x3158,0x309A,0x32DC,0x331E
.value 0x2460,0x25A2,0x27E4,0x2626,0x2368,0x22AA,0x20EC,0x212E
.value 0x2A70,0x2BB2,0x29F4,0x2836,0x2D78,0x2CBA,0x2EFC,0x2F3E
.value 0x7080,0x7142,0x7304,0x72C6,0x7788,0x764A,0x740C,0x75CE
.value 0x7E90,0x7F52,0x7D14,0x7CD6,0x7998,0x785A,0x7A1C,0x7BDE
.value 0x6CA0,0x6D62,0x6F24,0x6EE6,0x6BA8,0x6A6A,0x682C,0x69EE
.value 0x62B0,0x6372,0x6134,0x60F6,0x65B8,0x647A,0x663C,0x67FE
.value 0x48C0,0x4902,0x4B44,0x4A86,0x4FC8,0x4E0A,0x4C4C,0x4D8E
.value 0x46D0,0x4712,0x4554,0x4496,0x41D8,0x401A,0x425C,0x439E
.value 0x54E0,0x5522,0x5764,0x56A6,0x53E8,0x522A,0x506C,0x51AE
.value 0x5AF0,0x5B32,0x5974,0x58B6,0x5DF8,0x5C3A,0x5E7C,0x5FBE
.value 0xE100,0xE0C2,0xE284,0xE346,0xE608,0xE7CA,0xE58C,0xE44E
.value 0xEF10,0xEED2,0xEC94,0xED56,0xE818,0xE9DA,0xEB9C,0xEA5E
.value 0xFD20,0xFCE2,0xFEA4,0xFF66,0xFA28,0xFBEA,0xF9AC,0xF86E
.value 0xF330,0xF2F2,0xF0B4,0xF176,0xF438,0xF5FA,0xF7BC,0xF67E
.value 0xD940,0xD882,0xDAC4,0xDB06,0xDE48,0xDF8A,0xDDCC,0xDC0E
.value 0xD750,0xD692,0xD4D4,0xD516,0xD058,0xD19A,0xD3DC,0xD21E
.value 0xC560,0xC4A2,0xC6E4,0xC726,0xC268,0xC3AA,0xC1EC,0xC02E
.value 0xCB70,0xCAB2,0xC8F4,0xC936,0xCC78,0xCDBA,0xCFFC,0xCE3E
.value 0x9180,0x9042,0x9204,0x93C6,0x9688,0x974A,0x950C,0x94CE
.value 0x9F90,0x9E52,0x9C14,0x9DD6,0x9898,0x995A,0x9B1C,0x9ADE
.value 0x8DA0,0x8C62,0x8E24,0x8FE6,0x8AA8,0x8B6A,0x892C,0x88EE
.value 0x83B0,0x8272,0x8034,0x81F6,0x84B8,0x857A,0x873C,0x86FE
.value 0xA9C0,0xA802,0xAA44,0xAB86,0xAEC8,0xAF0A,0xAD4C,0xAC8E
.value 0xA7D0,0xA612,0xA454,0xA596,0xA0D8,0xA11A,0xA35C,0xA29E
.value 0xB5E0,0xB422,0xB664,0xB7A6,0xB2E8,0xB32A,0xB16C,0xB0AE
.value 0xBBF0,0xBA32,0xB874,0xB9B6,0xBCF8,0xBD3A,0xBF7C,0xBEBE
.byte 71,72,65,83,72,32,102,111,114,32,120,56,54,95,54,52,44,32,67,82,89,80,84,79,71,65,77,83,32,98,121,32,60,97,112,112,114,111,64,111,112,101,110,115,115,108,46,111,114,103,62,0
.align 64
/* Mark the stack non-executable. */
#if defined(__linux__) && defined(__ELF__)
.section .note.GNU-stack,"",%progbits
#endif
#endif /* defined(__x86_64__) && defined(HAVE_AVX) && defined(HAVE_AES) ... */

View File

@ -107,6 +107,11 @@ typedef union {
} aes_ks_t;
typedef struct aes_impl_ops aes_impl_ops_t;
/*
* The absolute offset of the encr_ks (0) and the nr (504) fields are hard
* coded in aesni-gcm-x86_64, so please don't change (or adjust accordingly).
*/
typedef struct aes_key aes_key_t;
struct aes_key {
aes_ks_t encr_ks; /* encryption key schedule */

View File

@ -34,6 +34,17 @@ extern "C" {
#include <sys/crypto/common.h>
#include <sys/crypto/impl.h>
/*
* Does the build chain support all instructions needed for the GCM assembler
* routines. AVX support should imply AES-NI and PCLMULQDQ, but make sure
* anyhow.
*/
#if defined(__x86_64__) && defined(HAVE_AVX) && \
defined(HAVE_AES) && defined(HAVE_PCLMULQDQ)
#define CAN_USE_GCM_ASM
extern boolean_t gcm_avx_can_use_movbe;
#endif
#define ECB_MODE 0x00000002
#define CBC_MODE 0x00000004
#define CTR_MODE 0x00000008
@ -189,13 +200,17 @@ typedef struct ccm_ctx {
*
* gcm_H: Subkey.
*
* gcm_Htable: Pre-computed and pre-shifted H, H^2, ... H^6 for the
* Karatsuba Algorithm in host byte order.
*
* gcm_J0: Pre-counter block generated from the IV.
*
* gcm_len_a_len_c: 64-bit representations of the bit lengths of
* AAD and ciphertext.
*
* gcm_kmflag: Current value of kmflag. Used only for allocating
* the plaintext buffer during decryption.
* gcm_kmflag: Current value of kmflag. Used for allocating
* the plaintext buffer during decryption and a
* gcm_avx_chunk_size'd buffer for avx enabled encryption.
*/
typedef struct gcm_ctx {
struct common_ctx gcm_common;
@ -203,12 +218,23 @@ typedef struct gcm_ctx {
size_t gcm_processed_data_len;
size_t gcm_pt_buf_len;
uint32_t gcm_tmp[4];
/*
* The relative positions of gcm_ghash, gcm_H and pre-computed
* gcm_Htable are hard coded in aesni-gcm-x86_64.S and ghash-x86_64.S,
* so please don't change (or adjust accordingly).
*/
uint64_t gcm_ghash[2];
uint64_t gcm_H[2];
#ifdef CAN_USE_GCM_ASM
uint64_t gcm_Htable[12][2];
#endif
uint64_t gcm_J0[2];
uint64_t gcm_len_a_len_c[2];
uint8_t *gcm_pt_buf;
int gcm_kmflag;
#ifdef CAN_USE_GCM_ASM
boolean_t gcm_use_avx;
#endif
} gcm_ctx_t;
#define gcm_keysched gcm_common.cc_keysched

View File

@ -453,17 +453,19 @@ mod_hash_create_extended(
int sleep) /* whether to sleep for mem */
{
mod_hash_t *mod_hash;
size_t size;
ASSERT(hname && keycmp && hash_alg && vdtor && kdtor);
if ((mod_hash = kmem_zalloc(MH_SIZE(nchains), sleep)) == NULL)
return (NULL);
mod_hash->mh_name = kmem_alloc(strlen(hname) + 1, sleep);
size = strlen(hname) + 1;
mod_hash->mh_name = kmem_alloc(size, sleep);
if (mod_hash->mh_name == NULL) {
kmem_free(mod_hash, MH_SIZE(nchains));
return (NULL);
}
(void) strcpy(mod_hash->mh_name, hname);
(void) strlcpy(mod_hash->mh_name, hname, size);
rw_init(&mod_hash->mh_contents, NULL, RW_DEFAULT, NULL);
mod_hash->mh_sleep = sleep;

View File

@ -598,10 +598,12 @@ l_noret luaG_errormsg (lua_State *L) {
l_noret luaG_runerror (lua_State *L, const char *fmt, ...) {
L->runerror++;
va_list argp;
va_start(argp, fmt);
addinfo(L, luaO_pushvfstring(L, fmt, argp));
va_end(argp);
luaG_errormsg(L);
L->runerror--;
}
/* END CSTYLED */

View File

@ -29,6 +29,24 @@
/* Return the number of bytes available on the stack. */
#if defined (_KERNEL) && defined(__linux__)
#include <asm/current.h>
static intptr_t stack_remaining(void) {
char local;
return (intptr_t)(&local - (char *)current->stack);
}
#elif defined (_KERNEL) && defined(__FreeBSD__)
#include <sys/pcpu.h>
static intptr_t stack_remaining(void) {
char local;
return (intptr_t)(&local - (char *)curthread->td_kstack);
}
#else
static intptr_t stack_remaining(void) {
return INTPTR_MAX;
}
#endif
/*
** {======================================================
@ -437,8 +455,13 @@ void luaD_call (lua_State *L, StkId func, int nResults, int allowyield) {
if (L->nCcalls == LUAI_MAXCCALLS)
luaG_runerror(L, "C stack overflow");
else if (L->nCcalls >= (LUAI_MAXCCALLS + (LUAI_MAXCCALLS>>3)))
luaD_throw(L, LUA_ERRERR); /* error while handing stack error */
luaD_throw(L, LUA_ERRERR); /* error while handling stack error */
}
intptr_t remaining = stack_remaining();
if (L->runerror == 0 && remaining < LUAI_MINCSTACK)
luaG_runerror(L, "C stack overflow");
if (L->runerror != 0 && remaining < LUAI_MINCSTACK / 2)
luaD_throw(L, LUA_ERRERR); /* error while handling stack error */
if (!allowyield) L->nny++;
if (!luaD_precall(L, func, nResults)) /* is a Lua function? */
luaV_execute(L); /* call it */

View File

@ -122,6 +122,12 @@ typedef LUAI_UACNUMBER l_uacNumber;
#define LUAI_MAXCCALLS 20
#endif
/*
* Minimum amount of available stack space (in bytes) to make a C call. With
* gsub() recursion, the stack space between each luaD_call() is 1256 bytes.
*/
#define LUAI_MINCSTACK 4096
/*
** maximum number of upvalues in a closure (both C and Lua). (Value
** must fit in an unsigned char.)

View File

@ -214,6 +214,7 @@ static void preinit_state (lua_State *L, global_State *g) {
L->nny = 1;
L->status = LUA_OK;
L->errfunc = 0;
L->runerror = 0;
}

View File

@ -166,6 +166,7 @@ struct lua_State {
unsigned short nCcalls; /* number of nested C calls */
lu_byte hookmask;
lu_byte allowhook;
lu_byte runerror; /* handling a runtime error */
int basehookcount;
int hookcount;
lua_Hook hook;

View File

@ -853,9 +853,9 @@ static void addquoted (lua_State *L, luaL_Buffer *b, int arg) {
else if (*s == '\0' || iscntrl(uchar(*s))) {
char buff[10];
if (!isdigit(uchar(*(s+1))))
sprintf(buff, "\\%d", (int)uchar(*s));
snprintf(buff, sizeof(buff), "\\%d", (int)uchar(*s));
else
sprintf(buff, "\\%03d", (int)uchar(*s));
snprintf(buff, sizeof(buff), "\\%03d", (int)uchar(*s));
luaL_addstring(b, buff);
}
else
@ -890,11 +890,11 @@ static const char *scanformat (lua_State *L, const char *strfrmt, char *form) {
/*
** add length modifier into formats
*/
static void addlenmod (char *form, const char *lenmod) {
static void addlenmod (char *form, const char *lenmod, size_t size) {
size_t l = strlen(form);
size_t lm = strlen(lenmod);
char spec = form[l - 1];
strcpy(form + l - 1, lenmod);
strlcpy(form + l - 1, lenmod, size - (l - 1));
form[l + lm - 1] = spec;
form[l + lm] = '\0';
}
@ -931,7 +931,7 @@ static int str_format (lua_State *L) {
lua_Number diff = n - (lua_Number)ni;
luaL_argcheck(L, -1 < diff && diff < 1, arg,
"not a number in proper range");
addlenmod(form, LUA_INTFRMLEN);
addlenmod(form, LUA_INTFRMLEN, MAX_FORMAT);
nb = str_sprintf(buff, form, ni);
break;
}
@ -941,7 +941,7 @@ static int str_format (lua_State *L) {
lua_Number diff = n - (lua_Number)ni;
luaL_argcheck(L, -1 < diff && diff < 1, arg,
"not a non-negative number in proper range");
addlenmod(form, LUA_INTFRMLEN);
addlenmod(form, LUA_INTFRMLEN, MAX_FORMAT);
nb = str_sprintf(buff, form, ni);
break;
}
@ -951,7 +951,7 @@ static int str_format (lua_State *L) {
case 'a': case 'A':
#endif
case 'g': case 'G': {
addlenmod(form, LUA_FLTFRMLEN);
addlenmod(form, LUA_FLTFRMLEN, MAX_FORMAT);
nb = str_sprintf(buff, form, (LUA_FLTFRM_T)luaL_checknumber(L, arg));
break;
}

View File

@ -929,32 +929,4 @@ void luaV_execute (lua_State *L) {
}
}
/*
* this can live in SPL
*/
#if BITS_PER_LONG == 32
#if defined(_KERNEL) && !defined(SPL_HAS_MODDI3)
extern uint64_t __umoddi3(uint64_t dividend, uint64_t divisor);
/* 64-bit signed modulo for 32-bit machines. */
int64_t
__moddi3(int64_t n, int64_t d)
{
int64_t q;
boolean_t nn = B_FALSE;
if (n < 0) {
nn = B_TRUE;
n = -n;
}
if (d < 0)
d = -d;
q = __umoddi3(n, d);
return (nn ? -q : q);
}
EXPORT_SYMBOL(__moddi3);
#endif
#endif
/* END CSTYLED */

Some files were not shown because too many files have changed in this diff Show More