Compare commits

..

222 Commits

Author SHA1 Message Date
Tony Hutter a8c2b7ebc6 Tag zfs-0.7.13
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2019-02-22 09:47:55 -08:00
John Wren Kennedy 2af898ee24 test-runner: python3 support
Updated to be compatible with Python 2.6, 2.7, 3.5 or newer.

Reviewed-by: John Ramsden <johnramsden@riseup.net>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: John Wren Kennedy <john.kennedy@delphix.com>
Closes #8096
2019-02-22 09:47:34 -08:00
Gregor Kopka c32c2f17d0 Fix flake 8 style warnings
Ran zts-report.py and test-runner.py from ./tests/test-runner/bin/
through the 2to3 (https://docs.python.org/2/library/2to3.html).
Checked the result, fixed:
- 'maxint' -> 'maxsize' that 2to3 missed.
- 'cmp=' parameter for a 'sorted()' with a 'key=' version.
- try/except wrapping of configparser import as there are still
  python 2.7 systems that lack a compatibility shim

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gregor Kopka <gregor@kopka.net>
Closes #7925
Closes #7952
2019-02-22 09:47:34 -08:00
Tony Hutter 2254b2bbbe GCC 9.0: Fix ztest "directive argument is not a nul-terminated string"
GCC 9.0 is complaining because we're trying to print strings that
are defined like this:

.zo_pool = { 'z', 't', 'e', 's', 't', '\0' },

Fix them by making them actual strings.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8330
2019-02-22 09:47:34 -08:00
Brian Behlendorf 5c4ec382a7 Linux 5.0 compat: Fix bio_set_dev()
The Linux 5.0 kernel updated the bio_set_dev() macro so it calls the
GPL-only bio_associate_blkg() symbol thus inadvertently converting
the entire macro.  Provide a minimal version which always assigns the
request queue's root_blkg to the bio.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8287
2019-02-22 09:47:34 -08:00
Tony Hutter e22bfd8149 Linux 5.0 compat: Disable vector instructions on 5.0+ kernels
The 5.0 kernel no longer exports the functions we need to do vector
(SSE/SSE2/SSE3/AVX...) instructions.  Disable vector-based checksum
algorithms when building against those kernels.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8259
2019-02-22 09:47:34 -08:00
Tony Hutter f45ad7bff6 Linux 5.0 compat: Fix SUBDIRs
SUBDIRs has been deprecated for a long time, and was finally removed in
the 5.0 kernel.  Use "M=" instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8257
2019-02-22 09:47:34 -08:00
Tony Hutter 0a3a4d067a Linux 5.0 compat: Convert MS_* macros to SB_*
In the 5.0 kernel, only the mount namespace code should use the MS_*
macos. Filesystems should use the SB_* ones.

https://patchwork.kernel.org/patch/10552493/

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8264
2019-02-22 09:47:34 -08:00
Tony Hutter ba8024a284 Linux 5.0 compat: Use totalram_pages()
totalram_pages() was converted to an atomic variable in 5.0:

https://patchwork.kernel.org/patch/10652795/

Its value should now be read though the totalram_pages() helper
function.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8263
2019-02-22 09:47:34 -08:00
Tony Hutter edc2675aed Linux 5.0 compat: access_ok() drops 'type' parameter
access_ok no longer needs a 'type' parameter in the 5.0 kernel.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8261
2019-02-22 09:47:34 -08:00
ilbsmart 98bb45e27a deadlock between mm_sem and tx assign in zfs_write() and page fault
The bug time sequence:
1. thread #1, `zfs_write` assign a txg "n".
2. In a same process, thread #2, mmap page fault (which means the
   `mm_sem` is hold) occurred, `zfs_dirty_inode` open a txg failed,
   and wait previous txg "n" completed.
3. thread #1 call `uiomove` to write, however page fault is occurred
   in `uiomove`, which means it need `mm_sem`, but `mm_sem` is hold by
   thread #2, so it stuck and can't complete,  then txg "n" will
   not complete.

So thread #1 and thread #2 are deadlocked.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Grady Wong <grady.w@xtaotech.com>
Closes #7939
2019-02-22 09:47:34 -08:00
Neal Gompa (ニール・ゴンパ) 44f463824b dkms: Enable debuginfo option to be set with zfs sysconfig file
On some Linux distributions, the kernel module build will not
default to building with debuginfo symbols, which can make it
difficult for debugging and testing.

For this case, we provide a flag to override the build to force
debuginfo to be produced for the kernel module build.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Neal Gompa <ngompa@datto.com>
Co-authored-by: Simon Watson <swatson@datto.com>
Signed-off-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Simon Watson <swatson@datto.com>
Closes #8304
2019-02-22 09:47:34 -08:00
Neal Gompa (ニール・ゴンパ) b0d579bc55 Bump commit subject length to 72 characters
There's not really a reason to keep the subject length so short,
since the reason to make it this short was for making nice renders
of a summary list of the git log. With 72 characters, this still
works out fine, so let's just raise it to that so that it's easier
to give slightly more descriptive change summaries.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Neal Gompa <ngompa@datto.com>
Closes #8250
2019-02-22 09:47:34 -08:00
Benjamin Gentil 7e5def8ae0 zfs.8 uses wrong snapshot names in Example 15
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: bunder2015 <omfgbunder@gmail.com>
Signed-off-by: Benjamin Gentil <benjamin@gentil.io>
Closes #8241
2019-02-22 09:47:34 -08:00
Tony Hutter 89019a846b Add enclosure_symlinks option to vdev_id
Add an 'enclosure_symlinks' option to vdev_id.conf.  This creates
consistently named symlinks to the enclosure devices (/dev/sg*) based
off the configuration in vdev_id.conf.  The enclosure symlinks show
up in /dev/by-enclosure/<prefix>-<channel><num>.  The links make it
make it easy to run sg_ses on a particular enclosure device.  The
enclosure links are created in addition to the normal
/dev/disk/by-vdev links.

'enclosure_symlinks' is only valid in sas_direct configurations.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Simon Guest <simon.guest@tesujimath.org>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8194
2019-02-22 09:47:34 -08:00
Simon Guest 41f7723e9c vdev_id: new slot type ses
This extends vdev_id to support a new slot type, ses, for SCSI Enclosure
Services.  With slot type ses, the disk slot numbers are determined by
using the device slot number reported by sg_ses for the device with
matching SAS address, found by querying all available enclosures.

This is primarily of use on systems with a deficient driver omitting
support for bay_identifier in /sys/devices.  In my testing, I found that
the existing slot types of port and id were not stable across disk
replacement, so an alternative was required.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Simon Guest <simon.guest@tesujimath.org>
Closes #6956
2019-02-22 09:47:34 -08:00
Simon Guest 2b8c3cb0c8 vdev_id: extension for new scsi topology
On systems with SCSI rather than SAS disk topology, this change enables
the vdev_id script to match against the block device path, and therefore
create a vdev alias in /dev/disk/by-vdev.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Simon Guest <simon.guest@tesujimath.org>
Closes #6592
2019-02-22 09:47:34 -08:00
Olaf Faaland f325d76e96 Rename macro ZFS_MINOR due to Lustre conflict
Macro ZFS_MINOR, introduced in commit a6cc9756 to record the chosen
static minor number for /dev/zfs, conflicts with an existing macro
in Lustre.  The lustre macro (along with _MAJOR, _PATCH, _FIX) is
used to record the zfsonlinux version Lustre is being built against.

Since the Lustre macro came first, and is used in past versions of
lustre at least going back to 2.10, it makes sense to rename the
macro in ZFS instead of doing so in Lustre which would require
backporting the patch.

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #8195
2019-02-22 09:47:34 -08:00
Brian Behlendorf e3fb781c5f Add kernel module auto-loading
Historically a dynamic misc minor number was registered for the
/dev/zfs device in order to prevent minor number collisions.  This
was fine but it prevented us from being able to use the kernel
module auto-loaded which requires a known reserved value.

Resolve this issue by adding a configure test to find an available
misc minor number which can then be used in MODULE_ALIAS_MISCDEV at
build time.  By adding this alias the zfs kmod is added to the list
of known static-nodes and the systemd-tmpfiles-setup-dev service
will create a /dev/zfs character device at boot time.

This in turn allows us to update the 90-zfs.rules file to make it
aware this is a static node.  The upshot of this is that whenever
a process (zpool, zfs, zed) opens the /dev/zfs the kmods will be
automatic loaded.  This even works for unprivileged users so there
is no longer a need to manually load the modules at boot time.

As an additional bonus the zed now no longer needs to start after
the zfs-import.service since it will trigger the module load.

In the unlikely event the minor number we selected conflicts with
another out of tree unregistered minor number the code falls back
to dynamically allocating it.  In this case the modules again
must be manually loaded.

Note that due to the change in the method of registering the minor
number the zimport.sh test case may incorrectly fail when the
static node for the installed packages is created instead of the
dynamic one.  This issue will only transiently impact zimport.sh
for this single commit when we transition and are mixing and
matching methods.

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
TEST_ZIMPORT_SKIP="yes"
Closes #7287
2019-02-22 09:47:34 -08:00
Ben Wolsieffer 14a5e48fb9 Use autoconf variable for C preprocessor
This fixes the build when cross-compiling, where the preprocessor might
be prefixed.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Ben Wolsieffer <benwolsieffer@gmail.com>
Closes #8180
2019-02-22 09:47:34 -08:00
Matthew Ahrens 01937958ce OpenZFS 9577 - remove zfs_dbuf_evict_key tsd
The zfs_dbuf_evict_key TSD (thread-specific data) is not necessary -
we can instead pass a flag down in a few places to prevent recursive
dbuf eviction. Making this change has 3 benefits:

1. The code semantics are easier to understand.
2. On Linux, performance is improved, because creating/removing
   TSD values (by setting to NULL vs non-NULL) is expensive, and
   we do it very often.
3. According to Nexenta, the current semantics can cause a
   deadlock when concurrently calling dmu_objset_evict_dbufs()
   (which is rare today, but they are working on a "parallel
   unmount" change that triggers this more easily):

Porting Notes:
* Minor conflict with OpenZFS 9337 which has not yet been ported.

Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

OpenZFS-issue: https://illumos.org/issues/9577
OpenZFS-commit: https://github.com/openzfs/openzfs/pull/645
External-issue: DLPX-58547
Closes #7602
2019-02-22 09:47:34 -08:00
LOLi edb504f9db Honor --with-mounthelperdir where applicable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6962
2019-02-22 09:47:34 -08:00
LOLi 2428fbbfcf contrib/initramfs: switch to automake
Use automake to build initramfs scripts and hooks.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6761
2019-02-22 09:47:33 -08:00
Tony Hutter 16d298188f Tag zfs-0.7.12
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-11-08 14:38:37 -08:00
Tony Hutter f42f8702ce Add BuildRequires gcc, make, elfutils-libelf-devel
This adds a BuildRequires for gcc, make, and elfutils-libelf-devel
into our spec files.  gcc has been a packaging requirement for
awhile now:

https://fedoraproject.org/wiki/Packaging:C_and_C%2B%2B

These additional BuildRequires allow us to mock build in
Fedora 29.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Tony Hutter <hutter2@llnl.gov>
Closes #8095
Closes #8102
2018-11-08 14:38:28 -08:00
Brian Behlendorf 9e58d5ef38 Fix flake8 "invalid escape sequence 'x'" warning
From, https://lintlyci.github.io/Flake8Rules/rules/W605.html

As of Python 3.6, a backslash-character pair that is not a valid
escape sequence now generates a DeprecationWarning. Although this
will eventually become a SyntaxError, that will not be for several
Python releases.

Note 'float_pobj' was simply removed from arcstat.py since it
was entirely unused.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8056
2018-11-08 14:38:28 -08:00
Brian Behlendorf 320f9de8ab ZTS: Update O_TMPFILE support check
In CentOS 7.5 the kernel provided a compatibility wrapper to support
O_TMPFILE.  This results in the test setup script correctly detecting
kernel support.  But the ZFS module was built without O_TMPFILE
support due to the non-standard CentOS kernel interface.

Handle this case by updating the setup check to fail either when
the kernel or the ZFS module fail to provide support.  The reason
will be clearly logged in the test results.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7528
2018-11-08 14:38:28 -08:00
George Melikov 262275ab26 Allow use of pool GUID as root pool
It's helpful if there are pools with same names,
but you need to use only one of them.

Main case is twin servers, meanwhile some software
requires the same name of pools (e.g. Proxmox).

Reviewed-by: Kash Pande <kash@tripleback.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Igor ‘guardian’ Lidin of Moscow, Russia
Closes #8052
2018-11-08 14:38:28 -08:00
Brian Behlendorf 55f39a01e6 Fix arc_release() refcount
Update arc_release to use arc_buf_size().  This hunk was accidentally
dropped when porting compressed send/recv, 2aa34383b.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8000
2018-11-08 14:38:28 -08:00
Tim Schumacher b884768e46 Prefix all refcount functions with zfs_
Recent changes in the Linux kernel made it necessary to prefix
the refcount_add() function with zfs_ due to a name collision.

To bring the other functions in line with that and to avoid future
collisions, prefix the other refcount functions as well.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Schumacher <timschumi@gmx.de>
Closes #7963
2018-11-08 14:38:28 -08:00
Tim Schumacher f8f4e13776 Linux 4.19-rc3+ compat: Remove refcount_t compat
torvalds/linux@59b57717f ("blkcg: delay blkg destruction until
after writeback has finished") added a refcount_t to the blkcg
structure. Due to the refcount_t compatibility code, zfs_refcount_t
was used by mistake.

Resolve this by removing the compatibility code and replacing the
occurrences of refcount_t with zfs_refcount_t.

Reviewed-by: Franz Pletz <fpletz@fnordicwalking.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Schumacher <timschumi@gmx.de>
Closes #7885
Closes #7932
2018-11-08 14:38:28 -08:00
Gregor Kopka 5f07d51751 Zpool iostat: remove latency/queue scaling
Bandwidth and iops are average per second while *_wait are averages
per request for latency or, for queue depths, an instantaneous
measurement at the end of an interval (according to man zpool).

When calculating the first two it makes sense to do
x/interval_duration (x being the increase in total bytes or number of
requests over the duration of the interval, interval_duration in
seconds) to 'scale' from amount/interval_duration to amount/second.

But applying the same math for the latter (*_wait latencies/queue) is
wrong as there is no interval_duration component in the values (these
are time/requests to get to average_time/request or already an
absulute number).

This bug leads to the only correct continuous *_wait figures for both
latencies and queue depths from 'zpool iostat -l/q' being with
duration=1 as then the wrong math cancels itself (x/1 is a nop).

This removes temporal scaling from latency and queue depth figures.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gregor Kopka <gregor@kopka.net>
Closes #7945
Closes #7694
2018-11-08 14:38:28 -08:00
Brian Behlendorf b2f003c4f4 Fix statfs(2) for 32-bit user space
When handling a 32-bit statfs() system call the returned fields,
although 64-bit in the kernel, must be limited to 32-bits or an
EOVERFLOW error will be returned.

This is less of an issue for block counts since the default
reported block size in 128KiB. But since it is possible to
set a smaller block size, these values will be scaled as
needed to fit in a 32-bit unsigned long.

Unlike most other filesystems the total possible file counts
are more likely to overflow because they are calculated based
on the available free space in the pool. In order to prevent
this the reported value must be capped at 2^32-1. This is
only for statfs(2) reporting, there are no changes to the
internal ZFS limits.

Reviewed-by: Andreas Dilger <andreas.dilger@whamcloud.com>
Reviewed-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #7927
Closes #7122
Closes #7937
2018-11-08 14:38:28 -08:00
Olaf Faaland 9014da2b01 Skip import activity test in more zdb code paths
Since zdb opens the pools read-only, it cannot damage the pool in the
event the pool is already imported either on the same host or on
another one.

If the pool vdev structure is changing while zdb is importing the
pool, it may cause zdb to crash.  However this is unlikely, and in any
case it's a user space process and can simply be run again.

For this reason, zdb should disable the multihost activity test on
import that is normally run.

This commit fixes a few zdb code paths where that had been overlooked.
It also adds tests to ensure that several common use cases handle this
properly in the future.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Gu Zheng <guzheng2331314@163.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7797
Closes #7801
2018-11-08 14:38:28 -08:00
Matthew Ahrens 45579c9515 Reduce taskq and context-switch cost of zio pipe
When doing a read from disk, ZFS creates 3 ZIO's: a zio_null(), the
logical zio_read(), and then a physical zio. Currently, each of these
results in a separate taskq_dispatch(zio_execute).

On high-read-iops workloads, this causes a significant performance
impact. By processing all 3 ZIO's in a single taskq entry, we reduce the
overhead on taskq locking and context switching.  We accomplish this by
allowing zio_done() to return a "next zio to execute" to zio_execute().

This results in a ~12% performance increase for random reads, from
96,000 iops to 108,000 iops (with recordsize=8k, on SSD's).

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-59292
Closes #7736
2018-11-08 14:38:28 -08:00
Tom Caputi b32f1279d4 Fix race in dnode_check_slots_free()
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable, leading to crashes. This patch adds the ability for
dnodes to track which txg they were last dirtied in and adds a
check for this before performing the reclaim.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #7147
Closes #7388
2018-11-08 14:38:28 -08:00
Tony Hutter 1b0cd07131 Tag zfs-0.7.11
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-09-13 10:13:41 -07:00
Dr. András Korn 8c6867dae4 tx_waited -> tx_dirty_delayed in trace_dmu.h
This change was missed in 0735ecb334.

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: András Korn <korn-github.com@elan.rulez.org>
Closes #7096
2018-09-13 10:12:22 -07:00
Tony Hutter 99310c0aa0 Revert "zpool reopen should detect expanded devices"
This reverts commit 2a16d4cfaf.

The commit was causing a "attempt to access beyond the end
of device" error:

list.zfsonlinux.org/pipermail/zfs-discuss/2018-September/032217.html
2018-09-13 10:11:42 -07:00
Tony Hutter d126980e5f Tag zfs-0.7.10
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-09-05 10:37:32 -07:00
Chris Siebenmann 88ef5b238b Correctly handle errors from kern_path
As a regular kernel function, kern_path() returns errors as negative
errnos, such as -ELOOP. zfsctl_snapdir_vget() must convert these into
the positive errnos used throughout the ZFS code when it returns them
to other ZFS functions so that the ZFS code properly sees them as
errors.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Siebenmann <cks.git01@cs.toronto.edu>
Closes #7764
Closes #7864
2018-07-06 02:46:51 -07:00
Georgy Yakovlev 30d8b85702 Fix build with CONFIG_GCC_PLUGIN_RANDSTRUCT
fs/zfs/zfs/metaslab.c:1055:2: error: positional initialization of field
in ‘struct’ declared with ‘designated_init’ attribute
[-Werror=designated-init]
  metaslab_rt_remove,

Signed-off-by: Georgy Yakovlev <ya@sysdump.net>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes: #7069
2018-07-06 02:46:51 -07:00
Tom Caputi 45f0437912 Fix 'zfs recv' of non large_dnode send streams
Currently, there is a bug where older send streams without the
DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly.
The code in receive_object() fails to handle cases where
drro->drr_dn_slots is set to 0, which is always the case when the
sending code does not support this feature flag. This patch fixes
the issue by ensuring that that a value of 0 is treated as
DNODE_MIN_SLOTS.

Tested-by:  DHE <git@dehacked.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #7617
Closes #7662
2018-07-06 02:46:51 -07:00
Tom Caputi dc3eea871a Fix object reclaim when using large dnodes
Currently, when the receive_object() code wants to reclaim an
object, it always assumes that the dnode is the legacy 512 bytes,
even when the incoming bonus buffer exceeds this length. This
causes a buffer overflow if --enable-debug is not provided and
triggers an ASSERT if it is. This patch resolves this issue and
adds an ASSERT to ensure this can't happen again.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #7097
Closes #7433
2018-07-06 02:46:51 -07:00
Tim Chase d2c8103a68 Fix problems receiving reallocated dnodes
This is a port of 047116ac - Raw sends must be able to decrease nlevels,
to the zfs-0.7-stable branch.  It includes the various fixes to the
problem of receiving incremental streams which include reallocated dnodes
in which the number of dnode slots has changed but excludes the parts
which are related to raw streams.

From 047116ac:

    Currently, when a raw zfs send file includes a
    DRR_OBJECT record that would decrease the number of
    levels of an existing object, the object is reallocated
    with dmu_object_reclaim() which creates the new dnode
    using the old object's nlevels. For non-raw sends this
    doesn't really matter, but raw sends require that
    nlevels on the receive side match that of the send
    side so that the checksum-of-MAC tree can be properly
    maintained. This patch corrects the issue by freeing
    the object completely before allocating it again in
    this case.

    This patch also corrects several issues with
    dnode_hold_impl() and related functions that prevented
    dnodes (particularly multi-slot dnodes) from being
    reallocated properly due to the fact that existing
    dnodes were not being fully cleaned up when they
    were freed.

    This patch adds a test to make sure that zfs recv
    functions properly with incremental streams containing
    dnodes of different sizes.

This also includes a one-liner fix from loli10K to fix a test failure:
https://github.com/zfsonlinux/zfs/pull/7792#discussion_r212769264

Authored-by: Tom Caputi <tcaputi@datto.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Ported-by: Tim Chase <tim@chase2k.com>

Closes #6821
Closes #6864

NOTE: This is the first of the port of 3 related patches patches to the
zfs-0.7-release branch of ZoL.  The other two patches should immediately
follow this one.
2018-07-06 02:46:51 -07:00
Joao Carlos Mendes Luis 3ea1f7f193 Fedora 28: Fix misc bounds check compiler warnings
Fix a bunch of truncation compiler warnings that show up
on Fedora 28 (GCC 8.0.1).

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #7368
Closes #7826
Closes #7830
2018-07-06 02:46:51 -07:00
LOLi 4356dd23a9 Fix libaio-devel requirement for Debian-based distributions
BuildRequires tags for "-devel" packages in the RPM spec file do not
work when building on Debian-based distributions.

Fix this issue by making this requirement conditional to RPM-based
distributions.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7829
Closes #7831
2018-07-06 02:46:51 -07:00
Brian Behlendorf 75318ec497 Add libaio-devel BuildRequires
The zfs-test package needs a build requirement on the libaio-devel
package.  Without it ./configure will correctly determine that
mmap_libaio cannot be built and it will be skipped.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7821
Closes #7824
2018-07-06 02:46:51 -07:00
Brian Behlendorf c1629734ab Add missing zfs-dracut RPM dependencies
The zfs-dracut package requires the hostid, basename, head, awk,
and grep utilities be installed.  The first three are provided by
coreutils but additional dependencies are required for awk and grep.

Reviewed-by: Manuel Amador (Rudd-O) <rudd-o@rudd-o.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7729
Closes #7747
2018-07-06 02:46:51 -07:00
DeHackEd 778290d5bc Don't modify argv[] in user tools
argv[] gets modified during string parsing for input arguments. This
is reflected in the live process listing. Don't do that.

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #7760
2018-07-06 02:46:51 -07:00
LOLi 98bc8e0b23 Fix arcstat.py handling of unsupported options
This change allows the arcstat.py script to handle unsupported options
gracefully and print both error and usage messages when one such option
is provided.

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7799
2018-07-06 02:46:51 -07:00
LOLi caafa436eb Allow inherited properties in zfs_check_settable()
This change modifies how 'checksum' and 'dedup' properties are verified
in zfs_check_settable() handling the case where they are explicitly
inherited in the dataset hierarchy when receiving a recursive send
stream.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7755
Closes #7576
Closes #7757
2018-07-06 02:46:51 -07:00
LOLi fe8de1c8a6 Fix zfs incremental send remove '-o' properties
When receiving an incremental send stream with intermediary snapshots
zfs_receive_one() does not correctly identify the top-level dataset:
consequently we restore said snapshots as if they were children
datasets in the hierarchy, forcing inheritance of any property received
with 'zfs send -o' and effectively removing any locally set value.

The test case did not correctly verify this situation because it uses
adjacent snapshots, basically testing 'zfs send -i' instead of
'zfs send -I': this commit adds an additional intermediary snapshot to
the test script.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7478
2018-07-06 02:46:51 -07:00
Toomas Soome 1bd93ea1e0 OpenZFS 8906 - uts: illumos rootfs should support salted cksum
Porting notes:
* As of grub-2.02 these checksums are not supported.  However, as
  pointed out in #6501 there are alternatives such as EFISTUB which
  work and have no such restriction.  A warning was added to the
  checksum property section of the zfs.8 man page.

Authored by: Toomas Soome <tsoome@me.com>
Reviewed by: C Fraire <cfraire@me.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Yuri Pankov <yuripv@yuripv.net>
Approved by: Dan McDonald <danmcd@joyent.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

OpenZFS-issue: https://illumos.org/issues/8906
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/7dec52f
Closes #6501
Closes #7714
2018-07-06 02:46:51 -07:00
Brian Behlendorf 6857950e46 Fix zpl_mount() deadlock
Commit 93b43af10 inadvertently introduced the following scenario which
can result in a deadlock.  This issue was most easily reproduced by
LXD containers using a ZFS storage backend but should be reproducible
under any workload which is frequently mounting and unmounting.

-- THREAD A --
spa_sync()
  spa_sync_upgrades()
    rrw_enter(&dp->dp_config_rwlock, RW_WRITER, FTAG); <- Waiting on B

-- THREAD B --
mount_fs()
  zpl_mount()
    zpl_mount_impl()
      dmu_objset_hold()
        dmu_objset_hold_flags()
          dsl_pool_hold()
            dsl_pool_config_enter()
              rrw_enter(&dp->dp_config_rwlock, RW_READER, tag);
    sget()
      sget_userns()
        grab_super()
          down_write(&s->s_umount); <- Waiting on C

-- THREAD C --
cleanup_mnt()
  deactivate_super()
    down_write(&s->s_umount);
    deactivate_locked_super()
      zpl_kill_sb()
        kill_anon_super()
          generic_shutdown_super()
            sync_filesystem()
              zpl_sync_fs()
                zfs_sync()
                  zil_commit()
                    txg_wait_synced() <- Waiting on A

Reviewed by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7598 
Closes #7659 
Closes #7691 
Closes #7693
2018-07-06 02:46:51 -07:00
Brian Behlendorf 716ce2b89e Fix kernel unaligned access on sparc64
Update the SA_COPY_DATA macro to check if architecture supports
efficient unaligned memory accesses at compile time.  Otherwise
fallback to using the sa_copy_data() function.

The kernel provided CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is
used to determine availability in kernel space.  In user space
the x86_64, x86, powerpc, and sometimes arm architectures will
define the HAVE_EFFICIENT_UNALIGNED_ACCESS macro.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7642
Closes #7684
2018-07-06 02:46:51 -07:00
Troels Nørgaard 9daae583d8 Default ashift for Amazon EC2 NVMe devices
Add a default 4 KiB ashift for Amazon EC2 NVMe devices on instances with
NVMe ephemeral devices, such as the types c5d, f1, i3 and m5d.
As per the official documentation [1] a 4096 byte blocksize should be
used to match the underlying hardware.

The string was identified via:

$ sudo sginfo -M /dev/nvme0n1
INQUIRY response (cmd: 0x12)
----------------------------
Device Type                        0
Vendor:                    NVMe
Product:                   Amazon EC2 NVMe
Revision level:

$ lsblk -io KNAME,TYPE,SIZE,MODEL
KNAME   TYPE    SIZE MODEL
nvme0n1 disk  442.4G Amazon EC2 NVMe Instance Storage

[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/
    storage-optimized-instances.html
    Retrived 2018-07-03

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Troels Nørgaard <tnn@tradeshift.com>
Closes #7676
2018-07-06 02:46:51 -07:00
Brian Behlendorf b5ee3df776 Linux 4.14 compat: blk_queue_stackable()
The blk_queue_stackable() function was replaced in the 4.14 kernel
by queue_is_rq_based(), commit torvalds/linux@5fdee212.  This change
resulted in the default elevator being used which can negatively
impact performance.

Rather than adding additional compatibility code to detect the
new interface unconditionally attempt to set the elevator.  Since
we expect this to fail for block devices without an elevator the
error message has been moved in to zfs_dbgmsg().

Finally, it was observed that the elevator_change() was removed
from the 4.12 kernel, commit torvalds/linux@c033269.  Update the
comment to clearly specify which are expected to export the
elevator_change() symbol.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7645
2018-07-06 02:46:51 -07:00
Tony Hutter 17cd9a8e0c Add pool state /proc entry, "SUSPENDED" pools
1. Add a proc entry to display the pool's state:

$ cat /proc/spl/kstat/zfs/tank/state
ONLINE

This is done without using the spa config locks, so it will
never hang.

2. Fix 'zpool status' and 'zpool list -o health' output to print
"SUSPENDED" instead of "ONLINE" for suspended pools.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7331
Closes #7563
2018-07-06 02:46:51 -07:00
Sara Hartse 2a16d4cfaf zpool reopen should detect expanded devices
Update bdev_capacity to have wholedisk vdevs query the
size of the underlying block device (correcting for the size
of the efi parition and partition alignment) and therefore detect
expanded space.

Correct vdev_get_stats_ex so that the expandsize is aligned
to metaslab size and new space is only reported if it is large
enough for a new metaslab.

Reviewed by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Wren Kennedy <jwk404@gmail.com>
Signed-off-by: sara hartse <sara.hartse@delphix.com>
External-issue: LX-165
Closes #7546
Issue #7582
2018-07-06 02:46:51 -07:00
Antonio Russo 3350a33908 Support Debian DKMS builds
scripts/dkms.mkconf calls configure with
`--with-linux=${kernel_source_dir}`, but Debian puts it kernel source at
`/lib/modules/<version>/source`. This patch adds the same logic to the
DKMS file produced by `scripts/dkms.mkconf` that Debian has shipped in
its official ZFS packaging: at DKMS build time, it checks if the system
is a Debian system, and adjusts the path accordingly.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
Closes #7358 
Closes #7540 
Closes #7554
2018-07-06 02:46:51 -07:00
Olaf Faaland 3eef58c9b6 module param callbacks check for initialized spa
Callbacks provided for module parameters are executed both
after the module is loaded, when a user alters it via sysfs, e.g
	echo bar > /sys/modules/zfs/parameters/foo

as well as when the module is loaded with an argument, e.g.
	modprobe zfs foo=bar

In the latter case, the init functions likely have not run yet,
including spa_init() which initializes the namespace lock so it is safe
to use.

Instead of immediately taking the namespace lock and attemping to
iterate over initialized spa structures, check whether spa_mode_global
is nonzero.  This is set by spa_init() after it has initialized the
namespace lock.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7496 
Closes #7521
2018-07-06 02:46:51 -07:00
Brian Behlendorf 4805781c74 Trim new line from zfs_vdev_scheduler
Add a helper function to trim the tailing new line.  While we're
here use this new hook to immediately apply the new scheduler.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3356 
Closes #6573
2018-07-06 02:46:51 -07:00
Chunwei Chen b06f40ea9b Fix ENOSPC in "Handle zap_add() failures in ..."
Commit cc63068 caused ENOSPC error when copy a large amount of files
between two directories. The reason is that the patch limits zap leaf
expansion to 2 retries, and return ENOSPC when failed.

The intent for limiting retries is to prevent pointlessly growing table
to max size when adding a block full of entries with same name in
different case in mixed mode. However, it turns out we cannot use any
limit on the retry. When we copy files from one directory in readdir
order, we are copying in hash order, one leaf block at a time. Which
means that if the leaf block in source directory has expanded 6 times,
and you copy those entries in that block, by the time you need to expand
the leaf in destination directory, you need to expand it 6 times in one
go. So any limit on the retry will result in error where it shouldn't.

Note that while we do use different salt for different directories, it
seems that the salt/hash function doesn't provide enough randomization
to the hash distance to prevent this from happening.

Since cc63068 has already been reverted. This patch adds it back and
removes the retry limit.

Also, as it turn out, failing on zap_add() has a serious side effect for
mzap_upgrade(). When upgrading from micro zap to fat zap, it will
call zap_add() to transfer entries one at a time. If it hit any error
halfway through, the remaining entries will be lost, causing those files
to become orphan. This patch add a VERIFY to catch it.

Reviewed-by: Sanjeev Bagewadi <sanjeev.bagewadi@gmail.com>
Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Albert Lee <trisk@forkgnu.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7401 
Closes #7421
2018-07-06 02:46:51 -07:00
Olaf Faaland 6b5cc49d81 Fix divide-by-zero in mmp_delay_update()
vdev_count_leaves() in the denominator may return 0, caught by Coverity.
Introduced by

* 533ea04 Update mmp_delay on sync or skipped, failed write

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7391
2018-07-06 02:46:51 -07:00
Prakash Surya ef7a79488a OpenZFS 8997 - ztest assertion failure in zil_lwb_write_issue
PROBLEM
=======

When `dmu_tx_assign` is called from `zil_lwb_write_issue`, it's possible
for either `ERESTART` or `EIO` to be returned.

If `ERESTART` is returned, this will cause an assertion to fail directly
in `zil_lwb_write_issue`, where the code assumes the return value is
`EIO` if `dmu_tx_assign` returns a non-zero value. This can occur if the
SPA is suspended when `dmu_tx_assign` is called, and most often occurs
when running `zloop`.

If `EIO` is returned, this can cause assertions to fail elsewhere in the
ZIL code. For example, `zil_commit_waiter_timeout` contains the
following logic:

    lwb_t *nlwb = zil_lwb_write_issue(zilog, lwb);
    ASSERT3S(lwb->lwb_state, !=, LWB_STATE_OPENED);

In this case, if `dmu_tx_assign` returned `EIO` from within
`zil_lwb_write_issue`, the `lwb` variable passed in will not be issued
to disk. Thus, it's `lwb_state` field will remain `LWB_STATE_OPENED` and
this assertion will fail. `zil_commit_waiter_timeout` assumes that after
it calls `zil_lwb_write_issue`, the `lwb` will be issued to disk, and
doesn't handle the case where this is not true; i.e. it doesn't handle
the case where `dmu_tx_assign` returns `EIO`.

SOLUTION
========

This change modifies the `dmu_tx_assign` function such that `txg_how` is
a bitmask, rather than of the `txg_how_t` enum type. Now, the previous
`TXG_WAITED` semantics can be used via `TXG_NOTHROTTLE`, along with
specifying either `TXG_NOWAIT` or `TXG_WAIT` semantics.

Previously, when `TXG_WAITED` was specified, `TXG_NOWAIT` semantics was
automatically invoked. This was not ideal when using `TXG_WAITED` within
`zil_lwb_write_issued`, leading the problem described above. Rather, we
want to achieve the semantics of `TXG_WAIT`, while also preventing the
`tx` from being penalized via the dirty delay throttling.

With this change, `zil_lwb_write_issued` can acheive the semtantics that
it requires by passing in the value `TXG_WAIT | TXG_NOTHROTTLE` to
`dmu_tx_assign`.

Further, consumers of `dmu_tx_assign` wishing to achieve the old
`TXG_WAITED` semantics can pass in the value `TXG_NOWAIT | TXG_NOTHROTTLE`.

Authored by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

Porting Notes:
- Additionally updated `zfs_tmpfile` to use `TXG_NOTHROTTLE`

OpenZFS-issue: https://www.illumos.org/issues/8997
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/19ea6cb0f9
Closes #7084
2018-07-06 02:46:51 -07:00
Brian Behlendorf a2f759146d Linux compat 4.18: check_disk_size_change()
Added support for the bops->check_events() interface which was
added in the 2.6.38 kernel to replace bops->media_changed().
Fully implementing this functionality allows the volume resize
code to rely on revalidate_disk(), which is the preferred
mechanism, and removes the need to use check_disk_size_change().

In order for bops->check_events() to lookup the zvol_state_t
stored in the disk->private_data the zvol_state_lock needs to
be held.  Since the check events interface may poll the mutex
has been converted to a rwlock for better concurrently.  The
rwlock need only be taken as a writer in the zvol_free() path
when disk->private_data is set to NULL.

The configure checks for the block_device_operations structure
were consolidated in a single kernel-block-device-operations.m4
file.

The ZFS_AC_KERNEL_BDEV_BLOCK_DEVICE_OPERATIONS configure checks
and assoicated dead code was removed.  This interface was added
to the 2.6.28 kernel which predates the oldest supported 2.6.32
kernel and will therefore always be available.

Updated maximum Linux version in META file.  The 4.17 kernel
was released on 2018-06-03 and ZoL is compatible with the
finalized kernel.

Reviewed-by: Boris Protopopov <boris.protopopov@actifio.com>
Reviewed-by: Sara Hartse <sara.hartse@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7611
2018-07-06 02:46:51 -07:00
Brian Behlendorf f79c0de208 Linux 4.18 compat: inode timespec -> timespec64
Commit torvalds/linux@95582b0 changes the inode i_atime, i_mtime,
and i_ctime members form timespec's to timespec64's to make them
2038 safe.  As part of this change the current_time() function was
also updated to return the timespec64 type.

Resolve this issue by introducing a new inode_timespec_t type which
is defined to match the timespec type used by the inode.  It should
be used when working with inode timestamps to ensure matching types.

The timestruc_t type under Illumos was used in a similar fashion but
was specified to always be a timespec_t.  Rather than incorrectly
define this type all timespec_t types have been replaced by the new
inode_timespec_t type.

Finally, the kernel and user space 'sys/time.h' headers were aligned
with each other.  They define as appropriate for the context several
constants as macros and include static inline implementation of
gethrestime(), gethrestime_sec(), and gethrtime().

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7643
Backported-by: Richard Yao <ryao@gentoo.org>
2018-07-06 02:46:51 -07:00
Boris Protopopov 1667816089 zv_suspend_lock in zvol_open()/zvol_release()
Acquire zv_suspend_lock on first open and last close only.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Boris Protopopov <boris.protopopov@actifio.com>
Closes #6342
2018-07-06 02:46:51 -07:00
Tony Hutter d1ed1be3cd Tag zfs-0.7.9
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-05-08 13:33:38 -07:00
Tony Hutter e749242a99 Remove DEBUG_STACKFLAGS to bypass compiler error
'Support -fsanitize=address with --enable-asan' (fed9035) removed
DEBUG_STACKFLAGS="-fstack-check" from zfs-build.m4 in master.
However, that's too heavyweight a patch to merge in to the 0.7.x branch,
so just take the one-liner we need to get around a compiler error
on Fedora 28:

$ ./configure --enable-debug --enable-debuginfo && make pkg-utils
  CC       gethrtime.lo
cc1: error: '-fstack-check=' and '-fstack-clash_protection' are mutually
exclusive.  Disabling '-fstack-check=' [-Werror]

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>

Requires-spl: #701
2018-05-07 17:19:58 -07:00
Tony Hutter 9267ef84fd Fedora 28: Add BuildRequires: libtirpc-devel
Add "BuildRequires: libtirpc-devel" to fix mock builds on Fedora 28.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7494
Closes #7495
2018-05-07 17:19:57 -07:00
Brian Behlendorf 0ee129199f RHEL 7.5 compat: FMODE_KABI_ITERATE
As of RHEL 7.5 the mainline fops.iterate() method was added to
the file_operations structure and is correctly detected by the
configure script.

Normally this is what we want, but in order to maintain KABI
compatibility the RHEL change additionally does the following:

* Requires that callers intending to use this extended interface
  set the FMODE_KABI_ITERATE flag on the file structure when
  opening the directory.
* Adds the fops.iterate() method to the end of the structure,
  without removing fops.readdir().

This change updates the configure check to ignore the RHEL 7.5+
variant of fops.iterate() when detected.  Instead fallback to
the fops.readdir() interface which will be available.

Finally, add the 'zpl_' prefix to the directory context wrappers
to avoid colliding with the kernel provided symbols when both
the fops.iterate() and fops.readdir() are provided by the kernel.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7460
Closes #7463
2018-05-07 17:19:57 -07:00
George Melikov 245be00597 Add back iostat -y or -w descriptions
The iostat -y and -w descriptions were left in cda0317e,
get them back.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #7479
Closes #7483
2018-05-07 17:19:57 -07:00
Antonio Russo c38d702330 Add test with two kinds of file creation orders
Data loss was identified in #7401 when many small files were copied.
This adds a reproducer for this bug and other similar ones: randomly
generate N files. Then, listing M of them by `ls -U` order, produce
those same files in a directory of the same name.

This triggers the bug consistently, provided N and M are large enough.
Here, N=2^16 and M=2^13.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
Closes #7411
2018-05-07 17:19:57 -07:00
Seth Forshee 3f729907c8 Allow mounting datasets more than once
Currently mounting an already mounted zfs dataset results in an
error, whereas it is typically allowed with other filesystems.
This causes some bad interactions with mount namespaces. Take
this sequence for example:

- Create a dataset
- Create a snapshot of the dataset
- Create a clone of the snapshot
- Create a new mount namespace
- Rename the original dataset

The rename results in unmounting and remounting the clone in the
original mount namespace, however the remount fails because the
dataset is still mounted in the new mount namespace. (Note that
this means the mount in the new mount namespace is never being
unmounted, so perhaps the unmount/remount of the clone isn't
actually necessary.)

The problem here is a result of the way mounting is implemented
in the kernel module. Since it is not mounting block devices it
uses mount_nodev() instead of the usual mount_bdev(). However,
mount_nodev() is written for filesystems for which each mount is
a new instance (i.e. a new super block), and zfs should be able
to detect when a mount request can be satisfied using an existing
super block.

Change zpl_mount() to call sget() directly with it's own test
callback. Passing the objset_t object as the fs data allows
checking if a superblock already exists for the dataset, and in
that case we just need to return a new reference for the sb's
root dentry.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Closes #5796
Closes #7207
2018-05-07 17:19:57 -07:00
beren12 cca220d7c6 Fix zfs_arc_max minimum tuning
When setting `zfs_arc_max` its minimum value is allowed
to be 64 MiB.  There was an off-by-1 error which can matter
on tiny systems.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Zubrzycki <github@mid-earth.net>
Closes #7417
2018-05-07 17:19:57 -07:00
Brian Behlendorf 4ed30958ce Linux compat 4.16: blk_queue_flag_{set,clear}
The HAVE_BLK_QUEUE_WRITE_CACHE_GPL_ONLY case was overlooked in
the original 10f88c5c commit because blk_queue_write_cache()
was available for the in-kernel builds.

Update the blk_queue_flag_{set,clear} wrappers to call the locked
versions to avoid confusion.  This is safe for all existing callers.

The blk_queue_set_write_cache() function has been updated to use
these wrappers.  This means setting/clearing both QUEUE_FLAG_WC
and QUEUE_FLAG_FUA is no longer atomic but this only done early
in zvol_alloc() prior to any requests so there is no issue.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Kash Pande <kash@tripleback.net>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7428
Closes #7431
2018-05-07 17:19:57 -07:00
Giuseppe Di Natale 2f118072cb Linux compat 4.16: blk_queue_flag_{set,clear}
queue_flag_{set,clear}_unlocked are now private interfaces in
the Linux kernel (https://github.com/torvalds/linux/commit/8a0ac14).
Use blk_queue_flag_{set,clear} interfaces which were introduced as
of https://github.com/torvalds/linux/commit/8814ce8.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #7410
2018-05-07 17:19:57 -07:00
Brian Behlendorf 7440f10ec1 Fix 'zfs send/recv' hang with 16M blocks
When using 16MB blocks the send/recv queue's aren't quite big
enough.  This change leaves the default 16M queue size which a
good value for most pools.  But it additionally ensures that the
queue sizes are at least twice the allowed zfs_max_recordsize.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7365
Closes #7404
2018-05-07 17:19:57 -07:00
Giuseppe Di Natale 8bb800d6b4 Clean up (k)shlib and cfg file shebangs
Most kshlib files are imported by other scripts
and do not have a shebang at the top of their files.
Make all kshlib follow this convention.

Remove shebangs from cfg files as well.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Close #7406
2018-05-07 17:19:57 -07:00
Tony Hutter bbf61c118f Fix "file is executable, but no shebang" warnings
Fedora 28's RPM build checks warn when executable files don't have a
shebang line.  These warnings are caused when we (incorrectly)
include data & config files in the_SCRIPTS automake lines. Files in
_SCRIPTS are marked executable by automake. This patch fixes the
issue by including non-executable scripts in a _DATA line instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7359
Closes #7395
2018-05-07 17:19:57 -07:00
Tony Hutter d296b09456 Exclude python scripts from RPM shebang check
The newest Fedora packaging rules print warnings for scripts using the
/usr/bin/python shebang:

    *** WARNING: mangling shebang in /usr/bin/arc_summary.py from
    #!/usr/bin/python to #!/usr/bin/python2. This will become an ERROR,
    fix it manually!

Fedora wants all cross compatible scripts to pick python3.  Since we
don't want our users to have to pick a specific version of python, we
exclude our scripts from the RPM build check.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7360
Closes #7399
2018-05-07 17:19:57 -07:00
Olaf Faaland 5ac017fc04 Update mmp_delay on sync or skipped, failed write
When an MMP write is skipped, or fails, and time since
mts->mmp_last_write is already greater than mts->mmp_delay, increase
mts->mmp_delay.  The original code only updated mts->mmp_delay when a
write succeeded, but this results in the write(s) after delays and
failed write(s) reporting an ub_mmp_delay which is too low.

Update mmp_last_write and mmp_delay if a txg sync was successful.  At
least one uberblock was written, thus extending the time we can be sure
the pool will not be imported by another host.

Do not allow mmp_delay to go below (MSEC2NSEC(zfs_multihost_interval) /
vdev_count_leaves()) so that a period of frequent successful MMP writes,
e.g. due to frequent txg syncs, does not result in an import activity
check so short it is not reliable based on mmp thread writes alone.

Remove unnecessary local variable, start.  We do not use the start time
of the loop iteration.

Add a debug message in spa_activity_check() to allow verification of the
import_delay value and to prove the activity check occurred.

Alter the tests that import pools and attempt to detect an activity
check.  Calculate the expected duration of spa_activity_check() based on
module parameters at the time the import is performed, rather than a
fixed time set in mmp.cfg.  The fixed time may be wrong.  Also, use the
default zfs_multihost_interval value so the activity check is longer and
easier to recognize.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7330
2018-05-07 17:19:57 -07:00
Tony Hutter f5ecab3aef Fedora 28: Fix misc bounds check compiler warnings
Fix a bunch of (mostly) sprintf/snprintf truncation compiler
warnings that show up on Fedora 28 (GCC 8.0.1).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7361
Closes #7368
2018-05-07 17:19:57 -07:00
LOLi fd01167ffd Fix hung z_zvol tasks during 'zfs receive'
During a receive operation zvol_create_minors_impl() can wait
needlessly for the prefetch thread because both share the same tasks
queue.  This results in hung tasks:

<3>INFO: task z_zvol:5541 blocked for more than 120 seconds.
<3>      Tainted: P           O  3.16.0-4-amd64
<3>"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

The first z_zvol:5541 (zvol_task_cb) is waiting for the long running
traverse_prefetch_thread:260

root@linux:~# cat /proc/spl/taskq
taskq                       act  nthr  spwn  maxt   pri  mina
spl_system_taskq/0            1     2     0    64   100     1
	active: [260]traverse_prefetch_thread [zfs](0xffff88003347ae40)
	wait: 5541
spl_delay_taskq/0             0     1     0     4   100     1
	delay: spa_deadman [zfs](0xffff880039924000)
z_zvol/1                      1     1     0     1   120     1
	active: [5541]zvol_task_cb [zfs](0xffff88001fde6400)
	pend: zvol_task_cb [zfs](0xffff88001fde6800)

This change adds a dedicated, per-pool, prefetch taskq to prevent the
traverse code from monopolizing the global (and limited) system_taskq by
inappropriately scheduling long running tasks on it.

Reviewed-by: Albert Lee <trisk@forkgnu.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6330
Closes #6890
Closes #7343
2018-05-07 17:19:57 -07:00
Don Brady 3b118f0a34 Add support for nvme based devids
Adds a devid for nvme devices. This is very similar to how the
other 'bus' (scsi|sata|usb) devids are generated. The devid
resides in a name/value pair in the leaf vdevs in a zpool config.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #7356
2018-05-07 17:19:57 -07:00
Tony Hutter ebe443c8ff chmod -x on etc/init.d/zfs-*.in automake files
Clear executable bit on zfs-import.in, zfs-mount.in,
zfs-share.in, and zfs-zed.in.  These are automake files and
should not be marked executable.  This fixes a RPM build error
on Fedora 28.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7355
Closes #7327
2018-05-07 17:19:57 -07:00
Brian Behlendorf 63f3396233 Fix mmap / libaio deadlock
Calling uiomove() in mappedread() under the page lock can result
in a deadlock if the user space page needs to be faulted in.

Resolve the issue by dropping the page lock before the uiomove().
The inode range lock protects against concurrent updates via
zfs_read() and zfs_write().

Reviewed-by: Albert Lee <trisk@forkgnu.org>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7335
Closes #7339
2018-05-07 17:19:57 -07:00
DeHackEd 2deb4526ee Remove libattr requirement
RHEL/CentOS 6 supports sys/xattr.h eliminating the need for
libattr-devel as a dependency.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #7344
Closes #7351
2018-05-07 17:19:57 -07:00
Tony Hutter a1662ffcaa Fedora 28: Fix "Macro %_dracutdir has empty body"
If you run ./configure --with-config=srpm, it will not trigger
the user m4 scripts to populate the dracut and udev directories.
This causes a build error on Fedora 28.  Make the dracut and
udev lines conditional to get around this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7326
Closes #7328
2018-05-07 17:19:57 -07:00
kpande ea921bf6a6 modprobe zfs during dracut mount
Resolves importing root pool during boot in dracut.  This case was
inadvertently broken with the module autoloading change in #7287.

Reviewed-by: Matthew Thode <prometheanfire@gentoo.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Kash Pande <kash@tripleback.net>
Closes #7322
2018-05-07 17:19:57 -07:00
timor 6e627cc468 Add support for nvme disk detection
This treats /dev/nvme.. devices the same way as /dev/sd... devices.  The
motivation behind this is that whole disk detection did not work on nvme
SSDs without that, because it DKC_UNKNOWN was returned for such devices.

Perhaps there should be a separate DKC_ type for this, but I don't know
enough about the code to know the implications of that.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: timor <timor.dd@googlemail.com>
Closes #7304
2018-05-07 17:19:56 -07:00
Olaf Faaland 3eb3a13628 Report pool suspended due to MMP
When the pool is suspended, record whether it was due to an I/O error or
due to MMP writes failing to succeed within the required time.

Change spa_suspended from uint8_t to zio_suspend_reason_t to store the
reason.

When userspace queries pool status via spa_tryimport(), report the
reason the pool was suspended in a new key,
ZPOOL_CONFIG_SUSPENDED_REASON.

In libzfs, when interpreting the returned config nvlist, report
suspension due to MMP with a new pool status enum value,
ZPOOL_STATUS_IO_FAILURE_MMP.

In status_callback(), which generates and emits the message when 'zpool
status' is executed, add a case to print an appropriate message for the
new pool status enum value.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7296
2018-05-07 17:19:56 -07:00
Tim Chase c234706270 Add zfs_scan_ignore_errors tunable
When it's set, a DTL range will be cleared even if its scan/scrub had
errors.  This allows to work around resilver/scrub upon import when the
pool has errors.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes #7293
2018-05-07 17:19:56 -07:00
Tony Hutter 6059ba27c4 Allow to limit zed's syslog chattiness
Some usage patterns like send/recv of replication streams can
produce a large number of events. In such a case, the current
all-syslog.sh zedlet will hold up to its name, and flood the
logs with mostly redundant information. Two mitigate this
situation, this changeset introduces to new variables
ZED_SYSLOG_SUBCLASS_INCLUDE and ZED_SYSLOG_SUBCLASS_EXCLUDE
to zed.rc that give more control over which event classes end
up in the syslog.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Daniel Kobras <d.kobras@science-computing.de>
Closes #6886
Closes #7260
2018-05-07 17:19:56 -07:00
Olaf Faaland 927f40d089 Record skipped MMP writes in multihost_history
Once per pass through the MMP thread's loop, the vdev tree is walked to
find a suitable leaf to write the next MMP block to.  If no such leaf is
found, the thread sleeps for a while and resumes at the top of the loop.

Add an entry to multihost_history when no leaf can be found, and record
the reason in the error column.  The error code for such entries is a
bitfield, displayed in hex:

0x1  At least one vdev (interior or leaf) was not writeable.
0x2  At least one writeable leaf vdev was found, but it had a pending
MMP write.

timestamp = the time in seconds since the epoch when no leaf could be
found originally.

duration = the time (in ns) during which no MMP block was written for
this reason.  This does not include the preceeding inter-write period
nor the following inter-write period.

vdev_guid = the number of sequential cycles of the MMP thread looop when
this occurred.

Sample output, truncated to fit:

For records of skipped MMP writes the right-most column, vdev_path, is
reported as "-".

id   txg  timestamp   error  duration    mmp_delay  vdev_guid     ...
936  11   1520036441  0      146264      891422313  1740883117838 ...
937  11   1520036441  0      163956      888356657  7320395061548 ...
938  11   1520036442  0      130690      885314969  7320395061548 ...
939  11   1520036442  0      2001068577  882296582  1740883117838 ...
940  11   1520036443  0      161806      882296582  7320395061548 ...
941  11   1520036443  0x2    0           998020546  1             ...
942  11   1520036444  0      136585      998020546  7320395061548 ...
943  11   1520036444  0x2    0           998020257  1             ...
944  11   1520036445  5      2002662964  994160219  1740883117838 ...
945  11   1520036445  0x2    998073118   994160219  3             ...
946  11   1520036447  0      247136      994160219  7320395061548 ...

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7212
2018-05-07 17:19:56 -07:00
Giuseppe Di Natale 6356d50e67 Introduce a destroy_dataset helper
Datasets can be busy when calling zfs destroy. Introduce
a helper function to destroy datasets and use it to destroy
datasets in zfs_allow_004_pos, zfs_promote_008_pos, and
zfs_destroy_002_pos.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #7224
Closes #7246
Closes #7249
Closes #7267
2018-05-07 17:19:56 -07:00
Tony Hutter bd69ae3b53 Tag zfs-0.7.8
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-04-09 14:31:57 -07:00
Tony Hutter 9a2e90c9fc Revert "Handle zap_add() failures in mixed ... "
This reverts commit cc63068e95.

Under certain circumstances this change can result in an ENOSPC
error when adding new files to a directory.  See #7401 for full
details.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Issue #7401
Closes #7416
2018-04-09 17:29:59 -04:00
Tony Hutter 240ccfc13a Tag zfs-0.7.7
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-03-14 16:16:43 -07:00
Brian Behlendorf c30e716c81 Fix MMP write frequency for large pools
When a single pool contains more vdevs than the CONFIG_HZ for
for the kernel the mmp thread will not delay properly.  Switch
to using cv_timedwait_sig_hires() to handle higher resolution
delays.

This issue was reported on Arch Linux where HZ defaults to only
100 and this could be fairly easily reproduced with a reasonably
large pool.  Most distribution kernels set CONFIG_HZ=250 or
CONFIG_HZ=1000 and thus are unlikely to be impacted.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7205
Closes #7289
2018-03-14 16:10:38 -07:00
Olaf Faaland 267fd7b0f1 Handle zio_resume and mmp => off
When multihost is disabled on a pool, and the pool is resumed via zpool
clear, within a single cycle of the mmp thread's loop (e.g.  while it's
in the cv_timedwait call), both mmp_last_write and mmp_delay should be
updated.

The original code mistakenly treated the two cases as if they could not
occur at the same time.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7286
2018-03-14 16:10:38 -07:00
LOLi dc0176eeec Fix zfs-kmod builds when using rpm >= 4.14
With rpm-software-management/rpm@5e94633 a package version containing
invalid characters (most commonly a double '-') causes the kmod package
generation to terminate with an error.  This change takes advantage of
the newly introduced rpm macro "_wrong_version_format_terminate_build"
to allow kmod packages to be built.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  loli10K <ezomori.nozomu@gmail.com>
Closes #7284
2018-03-14 16:10:38 -07:00
Paul Zuchowski 0a0af41bd9 zdb and inuse tests don't pass with real disks
Due to zpool create auto-partioning in Linux (i.e. sdb1),
certain utilities need to use the parition (sdb1) while
others use the whole disk name (sdb).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #6939
Closes #7261
2018-03-14 16:10:38 -07:00
Wolfgang Bumiller 3808006edf Take user namespaces into account in policy checks
Change file related checks to use user namespaces and make
sure involved uids/gids are mappable in the current
namespace.

Note that checks without file ownership information will
still not take user namespaces into account, as some of
these should be handled via 'zfs allow' (otherwise root in a
user namespace could issue commands such as `zpool export`).

This also adds an initial user namespace regression test
for the setgid bit loss, with a user_ns_exec helper usable
in further tests.

Additionally, configure checks for the required user
namespace related features are added for:
  * ns_capable
  * kuid/kgid_has_mapping()
  * user_ns in cred_t

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Closes #6800
Closes #7270
2018-03-14 16:10:38 -07:00
Olaf Faaland c17922b8a9 Detect long config lock acquisition in mmp
If something holds the config lock as a writer for too long, MMP will
fail to issue MMP writes in a timely manner.  This will result either in
the pool being suspended, or in an extreme case, in the pool not being
protected.

If the time to acquire the config lock exceeds 1/10 of the minimum
zfs_multihost_interval, report it in the zfs debug log.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7212
2018-03-14 16:10:38 -07:00
Giuseppe Di Natale 8d7f17798d Linux 4.16 compat: get_disk_and_module()
As of https://github.com/torvalds/linux/commit/fb6d47a, get_disk()
is now get_disk_and_module(). Add a configure check to determine
if we need to use get_disk_and_module().

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #7264
2018-03-14 16:10:38 -07:00
Tony Hutter 6dc40e2ada Change checksum & IO delay ratelimit values
Change checksum & IO delay ratelimit thresholds from 5/sec to 20/sec.
This allows zed to actually trigger if a bunch of these events arrive in
a short period of time (zed has a threshold of 10 events in 10 sec).
Previously, if you had, say, 100 checksum errors in 1 sec, it would get
ratelimited to 5/sec which wouldn't trigger zed to fault the drive.

Also, convert the checksum and IO delay thresholds to module params for
easy testing.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7252
2018-03-14 16:10:38 -07:00
chrisrd 792f88131c Increment zil_itx_needcopy_bytes properly
In zil_lwb_commit() with TX_WRITE, we copy the log write record (lrw)
into the log write block (lwb) and send it off using zil_lwb_add_txg().
If we also have WR_NEED_COPY, we additionally copy the lwr's data into
the lwb to be sent off.  If the lwr + data doesn't fit into the lwb, we
send the lrw and as much data as will fit (dnow bytes), then go back
and do the same with the remaining data.

Each time through this loop we're sending dnow data bytes. I.e.
zil_itx_needcopy_bytes should be incremented by dnow.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Closes #6988
Closes #7176
2018-03-14 16:10:38 -07:00
John Eismeier 33bb1e8256 Fix some typos
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: George Melikov <mail@gmelikov.ru>
Signed-off-by: John Eismeier <john.eismeier@gmail.com>
Closes #7237
2018-03-14 16:10:38 -07:00
Tomohiro Kusumi bcaba38e42 Fix zpool(8) list example to match actual format
a05dfd00 (Illumos 5147) has swapped FRAG and EXPANDSZ,
so it's natural to modify these examples.

 # zpool list | head -1
 NAME     SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
                              ^^^^^^^^^^^^^^^

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #7244
2018-03-14 16:10:38 -07:00
Tony Hutter 5e3085e360 Add SMART self-test results to zpool status -c
Add in SMART self-test results to zpool status|iostat -c.  This
works for both SAS and SATA drives.

Also, add plumbing to allow the 'smart' script to take smartctl
output from a directory of output text files instead of running
it against the vdevs.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7178
2018-03-14 16:10:37 -07:00
Tony Hutter 99920d823e Add scrub after resilver zed script
* Add a zed script to kick off a scrub after a resilver.  The script is
disabled by default.

* Add a optional $PATH (-P) option to zed to allow it to use a custom
$PATH for its zedlets.  This is needed when you're running zed under
the ZTS in a local workspace.

* Update test scripts to not copy in all-debug.sh and all-syslog.sh by
default.  They can be optionally copied in as part of zed_setup().
These scripts slow down zed considerably under heavy events loads and
can cause events to be dropped or their delivery delayed. This was
causing some sporadic failures in the 'fault' tests.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #4662
Closes #7086
2018-03-14 16:10:37 -07:00
chrisrd 338523dd6e Fix free memory calculation on v3.14+
Provide infrastructure to auto-configure to enum and API changes in the
global page stats used for our free memory calculations.

arc_free_memory has been broken since an API change in Linux v3.14:

2016-07-28 v4.8 599d0c95 mm, vmscan: move LRU lists to node
2016-07-28 v4.8 75ef7184 mm, vmstat: add infrastructure for per-node
  vmstats

These commits moved some of global_page_state() into
global_node_page_state(). The API change was particularly egregious as,
instead of breaking the old code, it silently did the wrong thing and we
continued using global_page_state() where we should have been using
global_node_page_state(), thus indexing into the wrong array via
NR_SLAB_RECLAIMABLE et al.

There have been further API changes along the way:

2017-07-06 v4.13 385386cf mm: vmstat: move slab statistics from zone to
  node counters
2017-09-06 v4.14 c41f012a mm: rename global_page_state to
  global_zone_page_state

...and various (incomplete, as it turns out) attempts to accomodate
these changes in ZoL:

2017-08-24 2209e409 Linux 4.8+ compatibility fix for vm stats
2017-09-16 787acae0 Linux 3.14 compat: IO acct, global_page_state, etc
2017-09-19 661907e6 Linux 4.14 compat: IO acct, global_page_state, etc

The config infrastructure provided here resolves these issues going back
to the original API change in v3.14 and is robust against further Linux
changes in this area.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Closes #7170
2018-03-14 16:10:37 -07:00
Olaf Faaland 2644784f49 Report duration and error in mmp_history entries
After an MMP write completes, update the relevant mmp_history entry
with the time between submission and completion, and the error
status of the write.

[faaland1@toss3a zfs]$ cat /proc/spl/kstat/zfs/pool/multihost
39 0 0x01 100 8800 69147946270893 72723903122926
id       txg     timestamp  error  duration   mmp_delay    vdev_guid
10607    1166    1518985089 0      138301     637785455    4882...
10608    1166    1518985089 0      136154     635407747    1151...
10609    1166    1518985089 0      803618560  633048078    9740...
10610    1166    1518985090 0      144826     633048078    4882...
10611    1166    1518985090 0      164527     666187671    1151...

Where duration = gethrtime_in_done_fn - gethrtime_at_submission, and
error = zio->io_error.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7190
2018-03-14 16:10:37 -07:00
Olaf Faaland b1f61f05b4 Do not initiate MMP writes while pool is suspended
While the pool is suspended on host A, it may be imported on host B.
If host A continued to write MMP blocks, it would be blindly
overwriting MMP blocks written by host B, and the blocks written by
host A would have outdated txg information.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7182
2018-03-14 16:10:37 -07:00
Tony Hutter e5ba614d05 Linux 4.16 compat: use correct *_dec_and_test()
Use refcount_dec_and_test() on 4.16+ kernels, atomic_dec_and_test()
on older kernels.  https://lwn.net/Articles/714974/

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes: #7179
Closes: #7211
2018-03-14 16:10:37 -07:00
Matthew Thode 30ac8de48a Allow modprobe to fail when called within systemd
This allows for systems with zfs built into the kernel manually to run
these services.  Otherwise the service will fail to start.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Kash Pande <kash@tripleback.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Thode <mthode@mthode.org>
Closes #7174
2018-03-14 16:10:37 -07:00
bunder2015 c705d8386b Add SMART attributes for SSD and NVMe
This adds the SMART attributes required to probe Samsung SSD and NVMe
(and possibly others) disks when using the "zpool status -c" command.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: bunder2015 <omfgbunder@gmail.com>
Closes #7183
Closes #7193
2018-03-14 16:10:37 -07:00
Giuseppe Di Natale d5b10b3ef3 Correct count_uberblocks in mmp.kshlib
A log_must call was causing count_uberblocks to return more
than just the uberblock count. Remove the log_must since it
was only logging a sleep.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #7191
2018-03-14 16:10:37 -07:00
chrisrd 5a84c60fb9 Fix config issues: frame size and headers
1. With various (debug and/or tracing?) kernel options enabled it's
possible for 'struct inode' and 'struct super_block' to exceed the
default frame size, leaving errors like this in config.log:

build/conftest.c:116:1: error: the frame size of 1048 bytes is larger
than 1024 bytes [-Werror=frame-larger-than=]

Fix this by removing the frame size warning for config checks

2. Without the correct headers included, it's possible for declarations
to be missed, leaving errors like this in the config.log:

build/conftest.c:131:14: error: ‘struct nameidata’ declared inside
parameter list [-Werror]

Fix this by adding appropriate headers.

Note: Both these issues can result in silent config failures because
the compile failure is taken to mean "this option is not supported by
this kernel" rather than "there's something wrong with the config
test". This can lead to something merely annoying (compile failures) to
something potentially serious (miscompiled or misused kernel primitives
or functions). E.g. the fixes included here resulted in these
additional defines in zfs_config.h with linux v4.14.19:

Also, drive-by whitespace fixes in config/* files which don't mention
"GNU" (those ones look to be imported from elsewhere so leave them
alone).

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Closes #7169
2018-03-14 16:10:37 -07:00
Olaf Faaland 26941ce90b Clarify zinject(8) explanation of -e
Error injection of EIO or ENXIO simply sets the zio's io_error value,
rather than preventing the read or write from occurring.  This is
important information as it affects how the probes must be used.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7172
2018-03-14 16:10:37 -07:00
George Wilson 07ce5d7390 OpenZFS 8857 - zio_remove_child() panic due to already destroyed parent zio
PROBLEM
=======
It's possible for a parent zio to complete even though it has children
which have not completed. This can result in the following panic:
    > $C
    ffffff01809128c0 vpanic()
    ffffff01809128e0 mutex_panic+0x58(fffffffffb94c904, ffffff597dde7f80)
    ffffff0180912950 mutex_vector_enter+0x347(ffffff597dde7f80)
    ffffff01809129b0 zio_remove_child+0x50(ffffff597dde7c58, ffffff32bd901ac0,
    ffffff3373370908)
    ffffff0180912a40 zio_done+0x390(ffffff32bd901ac0)
    ffffff0180912a70 zio_execute+0x78(ffffff32bd901ac0)
    ffffff0180912b30 taskq_thread+0x2d0(ffffff33bae44140)
    ffffff0180912b40 thread_start+8()
    > ::status
    debugging crash dump vmcore.2 (64-bit) from batfs0390
    operating system: 5.11 joyent_20170911T171900Z (i86pc)
    image uuid: (not set)
    panic message: mutex_enter: bad mutex, lp=ffffff597dde7f80
    owner=ffffff3c59b39480 thread=ffffff0180912c40
    dump content: kernel pages only
The problem is that dbuf_prefetch along with l2arc can create a zio tree
which confuses the parent zio and allows it to complete with while children
still exist. Here's the scenario:
    zio tree:
        pio
         |--- lio
The parent zio, pio, has entered the zio_done stage and begins to check its
children to see there are still some that have not completed. In zio_done(),
the children are checked in the following order:
    zio_wait_for_children(zio, ZIO_CHILD_VDEV, ZIO_WAIT_DONE)
    zio_wait_for_children(zio, ZIO_CHILD_GANG, ZIO_WAIT_DONE)
    zio_wait_for_children(zio, ZIO_CHILD_DDT, ZIO_WAIT_DONE)
    zio_wait_for_children(zio, ZIO_CHILD_LOGICAL, ZIO_WAIT_DONE)
If pio, finds any child which has not completed then it stops executing and
goes to sleep. Each call to zio_wait_for_children() will grab the io_lock
while checking the particular child.
In this scenario, the pio has completed the first call to
zio_wait_for_children() to check for any ZIO_CHILD_VDEV children. Since
the only zio in the zio tree right now is the logical zio, lio, then it
completes that call and prepares to check the next child type.
In the meantime, the lio completes and in its callback creates a child vdev
zio, cio. The zio tree looks like this:
    zio tree:
        pio
         |--- lio
         |--- cio
The lio then grabs the parent's io_lock and removes itself.
    zio tree:
        pio
         |--- cio
The pio continues to run but has already completed its check for ZIO_CHILD_VDEV
and will erroneously complete. When the child zio, cio, completes it will panic
the system trying to reference the parent zio which has been destroyed.
SOLUTION
========
The fix is to rework the zio_wait_for_children() logic to accept a bitfield
for all the children types that it's interested in checking. The
io_lock will is held the entire time we check all the children types. Since
the function now accepts a bitfield, a simple ZIO_CHILD_BIT() macro is provided
to allow for the conversion between a ZIO_CHILD type and the bitfield used by
the zio_wiat_for_children logic.

Authored by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: Youzhong Yang <youzhong@gmail.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Dan McDonald <danmcd@omniti.com>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/8857
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/862ff6d99c
Issue #5918
Closes #7168
2018-03-14 16:10:37 -07:00
LOLi 1d805a534b 'zfs receive' fails with "dataset is busy"
Receiving an incremental stream after an interrupted "zfs receive -s"
fails with the message "dataset is busy": this is because we still have
the hidden clone ../%recv from the resumable receive.

Improve the error message suggesting the existence of a partially
complete resumable stream from "zfs receive -s" which can be either
aborted ("zfs receive -A") or resumed ("zfs send -t").

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7129
Closes #7154
2018-03-14 16:10:37 -07:00
LOLi a9ff89e05c contrib/initramfs: add missing conf.d/zfs
When upgrading from the distribution-provided zfs-initramfs package on
root-on-zfs Ubuntu and Debian the system may fail to boot: this change
adds the missing initramfs configuration file.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7158
2018-03-14 16:10:37 -07:00
sanjeevbagewadi d85011ed69 mmp should use a fixed tag for spa_config locks
mmp_write_uberblock() and mmp_write_done() should the same tag
for spa_config_locks.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sanjeev Bagewadi <sanjeev.bagewadi@gmail.com>
Closes #6530
Closes #7155
2018-03-14 16:10:37 -07:00
sanjeevbagewadi b3da003ebf Handle zap_add() failures in mixed case mode
With "casesensitivity=mixed", zap_add() could fail when the number of
files/directories with the same name (varying in case) exceed the
capacity of the leaf node of a Fatzap. This results in a ASSERT()
failure as zfs_link_create() does not expect zap_add() to fail. The fix
is to handle these failures and rollback the transactions.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sanjeev Bagewadi <sanjeev.bagewadi@gmail.com>
Closes #7011
Closes #7054
2018-03-14 16:10:37 -07:00
Chunwei Chen 478754a8f5 Fix zdb -ed on objset for exported pool
zdb -ed on objset for exported pool would failed with:
  failed to own dataset 'qq/fs0': No such file or directory

The reason is that zdb pass objset name to spa_import, it uses that
name to create a spa. Later, when dmu_objset_own tries to lookup the spa
using real pool name, it can't find one.

We fix this by make sure we pass pool name rather than objset name to
spa_import.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099
Closes #6464
2018-03-14 16:10:37 -07:00
Chunwei Chen 31ff122aa2 Fix zdb -E segfault
SPA_MAXBLOCKSIZE is too large for stack.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099
2018-03-14 16:10:36 -07:00
Chunwei Chen 18c662b845 Fix zdb -R decompression
There are some issues in the zdb -R decompression implementation.

The first is that ZLE can easily decompress non-ZLE streams. So we add
ZDB_NO_ZLE env to make zdb skip ZLE.

The second is the random bytes appended to pabd, pbuf2 stuff. This serve
no purpose at all, those bytes shouldn't be read during decompression
anyway. Instead, we randomize lbuf2, so that we can make sure
decompression fill exactly to lsize by bcmp lbuf and lbuf2.

The last one is the condition to detect fail is wrong.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099
Closes #4984
2018-03-14 16:10:36 -07:00
Chunwei Chen c797f0898e Fix racy assignment of zcb.zcb_haderrors
zcb_haderrors will be modified in zdb_blkptr_done, which is
asynchronous. So we must move this assignment after zio_wait.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099
2018-03-14 16:10:36 -07:00
Chunwei Chen 5e566c5772 Fix zle_decompress out of bound access
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099
2018-03-14 16:10:36 -07:00
Chunwei Chen 23227313a2 Fix zdb -c traverse stop on damaged objset root
If a corruption happens to be on a root block of an objset, zdb -c will
not correctly report the error, and it will not traverse the datasets
that come after. This is because traverse_visitbp, which does the
callback and reset error for TRAVERSE_HARD, is skipped when traversing
zil is failed in traverse_impl.

Here's example of what 'zdb -eLcc' command looks like on a pool with
damaged objset root:

== before patch:

Traversing all blocks to verify checksums ...

Error counts:

	errno  count
block traversal size 379392 != alloc 33987072 (unreachable 33607680)

	bp count:             172
	ganged count:           0
	bp logical:       1678336      avg:   9757
	bp physical:       130560      avg:    759     compression:  12.85
	bp allocated:      379392      avg:   2205     compression:   4.42
	bp deduped:             0    ref>1:      0   deduplication:   1.00
	SPA allocated:   33987072     used:  0.80%

	additional, non-pointer bps of type 0:         71
	Dittoed blocks on same vdev: 101

== after patch:

Traversing all blocks to verify checksums ...

zdb_blkptr_cb: Got error 52 reading <54, 0, -1, 0>  -- skipping

Error counts:

	errno  count
	   52  1
block traversal size 33963520 != alloc 33987072 (unreachable 23552)

	bp count:             447
	ganged count:           0
	bp logical:      36093440      avg:  80745
	bp physical:     33699840      avg:  75391     compression:   1.07
	bp allocated:    33963520      avg:  75981     compression:   1.06
	bp deduped:             0    ref>1:      0   deduplication:   1.00
	SPA allocated:   33987072     used:  0.80%

	additional, non-pointer bps of type 0:         76
	Dittoed blocks on same vdev: 115

==

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099
2018-03-14 16:10:36 -07:00
Brian Behlendorf 3713b73335 Linux 4.11 compat: avoid refcount_t name conflict
Related to commit 4859fe796, when directly using the kernel's
refcount functions in kernel compatibility code do not map
refcount_t to zfs_refcount_t.  This leads to a type mismatch.

Longer term we should consider renaming refcount_t to
zfs_refcount_t in the zfs code base.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7148
2018-03-14 16:10:36 -07:00
Brian Behlendorf 310e63dfd1 Linux 4.16 compat: inode_set_iversion()
A new interface was added to manipulate the version field of an
inode.  Add a inode_set_iversion() wrapper for older kernels and
use the new interface when available.

The i_version field was dropped from the trace point due to the
switch to an atomic64_t i_version type.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7148
2018-03-14 16:10:36 -07:00
WHR a196b3bc3d OpenZFS 8966 - Source file zfs_acl.c, function zfs_aclset_common contains a use after end of the lifetime of a local variable
Authored by: WHR <msl0000023508@gmail.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: George Melikov <mail@gmelikov.ru>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Richard Lowe <richlowe@richlowe.net>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/8966
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/c95549fcdc
Closes #7141
2018-03-14 16:10:36 -07:00
Richard Elling a58e1284d8 Remove deprecated zfs_arc_p_aggressive_disable
zfs_arc_p_aggressive_disable is no more. This PR removes docs
and module parameters for zfs_arc_p_aggressive_disable.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Richard Elling <Richard.Elling@RichardElling.com>
Closes #7135
2018-03-14 16:10:36 -07:00
Brian Behlendorf f1dde3fb20 Fix default libdir for Debian/Ubuntu
The distribution provided architecture specific RPM macro files
for x86_64 and other architectures on Debian/Ubuntu specify the
wrong default libdir install location.  When building deb packages
override _lib with the correct location.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7083
Closes #7101
2018-03-14 16:10:36 -07:00
wli5 5f38142e7b Bug fix in qat_compress.c for vmalloc addr check
Remove the unused vmalloc address check, and function mem_to_page
will handle the non-vmalloc address when map it to a physical
address.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Weigang Li <weigang.li@intel.com>
Closes #7125
2018-03-14 16:10:36 -07:00
LOLi 29b79dcfe9 Fix systemd_ RPM macros usage on Debian-based distributions
Debian-based distributions do not seem to provide RPM macros for
dealing with systemd pre- and post- (un)install actions: this results
in errors when installing or upgrading .deb packages because the
resulting control scripts contain the following unresolved macros:

 * %systemd_post
 * %systemd_preun
 * %systemd_postun

Fix this by providing default values for postinstall, preuninstall and
postuninstall scripts when these macros are not defined.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7074
Closes #7100
2018-03-14 16:10:36 -07:00
John L. Hammond ecc972c7f0 Emit an error message before MMP suspends pool
In mmp_thread(), emit an MMP specific error message before calling
zio_suspend() so that the administrator will understand why the pool
is being suspended.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Closes #7048
2018-03-14 16:10:36 -07:00
LOLi 6c891ade8b ZTS: Fix create-o_ashift test case
The function that fills the uberblock ring buffer on every device label
has been reworked to avoid occasional failures caused by a race
condition that prevents 'zpool sync' from writing some uberblock
sequentially: this happens when the pool sync ioctl dispatch code calls
txg_wait_synced() while we're already waiting for a TXG to sync.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6924
Closes #6977
2018-03-14 16:10:36 -07:00
LOLi 03658d5081 Fix --with-systemd on Debian-based distributions (#6963)
These changes propagate the "--with-systemd" configure option to the
RPM spec file, allowing Debian-based distributions to package
systemd-related files.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6591
Closes #6963
2018-03-14 16:10:36 -07:00
Brian Behlendorf 5d62588032 Remove vn_rename and vn_remove dependency
The only place vn_rename and vn_remove are used is when writing
out an updated pool configuration file.  By truncating the file
instead of renaming and removing it we can avoid having to implement
these interfaces entirely.  Functionally an empty cache file is
treated the same as a missing cache file.  This is particularly
advantageous because the Linux kernel has never provided a way
to reliably implement vn_rename and vn_remove.

The cachefile_004_pos.ksh test case was updated to understand
that an empty cache file is the same as a missing one.

The zfs-import-* systemd service files were not updated to use
ConditionFileNotEmpty in place of ConditionPathExists.  This
means that after exporting all pools and rebooting new pools
will not the scanned for on the next boot.  This small change
should not impact normal usage since pools are not exported
as part of a normal shutdown.

Documentation was updated accordingly.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Arkadiusz Bubała <arkadiusz.bubala@open-e.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes zfsonlinux/spl#648
Closes #6753
2018-03-14 16:10:36 -07:00
Brian Behlendorf 6897ea475f Fix "--enable-code-coverage" debug build
When --enable-code-coverage is provided it should not result
in NDEBUG being defined.  This is controlled by --enable-debug.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6674
2018-03-14 16:10:36 -07:00
Brian Behlendorf 3790bfa80f Update codecov.yml
Update the codecov.yml to make the following functional changes.

* Do not require the CI testing to pass before posting results.
* Set red-yellow-green coverage percent from 50%-100%
* Allow a 1% drop in coverage to still be considered a pass.
* Reduce the size of the comment posted to the issue.

Additionally, the top level README.markdown has been updated
to include the codecov.io badge and the project summary reworded.

Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6669
2018-03-14 16:10:36 -07:00
Prakash Surya 6b278f3223 Add support for "--enable-code-coverage" option
This change adds support for a new option that can be passed to the
configure script: "--enable-code-coverage". Further, the "--enable-gcov"
option has been removed, as this new option provides the same
functionality (plus more).

When using this new option the following make targets are available:

 * check-code-coverage
 * code-coverage-capture
 * code-coverage-clean

Note: these make targets can only be run from the root of the project.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Prakash Surya <prakash.surya@delphix.com>
Closes #6670
2018-03-14 16:10:36 -07:00
Prakash Surya f1236ebf35 Make "-fno-inline" compile option more accessible
When functions are inlined, it can make the system much more difficult
to instrument using tools such as ftrace, BPF, crash, etc. Thus, to aid
development and increase the system's observability, when the
"--enable-debuginfo" flag is specified, the "-fno-inline" compilation
option will be used for both userspace and kernel modules.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Prakash Surya <prakash.surya@delphix.com>
Closes #6605
2018-03-14 16:10:36 -07:00
Brian Behlendorf 184087f822 Add configure option to enable gcov analysis
* Add configure option to enable gcov analysis.
* Includes a few minor ctime fixes.
* Add codecov.yml configuration.

Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6642
2018-03-14 16:10:36 -07:00
Richard Yao 834815e9f7 Implement --enable-debuginfo to force debuginfo
Inspection of a Ubuntu 14.04 x64 system revealed that the config file
used to build the kernel image differs from the config file used to
build kernel modules by the presence of CONFIG_DEBUG_INFO=y:

This in itself is insufficient to show that the kernel is built with
debuginfo, but a cursory analysis of the debuginfo provided and the
size of the kernel strongly suggests that it was built with
CONFIG_DEBUG_INFO=y while the modules were not. Installing
linux-image-$(uname -r)-dbgsym had no obvious effect on the debuginfo
provided by either the modules or the kernel.

The consequence is that issue reports from distributions such as Ubuntu
and its derivatives build kernel modules without debuginfo contain
nonsensical backtraces. It is therefore desireable to force generation
of debuginfo, so we implement --enable-debuginfo. Since the build system
can build both userspace components and kernel modules, the generic
--enable-debuginfo option will force debuginfo for both. However, it
also supports --enable-debuginfo=kernel and --enable-debuginfo=user for
finer grained control.

Enabling debuginfo for the kernel modules works by injecting
CONFIG_DEBUG_INFO=y into the make environment. This is enables
generation of debuginfo by the kernel build systems on all Linux
kernels, but the build environment is slightly different int hat
CONFIG_DEBUG_INFO has not been in the CPP. Adding -DCONFIG_DEBUG_INFO
would fix that, but it would also cause build failures on kernels where
CONFIG_DEBUG_INFO=y is already set. That would complicate its use in
DKMS environments that support a range of kernels and is therefore
undesireable. We could write a compatibility shim to enable
CONFIG_DEBUG_INFO only when it is explicitly disabled, but we forgo
doing that because it is unnecessary. Nothing in ZoL or the kernel uses
CONFIG_DEBUG_INFO in the CPP at this time and that is unlikely to
change.

Enabling debuginfo for the userspace components is done by injecting -g
into CPPFLAGS. This is not necessary because the build system honors the
environment's CPPFLAGS by appending them to the actual CPPFLAGS used,
but it is supported for consistency.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@clusterhq.com>
Closes #2734
2018-03-14 16:10:35 -07:00
Richard Yao 0f1ff38476 Make --enable-debug fail when given bogus args
Currently, bogus options to --enable-debug become --disable-debug. That
means that passing --enable-debug=true is analogous to --disable-debug,
but the result is counterintuitive. We switch to AS_CASE to allow us to
fail when given a bogus option.

Also, we modify the text printed to clarify that --enable-debug enables
assertions.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@clusterhq.com>
Closes #2734
2018-03-14 16:10:35 -07:00
Tony Hutter e3b28e16ce Tag zfs-0.7.6
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-02-01 10:02:58 -08:00
LOLi 2f62fdd644 Fix 'zfs receive -o' when used with '-e|-d'
When used in conjunction with one of '-e' or '-d' zfs receive options
none of the properties requested to be set (-o) are actually applied:
this is caused by a wrong assumption made about the toplevel dataset
in zfs_receive_one().

Fix this by correctly detecting the toplevel dataset.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7088

Requires-spl: refs/pull/679/head
2018-01-30 10:27:32 -06:00
Brian Behlendorf 137b3e6cff Extend zloop.sh for automated testing
In order to debug issues encountered by ztest during automated
testing it's important that as much debugging information as
possible by dumped at the time of the failure.  The following
changes extend the zloop.sh script in order to make it easier
to integrate with buildbot.

* Add the `-m <maximum cores>` option to zloop.sh to place a
  limit of the number of core dumps generated.  By default, the
  existing behavior is maintained and no limit is set.

* Add the `-l` option to create a 'ztest.core.N' symlink in the
  current directory to the core directory. This functionality
  is provided primarily for buildbot which expects log files to
  have well known names.

* Rename 'ztest.ddt' to 'ztest.zdb' and extend it to dump
  additional basic information on failure for latter analysis.

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed by: Thomas Caputi <tcaputi@datto.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6999

Conflicts:
	scripts/zloop.sh
2018-01-30 10:27:31 -06:00
Don Brady d1630dda58 Cleanup zloop working directory after each pass
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed by: John Kennedy <jwk404@gmail.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: Don Brady <don.brady@delphix.com>
Issue #6595
Closes #6663
2018-01-30 10:27:31 -06:00
Alexander Motin 701ebd014a OpenZFS 8835 - Speculative prefetch in ZFS not working for misaligned reads
In case of misaligned I/O sequential requests are not detected as such
due to overlaps in logical block sequence:

    dmu_zfetch(fffff80198dd0ae0, 27347, 9, 1)
    dmu_zfetch(fffff80198dd0ae0, 27355, 9, 1)
    dmu_zfetch(fffff80198dd0ae0, 27363, 9, 1)
    dmu_zfetch(fffff80198dd0ae0, 27371, 9, 1)
    dmu_zfetch(fffff80198dd0ae0, 27379, 9, 1)
    dmu_zfetch(fffff80198dd0ae0, 27387, 9, 1)

This patch makes single block overlap to be counted as a stream hit,
improving performance up to several times.

Authored by: Alexander Motin <mav@FreeBSD.org>
Approved by: Gordon Ross <gwr@nexenta.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Allan Jude <allanjude@freebsd.org>
Reviewed by: Gvozden Neskovic <neskovic@gmail.com>
Reviewed by: George Melikov <mail@gmelikov.ru>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/8835
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/aab6dd482a
Closes #7062
2018-01-30 10:27:31 -06:00
LOLi 5b8ec2cf39 Fix Debian packaging on ARMv7/ARM64
When building packages on Debian-based systems specify the target
architecture used by 'alien' to convert .rpm packages into .deb: this
avoids detecting an incorrect value which results in the following
errors:

<package>.aarch64.rpm is for architecture aarch64 ; the package cannot be built on this system
<package>.armv7l.rpm is for architecture armel ; the package cannot be built on this system

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7046
Closes #7058
2018-01-30 10:27:31 -06:00
Brian Behlendorf 9d1a39cec6 Fix shellcheck v0.4.6 warnings
Resolve new warnings reported after upgrading to shellcheck
version 0.4.6.  This patch contains no functional changes.

* egrep is non-standard and deprecated. Use grep -E instead. [SC2196]
* Check exit code directly with e.g. 'if mycmd;', not indirectly
  with $?.  [SC2181]  Suppressed.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7040
2018-01-30 10:27:31 -06:00
DeHackEd 2a7b736dce Remove l2arc_nocompress from zfs-module-parameters(5)
Parameter was removed in d3c2ae1c08
(OpenZFS 6950 - ARC should cache compressed data)

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #7043
2018-01-30 10:27:31 -06:00
Richard Yao ecc8af1812 Fix incompatibility with Reiser4 patched kernels
In ZFSOnLinux, our sources and build system are self contained such that
we do not need to make changes to the Linux kernel sources. Reiser4 on
the other hand exists solely as a kernel tree patch and opts to make
changes to the kernel rather than adapt to it. After Linux 4.1 made a
VFS change that replaced new_sync_read with do_sync_read, Reiser4's
maintainer decided to modify the kernel VFS to export the old function.
This caused our autotools check to misidentify the kernel API as
predating Linux 4.1 on kernels that have been patched with Reiser4
support, which breaks our build.

Reiser4 really should be patched to stop doing this, but lets modify our
check to be more strict to help the affected users of both filesystems.

Also, we were not checking the types of arguments and return value of
new_sync_read() and new_sync_write() . Lets fix that too.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Closes #6241
Closes #7021
2018-01-30 10:27:31 -06:00
Alex Zhuravlev 129e3e8dc3 Use zap_count instead of cached z_size for unlink
As a performance optimization Lustre does not strictly update
the SA_ZPL_SIZE when adding/removing from non-directory entries.
This results in entries which cannot be removed through the ZPL
layer even though the ZAP is empty and safe to remove.

Resolve this issue by checking the zap_count() directly instead
on relying on the cached SA_ZPL_SIZE.  Micro-benchmarks show no
significant performance impact due to the additional overhead
of using zap_count().

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7019
2018-01-30 10:27:31 -06:00
Nathaniel Wesley Filardo 9fb09f79e5 Revert raidz_map and _col structure types
As part of the refactoring of ab9f4b0b82,
several uint64_t-s and uint8_t-s were changed to other types.  This
caused ZoL github issue #6981, an overflow of a size_t on a 32-bit ARM
machine.  In absense of any strong motivation for the type changes, this
simply puts them back, modulo the changes accumulated for ABD.

Compile-tested on amd64 and run-tested on armhf.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Nathaniel Wesley Filardo <nwf@cs.jhu.edu>
Closes #6981
Closes #7023
2018-01-30 10:27:31 -06:00
Nathaniel Wesley Filardo a2ee6568c6 zhack: fix getopt return type
This fixes zhack's command processing on ARM.  On ARM char
is unsigned, and so, in promotion to an int, it will never
compare equal to -1.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Nathaniel Wesley Filardo <nwf@cs.jhu.edu>
Closes #7016
2018-01-30 10:27:31 -06:00
Brian Behlendorf 9c1a8eaa51 Fix ARC hit rate
When the compressed ARC feature was added in commit d3c2ae1
the method of reference counting in the ARC was modified.  As
part of this accounting change the arc_buf_add_ref() function
was removed entirely.

This would have be fine but the arc_buf_add_ref() function
served a second undocumented purpose of updating the ARC access
information when taking a hold on a dbuf.  Without this logic
in place a cached dbuf would not migrate its associated
arc_buf_hdr_t to the MFU list.  This would negatively impact
the ARC hit rate, particularly on systems with a small ARC.

This change reinstates the missing call to arc_access() from
dbuf_hold() by implementing a new arc_buf_access() function.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6171
Closes #6852
Closes #6989
2018-01-30 10:27:31 -06:00
LOLi a8fa31b50b Fix 'zpool add' handling of nested interior VDEVs
When replacing a faulted device which was previously handled by a spare
multiple levels of nested interior VDEVs will be present in the pool
configuration; the following example illustrates one of the possible
situations:

   NAME                          STATE     READ WRITE CKSUM
   testpool                      DEGRADED     0     0     0
     raidz1-0                    DEGRADED     0     0     0
       spare-0                   DEGRADED     0     0     0
         replacing-0             DEGRADED     0     0     0
           /var/tmp/fault-dev    UNAVAIL      0     0     0  cannot open
           /var/tmp/replace-dev  ONLINE       0     0     0
         /var/tmp/spare-dev1     ONLINE       0     0     0
       /var/tmp/safe-dev         ONLINE       0     0     0
   spares
     /var/tmp/spare-dev1         INUSE     currently in use

This is safe and allowed, but get_replication() needs to handle this
situation gracefully to let zpool add new devices to the pool.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6678
Closes #6996
2018-01-30 10:27:31 -06:00
lidongyang 8d82a19def Call commit callbacks from the tail of the list
Our zfs backed Lustre MDT had soft lockups while under heavy metadata
workloads while handling transaction callbacks from osd_zfs.

The problem is zfs is not taking advantage of the fast path in
Lustre's trans callback handling, where Lustre will skip the calls
to ptlrpc_commit_replies() when it already saw a higher transaction
number.

This patch corrects this, it also has a positive impact on metadata
performance on Lustre with osd_zfs, plus some cleanup in the headers.

A similar issue for ext4/ldiskfs is described on:
https://jira.hpdd.intel.com/browse/LU-6527

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Li Dongyang <dongyang.li@anu.edu.au>
Closes #6986
2018-01-30 10:27:31 -06:00
Giuseppe Di Natale c2aacf2087 Handle broken pipes in arc_summary
Using a command similar to 'arc_summary.py | head' causes
a broken pipe exception. Gracefully exit in the case of a
broken pipe in arc_summary.py.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6965 
Closes #6969
2018-01-30 10:27:31 -06:00
LOLi 9a6c57845a Handle invalid options in arc_summary
If an invalid option is provided to arc_summary.py we handle any error
thrown from the getopt Python module and print the usage help message.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6983
2018-01-30 10:27:31 -06:00
Dominik Hassler d27a40d28f OpenZFS 8794 - cstyle generates warnings with recent perl
Authored by: Dominik Hassler <hadfl@omniosce.org>
Reviewed by: Andy Fiddaman <andy@omniosce.org>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Dan McDonald <danmcd@joyent.com>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/8794
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/578f67364c
Closes #6973
2018-01-30 10:27:31 -06:00
Brian Behlendorf aebc5df418 Update for cppcheck v1.80
Resolve new warnings and errors from cppcheck v1.80.

* [lib/libshare/libshare.c:543]: (warning)
  Possible null pointer dereference: protocol
* [lib/libzfs/libzfs_dataset.c:2323]: (warning)
  Possible null pointer dereference: srctype
* [lib/libzfs/libzfs_import.c:318]: (error)
  Uninitialized variable: link
* [module/zfs/abd.c:353]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:353]: (error) Uninitialized variable: i
* [module/zfs/abd.c:385]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:385]: (error) Uninitialized variable: i
* [module/zfs/abd.c:553]: (error) Uninitialized variable: i
* [module/zfs/abd.c:553]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:763]: (error) Uninitialized variable: i
* [module/zfs/abd.c:763]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:305]: (error) Uninitialized variable: tmp_page
* [module/zfs/zpl_xattr.c:342]: (warning)
   Possible null pointer dereference: value
* [module/zfs/zvol.c:208]: (error) Uninitialized variable: p

Convert the following suppression to inline.

* [module/zfs/zfs_vnops.c:840]: (error)
  Possible null pointer dereference: aiov

Exclude HAVE_UIO_ZEROCOPY and HAVE_DNLC from analysis since
these macro's will never be defined until this functionality
is implemented.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6879
2018-01-30 10:27:31 -06:00
Scot W. Stevenson 7a8bef3983 Fix data on evict_skips in arc_summary.py
Display correct data from kstat arcstats for evict_skips,
which is currently repeating the data from mutex_misses.
Fixes #6882

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6882 
Closes #6883
2018-01-30 10:27:31 -06:00
Scot W. Stevenson d486dee89e Minor code cleanups in arc_python.py
Remove unused library re and associated variable kstat_pobj. Add note
to documentation at start of program about required support for old
versions of Python. Change variable "format" (which is a built-in
function) to "fmt".

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6869
2018-01-30 10:27:31 -06:00
Scot W. Stevenson 7de8fb33a2 Fix arc_summary.py -d crash with Python3
Prevents arc_summary.py crashing when called with parameter -d or
long form --description with Python3.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6849 
Closes #6850
2018-01-30 10:27:31 -06:00
Scot W. Stevenson 904c03672b Sort output of tunables in arc_summary.py
Sort list of tunables printed by _tunable_summary()
alphabetically

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6828
2018-01-30 10:27:30 -06:00
Scot W. Stevenson 03955e3488 Add documentation strings to arc_summary.py
Include docstrings (PEP8, PEP257) for module and all functions.
Separately, remove outdated section in comment at start of
module. Separately, remove unused global constant "usetunable".

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6818
2018-01-30 10:27:30 -06:00
Scot W. Stevenson 88e4e0d5dd Rewrite fHits() in arc_summary.py with SI units
Complete rewrite of fHits(). Move units from non-standard English
abbreviations to SI units, thereby avoiding confusion because of
"long scale" and "short scale" numbers. Remove unused parameter
"Decimal". Add function string. Aim to confirm to PEP8.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6815
2018-01-30 10:27:30 -06:00
Scot W. Stevenson 03f638a8ef Minor code cleanup in arc_summary.py
Simplify and inline single-use function div1(); inline twice-used
function div2(); add function comment to zfs_header(); replace
variable "unused" in get_Kstat() with "_" following convention.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6802
2018-01-30 10:27:30 -06:00
Scot W. Stevenson 5dc25de668 Rewrite of function fBytes() in arc_summary.py
Replace if-elif-else construction with shorter loop;
remove unused parameter "Decimal"; centralize format
string; add function documentation string; conform to
PEP8.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6784
2018-01-30 10:27:30 -06:00
David Quigley 53e5890cff Fix bug in distclean which removes needed files
Running distclean removes the following files because of an error
in Makefile.am

deleted:    tests/zfs-tests/include/commands.cfg
deleted:    tests/zfs-tests/include/libtest.shlib
deleted:    tests/zfs-tests/include/math.shlib
deleted:    tests/zfs-tests/include/properties.shlib
deleted:    tests/zfs-tests/include/zpool_script.shlib

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: David Quigley <david.quigley@intel.com>
Closes #6636
2018-01-30 10:27:30 -06:00
Gvozden Neskovic a94447ddf3 dmu_objset: release bonus buffer in failure path
Reported by kmemleak during testing of a new patch:

```
unreferenced object 0xffff9f1c12e38800 (size 1024):
  comm "z_upgrade", pid 17842, jiffies 4296870904 (age 8746.268s)
  backtrace:
    kmemleak_alloc+0x7a/0x100
    __kmalloc_node+0x26c/0x510
    range_tree_create+0x39/0xa0 [zfs]
    dmu_zfetch_init+0x73/0xe0 [zfs]
    dnode_create+0x12c/0x3b0 [zfs]
    dnode_hold_impl+0x1096/0x1130 [zfs]
    dnode_hold+0x23/0x30 [zfs]
    dmu_bonus_hold_impl+0x6b/0x370 [zfs]
    dmu_bonus_hold+0x1e/0x30 [zfs]
    dmu_objset_space_upgrade+0x114/0x310 [zfs]
    dmu_objset_userobjspace_upgrade_cb+0xd8/0x150 [zfs]
    dmu_objset_upgrade_task_cb+0x136/0x1e0 [zfs]    
    kthread+0x119/0x150
```

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Closes #6575
2018-01-30 10:27:30 -06:00
Chunwei Chen 7192ec7942 Fix zfs_ioc_pool_sync should not use fnvlist
Use fnvlist on user input would allow user to easily panic zfs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #6529
2018-01-30 10:27:30 -06:00
Gvozden Neskovic 06acbbc429 vdev_mirror: load balancing fixes
vdev_queue:
- Track the last position of each vdev, including the io size,
  in order to detect linear access of the following zio.
- Remove duplicate `vq_lastoffset`

vdev_mirror:
- Correctly calculate the zio offset (signedness issue)
- Deprecate `vdev_queue_register_lastoffset()`
- Add `VDEV_LABEL_START_SIZE` to zio offset of leaf vdevs

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Closes #6461
2018-01-30 10:27:30 -06:00
BtbN 6116bbd744 Use /sbin/openrc-run for openrc init scripts
Using /sbin/runscript is deprecated and throws a QA warning
when still used in init scripts.

Reviewed-by: bunder2015 <omfgbunder@gmail.com>
Signed-off-by: BtbN <btbn@btbn.de>
Closes #6519
2018-01-30 10:27:30 -06:00
Tony Hutter a803eacf26 Tag zfs-0.7.5
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2017-12-18 10:57:47 -08:00
Brian Behlendorf 504bfc8b49 Fix multihost stale cache file import
When the multihost property is enabled it should be impossible to
import an active pool even using the force (-f) option.  This patch
prevents a forced import from succeeding when importing with a
stale cache file.

The root cause of the problem is that the kernel modules trusted
the hostid provided in configuration.  This is always correct when
the configuration is generated by scanning for the pool.  However,
when using an existing cache file the hostid could be stale which
would result in the activity check being skipped.

Resolve the issue by always using the hostid read from the label
configuration where the best uberblock was found.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6933
Closes #6971
2017-12-18 10:31:01 -08:00
Olaf Faaland 53a8cbd70e Fix ZTS MMP tests and ztest -M behavior
Quote "$MMP_IMPORT_MSG" when it is passed as an argument, as it is a
multi-word string.  Some tests were passing when they should not have,
because the grep was only testing for the first word.

Correct the message expected when no hostid is set and the test attempts
to enable multihost.  It did not match the actual output in that
situation.

Disable ztest_reguid() when ztest is invoked with the -M option.  If
ztest performs a reguid, a concurrent import attempt may fail with the
error "one or more devices is currently unavailable" if the guid sum is
calculated on the original device guids but compared against the guid
sum ztest wrote based on the new device guids.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #6666
2017-12-18 10:14:39 -08:00
David Qian 505b97ae20 Enable QAT support in zfs-dkms RPM
Enable QAT accelerated gzip compression in zfs-dkms RPM package when
environment variant ICP_ROOT is set to QAT drive source code folder
and QAT hardware presence.  Otherwise, use default gzip compression.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: David Qian <david.qian@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6932
2017-12-18 10:02:19 -08:00
Lalufu 30a64ebaed Add zfs-import.target services in spec file
Add missing zfs-import.target to list of systemd services in zfs
RPM spec file.

Reviewed-by: Niklas Wagner <Skaro@Skaronator.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ralf Ertzinger <ralf@skytale.net>
Issue #6953
Closes #6955
2017-12-18 09:45:01 -08:00
Antonio Russo da16fc5739 Enable zfs-import.target in systemd preset (#6968)
Cherry picked line from PR #6822, this enables the new
target introduced in PR #6764.

Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
2017-12-18 09:43:55 -08:00
Tony Hutter 3c7fa6ca33 Tag zfs-0.7.4
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2017-12-07 10:25:36 -08:00
Tony Hutter 36e0ddb744 Revert "Long hold the dataset during upgrade"
This reverts commit a5c8119eba.

The commit (which was modified to remove encryption) was hitting
ASSERT(dsl_pool_config_held(dmu_objset_pool(os))) in
dmu_objset_upgrade() during automated testing.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2017-12-06 13:25:40 -06:00
Giuseppe Di Natale cf21b5b5b2 Allow test-runner to filter test groups by tag
Enable test-runner to accept a list of tags to identify
which test groups the user wishes to run.

Also allow test-runner to perform multiple iterations
of a test run.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Wren Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6788
2017-12-06 13:25:40 -06:00
Brian Behlendorf 1030f807ba Fix NFS sticky bit permission denied error
When zfs_sticky_remove_access() was originally adapted for Linux
a typo was made which altered the intended behavior.  As described
in the block comment, the intended behavior is that permission
should be granted when the entry is a regular file and you have
write access.  That is, S_ISREG should have been used instead of
S_ISDIR.

Restricting permission to regular files made good sense for older
systems where setting the bit on executable files would instruct
the system to save the program's text segment on the swap device.

On modern systems this behavior has been replaced by the sticky
bit acting as a restricted deletion flag and the plain file
restriction has been relaxed.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6889
Closes #6910
2017-12-04 17:22:47 -08:00
JKDingwall d45702bcfa Add /usr/bin/env to COPY_EXEC_LIST initramfs hook
5dc1ff29 changed the user space program to mount a zfs snapshot
from /bin/sh to /usr/bin/env.  If the executable is not present
in the initramfs then snapshots cannot be automounted.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: James Dingwall <james.dingwall@zynstra.com>
Closes #5360
Closes #6913
Conflicts:
	contrib/initramfs/hooks/zfs
2017-12-04 17:22:36 -08:00
Brian Behlendorf ddd20dbe0b Fix 'zpool create|add' replication level check
When the pool configuration contains a hole due to a previous device
removal ignore this top level vdev.  Failure to do so will result in
the current configuration being assessed to have a non-uniform
replication level and the expected warning will be disabled.

The zpool_add_010_pos test case was extended to cover this scenario.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6907
Closes #6911
2017-12-04 17:21:39 -08:00
Brian Behlendorf 4a98780933 Preserve itx alloc size for zio_data_buf_free()
Using zio_data_buf_alloc() to allocate the itx's may be unsafe
because the itx->itx_lr.lrc_reclen field is not constant from
allocation to free.  Using a different itx->itx_lr.lrc_reclen
size in zio_data_buf_free() can result in the allocation being
returned to the wrong kmem cache.

This issue can be avoided entirely by storing the allocation size
in itx->itx_size and using that for zio_data_buf_free().

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6912
2017-12-04 17:21:39 -08:00
LOLi 6db8f1a0d1 Fix 'zfs get {user|group}objused@' functionality
Fix a regression accidentally introduced in 1b81ab4 that prevents
'zfs get {user|group}objused@' from correctly reporting the requested
value.

Update "userspace_003_pos.ksh" and "groupspace_003_pos.ksh" to verify
this functionality.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6908
2017-12-04 17:21:39 -08:00
Mark Wright e06711412b Linux 4.14 compat: CONFIG_GCC_PLUGIN_RANDSTRUCT
Fix build errors with gcc 7.2.0 on Gentoo with kernel 4.14
built with CONFIG_GCC_PLUGIN_RANDSTRUCT=y such as:

module/nvpair/nvpair.c:2810:2:error:
positional initialization of field in ?struct? declared with
'designated_init' attribute [-Werror=designated-init]
  nvs_native_nvlist,
  ^~~~~~~~~~~~~~~~~

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mark Wright <gienah@gentoo.org>
Closes #5390
Closes #6903
2017-12-04 17:21:39 -08:00
Richard Laager 68ba1d2fa9 initramfs: Honor canmount=off
The initramfs script was not honoring canmount=off.  With this change,
it does.  If the administrator has asked that a filesystem not be
mounted, that should be honored.

As an exception, the initramfs script ignores canmount=off on the
rootfs.  The rootfs should not have canmount=off set either.  However,
mounting it anyway seems harmless because it is being asked for
explicitly.  The point of this exception is to avoid the risk of
breaking existing systems, just in case someone has canmount=off set on
their rootfs.

The initramfs still mounts filesystems with canmount=noauto.  This is
necessary because it is typical to set that on the rootfs so that it can
be cloned.  Without canmount=noauto, the clones' duplicate mountpoints
would conflict.

This is the remainder of the fix for:
https://github.com/zfsonlinux/pkg-zfs/issues/221

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #6897
2017-12-04 17:21:38 -08:00
Richard Laager 4e11137989 initramfs: Honor mountpoint=none/legacy
For filesystems that are children of the rootfs, when mountpoint=none or
mountpoint=legacy, the initrafms script would assume a mountpoint based
on the dataset path.  Given that the rootfs should have mountpoint=/ and
mountpoint inheritance is is the default behavior of ZFS, this behavior
seems unnecessary.  In any event, it turns mountpoint=none into a no-op.
That removes this option from the administrator, and if someone uses it,
it does not work as expected.  Worse yet, if the mountpoint directory
does not exist (which is the typical case for mountpoint=none), the
mounting and thus the boot process will fail.  For the case of
mountpoint=legacy, the assumed mountpoint may not be the correct value
set in /etc/fstab.

This change makes the initramfs script not mount the filesystem in
either case.  For mountpoint=none, this means we are correctly honoring
the setting.  For mountpoint=legacy, there are two scenarios:  If
canmount=on, the filesystem will be mounted by the normal mechanisms
later in the boot process.  If canmount=noauto, the filesystem will not
be mounted at all, unless the administrator has done something special.
If they're not doing something special and they want it mounted by the
initramfs, they can simply not set mountpoint=legacy.

This is part of the fix for:
https://github.com/zfsonlinux/pkg-zfs/issues/221

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #6897
2017-12-04 17:21:38 -08:00
DeHackEd be9be1cc3e zpool(8): Fix "zpool import -t"
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: DHE <git@dehacked.net>
Closes #6894
2017-12-04 17:21:38 -08:00
George G eab4536081 Fix column alignment with long zpool names
`zpool status` normally aligns NAME/STATE/etc columns:

    NAME                       STATE     READ WRITE CKSUM
    dummy                      ONLINE       0     0     0
      mirror-0                 ONLINE       0     0     0
        /tmp/dummy-long-1.bin  ONLINE       0     0     0
        /tmp/dummy-long-2.bin  ONLINE       0     0     0
      mirror-1                 ONLINE       0     0     0
        /tmp/dummy-long-3.bin  ONLINE       0     0     0
        /tmp/dummy-long-4.bin  ONLINE       0     0     0

However, if the zpool name is longer than the zvol names, alignment
issues arise:

    NAME                  STATE     READ WRITE CKSUM
    dummy-very-very-long-zpool-name  ONLINE       0     0     0
      mirror-0            ONLINE       0     0     0
        /tmp/dummy-1.bin  ONLINE       0     0     0
        /tmp/dummy-2.bin  ONLINE       0     0     0
      mirror-1            ONLINE       0     0     0
        /tmp/dummy-3.bin  ONLINE       0     0     0
        /tmp/dummy-4.bin  ONLINE       0     0     0

`zpool iostat` and `zpool import` are also affected:

                  capacity     operations     bandwidth
    pool        alloc   free   read  write   read  write
    ----------  -----  -----  -----  -----  -----  -----
    dummy        104K  1.97G      0      0    152  9.84K
    dummy-very-very-long-zpool-name   152K  1.97G      0      1    144  13.1K
    ----------  -----  -----  -----  -----  -----  -----

    dummy-very-very-long-zpool-name  ONLINE
      mirror-0            ONLINE
        /tmp/dummy-1.bin  ONLINE
        /tmp/dummy-2.bin  ONLINE
      mirror-1            ONLINE
        /tmp/dummy-3.bin  ONLINE
        /tmp/dummy-4.bin  ONLINE

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Gaydarov <git@gg7.io>
Closes #6786
2017-12-04 17:21:38 -08:00
Brian Behlendorf 954516cec1 Emit history events for 'zpool create'
History commands and events were being suppressed for the
'zpool create' command since the history object did not
yet exist.  Create the object earlier so this history
doesn't get lost.

Split the pool_destroy event in to pool_destroy and
pool_export so they may be distinguished.

Updated events_001_pos and events_002_pos test cases.  They
now check for the expected history events and were reworked
to be more reliable.

Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6712
Closes #6486
Conflicts:
	tests/zfs-tests/tests/functional/events/events_002_pos.ksh
2017-12-04 17:21:03 -08:00
Brian Behlendorf 841cb5ee2a Fix dirty check in dmu_offset_next()
The correct way to determine if a dnode is dirty is to check
if any of the dn->dn_dirty_link's are active.  Relying solely
on the dn->dn_dirtyctx can result in the dnode being mistakenly
reported as clean.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3125 
Closes #6867
2017-11-21 13:11:29 -06:00
Brian Behlendorf d4cf31275b Disable automatic dependencies in zfs-test package
All of the ZTS test scripts specify /bin/ksh as the interpreter.
Unfortunately, as of Fedora 27 only /usr/bin/ksh is provided by
the package manager.  Rather than change all the scripts to
accommodate the latest Fedora disable automatic dependencies
for the zfs-test package.  Functionally this will not cause
any problems since /bin is a symlink to /usr/bin.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6868
2017-11-21 13:11:29 -06:00
LOLi fedc1d96a8 Fix truncate(2) mtime and ctime handling
On Linux, ftruncate(2) always changes the file timestamps, even if the
file size is not changed. However, in case of a successfull
truncate(2), the timestamps are updated only if the file size changes.
This translates to the VFS calling the ZFS Posix Layer "setattr"
function (zpl_setattr) with ATTR_MTIME and ATTR_CTIME unconditionally
set on the iattr mask only when doing a ftruncate(2), while the
truncate(2) is left to the filesystem implementation to be dealt with.

This behaviour is consistent with POSIX:2004/SUSv3 specifications
where there's no explicit requirement for file size changes to update
the timestamps only for ftruncate(2):

http://pubs.opengroup.org/onlinepubs/009695399/functions/truncate.html
http://pubs.opengroup.org/onlinepubs/009695399/functions/ftruncate.html

This has been later updated in POSIX:2008/SUSv4 where, for both
truncate(2)/ftruncate(2), there's no mention of this size change
requirement:

http://austingroupbugs.net/view.php?id=489
http://pubs.opengroup.org/onlinepubs/9699919799/functions/truncate.html
http://pubs.opengroup.org/onlinepubs/9699919799/functions/ftruncate.html

Unfortunately the Linux VFS is still calling into the ZPL without
ATTR_MTIME/ATTR_CTIME set in the truncate(2) case: we fix this by
explicitly updating the timestamps when detecting the ATTR_SIZE bit,
which is always set in do_truncate(), on the iattr mask.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6811
Closes #6819
2017-11-21 13:11:29 -06:00
benrubson 59511072b4 OpenZFS 7531 - Assign correct flags to prefetched buffers
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Authored by: abraunegg <alex.braunegg@gmail.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/7531
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/468008cb
2017-11-21 13:11:29 -06:00
Arkadiusz Bubała a5c8119eba Long hold the dataset during upgrade
If the receive or rollback is performed while filesystem is upgrading
the objset may be evicted in `dsl_dataset_clone_swap_sync_impl`. This
will lead to NULL pointer dereference when upgrade tries to access
evicted objset.

This commit adds long hold of dataset during whole upgrade process.
The receive and rollback will return an EBUSY error until the
upgrade is not finished.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arkadiusz Bubała <arkadiusz.bubala@open-e.com>
Closes #5295
Closes #6837
2017-11-21 13:03:21 -06:00
Tim Chase d7881a6dca Handle compressed buffers in __dbuf_hold_impl()
In __dbuf_hold_impl(), if a buffer is currently syncing and is still
referenced from db_data, a copy is made in case it is dirtied again in
the txg.  Previously, the buffer for the copy was simply allocated with
arc_alloc_buf() which doesn't handle compressed or encrypted buffers
(which are a special case of a compressed buffer).  The result was
typically an invalid memory access because the newly-allocated buffer
was of the uncompressed size.

This commit fixes the problem by handling the 2 compressed cases,
encrypted and unencrypted, respectively, with arc_alloc_raw_buf() and
arc_alloc_compressed_buf().

Although using the proper allocation functions fixes the invalid memory
access by allocating a buffer of the compressed size, another unrelated
issue made it impossible to properly detect compressed buffers in the
first place.  The header's compression flag was set to ZIO_COMPRESS_OFF
in arc_write() when it was possible that an attached buffer was actually
compressed.  This commit adds logic to only set ZIO_COMPRESS_OFF in
the non-ZIO_RAW case which wil handle both cases of compressed buffers
(encrypted or unencrypted).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes #5742
Closes #6797
2017-11-21 13:01:30 -06:00
LOLi 951e62169e Fix undefined %{systemd_svcs} in RPM scriptlets
This allows RPM-based systems to properly control package installation
and removal when using systemd.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6838 
Closes #6841
2017-11-20 16:48:26 -06:00
wli5 9add19b37d Bug fix in qat_compress.c when compressed size is < 4KB
When the 128KB block is compressed to less than 4KB, the pointer
to the Footer is not in the end of the compressed buffer, that's
because the Header offset was added twice for this case. So there
is a gap between the Footer and the compressed buffer.
1. Always compute the Footer pointer address from the start of the
last page.
2. Remove the un-used workaroud code which has been verified fixed
with the latest driver and this fix.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Weigang Li <weigang.li@intel.com>
Closes #6827
2017-11-20 16:48:26 -06:00
Brian Behlendorf b2d633202d Disable automatic dependencies in DKMS package
By default additional dependencies are generated automatically for
packages.  This is normally a good thing because it helps ensure
things just work.  It doesn't make sense for the DKMS package which
requires minimal dependencies that can be easily listed.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6467 
Closes #6835
2017-11-20 16:48:25 -06:00
Brian Behlendorf 414f4a9c54 Initramfs fixes
* initramfs: Fix inconsistent whitespace
* initramfs: Fix a spelling error
* initramfs: Set elevator=noop on the rpool's disks

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #6807
2017-11-20 16:23:33 -06:00
Antonio Russo 1c4f5e7d92 systemd zfs-import.target and documentation
zfs-import-{cache,scan}.service must complete before any mounting of
filesystems can occur. To simplify this dependency, create a target
that is reached After (in the systemd sense) the pool is imported.

Additionally, recommend that legacy zfs mounts use the option

x-systemd.requires=zfs-import.target

to codify this requirement.

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
Closes #6764
2017-11-20 16:20:08 -06:00
abraunegg 246e515cf8 Update zfs module parameters man5
Update zfs module parameters man5 with missing parameter details
for multiple tunings.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Alex Braunegg <alex.braunegg@gmail.com>
Closes #6785
2017-11-20 16:19:54 -06:00
Brian Behlendorf 2d41e75e52 Fix status command options in zpool(8)
The 'zpool status' command supports the -P option for printing full
path names.  It does not support the -p parsable option for printing
exact values.
    
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6792 
Closes #6794
2017-11-20 16:19:23 -06:00
Fabian-Gruenbichler d834d6811b arcstat: flush stdout / outfile after each line
Otherwise, if arcstat gets interrupted before the desired number of
iterations is reached, the output file will be empty (both if set via
'-o' or via shell redirection).

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes #6775
2017-11-20 16:19:23 -06:00
Giuseppe Di Natale 029a1b0c20 Ensure arc_size_break is filled in arc_summary.py
Use mfu_size and mru_size pulled from the arcstats
kstat file to calculate the mfu and mru percentages
for arc size breakdown.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: AndCycle <andcycle@andcycle.idv.tw>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #5526 
Closes #6770
2017-11-20 16:19:23 -06:00
Giuseppe Di Natale c45254b0ec Correct flake8 errors after STYLE builder update
Fix new flake8 errors related to bare excepts and ambiguous
variable names due to a STYLE builder update.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6776
2017-11-20 16:19:23 -06:00
wli5 318fdeb51f Support integration with new QAT products
Support integration with new QAT products: Intel(R) C62x Chipset,
or Atom(R) C3000 Processor Product Family SoC:
1. Detect new file name in auto-conf.
2. Change MAX_INSTANCES to 48.
3. Change "num_inst" to U16 to clean a build warning.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Weigang Li <weigang.li@intel.com>
Closes #6767
2017-11-20 16:19:23 -06:00
Olaf Faaland d3d20bf442 Reimplement vdev_random_leaf and rename it
Rename it as mmp_random_leaf() since it is defined in mmp.c.

The earlier implementation could end up spinning forever if a pool had a
vdev marked writeable, none of whose children were writeable.  It also
did not guarantee that if a writeable leaf vdev existed, it would be
found.

Reimplement to recursively walk the device tree to select the leaf.  It
searches the entire tree, so that a return value of (NULL) indicates
there were no usable leaves in the pool; all were either not writeable
or had pending mmp writes.

It still chooses the starting child randomly at each level of the tree,
so if the pool's devices are healthy, the mmp writes go to random leaves
with an even distribution.  This was verified by testing using
zfs_multihost_history enabled.

Reviewed by: Thomas Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #6631 
Closes #6665
2017-11-20 16:19:23 -06:00
452 changed files with 9676 additions and 2959 deletions

View File

@ -161,7 +161,7 @@ coding convention.
### Commit Message Formats
#### New Changes
Commit messages for new changes must meet the following guidelines:
* In 50 characters or less, provide a summary of the change as the
* In 72 characters or less, provide a summary of the change as the
first line in the commit message.
* A body which provides a description of the change. If necessary,
please summarize important information such as why the proposed

30
.github/codecov.yml vendored Normal file
View File

@ -0,0 +1,30 @@
codecov:
notify:
require_ci_to_pass: no
coverage:
precision: 2
round: down
range: "50...100"
status:
project:
default:
threshold: 1%
patch:
default:
threshold: 1%
parsers:
gcov:
branch_detection:
conditional: yes
loop: yes
method: no
macro: no
comment:
layout: "header, sunburst, diff"
behavior: default
require_changes: no

View File

@ -1,4 +1,3 @@
nullPointer:./module/zfs/zfs_vnops.c:839
preprocessorErrorDirective:./module/zfs/vdev_raidz_math_avx512f.c:243
preprocessorErrorDirective:./module/zfs/vdev_raidz_math_sse2.c:266

2
.gitignore vendored
View File

@ -19,6 +19,8 @@
*.mod.c
*~
*.swp
*.gcno
*.gcda
.deps
.libs
.dirstamp

2
META
View File

@ -1,7 +1,7 @@
Meta: 1
Name: zfs
Branch: 1.0
Version: 0.7.3
Version: 0.7.13
Release: 1
Release-Tags: relext
License: CDDL

View File

@ -23,16 +23,18 @@ EXTRA_DIST = autogen.sh copy-builtin
EXTRA_DIST += config/config.awk config/rpm.am config/deb.am config/tgz.am
EXTRA_DIST += META DISCLAIMER COPYRIGHT README.markdown OPENSOLARIS.LICENSE
@CODE_COVERAGE_RULES@
distclean-local::
-$(RM) -R autom4te*.cache
-find . \( -name SCCS -o -name BitKeeper -o -name .svn -o -name CVS \
-o -name .pc -o -name .hg -o -name .git \) -prune -o \
\( -name '*.orig' -o -name '*.rej' -o -name '*~' \
-o -name '*.bak' -o -name '#*#' -o -name '.*.orig' \
-o -name '.*.rej' -o -name '.script-config' -o -size 0 \
-o -name '*%' -o -name '.*.cmd' -o -name 'core' \
-o -name 'Makefile' -o -name 'Module.symvers' \
-o -name '*.order' -o -name '*.markers' \) \
-o -name '.*.rej' -o -size 0 -o -name '*%' -o -name '.*.cmd' \
-o -name 'core' -o -name 'Makefile' -o -name 'Module.symvers' \
-o -name '*.order' -o -name '*.markers' -o -name '*.gcda' \
-o -name '*.gcno' \) \
-type f -print | xargs $(RM)
dist-hook:
@ -65,10 +67,10 @@ lint: cppcheck paxcheck
cppcheck:
@if type cppcheck > /dev/null 2>&1; then \
cppcheck --quiet --force --error-exitcode=2 \
cppcheck --quiet --force --error-exitcode=2 --inline-suppr \
--suppressions-list=.github/suppressions.txt \
-UHAVE_SSE2 -UHAVE_AVX512F \
${top_srcdir}; \
-UHAVE_SSE2 -UHAVE_AVX512F -UHAVE_UIO_ZEROCOPY \
-UHAVE_DNLC ${top_srcdir}; \
fi
paxcheck:

View File

@ -1,9 +1,9 @@
<p align="center"><img src="http://zfsonlinux.org/images/zfs-linux.png"/></p>
ZFS is an advanced file system and volume manager which was originally
developed for Solaris and is now maintained by the Illumos community.
![img](http://zfsonlinux.org/images/zfs-linux.png)
ZFS on Linux, which is also known as ZoL, is currently feature complete. It
includes fully functional and stable SPA, DMU, ZVOL, and ZPL layers. And it's native!
ZFS on Linux is an advanced file system and volume manager which was originally
developed for Solaris and is now maintained by the OpenZFS community.
[![codecov](https://codecov.io/gh/zfsonlinux/zfs/branch/master/graph/badge.svg)](https://codecov.io/gh/zfsonlinux/zfs)
# Official Resources
* [Site](http://zfsonlinux.org)

View File

@ -32,38 +32,56 @@
# If you are having troubles when using this script from cron(8) please try
# adjusting your PATH before reporting problems.
#
# /usr/bin & /sbin
#
# Binaries used are:
#
# dc(1), kldstat(8), sed(1), sysctl(8) & vmstat(8)
#
# Binaries that I am working on phasing out are:
#
# dc(1) & sed(1)
# Note some of this code uses older code (eg getopt instead of argparse,
# subprocess.Popen() instead of subprocess.run()) because we need to support
# some very old versions of Python.
"""Print statistics on the ZFS Adjustable Replacement Cache (ARC)
Provides basic information on the ARC, its efficiency, the L2ARC (if present),
the Data Management Unit (DMU), Virtual Devices (VDEVs), and tunables. See the
in-source documentation and code at
https://github.com/zfsonlinux/zfs/blob/master/module/zfs/arc.c for details.
"""
import getopt
import os
import sys
import time
import getopt
import re
from os import listdir
import errno
from subprocess import Popen, PIPE
from decimal import Decimal as D
usetunable = True
show_tunable_descriptions = False
alternate_tunable_layout = False
kstat_pobj = re.compile("^([^:]+):\s+(.+)\s*$", flags=re.M)
def handle_Exception(ex_cls, ex, tb):
if ex is IOError:
if ex.errno == errno.EPIPE:
sys.exit()
if ex is KeyboardInterrupt:
sys.exit()
sys.excepthook = handle_Exception
def get_Kstat():
"""Collect information on the ZFS subsystem from the /proc virtual
file system. The name "kstat" is a holdover from the Solaris utility
of the same name.
"""
def load_proc_kstats(fn, namespace):
"""Collect information on a specific subsystem of the ARC"""
kstats = [line.strip() for line in open(fn)]
del kstats[0:2]
for kstat in kstats:
kstat = kstat.strip()
name, unused, value = kstat.split()
name, _, value = kstat.split()
Kstat[namespace + name] = D(value)
Kstat = {}
@ -77,82 +95,79 @@ def get_Kstat():
return Kstat
def div1():
sys.stdout.write("\n")
for i in range(18):
sys.stdout.write("%s" % "----")
sys.stdout.write("\n")
def fBytes(b=0):
"""Return human-readable representation of a byte value in
powers of 2 (eg "KiB" for "kibibytes", etc) to two decimal
points. Values smaller than one KiB are returned without
decimal points.
"""
prefixes = [
[2**80, "YiB"], # yobibytes (yotta)
[2**70, "ZiB"], # zebibytes (zetta)
[2**60, "EiB"], # exbibytes (exa)
[2**50, "PiB"], # pebibytes (peta)
[2**40, "TiB"], # tebibytes (tera)
[2**30, "GiB"], # gibibytes (giga)
[2**20, "MiB"], # mebibytes (mega)
[2**10, "KiB"]] # kibibytes (kilo)
def div2():
sys.stdout.write("\n")
if b >= 2**10:
for limit, unit in prefixes:
def fBytes(Bytes=0, Decimal=2):
kbytes = (2 ** 10)
mbytes = (2 ** 20)
gbytes = (2 ** 30)
tbytes = (2 ** 40)
pbytes = (2 ** 50)
ebytes = (2 ** 60)
zbytes = (2 ** 70)
ybytes = (2 ** 80)
if b >= limit:
value = b / limit
break
result = "%0.2f\t%s" % (value, unit)
if Bytes >= ybytes:
return str("%0." + str(Decimal) + "f") % (Bytes / ybytes) + "\tYiB"
elif Bytes >= zbytes:
return str("%0." + str(Decimal) + "f") % (Bytes / zbytes) + "\tZiB"
elif Bytes >= ebytes:
return str("%0." + str(Decimal) + "f") % (Bytes / ebytes) + "\tEiB"
elif Bytes >= pbytes:
return str("%0." + str(Decimal) + "f") % (Bytes / pbytes) + "\tPiB"
elif Bytes >= tbytes:
return str("%0." + str(Decimal) + "f") % (Bytes / tbytes) + "\tTiB"
elif Bytes >= gbytes:
return str("%0." + str(Decimal) + "f") % (Bytes / gbytes) + "\tGiB"
elif Bytes >= mbytes:
return str("%0." + str(Decimal) + "f") % (Bytes / mbytes) + "\tMiB"
elif Bytes >= kbytes:
return str("%0." + str(Decimal) + "f") % (Bytes / kbytes) + "\tKiB"
elif Bytes == 0:
return str("%d" % 0) + "\tBytes"
else:
return str("%d" % Bytes) + "\tBytes"
result = "%d\tBytes" % b
return result
def fHits(Hits=0, Decimal=2):
khits = (10 ** 3)
mhits = (10 ** 6)
bhits = (10 ** 9)
thits = (10 ** 12)
qhits = (10 ** 15)
Qhits = (10 ** 18)
shits = (10 ** 21)
Shits = (10 ** 24)
def fHits(hits=0):
"""Create a human-readable representation of the number of hits.
The single-letter symbols used are SI to avoid the confusion caused
by the different "short scale" and "long scale" representations in
English, which use the same words for different values. See
https://en.wikipedia.org/wiki/Names_of_large_numbers and
https://physics.nist.gov/cuu/Units/prefixes.html
"""
numbers = [
[10**24, 'Y'], # yotta (septillion)
[10**21, 'Z'], # zetta (sextillion)
[10**18, 'E'], # exa (quintrillion)
[10**15, 'P'], # peta (quadrillion)
[10**12, 'T'], # tera (trillion)
[10**9, 'G'], # giga (billion)
[10**6, 'M'], # mega (million)
[10**3, 'k']] # kilo (thousand)
if hits >= 1000:
for limit, symbol in numbers:
if hits >= limit:
value = hits/limit
break
result = "%0.2f%s" % (value, symbol)
if Hits >= Shits:
return str("%0." + str(Decimal) + "f") % (Hits / Shits) + "S"
elif Hits >= shits:
return str("%0." + str(Decimal) + "f") % (Hits / shits) + "s"
elif Hits >= Qhits:
return str("%0." + str(Decimal) + "f") % (Hits / Qhits) + "Q"
elif Hits >= qhits:
return str("%0." + str(Decimal) + "f") % (Hits / qhits) + "q"
elif Hits >= thits:
return str("%0." + str(Decimal) + "f") % (Hits / thits) + "t"
elif Hits >= bhits:
return str("%0." + str(Decimal) + "f") % (Hits / bhits) + "b"
elif Hits >= mhits:
return str("%0." + str(Decimal) + "f") % (Hits / mhits) + "m"
elif Hits >= khits:
return str("%0." + str(Decimal) + "f") % (Hits / khits) + "k"
elif Hits == 0:
return str("%d" % 0)
else:
return str("%d" % Hits)
result = "%d" % hits
return result
def fPerc(lVal=0, rVal=0, Decimal=2):
"""Calculate percentage value and return in human-readable format"""
if rVal > 0:
return str("%0." + str(Decimal) + "f") % (100 * (lVal / rVal)) + "%"
else:
@ -160,6 +175,7 @@ def fPerc(lVal=0, rVal=0, Decimal=2):
def get_arc_summary(Kstat):
"""Collect general data on the ARC"""
output = {}
memory_throttle_count = Kstat[
@ -176,16 +192,18 @@ def get_arc_summary(Kstat):
# ARC Misc.
deleted = Kstat["kstat.zfs.misc.arcstats.deleted"]
mutex_miss = Kstat["kstat.zfs.misc.arcstats.mutex_miss"]
evict_skip = Kstat["kstat.zfs.misc.arcstats.evict_skip"]
# ARC Misc.
output["arc_misc"] = {}
output["arc_misc"]["deleted"] = fHits(deleted)
output["arc_misc"]['mutex_miss'] = fHits(mutex_miss)
output["arc_misc"]['evict_skips'] = fHits(mutex_miss)
output["arc_misc"]['evict_skips'] = fHits(evict_skip)
# ARC Sizing
arc_size = Kstat["kstat.zfs.misc.arcstats.size"]
mru_size = Kstat["kstat.zfs.misc.arcstats.p"]
mru_size = Kstat["kstat.zfs.misc.arcstats.mru_size"]
mfu_size = Kstat["kstat.zfs.misc.arcstats.mfu_size"]
target_max_size = Kstat["kstat.zfs.misc.arcstats.c_max"]
target_min_size = Kstat["kstat.zfs.misc.arcstats.c_min"]
target_size = Kstat["kstat.zfs.misc.arcstats.c"]
@ -230,27 +248,14 @@ def get_arc_summary(Kstat):
]
output['arc_size_break'] = {}
if arc_size > target_size:
mfu_size = (arc_size - mru_size)
output['arc_size_break']['recently_used_cache_size'] = {
'per': fPerc(mru_size, arc_size),
'num': fBytes(mru_size),
}
output['arc_size_break']['frequently_used_cache_size'] = {
'per': fPerc(mfu_size, arc_size),
'num': fBytes(mfu_size),
}
elif arc_size < target_size:
mfu_size = (target_size - mru_size)
output['arc_size_break']['recently_used_cache_size'] = {
'per': fPerc(mru_size, target_size),
'num': fBytes(mru_size),
}
output['arc_size_break']['frequently_used_cache_size'] = {
'per': fPerc(mfu_size, target_size),
'num': fBytes(mfu_size),
}
output['arc_size_break']['recently_used_cache_size'] = {
'per': fPerc(mru_size, mru_size + mfu_size),
'num': fBytes(mru_size),
}
output['arc_size_break']['frequently_used_cache_size'] = {
'per': fPerc(mfu_size, mru_size + mfu_size),
'num': fBytes(mfu_size),
}
# ARC Hash Breakdown
hash_chain_max = Kstat["kstat.zfs.misc.arcstats.hash_chain_max"]
@ -273,6 +278,8 @@ def get_arc_summary(Kstat):
def _arc_summary(Kstat):
"""Print information on the ARC"""
# ARC Sizing
arc = get_arc_summary(Kstat)
@ -288,7 +295,7 @@ def _arc_summary(Kstat):
sys.stdout.write("\tMutex Misses:\t\t\t\t%s\n" %
arc['arc_misc']['mutex_miss'])
sys.stdout.write("\tEvict Skips:\t\t\t\t%s\n" %
arc['arc_misc']['mutex_miss'])
arc['arc_misc']['evict_skips'])
sys.stdout.write("\n")
# ARC Sizing
@ -347,6 +354,8 @@ def _arc_summary(Kstat):
def get_arc_efficiency(Kstat):
"""Collect information on the efficiency of the ARC"""
output = {}
arc_hits = Kstat["kstat.zfs.misc.arcstats.hits"]
@ -470,6 +479,8 @@ def get_arc_efficiency(Kstat):
def _arc_efficiency(Kstat):
"""Print information on the efficiency of the ARC"""
arc = get_arc_efficiency(Kstat)
sys.stdout.write("ARC Total accesses:\t\t\t\t\t%s\n" %
@ -580,6 +591,8 @@ def _arc_efficiency(Kstat):
def get_l2arc_summary(Kstat):
"""Collection information on the L2ARC"""
output = {}
l2_abort_lowmem = Kstat["kstat.zfs.misc.arcstats.l2_abort_lowmem"]
@ -674,6 +687,7 @@ def get_l2arc_summary(Kstat):
def _l2arc_summary(Kstat):
"""Print information on the L2ARC"""
arc = get_l2arc_summary(Kstat)
@ -758,6 +772,8 @@ def _l2arc_summary(Kstat):
def get_dmu_summary(Kstat):
"""Collect information on the DMU"""
output = {}
zfetch_hits = Kstat["kstat.zfs.misc.zfetchstats.hits"]
@ -783,6 +799,7 @@ def get_dmu_summary(Kstat):
def _dmu_summary(Kstat):
"""Print information on the DMU"""
arc = get_dmu_summary(Kstat)
@ -804,6 +821,8 @@ def _dmu_summary(Kstat):
def get_vdev_summary(Kstat):
"""Collect information on the VDEVs"""
output = {}
vdev_cache_delegations = \
@ -834,6 +853,8 @@ def get_vdev_summary(Kstat):
def _vdev_summary(Kstat):
"""Print information on the VDEVs"""
arc = get_vdev_summary(Kstat)
if arc['vdev_cache_total'] > 0:
@ -853,10 +874,12 @@ def _vdev_summary(Kstat):
def _tunable_summary(Kstat):
"""Print information on tunables, including descriptions if requested"""
global show_tunable_descriptions
global alternate_tunable_layout
names = listdir("/sys/module/zfs/parameters/")
names = os.listdir("/sys/module/zfs/parameters/")
values = {}
for name in names:
@ -867,13 +890,21 @@ def _tunable_summary(Kstat):
descriptions = {}
if show_tunable_descriptions:
command = ["/sbin/modinfo", "zfs", "-0"]
try:
command = ["/sbin/modinfo", "zfs", "-0"]
p = Popen(command, stdin=PIPE, stdout=PIPE,
stderr=PIPE, shell=False, close_fds=True)
p.wait()
description_list = p.communicate()[0].strip().split('\0')
# By default, Python 2 returns a string as the first element of the
# tuple from p.communicate(), while Python 3 returns bytes which
# must be decoded first. The better way to do this would be with
# subprocess.run() or at least .check_output(), but this fails on
# CentOS 6 because of its old version of Python 2
desc = bytes.decode(p.communicate()[0])
description_list = desc.strip().split('\0')
if p.returncode == 0:
for tunable in description_list:
@ -892,19 +923,23 @@ def _tunable_summary(Kstat):
(sys.argv[0], command[0], e.strerror))
sys.stderr.write("Tunable descriptions will be disabled.\n")
sys.stdout.write("ZFS Tunable:\n")
sys.stdout.write("ZFS Tunables:\n")
names.sort()
if alternate_tunable_layout:
fmt = "\t%s=%s\n"
else:
fmt = "\t%-50s%s\n"
for name in names:
if not name:
continue
format = "\t%-50s%s\n"
if alternate_tunable_layout:
format = "\t%s=%s\n"
if show_tunable_descriptions and name in descriptions:
sys.stdout.write("\t# %s\n" % descriptions[name])
sys.stdout.write(format % (name, values[name]))
sys.stdout.write(fmt % (name, values[name]))
unSub = [
@ -918,14 +953,18 @@ unSub = [
def zfs_header():
daydate = time.strftime("%a %b %d %H:%M:%S %Y")
"""Print title string with date"""
div1()
sys.stdout.write("ZFS Subsystem Report\t\t\t\t%s" % daydate)
div2()
daydate = time.strftime('%a %b %d %H:%M:%S %Y')
sys.stdout.write('\n'+'-'*72+'\n')
sys.stdout.write('ZFS Subsystem Report\t\t\t\t%s' % daydate)
sys.stdout.write('\n')
def usage():
"""Print usage information"""
sys.stdout.write("Usage: arc_summary.py [-h] [-a] [-d] [-p PAGE]\n\n")
sys.stdout.write("\t -h, --help : "
"Print this help message and exit\n")
@ -946,12 +985,20 @@ def usage():
def main():
"""Main function"""
global show_tunable_descriptions
global alternate_tunable_layout
opts, args = getopt.getopt(
sys.argv[1:], "adp:h", ["alternate", "description", "page=", "help"]
)
try:
opts, args = getopt.getopt(
sys.argv[1:],
"adp:h", ["alternate", "description", "page=", "help"]
)
except getopt.error as e:
sys.stderr.write("Error: %s\n" % e.msg)
usage()
sys.exit(1)
args = {}
for opt, arg in opts:
@ -963,7 +1010,7 @@ def main():
args['p'] = arg
if opt in ('-h', '--help'):
usage()
sys.exit()
sys.exit(0)
Kstat = get_Kstat()
@ -978,14 +1025,14 @@ def main():
except IndexError:
sys.stderr.write('the argument to -p must be between 1 and ' +
str(len(unSub)) + '\n')
sys.exit()
sys.exit(1)
else:
pages = unSub
zfs_header()
for page in pages:
page(Kstat)
div2()
sys.stdout.write("\n")
if __name__ == '__main__':

View File

@ -112,7 +112,6 @@ cur = {}
d = {}
out = None
kstat = None
float_pobj = re.compile("^[0-9]+(\.[0-9]+)?$")
def detailed_usage():
@ -219,6 +218,7 @@ def print_values():
sep
))
sys.stdout.write("\n")
sys.stdout.flush()
def print_header():
@ -238,7 +238,7 @@ def get_terminal_lines():
data = fcntl.ioctl(sys.stdout.fileno(), termios.TIOCGWINSZ, '1234')
sz = struct.unpack('hh', data)
return sz[0]
except:
except Exception:
pass
@ -279,12 +279,12 @@ def init():
"outfile",
"help",
"verbose",
"seperator",
"separator",
"columns"
]
)
except getopt.error as msg:
sys.stderr.write(msg)
sys.stderr.write("Error: %s\n" % str(msg))
usage()
opts = None
@ -298,7 +298,7 @@ def init():
hflag = True
if opt in ('-v', '--verbose'):
vflag = True
if opt in ('-s', '--seperator'):
if opt in ('-s', '--separator'):
sep = arg
i += 1
if opt in ('-f', '--columns'):

View File

@ -474,7 +474,7 @@ def main():
"help",
"infile",
"outfile",
"seperator",
"separator",
"types",
"verbose",
"extended"
@ -499,7 +499,7 @@ def main():
ofile = arg
if opt in ('-r', '--raw'):
raw += 1
if opt in ('-s', '--seperator'):
if opt in ('-s', '--separator'):
sep = arg
if opt in ('-t', '--types'):
tflag = True

View File

@ -7,6 +7,8 @@ DEFAULT_INCLUDES += \
#
# Ignore the prefix for the mount helper. It must be installed in /sbin/
# because this path is hardcoded in the mount(8) for security reasons.
# However, if needed, the configure option --with-mounthelperdir= can be used
# to override the default install location.
#
sbindir=$(mounthelperdir)
sbin_PROGRAMS = mount.zfs

View File

@ -100,10 +100,11 @@ usage() {
cat << EOF
Usage: vdev_id [-h]
vdev_id <-d device> [-c config_file] [-p phys_per_port]
[-g sas_direct|sas_switch] [-m]
[-g sas_direct|sas_switch|scsi] [-m]
-c specify name of alernate config file [default=$CONFIG]
-d specify basename of device (i.e. sda)
-e Create enclose device symlinks only (/dev/by-enclosure)
-g Storage network topology [default="$TOPOLOGY"]
-m Run in multipath mode
-p number of phy's per switch port [default=$PHYS_PER_PORT]
@ -135,7 +136,7 @@ map_channel() {
MAPPED_CHAN=`awk "\\$1 == \"channel\" && \\$2 == ${PORT} \
{ print \\$3; exit }" $CONFIG`
;;
"sas_direct")
"sas_direct"|"scsi")
MAPPED_CHAN=`awk "\\$1 == \"channel\" && \
\\$2 == \"${PCI_ID}\" && \\$3 == ${PORT} \
{ print \\$4; exit }" $CONFIG`
@ -276,6 +277,23 @@ sas_handler() {
d=$(eval echo \${$i})
SLOT=`echo $d | sed -e 's/^.*://'`
;;
"ses")
# look for this SAS path in all SCSI Enclosure Services
# (SES) enclosures
sas_address=`cat $end_device_dir/sas_address 2>/dev/null`
enclosures=`lsscsi -g | \
sed -n -e '/enclosu/s/^.* \([^ ][^ ]*\) *$/\1/p'`
for enclosure in $enclosures; do
set -- $(sg_ses -p aes $enclosure | \
awk "/device slot number:/{slot=\$12} \
/SAS address: $sas_address/\
{print slot}")
SLOT=$1
if [ -n "$SLOT" ] ; then
break
fi
done
;;
esac
if [ -z "$SLOT" ] ; then
return
@ -289,6 +307,156 @@ sas_handler() {
echo ${CHAN}${SLOT}${PART}
}
scsi_handler() {
if [ -z "$FIRST_BAY_NUMBER" ] ; then
FIRST_BAY_NUMBER=`awk "\\$1 == \"first_bay_number\" \
{print \\$2; exit}" $CONFIG`
fi
FIRST_BAY_NUMBER=${FIRST_BAY_NUMBER:-0}
if [ -z "$PHYS_PER_PORT" ] ; then
PHYS_PER_PORT=`awk "\\$1 == \"phys_per_port\" \
{print \\$2; exit}" $CONFIG`
fi
PHYS_PER_PORT=${PHYS_PER_PORT:-4}
if ! echo $PHYS_PER_PORT | grep -q -E '^[0-9]+$' ; then
echo "Error: phys_per_port value $PHYS_PER_PORT is non-numeric"
exit 1
fi
if [ -z "$MULTIPATH_MODE" ] ; then
MULTIPATH_MODE=`awk "\\$1 == \"multipath\" \
{print \\$2; exit}" $CONFIG`
fi
# Use first running component device if we're handling a dm-mpath device
if [ "$MULTIPATH_MODE" = "yes" ] ; then
# If udev didn't tell us the UUID via DM_NAME, check /dev/mapper
if [ -z "$DM_NAME" ] ; then
DM_NAME=`ls -l --full-time /dev/mapper |
awk "/\/$DEV$/{print \\$9}"`
fi
# For raw disks udev exports DEVTYPE=partition when
# handling partitions, and the rules can be written to
# take advantage of this to append a -part suffix. For
# dm devices we get DEVTYPE=disk even for partitions so
# we have to append the -part suffix directly in the
# helper.
if [ "$DEVTYPE" != "partition" ] ; then
PART=`echo $DM_NAME | awk -Fp '/p/{print "-part"$2}'`
fi
# Strip off partition information.
DM_NAME=`echo $DM_NAME | sed 's/p[0-9][0-9]*$//'`
if [ -z "$DM_NAME" ] ; then
return
fi
# Get the raw scsi device name from multipath -ll. Strip off
# leading pipe symbols to make field numbering consistent.
DEV=`multipath -ll $DM_NAME |
awk '/running/{gsub("^[|]"," "); print $3 ; exit}'`
if [ -z "$DEV" ] ; then
return
fi
fi
if echo $DEV | grep -q ^/devices/ ; then
sys_path=$DEV
else
sys_path=`udevadm info -q path -p /sys/block/$DEV 2>/dev/null`
fi
# expect sys_path like this, for example:
# /devices/pci0000:00/0000:00:0b.0/0000:09:00.0/0000:0a:05.0/0000:0c:00.0/host3/target3:1:0/3:1:0:21/block/sdv
# Use positional parameters as an ad-hoc array
set -- $(echo "$sys_path" | tr / ' ')
num_dirs=$#
scsi_host_dir="/sys"
# Get path up to /sys/.../hostX
i=1
while [ $i -le $num_dirs ] ; do
d=$(eval echo \${$i})
scsi_host_dir="$scsi_host_dir/$d"
echo $d | grep -q -E '^host[0-9]+$' && break
i=$(($i + 1))
done
if [ $i = $num_dirs ] ; then
return
fi
PCI_ID=$(eval echo \${$(($i -1))} | awk -F: '{print $2":"$3}')
# In scsi mode, the directory two levels beneath
# /sys/.../hostX reveals the port and slot.
port_dir=$scsi_host_dir
j=$(($i + 2))
i=$(($i + 1))
while [ $i -le $j ] ; do
port_dir="$port_dir/$(eval echo \${$i})"
i=$(($i + 1))
done
set -- $(echo $port_dir | sed -e 's/^.*:\([^:]*\):\([^:]*\)$/\1 \2/')
PORT=$1
SLOT=$(($2 + $FIRST_BAY_NUMBER))
if [ -z "$SLOT" ] ; then
return
fi
CHAN=`map_channel $PCI_ID $PORT`
SLOT=`map_slot $SLOT $CHAN`
if [ -z "$CHAN" ] ; then
return
fi
echo ${CHAN}${SLOT}${PART}
}
# Figure out the name for the enclosure symlink
enclosure_handler () {
# We get all the info we need from udev's DEVPATH variable:
#
# DEVPATH=/sys/devices/pci0000:00/0000:00:03.0/0000:05:00.0/host0/subsystem/devices/0:0:0:0/scsi_generic/sg0
# Get the enclosure ID ("0:0:0:0")
ENC=$(basename $(readlink -m "/sys/$DEVPATH/../.."))
if [ ! -d /sys/class/enclosure/$ENC ] ; then
# Not an enclosure, bail out
return
fi
# Get the long sysfs device path to our enclosure. Looks like:
# /devices/pci0000:00/0000:00:03.0/0000:05:00.0/host0/port-0:0/ ... /enclosure/0:0:0:0
ENC_DEVICE=$(readlink /sys/class/enclosure/$ENC)
# Grab the full path to the hosts port dir:
# /devices/pci0000:00/0000:00:03.0/0000:05:00.0/host0/port-0:0
PORT_DIR=$(echo $ENC_DEVICE | grep -Eo '.+host[0-9]+/port-[0-9]+:[0-9]+')
# Get the port number
PORT_ID=$(echo $PORT_DIR | grep -Eo "[0-9]+$")
# The PCI directory is two directories up from the port directory
# /sys/devices/pci0000:00/0000:00:03.0/0000:05:00.0
PCI_ID_LONG=$(basename $(readlink -m "/sys/$PORT_DIR/../.."))
# Strip down the PCI address from 0000:05:00.0 to 05:00.0
PCI_ID=$(echo "$PCI_ID_LONG" | sed -r 's/^[0-9]+://g')
# Name our device according to vdev_id.conf (like "L0" or "U1").
NAME=$(awk "/channel/{if (\$1 == \"channel\" && \$2 == \"$PCI_ID\" && \
\$3 == \"$PORT_ID\") {print \$4int(count[\$4])}; count[\$4]++}" $CONFIG)
echo "${NAME}"
}
alias_handler () {
# Special handling is needed to correctly append a -part suffix
# to partitions of device mapper devices. The DEVTYPE attribute
@ -344,7 +512,7 @@ alias_handler () {
done
}
while getopts 'c:d:g:mp:h' OPTION; do
while getopts 'c:d:eg:mp:h' OPTION; do
case ${OPTION} in
c)
CONFIG=${OPTARG}
@ -352,6 +520,16 @@ while getopts 'c:d:g:mp:h' OPTION; do
d)
DEV=${OPTARG}
;;
e)
# When udev sees a scsi_generic device, it calls this script with -e to
# create the enclosure device symlinks only. We also need
# "enclosure_symlinks yes" set in vdev_id.config to actually create the
# symlink.
ENCLOSURE_MODE=$(awk '{if ($1 == "enclosure_symlinks") print $2}' $CONFIG)
if [ "$ENCLOSURE_MODE" != "yes" ] ; then
exit 0
fi
;;
g)
TOPOLOGY=$OPTARG
;;
@ -371,7 +549,7 @@ if [ ! -r $CONFIG ] ; then
exit 0
fi
if [ -z "$DEV" ] ; then
if [ -z "$DEV" -a -z "$ENCLOSURE_MODE" ] ; then
echo "Error: missing required option -d"
exit 1
fi
@ -384,16 +562,37 @@ if [ -z "$BAY" ] ; then
BAY=`awk "\\$1 == \"slot\" {print \\$2; exit}" $CONFIG`
fi
TOPOLOGY=${TOPOLOGY:-sas_direct}
# Should we create /dev/by-enclosure symlinks?
if [ "$ENCLOSURE_MODE" = "yes" -a "$TOPOLOGY" = "sas_direct" ] ; then
ID_ENCLOSURE=$(enclosure_handler)
if [ -z "$ID_ENCLOSURE" ] ; then
exit 0
fi
# Just create the symlinks to the enclosure devices and then exit.
ENCLOSURE_PREFIX=$(awk '/enclosure_symlinks_prefix/{print $2}' $CONFIG)
if [ -z "$ENCLOSURE_PREFIX" ] ; then
ENCLOSURE_PREFIX="enc"
fi
echo "ID_ENCLOSURE=$ID_ENCLOSURE"
echo "ID_ENCLOSURE_PATH=by-enclosure/$ENCLOSURE_PREFIX-$ID_ENCLOSURE"
exit 0
fi
# First check if an alias was defined for this device.
ID_VDEV=`alias_handler`
if [ -z "$ID_VDEV" ] ; then
BAY=${BAY:-bay}
TOPOLOGY=${TOPOLOGY:-sas_direct}
case $TOPOLOGY in
sas_direct|sas_switch)
ID_VDEV=`sas_handler`
;;
scsi)
ID_VDEV=`scsi_handler`
;;
*)
echo "Error: unknown topology $TOPOLOGY"
exit 1

View File

@ -24,7 +24,7 @@
* Copyright (c) 2011, 2016 by Delphix. All rights reserved.
* Copyright (c) 2014 Integros [integros.com]
* Copyright 2016 Nexenta Systems, Inc.
* Copyright (c) 2017 Lawrence Livermore National Security, LLC.
* Copyright (c) 2017, 2018 Lawrence Livermore National Security, LLC.
* Copyright (c) 2015, 2017, Intel Corporation.
*/
@ -2716,10 +2716,6 @@ dump_label(const char *dev)
exit(1);
}
if (ioctl(fd, BLKFLSBUF) != 0)
(void) printf("failed to invalidate cache '%s' : %s\n", path,
strerror(errno));
if (fstat64_blk(fd, &statbuf) != 0) {
(void) printf("failed to stat '%s': %s\n", path,
strerror(errno));
@ -2727,6 +2723,10 @@ dump_label(const char *dev)
exit(1);
}
if (S_ISBLK(statbuf.st_mode) && ioctl(fd, BLKFLSBUF) != 0)
(void) printf("failed to invalidate cache '%s' : %s\n", path,
strerror(errno));
avl_create(&config_tree, cksum_record_compare,
sizeof (cksum_record_t), offsetof(cksum_record_t, link));
avl_create(&uberblock_tree, cksum_record_compare,
@ -3313,7 +3313,7 @@ dump_block_stats(spa_t *spa)
uint64_t norm_alloc, norm_space, total_alloc, total_found;
int flags = TRAVERSE_PRE | TRAVERSE_PREFETCH_METADATA | TRAVERSE_HARD;
boolean_t leaks = B_FALSE;
int e, c;
int e, c, err;
bp_embedded_type_t i;
(void) printf("\nTraversing all blocks %s%s%s%s%s...\n\n",
@ -3354,7 +3354,7 @@ dump_block_stats(spa_t *spa)
zcb.zcb_totalasize = metaslab_class_get_alloc(spa_normal_class(spa));
zcb.zcb_start = zcb.zcb_lastprint = gethrtime();
zcb.zcb_haderrors |= traverse_pool(spa, 0, flags, zdb_blkptr_cb, &zcb);
err = traverse_pool(spa, 0, flags, zdb_blkptr_cb, &zcb);
/*
* If we've traversed the data blocks then we need to wait for those
@ -3370,6 +3370,12 @@ dump_block_stats(spa_t *spa)
}
}
/*
* Done after zio_wait() since zcb_haderrors is modified in
* zdb_blkptr_done()
*/
zcb.zcb_haderrors |= err;
if (zcb.zcb_haderrors) {
(void) printf("\nError counts:\n\n");
(void) printf("\t%5s %s\n", "errno", "count");
@ -3653,6 +3659,22 @@ dump_simulated_ddt(spa_t *spa)
dump_dedup_ratio(&dds_total);
}
static void
zdb_set_skip_mmp(char *target)
{
spa_t *spa;
/*
* Disable the activity check to allow examination of
* active pools.
*/
mutex_enter(&spa_namespace_lock);
if ((spa = spa_lookup(target)) != NULL) {
spa->spa_import_flags |= ZFS_IMPORT_SKIP_MMP;
}
mutex_exit(&spa_namespace_lock);
}
static void
dump_zpool(spa_t *spa)
{
@ -3889,13 +3911,6 @@ name:
return (NULL);
}
/* ARGSUSED */
static int
random_get_pseudo_bytes_cb(void *buf, size_t len, void *unused)
{
return (random_get_pseudo_bytes(buf, len));
}
/*
* Read a block from a pool and print it out. The syntax of the
* block descriptor is:
@ -4058,17 +4073,8 @@ zdb_read_block(char *thing, spa_t *spa)
* every decompress function at every inflated blocksize.
*/
enum zio_compress c;
void *pbuf2 = umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL);
void *lbuf2 = umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL);
abd_copy_to_buf(pbuf2, pabd, psize);
VERIFY0(abd_iterate_func(pabd, psize, SPA_MAXBLOCKSIZE - psize,
random_get_pseudo_bytes_cb, NULL));
VERIFY0(random_get_pseudo_bytes((uint8_t *)pbuf2 + psize,
SPA_MAXBLOCKSIZE - psize));
/*
* XXX - On the one hand, with SPA_MAXBLOCKSIZE at 16MB,
* this could take a while and we should let the user know
@ -4078,13 +4084,29 @@ zdb_read_block(char *thing, spa_t *spa)
for (lsize = psize + SPA_MINBLOCKSIZE;
lsize <= SPA_MAXBLOCKSIZE; lsize += SPA_MINBLOCKSIZE) {
for (c = 0; c < ZIO_COMPRESS_FUNCTIONS; c++) {
/*
* ZLE can easily decompress non zle stream.
* So have an option to disable it.
*/
if (c == ZIO_COMPRESS_ZLE &&
getenv("ZDB_NO_ZLE"))
continue;
(void) fprintf(stderr,
"Trying %05llx -> %05llx (%s)\n",
(u_longlong_t)psize, (u_longlong_t)lsize,
zio_compress_table[c].ci_name);
/*
* We randomize lbuf2, and decompress to both
* lbuf and lbuf2. This way, we will know if
* decompression fill exactly to lsize.
*/
VERIFY0(random_get_pseudo_bytes(lbuf2, lsize));
if (zio_decompress_data(c, pabd,
lbuf, psize, lsize) == 0 &&
zio_decompress_data_buf(c, pbuf2,
zio_decompress_data(c, pabd,
lbuf2, psize, lsize) == 0 &&
bcmp(lbuf, lbuf2, lsize) == 0)
break;
@ -4092,11 +4114,9 @@ zdb_read_block(char *thing, spa_t *spa)
if (c != ZIO_COMPRESS_FUNCTIONS)
break;
}
umem_free(pbuf2, SPA_MAXBLOCKSIZE);
umem_free(lbuf2, SPA_MAXBLOCKSIZE);
if (lsize <= psize) {
if (lsize > SPA_MAXBLOCKSIZE) {
(void) printf("Decompress of %s failed\n", thing);
goto out;
}
@ -4135,11 +4155,12 @@ zdb_embedded_block(char *thing)
{
blkptr_t bp;
unsigned long long *words = (void *)&bp;
char buf[SPA_MAXBLOCKSIZE];
char *buf;
int err;
memset(&bp, 0, sizeof (blkptr_t));
buf = umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL);
bzero(&bp, sizeof (bp));
err = sscanf(thing, "%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:"
"%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx",
words + 0, words + 1, words + 2, words + 3,
@ -4157,6 +4178,7 @@ zdb_embedded_block(char *thing)
exit(1);
}
zdb_dump_block_raw(buf, BPE_GET_LSIZE(&bp), 0);
umem_free(buf, SPA_MAXBLOCKSIZE);
}
int
@ -4171,7 +4193,7 @@ main(int argc, char **argv)
int error = 0;
char **searchdirs = NULL;
int nsearch = 0;
char *target;
char *target, *target_pool;
nvlist_t *policy = NULL;
uint64_t max_txg = UINT64_MAX;
int flags = ZFS_IMPORT_MISSING_LOG;
@ -4374,6 +4396,20 @@ main(int argc, char **argv)
error = 0;
target = argv[0];
if (strpbrk(target, "/@") != NULL) {
size_t targetlen;
target_pool = strdup(target);
*strpbrk(target_pool, "/@") = '\0';
target_is_spa = B_FALSE;
targetlen = strlen(target);
if (targetlen && target[targetlen - 1] == '/')
target[targetlen - 1] = '\0';
} else {
target_pool = target;
}
if (dump_opt['e']) {
importargs_t args = { 0 };
nvlist_t *cfg = NULL;
@ -4382,48 +4418,36 @@ main(int argc, char **argv)
args.path = searchdirs;
args.can_be_active = B_TRUE;
error = zpool_tryimport(g_zfs, target, &cfg, &args);
error = zpool_tryimport(g_zfs, target_pool, &cfg, &args);
if (error == 0) {
if (nvlist_add_nvlist(cfg,
ZPOOL_REWIND_POLICY, policy) != 0) {
fatal("can't open '%s': %s",
target, strerror(ENOMEM));
}
/*
* Disable the activity check to allow examination of
* active pools.
*/
if (dump_opt['C'] > 1) {
(void) printf("\nConfiguration for import:\n");
dump_nvlist(cfg, 8);
}
error = spa_import(target, cfg, NULL,
flags | ZFS_IMPORT_SKIP_MMP);
}
}
if (strpbrk(target, "/@") != NULL) {
size_t targetlen;
target_is_spa = B_FALSE;
targetlen = strlen(target);
if (targetlen && target[targetlen - 1] == '/')
target[targetlen - 1] = '\0';
}
if (error == 0) {
if (target_is_spa || dump_opt['R']) {
/*
* Disable the activity check to allow examination of
* active pools.
*/
mutex_enter(&spa_namespace_lock);
if ((spa = spa_lookup(target)) != NULL) {
spa->spa_import_flags |= ZFS_IMPORT_SKIP_MMP;
}
mutex_exit(&spa_namespace_lock);
error = spa_import(target_pool, cfg, NULL,
flags | ZFS_IMPORT_SKIP_MMP);
}
}
if (target_pool != target)
free(target_pool);
if (error == 0) {
if (target_is_spa || dump_opt['R']) {
zdb_set_skip_mmp(target);
error = spa_open_rewind(target, &spa, FTAG, policy,
NULL);
if (error) {
@ -4446,6 +4470,7 @@ main(int argc, char **argv)
}
}
} else {
zdb_set_skip_mmp(target);
error = open_objset(target, DMU_OST_ANY, FTAG, &os);
}
}

View File

@ -69,7 +69,8 @@ dist_zedexec_SCRIPTS = \
zed.d/statechange-notify.sh \
zed.d/vdev_clear-led.sh \
zed.d/vdev_attach-led.sh \
zed.d/pool_import-led.sh
zed.d/pool_import-led.sh \
zed.d/resilver_finish-start-scrub.sh
zedconfdefaults = \
all-syslog.sh \
@ -80,7 +81,8 @@ zedconfdefaults = \
statechange-notify.sh \
vdev_clear-led.sh \
vdev_attach-led.sh \
pool_import-led.sh
pool_import-led.sh \
resilver_finish-start-scrub.sh
install-data-hook:
$(MKDIR_P) "$(DESTDIR)$(zedconfdir)"

View File

@ -10,6 +10,8 @@
: "${ZED_DEBUG_LOG:="${TMPDIR:="/tmp"}/zed.debug.log"}"
zed_exit_if_ignoring_this_event
lockfile="$(basename -- "${ZED_DEBUG_LOG}").lock"
umask 077

View File

@ -5,6 +5,8 @@
[ -f "${ZED_ZEDLET_DIR}/zed.rc" ] && . "${ZED_ZEDLET_DIR}/zed.rc"
. "${ZED_ZEDLET_DIR}/zed-functions.sh"
zed_exit_if_ignoring_this_event
zed_log_msg "eid=${ZEVENT_EID}" "class=${ZEVENT_SUBCLASS}" \
"${ZEVENT_POOL_GUID:+"pool_guid=${ZEVENT_POOL_GUID}"}" \
"${ZEVENT_VDEV_PATH:+"vdev_path=${ZEVENT_VDEV_PATH}"}" \

View File

@ -0,0 +1,17 @@
#!/bin/sh
# resilver_finish-start-scrub.sh
# Run a scrub after a resilver
#
# Exit codes:
# 1: Internal error
# 2: Script wasn't enabled in zed.rc
[ -f "${ZED_ZEDLET_DIR}/zed.rc" ] && . "${ZED_ZEDLET_DIR}/zed.rc"
. "${ZED_ZEDLET_DIR}/zed-functions.sh"
[ "${ZED_SCRUB_AFTER_RESILVER}" = "1" ] || exit 2
[ -n "${ZEVENT_POOL}" ] || exit 1
[ -n "${ZEVENT_SUBCLASS}" ] || exit 1
zed_check_cmd "${ZPOOL}" || exit 1
zed_log_msg "Starting scrub after resilver on ${ZEVENT_POOL}"
"${ZPOOL}" scrub "${ZEVENT_POOL}"

View File

@ -397,7 +397,7 @@ zed_rate_limit()
zed_lock "${lockfile}" "${lockfile_fd}"
time_now="$(date +%s)"
time_prev="$(egrep "^[0-9]+;${tag}\$" "${statefile}" 2>/dev/null \
time_prev="$(grep -E "^[0-9]+;${tag}\$" "${statefile}" 2>/dev/null \
| tail -1 | cut -d\; -f1)"
if [ -n "${time_prev}" ] \
@ -406,7 +406,7 @@ zed_rate_limit()
else
umask_bak="$(umask)"
umask 077
egrep -v "^[0-9]+;${tag}\$" "${statefile}" 2>/dev/null \
grep -E -v "^[0-9]+;${tag}\$" "${statefile}" 2>/dev/null \
> "${statefile}.$$"
echo "${time_now};${tag}" >> "${statefile}.$$"
mv -f "${statefile}.$$" "${statefile}"
@ -438,3 +438,23 @@ zed_guid_to_pool()
$ZPOOL get -H -ovalue,name guid | awk '$1=='"$guid"' {print $2}'
fi
}
# zed_exit_if_ignoring_this_event
#
# Exit the script if we should ignore this event, as determined by
# $ZED_SYSLOG_SUBCLASS_INCLUDE and $ZED_SYSLOG_SUBCLASS_EXCLUDE in zed.rc.
# This function assumes you've imported the normal zed variables.
zed_exit_if_ignoring_this_event()
{
if [ -n "${ZED_SYSLOG_SUBCLASS_INCLUDE}" ]; then
eval "case ${ZEVENT_SUBCLASS} in
${ZED_SYSLOG_SUBCLASS_INCLUDE});;
*) exit 0;;
esac"
elif [ -n "${ZED_SYSLOG_SUBCLASS_EXCLUDE}" ]; then
eval "case ${ZEVENT_SUBCLASS} in
${ZED_SYSLOG_SUBCLASS_EXCLUDE}) exit 0;;
*);;
esac"
fi
}

View File

@ -86,6 +86,9 @@
#
ZED_USE_ENCLOSURE_LEDS=1
##
# Run a scrub after every resilver
#ZED_SCRUB_AFTER_RESILVER=1
##
# The syslog priority (e.g., specified as a "facility.level" pair).
@ -97,3 +100,14 @@ ZED_USE_ENCLOSURE_LEDS=1
#
#ZED_SYSLOG_TAG="zed"
##
# Which set of event subclasses to log
# By default, events from all subclasses are logged.
# If ZED_SYSLOG_SUBCLASS_INCLUDE is set, only subclasses
# matching the pattern are logged. Use the pipe symbol (|)
# or shell wildcards (*, ?) to match multiple subclasses.
# Otherwise, if ZED_SYSLOG_SUBCLASS_EXCLUDE is set, the
# matching subclasses are excluded from logging.
#ZED_SYSLOG_SUBCLASS_INCLUDE="checksum|scrub_*|vdev.*"
#ZED_SYSLOG_SUBCLASS_EXCLUDE="statechange|config_*|history_event"

View File

@ -155,6 +155,8 @@ _zed_conf_display_help(const char *prog, int got_err)
"Run daemon in the foreground.");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-M",
"Lock all pages in memory.");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-P",
"$PATH for ZED to use (only used by ZTS).");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-Z",
"Zero state file.");
fprintf(fp, "\n");
@ -247,7 +249,7 @@ _zed_conf_parse_path(char **resultp, const char *path)
void
zed_conf_parse_opts(struct zed_conf *zcp, int argc, char **argv)
{
const char * const opts = ":hLVc:d:p:s:vfFMZ";
const char * const opts = ":hLVc:d:p:P:s:vfFMZ";
int opt;
if (!zcp || !argv || !argv[0])
@ -275,6 +277,9 @@ zed_conf_parse_opts(struct zed_conf *zcp, int argc, char **argv)
case 'p':
_zed_conf_parse_path(&zcp->pid_file, optarg);
break;
case 'P':
_zed_conf_parse_path(&zcp->path, optarg);
break;
case 's':
_zed_conf_parse_path(&zcp->state_file, optarg);
break;

View File

@ -37,6 +37,7 @@ struct zed_conf {
int state_fd; /* fd to state file */
libzfs_handle_t *zfs_hdl; /* handle to libzfs */
int zevent_fd; /* fd for access to zevents */
char *path; /* custom $PATH for zedlets to use */
};
struct zed_conf *zed_conf_create(void);

View File

@ -733,12 +733,14 @@ _zed_event_add_nvpair(uint64_t eid, zed_strings_t *zsp, nvpair_t *nvp)
/*
* Restrict various environment variables to safe and sane values
* when constructing the environment for the child process.
* when constructing the environment for the child process, unless
* we're running with a custom $PATH (like under the ZFS test suite).
*
* Reference: Secure Programming Cookbook by Viega & Messier, Section 1.1.
*/
static void
_zed_event_add_env_restrict(uint64_t eid, zed_strings_t *zsp)
_zed_event_add_env_restrict(uint64_t eid, zed_strings_t *zsp,
const char *path)
{
const char *env_restrict[][2] = {
{ "IFS", " \t\n" },
@ -753,11 +755,35 @@ _zed_event_add_env_restrict(uint64_t eid, zed_strings_t *zsp)
{ "ZFS_RELEASE", ZFS_META_RELEASE },
{ NULL, NULL }
};
/*
* If we have a custom $PATH, use the default ZFS binary locations
* instead of the hard-coded ones.
*/
const char *env_path[][2] = {
{ "IFS", " \t\n" },
{ "PATH", NULL }, /* $PATH copied in later on */
{ "ZDB", "zdb" },
{ "ZED", "zed" },
{ "ZFS", "zfs" },
{ "ZINJECT", "zinject" },
{ "ZPOOL", "zpool" },
{ "ZFS_ALIAS", ZFS_META_ALIAS },
{ "ZFS_VERSION", ZFS_META_VERSION },
{ "ZFS_RELEASE", ZFS_META_RELEASE },
{ NULL, NULL }
};
const char *(*pa)[2];
assert(zsp != NULL);
for (pa = env_restrict; *(*pa); pa++) {
pa = path != NULL ? env_path : env_restrict;
for (; *(*pa); pa++) {
/* Use our custom $PATH if we have one */
if (path != NULL && strcmp((*pa)[0], "PATH") == 0)
(*pa)[1] = path;
_zed_event_add_var(eid, zsp, NULL, (*pa)[0], "%s", (*pa)[1]);
}
}
@ -902,7 +928,7 @@ zed_event_service(struct zed_conf *zcp)
while ((nvp = nvlist_next_nvpair(nvl, nvp)))
_zed_event_add_nvpair(eid, zsp, nvp);
_zed_event_add_env_restrict(eid, zsp);
_zed_event_add_env_restrict(eid, zsp, zcp->path);
_zed_event_add_env_preserve(eid, zsp);
_zed_event_add_var(eid, zsp, ZED_VAR_PREFIX, "PID",

View File

@ -6072,7 +6072,7 @@ share_mount_one(zfs_handle_t *zhp, int op, int flags, char *protocol,
(void) fprintf(stderr, gettext("cannot %s '%s': "
"Contains partially-completed state from "
"\"zfs receive -r\", which can be resumed with "
"\"zfs receive -s\", which can be resumed with "
"\"zfs send -t\"\n"),
cmdname, zfs_get_name(zhp));
return (1);
@ -7041,6 +7041,7 @@ main(int argc, char **argv)
int ret = 0;
int i = 0;
char *cmdname;
char **newargv;
(void) setlocale(LC_ALL, "");
(void) textdomain(TEXT_DOMAIN);
@ -7095,17 +7096,26 @@ main(int argc, char **argv)
libzfs_print_on_error(g_zfs, B_TRUE);
/*
* Many commands modify input strings for string parsing reasons.
* We create a copy to protect the original argv.
*/
newargv = malloc((argc + 1) * sizeof (newargv[0]));
for (i = 0; i < argc; i++)
newargv[i] = strdup(argv[i]);
newargv[argc] = NULL;
/*
* Run the appropriate command.
*/
libzfs_mnttab_cache(g_zfs, B_TRUE);
if (find_command_idx(cmdname, &i) == 0) {
current_command = &command_table[i];
ret = command_table[i].func(argc - 1, argv + 1);
ret = command_table[i].func(argc - 1, newargv + 1);
} else if (strchr(cmdname, '=') != NULL) {
verify(find_command_idx("set", &i) == 0);
current_command = &command_table[i];
ret = command_table[i].func(argc, argv);
ret = command_table[i].func(argc, newargv);
} else {
(void) fprintf(stderr, gettext("unrecognized "
"command '%s'\n"), cmdname);
@ -7113,6 +7123,10 @@ main(int argc, char **argv)
ret = 1;
}
for (i = 0; i < argc; i++)
free(newargv[i]);
free(newargv);
if (ret == 0 && log_history)
(void) zpool_log_history(g_zfs, history_str);

View File

@ -268,7 +268,7 @@ zhack_feature_enable_sync(void *arg, dmu_tx_t *tx)
static void
zhack_do_feature_enable(int argc, char **argv)
{
char c;
int c;
char *desc, *target;
spa_t *spa;
objset_t *mos;
@ -363,7 +363,7 @@ feature_decr_sync(void *arg, dmu_tx_t *tx)
static void
zhack_do_feature_ref(int argc, char **argv)
{
char c;
int c;
char *target;
boolean_t decr = B_FALSE;
spa_t *spa;
@ -483,7 +483,7 @@ main(int argc, char **argv)
char *path[MAX_NUM_PATHS];
const char *subcommand;
int rv = 0;
char c;
int c;
g_importargs.path = path;

View File

@ -525,10 +525,11 @@ run_one(cmd_args_t *args, uint32_t id, uint32_t T, uint32_t N,
memset(cmd, 0, cmd_size);
cmd->cmd_magic = ZPIOS_CMD_MAGIC;
strncpy(cmd->cmd_pool, args->pool, ZPIOS_NAME_SIZE - 1);
strncpy(cmd->cmd_pre, args->pre, ZPIOS_PATH_SIZE - 1);
strncpy(cmd->cmd_post, args->post, ZPIOS_PATH_SIZE - 1);
strncpy(cmd->cmd_log, args->log, ZPIOS_PATH_SIZE - 1);
snprintf(cmd->cmd_pool, sizeof (cmd->cmd_pool), "%s", args->pool);
snprintf(cmd->cmd_pre, sizeof (cmd->cmd_pre), "%s", args->pre);
snprintf(cmd->cmd_post, sizeof (cmd->cmd_post), "%s", args->post);
snprintf(cmd->cmd_log, sizeof (cmd->cmd_log), "%s", args->log);
cmd->cmd_id = id;
cmd->cmd_chunk_size = C;
cmd->cmd_thread_count = T;

View File

@ -60,9 +60,15 @@ dist_zpoolexec_SCRIPTS = \
zpool.d/pend_sec \
zpool.d/off_ucor \
zpool.d/ata_err \
zpool.d/nvme_err \
zpool.d/pwr_cyc \
zpool.d/upath \
zpool.d/vendor
zpool.d/vendor \
zpool.d/smart_test \
zpool.d/test_type \
zpool.d/test_status \
zpool.d/test_progress \
zpool.d/test_ended
zpoolconfdefaults = \
enc \
@ -98,9 +104,15 @@ zpoolconfdefaults = \
pend_sec \
off_ucor \
ata_err \
nvme_err \
pwr_cyc \
upath \
vendor
vendor \
smart_test \
test_type \
test_status \
test_progress \
test_ended
install-data-hook:
$(MKDIR_P) "$(DESTDIR)$(zpoolconfdir)"

1
cmd/zpool/zpool.d/nvme_err Symbolic link
View File

@ -0,0 +1 @@
smart

View File

@ -23,8 +23,45 @@ off_ucor: Show SMART offline uncorrectable errors (ATA).
ata_err: Show SMART ATA errors (ATA).
pwr_cyc: Show SMART power cycle count (ATA).
serial: Show disk serial number.
nvme_err: Show SMART NVMe errors (NVMe).
smart_test: Show SMART self-test results summary.
test_type: Show SMART self-test type (short, long... ).
test_status: Show SMART self-test status.
test_progress: Show SMART self-test percentage done.
test_ended: Show when the last SMART self-test ended (if supported).
"
# Hack for developer testing
#
# If you set $samples to a directory containing smartctl output text files,
# we will use them instead of running smartctl on the vdevs. This can be
# useful if you want to test a bunch of different smartctl outputs. Also, if
# $samples is set, and additional 'file' column is added to the zpool output
# showing the filename.
samples=
# get_filename_from_dir DIR
#
# Look in directory DIR and return a filename from it. The filename returned
# is chosen quasi-sequentially (based off our PID). This allows us to return
# a different filename every time this script is invoked (which we do for each
# vdev), without having to maintain state.
get_filename_from_dir()
{
dir=$1
pid="$$"
num_files=$(find "$dir" -maxdepth 1 -type f | wc -l)
mod=$((pid % num_files))
i=0
find "$dir" -type f -printf "%f\n" | while read -r file ; do
if [ "$mod" = "$i" ] ; then
echo "$file"
break
fi
i=$((i+1))
done
}
script=$(basename "$0")
if [ "$1" = "-h" ] ; then
@ -34,10 +71,18 @@ fi
smartctl_path=$(which smartctl)
if [ -b "$VDEV_UPATH" ] && [ -x "$smartctl_path" ]; then
raw_out=$(eval "sudo $smartctl_path -a $VDEV_UPATH")
if [ -b "$VDEV_UPATH" ] && [ -x "$smartctl_path" ] || [ -n "$samples" ] ; then
if [ -n "$samples" ] ; then
# cat a smartctl output text file instead of running smartctl
# on a vdev (only used for developer testing).
file=$(get_filename_from_dir $samples)
echo "file=$file"
raw_out=$(cat "$samples/$file")
else
raw_out=$(eval "sudo $smartctl_path -a $VDEV_UPATH")
fi
# Are we a SAS or ATA drive? Look for the right line in smartctl:
# What kind of drive are we? Look for the right line in smartctl:
#
# SAS:
# Transport protocol: SAS
@ -45,7 +90,9 @@ if [ -b "$VDEV_UPATH" ] && [ -x "$smartctl_path" ]; then
# SATA:
# ATA Version is: 8
#
type=$(echo "$raw_out" | grep -m 1 -Eo '^ATA|SAS$')
# NVMe:
# SMART/Health Information (NVMe Log 0xnn, NSID 0xnn)
#
out=$(echo "$raw_out" | awk '
# SAS specific
/read:/{print "rrd="$4"\nr_cor="$5"\nr_proc="$7"\nr_ucor="$8}
@ -54,10 +101,11 @@ if [ -b "$VDEV_UPATH" ] && [ -x "$smartctl_path" ]; then
/Elements in grown defect list/{print "defect="$6}
# SAS common
/SAS/{type="sas"}
/Drive Temperature:/{print "temp="$4}
# Status can be a long string, substitute spaces for '_'
/SMART Health Status:/{printf "health="; for(i=4;i<=NF-1;i++){printf "%s_", $i}; printf "%s\n", $i}
/number of hours powered up/{print "hours_on="$7}
/number of hours powered up/{print "hours_on="$7; hours_on=int($7)}
/Serial number:/{print "serial="$3}
# SATA specific
@ -70,40 +118,111 @@ if [ -b "$VDEV_UPATH" ] && [ -x "$smartctl_path" ]; then
/Power_Cycle_Count/{print "pwr_cyc="$10}
# SATA common
/SATA/{type="sata"}
/Temperature_Celsius/{print "temp="$10}
/Airflow_Temperature_Cel/{print "temp="$10}
/Current Temperature:/{print "temp="$3}
/SMART overall-health self-assessment test result:/{print "health="$6}
/Power_On_Hours/{print "hours_on="$10}
/Power_On_Hours/{print "hours_on="$10; hours_on=int($10)}
/Serial Number:/{print "serial="$3}
END {ORS="\n"; print ""}
# NVMe common
/NVMe/{type="nvme"}
/Temperature:/{print "temp="$2}
/SMART overall-health self-assessment test result:/{print "health="$6}
/Power On Hours:/{gsub("[^0-9]","",$4); print "hours_on="$4}
/Serial Number:/{print "serial="$3}
/Power Cycles:/{print "pwr_cyc="$3}
# NVMe specific
/Media and Data Integrity Errors:/{print "nvme_err="$6}
# SMART self-test info
/Self-test execution status:/{progress=tolower($4)} # SAS
/SMART Self-test log/{test_seen=1} # SAS
/SMART Extended Self-test Log/{test_seen=1} # SATA
/# 1/{
test_type=tolower($3"_"$4);
# Status could be one word ("Completed") or multiple ("Completed: read
# failure"). Look for the ":" to see if we need to grab more words.
if ($5 ~ ":")
status=tolower($5""$6"_"$7)
else
status=tolower($5)
if (status=="self")
status="running";
if (type == "sas") {
hours=int($(NF-4))
} else {
hours=int($(NF-1))
# SATA reports percent remaining, rather than percent done
# Convert it to percent done.
progress=(100-int($(NF-2)))"%"
}
# When we int()-ify "hours", it converts stuff like "NOW" and "-" into
# 0. In those cases, set it to hours_on, so they will cancel out in
# the "hours_ago" calculation later on.
if (hours == 0)
hours=hours_on
if (test_seen) {
print "test="hours_on
print "test_type="test_type
print "test_status="status
print "test_progress="progress
}
# Not all drives report hours_on
if (hours_on && hours) {
total_hours_ago=(hours_on-hours)
days_ago=int(total_hours_ago/24)
hours_ago=(total_hours_ago % 24)
if (days_ago != 0)
ago_str=days_ago"d"
if (hours_ago !=0)
ago_str=ago_str""hours_ago"h"
print "test_ended="ago_str
}
}
END {print "type="type; ORS="\n"; print ""}
');
fi
type=$(echo "$out" | grep '^type=' | cut -d '=' -f 2)
# if type is not set by now, either we don't have a block device
# or smartctl failed. Either way, default to ATA and set out to
# nothing
# If type is not set by now, either we don't have a block device
# or smartctl failed. Either way, default to ATA and set $out to
# nothing.
if [ -z "$type" ]; then
type="ATA"
type="sata"
out=
fi
case $script in
smart)
# Print temperature plus common predictors of drive failure
if [ "$type" = "SAS" ] ; then
if [ "$type" = "sas" ] ; then
scripts="temp|health|r_ucor|w_ucor"
elif [ "$type" = "ATA" ] ; then
elif [ "$type" = "sata" ] ; then
scripts="temp|health|ata_err|realloc|rep_ucor|cmd_to|pend_sec|off_ucor"
elif [ "$type" = "nvme" ] ; then
scripts="temp|health|nvme_err"
fi
;;
smartx)
# Print some other interesting stats
if [ "$type" = "SAS" ] ; then
if [ "$type" = "sas" ] ; then
scripts="hours_on|defect|nonmed|r_proc|w_proc"
elif [ "$type" = "ATA" ] ; then
elif [ "$type" = "sata" ] ; then
scripts="hours_on|pwr_cyc"
elif [ "$type" = "nvme" ] ; then
scripts="hours_on|pwr_cyc"
fi
;;
smart_test)
scripts="test_type|test_status|test_progress|test_ended"
;;
*)
scripts="$script"
esac

View File

@ -0,0 +1 @@
smart

View File

@ -0,0 +1 @@
smart

View File

@ -0,0 +1 @@
smart

View File

@ -0,0 +1 @@
smart

1
cmd/zpool/zpool.d/test_type Symbolic link
View File

@ -0,0 +1 @@
smart

View File

@ -2211,7 +2211,8 @@ show_import(nvlist_t *config)
(void) printf(gettext(" config:\n\n"));
cb.cb_namewidth = max_width(NULL, nvroot, 0, 0, VDEV_NAME_TYPE_ID);
cb.cb_namewidth = max_width(NULL, nvroot, 0, strlen(name),
VDEV_NAME_TYPE_ID);
if (cb.cb_namewidth < 10)
cb.cb_namewidth = 10;
@ -3492,7 +3493,7 @@ single_histo_average(uint64_t *histo, unsigned int buckets)
static void
print_iostat_queues(iostat_cbdata_t *cb, nvlist_t *oldnv,
nvlist_t *newnv, double scale)
nvlist_t *newnv)
{
int i;
uint64_t val;
@ -3522,7 +3523,7 @@ print_iostat_queues(iostat_cbdata_t *cb, nvlist_t *oldnv,
format = ZFS_NICENUM_1024;
for (i = 0; i < ARRAY_SIZE(names); i++) {
val = nva[i].data[0] * scale;
val = nva[i].data[0];
print_one_stat(val, format, column_width, cb->cb_scripted);
}
@ -3531,7 +3532,7 @@ print_iostat_queues(iostat_cbdata_t *cb, nvlist_t *oldnv,
static void
print_iostat_latency(iostat_cbdata_t *cb, nvlist_t *oldnv,
nvlist_t *newnv, double scale)
nvlist_t *newnv)
{
int i;
uint64_t val;
@ -3561,7 +3562,7 @@ print_iostat_latency(iostat_cbdata_t *cb, nvlist_t *oldnv,
/* Print our avg latencies on the line */
for (i = 0; i < ARRAY_SIZE(names); i++) {
/* Compute average latency for a latency histo */
val = single_histo_average(nva[i].data, nva[i].count) * scale;
val = single_histo_average(nva[i].data, nva[i].count);
print_one_stat(val, format, column_width, cb->cb_scripted);
}
free_calc_stats(nva, ARRAY_SIZE(names));
@ -3700,9 +3701,9 @@ print_vdev_stats(zpool_handle_t *zhp, const char *name, nvlist_t *oldnv,
print_iostat_default(calcvs, cb, scale);
}
if (cb->cb_flags & IOS_LATENCY_M)
print_iostat_latency(cb, oldnv, newnv, scale);
print_iostat_latency(cb, oldnv, newnv);
if (cb->cb_flags & IOS_QUEUES_M)
print_iostat_queues(cb, oldnv, newnv, scale);
print_iostat_queues(cb, oldnv, newnv);
if (cb->cb_flags & IOS_ANYHISTO_M) {
printf("\n");
print_iostat_histos(cb, oldnv, newnv, scale, name);
@ -3901,7 +3902,7 @@ get_namewidth(zpool_handle_t *zhp, void *data)
&nvroot) == 0);
unsigned int poolname_len = strlen(zpool_get_name(zhp));
if (!cb->cb_verbose)
cb->cb_namewidth = poolname_len;
cb->cb_namewidth = MAX(poolname_len, cb->cb_namewidth);
else
cb->cb_namewidth = MAX(poolname_len,
max_width(zhp, nvroot, 0, cb->cb_namewidth,
@ -6225,7 +6226,8 @@ status_callback(zpool_handle_t *zhp, void *data)
&nvroot) == 0);
verify(nvlist_lookup_uint64_array(nvroot, ZPOOL_CONFIG_VDEV_STATS,
(uint64_t **)&vs, &c) == 0);
health = zpool_state_to_name(vs->vs_state, vs->vs_aux);
health = zpool_get_state_str(zhp);
(void) printf(gettext(" pool: %s\n"), zpool_get_name(zhp));
(void) printf(gettext(" state: %s\n"), health);
@ -6394,6 +6396,15 @@ status_callback(zpool_handle_t *zhp, void *data)
"to be recovered.\n"));
break;
case ZPOOL_STATUS_IO_FAILURE_MMP:
(void) printf(gettext("status: The pool is suspended because "
"multihost writes failed or were delayed;\n\tanother "
"system could import the pool undetected.\n"));
(void) printf(gettext("action: Make sure the pool's devices "
"are connected, then reboot your system and\n\timport the "
"pool.\n"));
break;
case ZPOOL_STATUS_IO_FAILURE_WAIT:
case ZPOOL_STATUS_IO_FAILURE_CONTINUE:
(void) printf(gettext("status: One or more devices are "
@ -7960,6 +7971,7 @@ main(int argc, char **argv)
int ret = 0;
int i = 0;
char *cmdname;
char **newargv;
(void) setlocale(LC_ALL, "");
(void) textdomain(TEXT_DOMAIN);
@ -7994,16 +8006,25 @@ main(int argc, char **argv)
zfs_save_arguments(argc, argv, history_str, sizeof (history_str));
/*
* Many commands modify input strings for string parsing reasons.
* We create a copy to protect the original argv.
*/
newargv = malloc((argc + 1) * sizeof (newargv[0]));
for (i = 0; i < argc; i++)
newargv[i] = strdup(argv[i]);
newargv[argc] = NULL;
/*
* Run the appropriate command.
*/
if (find_command_idx(cmdname, &i) == 0) {
current_command = &command_table[i];
ret = command_table[i].func(argc - 1, argv + 1);
ret = command_table[i].func(argc - 1, newargv + 1);
} else if (strchr(cmdname, '=')) {
verify(find_command_idx("set", &i) == 0);
current_command = &command_table[i];
ret = command_table[i].func(argc, argv);
ret = command_table[i].func(argc, newargv);
} else if (strcmp(cmdname, "freeze") == 0 && argc == 3) {
/*
* 'freeze' is a vile debugging abomination, so we treat
@ -8020,6 +8041,10 @@ main(int argc, char **argv)
ret = 1;
}
for (i = 0; i < argc; i++)
free(newargv[i]);
free(newargv);
if (ret == 0 && log_history)
(void) zpool_log_history(g_zfs, history_str);

View File

@ -191,6 +191,7 @@ static vdev_disk_db_entry_t vdev_disk_database[] = {
{"ATA INTEL SSDSC2BP24", 4096},
{"ATA INTEL SSDSC2BP48", 4096},
{"NA SmrtStorSDLKAE9W", 4096},
{"NVMe Amazon EC2 NVMe ", 4096},
/* Imported from Open Solaris */
{"ATA MARVELL SD88SA02", 4096},
/* Advanced format Hard drives */
@ -800,8 +801,11 @@ get_replication(nvlist_t *nvroot, boolean_t fatal)
if (is_log)
continue;
verify(nvlist_lookup_string(nv, ZPOOL_CONFIG_TYPE,
&type) == 0);
/* Ignore holes introduced by removing aux devices */
verify(nvlist_lookup_string(nv, ZPOOL_CONFIG_TYPE, &type) == 0);
if (strcmp(type, VDEV_TYPE_HOLE) == 0)
continue;
if (nvlist_lookup_nvlist_array(nv, ZPOOL_CONFIG_CHILDREN,
&child, &children) != 0) {
/*
@ -857,9 +861,11 @@ get_replication(nvlist_t *nvroot, boolean_t fatal)
/*
* If this is a replacing or spare vdev, then
* get the real first child of the vdev.
* get the real first child of the vdev: do this
* in a loop because replacing and spare vdevs
* can be nested.
*/
if (strcmp(childtype,
while (strcmp(childtype,
VDEV_TYPE_REPLACING) == 0 ||
strcmp(childtype, VDEV_TYPE_SPARE) == 0) {
nvlist_t **rchild;

View File

@ -171,8 +171,8 @@ typedef struct ztest_shared_opts {
} ztest_shared_opts_t;
static const ztest_shared_opts_t ztest_opts_defaults = {
.zo_pool = { 'z', 't', 'e', 's', 't', '\0' },
.zo_dir = { '/', 't', 'm', 'p', '\0' },
.zo_pool = "ztest",
.zo_dir = "/tmp",
.zo_alt_ztest = { '\0' },
.zo_alt_libpath = { '\0' },
.zo_vdevs = 5,
@ -197,7 +197,8 @@ extern uint64_t metaslab_gang_bang;
extern uint64_t metaslab_df_alloc_threshold;
extern int metaslab_preload_limit;
extern boolean_t zfs_compressed_arc_enabled;
extern int zfs_abd_scatter_enabled;
extern int zfs_abd_scatter_enabled;
extern int dmu_object_alloc_chunk_shift;
static ztest_shared_opts_t *ztest_shared_opts;
static ztest_shared_opts_t ztest_opts;
@ -310,6 +311,7 @@ static ztest_shared_callstate_t *ztest_shared_callstate;
ztest_func_t ztest_dmu_read_write;
ztest_func_t ztest_dmu_write_parallel;
ztest_func_t ztest_dmu_object_alloc_free;
ztest_func_t ztest_dmu_object_next_chunk;
ztest_func_t ztest_dmu_commit_callbacks;
ztest_func_t ztest_zap;
ztest_func_t ztest_zap_parallel;
@ -357,6 +359,7 @@ ztest_info_t ztest_info[] = {
ZTI_INIT(ztest_dmu_read_write, 1, &zopt_always),
ZTI_INIT(ztest_dmu_write_parallel, 10, &zopt_always),
ZTI_INIT(ztest_dmu_object_alloc_free, 1, &zopt_always),
ZTI_INIT(ztest_dmu_object_next_chunk, 1, &zopt_sometimes),
ZTI_INIT(ztest_dmu_commit_callbacks, 1, &zopt_always),
ZTI_INIT(ztest_zap, 30, &zopt_always),
ZTI_INIT(ztest_zap_parallel, 100, &zopt_always),
@ -1186,7 +1189,7 @@ ztest_spa_prop_set_uint64(zpool_prop_t prop, uint64_t value)
*/
typedef struct {
list_node_t z_lnode;
refcount_t z_refcnt;
zfs_refcount_t z_refcnt;
uint64_t z_object;
zfs_rlock_t z_range_lock;
} ztest_znode_t;
@ -1202,7 +1205,7 @@ ztest_znode_init(uint64_t object)
ztest_znode_t *zp = umem_alloc(sizeof (*zp), UMEM_NOFAIL);
list_link_init(&zp->z_lnode);
refcount_create(&zp->z_refcnt);
zfs_refcount_create(&zp->z_refcnt);
zp->z_object = object;
zfs_rlock_init(&zp->z_range_lock);
@ -1212,10 +1215,10 @@ ztest_znode_init(uint64_t object)
static void
ztest_znode_fini(ztest_znode_t *zp)
{
ASSERT(refcount_is_zero(&zp->z_refcnt));
ASSERT(zfs_refcount_is_zero(&zp->z_refcnt));
zfs_rlock_destroy(&zp->z_range_lock);
zp->z_object = 0;
refcount_destroy(&zp->z_refcnt);
zfs_refcount_destroy(&zp->z_refcnt);
list_link_init(&zp->z_lnode);
umem_free(zp, sizeof (*zp));
}
@ -1245,13 +1248,13 @@ ztest_znode_get(ztest_ds_t *zd, uint64_t object)
for (zp = list_head(&zll->z_list); (zp);
zp = list_next(&zll->z_list, zp)) {
if (zp->z_object == object) {
refcount_add(&zp->z_refcnt, RL_TAG);
zfs_refcount_add(&zp->z_refcnt, RL_TAG);
break;
}
}
if (zp == NULL) {
zp = ztest_znode_init(object);
refcount_add(&zp->z_refcnt, RL_TAG);
zfs_refcount_add(&zp->z_refcnt, RL_TAG);
list_insert_head(&zll->z_list, zp);
}
mutex_exit(&zll->z_lock);
@ -1265,8 +1268,8 @@ ztest_znode_put(ztest_ds_t *zd, ztest_znode_t *zp)
ASSERT3U(zp->z_object, !=, 0);
zll = &zd->zd_range_lock[zp->z_object & (ZTEST_OBJECT_LOCKS - 1)];
mutex_enter(&zll->z_lock);
refcount_remove(&zp->z_refcnt, RL_TAG);
if (refcount_is_zero(&zp->z_refcnt)) {
zfs_refcount_remove(&zp->z_refcnt, RL_TAG);
if (zfs_refcount_is_zero(&zp->z_refcnt)) {
list_remove(&zll->z_list, zp);
ztest_znode_fini(zp);
}
@ -3927,6 +3930,26 @@ ztest_dmu_object_alloc_free(ztest_ds_t *zd, uint64_t id)
umem_free(od, size);
}
/*
* Rewind the global allocator to verify object allocation backfilling.
*/
void
ztest_dmu_object_next_chunk(ztest_ds_t *zd, uint64_t id)
{
objset_t *os = zd->zd_os;
int dnodes_per_chunk = 1 << dmu_object_alloc_chunk_shift;
uint64_t object;
/*
* Rewind the global allocator randomly back to a lower object number
* to force backfilling and reclamation of recently freed dnodes.
*/
mutex_enter(&os->os_obj_lock);
object = ztest_random(os->os_obj_next_chunk);
os->os_obj_next_chunk = P2ALIGN(object, dnodes_per_chunk);
mutex_exit(&os->os_obj_lock);
}
#undef OD_ARRAY_SIZE
#define OD_ARRAY_SIZE 2
@ -5665,6 +5688,9 @@ ztest_reguid(ztest_ds_t *zd, uint64_t id)
uint64_t orig, load;
int error;
if (ztest_opts.zo_mmp_test)
return;
orig = spa_guid(spa);
load = spa_load_guid(spa);

View File

@ -55,11 +55,12 @@ main(int argc, char **argv)
{
int fd, error = 0;
char zvol_name[ZFS_MAX_DATASET_NAME_LEN];
char zvol_name_part[ZFS_MAX_DATASET_NAME_LEN];
char *zvol_name_part = NULL;
char *dev_name;
struct stat64 statbuf;
int dev_minor, dev_part;
int i;
int rc;
if (argc < 2) {
printf("Usage: %s /dev/zvol_device_node\n", argv[0]);
@ -88,11 +89,13 @@ main(int argc, char **argv)
return (errno);
}
if (dev_part > 0)
snprintf(zvol_name_part, ZFS_MAX_DATASET_NAME_LEN,
"%s-part%d", zvol_name, dev_part);
rc = asprintf(&zvol_name_part, "%s-part%d", zvol_name,
dev_part);
else
snprintf(zvol_name_part, ZFS_MAX_DATASET_NAME_LEN,
"%s", zvol_name);
rc = asprintf(&zvol_name_part, "%s", zvol_name);
if (rc == -1 || zvol_name_part == NULL)
goto error;
for (i = 0; i < strlen(zvol_name_part); i++) {
if (isblank(zvol_name_part[i]))
@ -100,6 +103,8 @@ main(int argc, char **argv)
}
printf("%s\n", zvol_name_part);
free(zvol_name_part);
error:
close(fd);
return (error);
}

View File

@ -6,6 +6,7 @@ AM_CFLAGS += ${NO_UNUSED_BUT_SET_VARIABLE}
AM_CFLAGS += ${NO_BOOL_COMPARE}
AM_CFLAGS += -fno-strict-aliasing
AM_CFLAGS += -std=gnu99
AM_CFLAGS += $(CODE_COVERAGE_CFLAGS)
AM_CPPFLAGS = -D_GNU_SOURCE -D__EXTENSIONS__ -D_REENTRANT
AM_CPPFLAGS += -D_POSIX_PTHREAD_SEMANTICS -D_FILE_OFFSET_BITS=64
AM_CPPFLAGS += -D_LARGEFILE64_SOURCE -DHAVE_LARGE_STACKS=1
@ -14,3 +15,4 @@ AM_CPPFLAGS += -DLIBEXECDIR=\"$(libexecdir)\"
AM_CPPFLAGS += -DRUNSTATEDIR=\"$(runstatedir)\"
AM_CPPFLAGS += -DSBINDIR=\"$(sbindir)\"
AM_CPPFLAGS += -DSYSCONFDIR=\"$(sysconfdir)\"
AM_CPPFLAGS += $(CODE_COVERAGE_CPPFLAGS)

264
config/ax_code_coverage.m4 Normal file
View File

@ -0,0 +1,264 @@
# ===========================================================================
# https://www.gnu.org/software/autoconf-archive/ax_code_coverage.html
# ===========================================================================
#
# SYNOPSIS
#
# AX_CODE_COVERAGE()
#
# DESCRIPTION
#
# Defines CODE_COVERAGE_CPPFLAGS, CODE_COVERAGE_CFLAGS,
# CODE_COVERAGE_CXXFLAGS and CODE_COVERAGE_LIBS which should be included
# in the CPPFLAGS, CFLAGS CXXFLAGS and LIBS/LIBADD variables of every
# build target (program or library) which should be built with code
# coverage support. Also defines CODE_COVERAGE_RULES which should be
# substituted in your Makefile; and $enable_code_coverage which can be
# used in subsequent configure output. CODE_COVERAGE_ENABLED is defined
# and substituted, and corresponds to the value of the
# --enable-code-coverage option, which defaults to being disabled.
#
# Test also for gcov program and create GCOV variable that could be
# substituted.
#
# Note that all optimization flags in CFLAGS must be disabled when code
# coverage is enabled.
#
# Usage example:
#
# configure.ac:
#
# AX_CODE_COVERAGE
#
# Makefile.am:
#
# @CODE_COVERAGE_RULES@
# my_program_LIBS = ... $(CODE_COVERAGE_LIBS) ...
# my_program_CPPFLAGS = ... $(CODE_COVERAGE_CPPFLAGS) ...
# my_program_CFLAGS = ... $(CODE_COVERAGE_CFLAGS) ...
# my_program_CXXFLAGS = ... $(CODE_COVERAGE_CXXFLAGS) ...
#
# This results in a "check-code-coverage" rule being added to any
# Makefile.am which includes "@CODE_COVERAGE_RULES@" (assuming the module
# has been configured with --enable-code-coverage). Running `make
# check-code-coverage` in that directory will run the module's test suite
# (`make check`) and build a code coverage report detailing the code which
# was touched, then print the URI for the report.
#
# In earlier versions of this macro, CODE_COVERAGE_LDFLAGS was defined
# instead of CODE_COVERAGE_LIBS. They are both still defined, but use of
# CODE_COVERAGE_LIBS is preferred for clarity; CODE_COVERAGE_LDFLAGS is
# deprecated. They have the same value.
#
# This code was derived from Makefile.decl in GLib, originally licenced
# under LGPLv2.1+.
#
# LICENSE
#
# Copyright (c) 2012, 2016 Philip Withnall
# Copyright (c) 2012 Xan Lopez
# Copyright (c) 2012 Christian Persch
# Copyright (c) 2012 Paolo Borelli
# Copyright (c) 2012 Dan Winship
# Copyright (c) 2015 Bastien ROUCARIES
#
# This library is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or (at
# your option) any later version.
#
# This library is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser
# General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#serial 25
AC_DEFUN([AX_CODE_COVERAGE],[
dnl Check for --enable-code-coverage
AC_REQUIRE([AC_PROG_SED])
# allow to override gcov location
AC_ARG_WITH([gcov],
[AS_HELP_STRING([--with-gcov[=GCOV]], [use given GCOV for coverage (GCOV=gcov).])],
[_AX_CODE_COVERAGE_GCOV_PROG_WITH=$with_gcov],
[_AX_CODE_COVERAGE_GCOV_PROG_WITH=gcov])
AC_MSG_CHECKING([whether to build with code coverage support])
AC_ARG_ENABLE([code-coverage],
AS_HELP_STRING([--enable-code-coverage],
[Whether to enable code coverage support]),,
enable_code_coverage=no)
AM_CONDITIONAL([CODE_COVERAGE_ENABLED], [test x$enable_code_coverage = xyes])
AC_SUBST([CODE_COVERAGE_ENABLED], [$enable_code_coverage])
AC_MSG_RESULT($enable_code_coverage)
AS_IF([ test "$enable_code_coverage" = "yes" ], [
# check for gcov
AC_CHECK_TOOL([GCOV],
[$_AX_CODE_COVERAGE_GCOV_PROG_WITH],
[:])
AS_IF([test "X$GCOV" = "X:"],
[AC_MSG_ERROR([gcov is needed to do coverage])])
AC_SUBST([GCOV])
dnl Check if gcc is being used
AS_IF([ test "$GCC" = "no" ], [
AC_MSG_ERROR([not compiling with gcc, which is required for gcov code coverage])
])
AC_CHECK_PROG([LCOV], [lcov], [lcov])
AC_CHECK_PROG([GENHTML], [genhtml], [genhtml])
AS_IF([ test -z "$LCOV" ], [
AC_MSG_ERROR([To enable code coverage reporting you must have lcov installed])
])
AS_IF([ test -z "$GENHTML" ], [
AC_MSG_ERROR([Could not find genhtml from the lcov package])
])
dnl Build the code coverage flags
dnl Define CODE_COVERAGE_LDFLAGS for backwards compatibility
CODE_COVERAGE_CPPFLAGS=""
CODE_COVERAGE_CFLAGS="-O0 -g -fprofile-arcs -ftest-coverage"
CODE_COVERAGE_CXXFLAGS="-O0 -g -fprofile-arcs -ftest-coverage"
CODE_COVERAGE_LIBS="-lgcov"
CODE_COVERAGE_LDFLAGS="$CODE_COVERAGE_LIBS"
AC_SUBST([CODE_COVERAGE_CPPFLAGS])
AC_SUBST([CODE_COVERAGE_CFLAGS])
AC_SUBST([CODE_COVERAGE_CXXFLAGS])
AC_SUBST([CODE_COVERAGE_LIBS])
AC_SUBST([CODE_COVERAGE_LDFLAGS])
[CODE_COVERAGE_RULES_CHECK='
-$(A''M_V_at)$(MAKE) $(AM_MAKEFLAGS) -k check
$(A''M_V_at)$(MAKE) $(AM_MAKEFLAGS) code-coverage-capture
']
[CODE_COVERAGE_RULES_CAPTURE='
$(code_coverage_v_lcov_cap)$(LCOV) $(code_coverage_quiet) $(addprefix --directory ,$(CODE_COVERAGE_DIRECTORY)) --capture --output-file "$(CODE_COVERAGE_OUTPUT_FILE).tmp" --test-name "$(call code_coverage_sanitize,$(PACKAGE_NAME)-$(PACKAGE_VERSION))" --no-checksum --compat-libtool $(CODE_COVERAGE_LCOV_SHOPTS) $(CODE_COVERAGE_LCOV_OPTIONS)
$(code_coverage_v_lcov_ign)$(LCOV) $(code_coverage_quiet) $(addprefix --directory ,$(CODE_COVERAGE_DIRECTORY)) --remove "$(CODE_COVERAGE_OUTPUT_FILE).tmp" "/tmp/*" $(CODE_COVERAGE_IGNORE_PATTERN) --output-file "$(CODE_COVERAGE_OUTPUT_FILE)" $(CODE_COVERAGE_LCOV_SHOPTS) $(CODE_COVERAGE_LCOV_RMOPTS)
-@rm -f $(CODE_COVERAGE_OUTPUT_FILE).tmp
$(code_coverage_v_genhtml)LANG=C $(GENHTML) $(code_coverage_quiet) $(addprefix --prefix ,$(CODE_COVERAGE_DIRECTORY)) --output-directory "$(CODE_COVERAGE_OUTPUT_DIRECTORY)" --title "$(PACKAGE_NAME)-$(PACKAGE_VERSION) Code Coverage" --legend --show-details "$(CODE_COVERAGE_OUTPUT_FILE)" $(CODE_COVERAGE_GENHTML_OPTIONS)
@echo "file://$(abs_builddir)/$(CODE_COVERAGE_OUTPUT_DIRECTORY)/index.html"
']
[CODE_COVERAGE_RULES_CLEAN='
clean: code-coverage-clean
distclean: code-coverage-clean
code-coverage-clean:
-$(LCOV) --directory $(top_builddir) -z
-rm -rf $(CODE_COVERAGE_OUTPUT_FILE) $(CODE_COVERAGE_OUTPUT_FILE).tmp $(CODE_COVERAGE_OUTPUT_DIRECTORY)
-find . \( -name "*.gcda" -o -name "*.gcno" -o -name "*.gcov" \) -delete
']
], [
[CODE_COVERAGE_RULES_CHECK='
@echo "Need to reconfigure with --enable-code-coverage"
']
CODE_COVERAGE_RULES_CAPTURE="$CODE_COVERAGE_RULES_CHECK"
CODE_COVERAGE_RULES_CLEAN=''
])
[CODE_COVERAGE_RULES='
# Code coverage
#
# Optional:
# - CODE_COVERAGE_DIRECTORY: Top-level directory for code coverage reporting.
# Multiple directories may be specified, separated by whitespace.
# (Default: $(top_builddir))
# - CODE_COVERAGE_OUTPUT_FILE: Filename and path for the .info file generated
# by lcov for code coverage. (Default:
# $(PACKAGE_NAME)-$(PACKAGE_VERSION)-coverage.info)
# - CODE_COVERAGE_OUTPUT_DIRECTORY: Directory for generated code coverage
# reports to be created. (Default:
# $(PACKAGE_NAME)-$(PACKAGE_VERSION)-coverage)
# - CODE_COVERAGE_BRANCH_COVERAGE: Set to 1 to enforce branch coverage,
# set to 0 to disable it and leave empty to stay with the default.
# (Default: empty)
# - CODE_COVERAGE_LCOV_SHOPTS_DEFAULT: Extra options shared between both lcov
# instances. (Default: based on $CODE_COVERAGE_BRANCH_COVERAGE)
# - CODE_COVERAGE_LCOV_SHOPTS: Extra options to shared between both lcov
# instances. (Default: $CODE_COVERAGE_LCOV_SHOPTS_DEFAULT)
# - CODE_COVERAGE_LCOV_OPTIONS_GCOVPATH: --gcov-tool pathtogcov
# - CODE_COVERAGE_LCOV_OPTIONS_DEFAULT: Extra options to pass to the
# collecting lcov instance. (Default: $CODE_COVERAGE_LCOV_OPTIONS_GCOVPATH)
# - CODE_COVERAGE_LCOV_OPTIONS: Extra options to pass to the collecting lcov
# instance. (Default: $CODE_COVERAGE_LCOV_OPTIONS_DEFAULT)
# - CODE_COVERAGE_LCOV_RMOPTS_DEFAULT: Extra options to pass to the filtering
# lcov instance. (Default: empty)
# - CODE_COVERAGE_LCOV_RMOPTS: Extra options to pass to the filtering lcov
# instance. (Default: $CODE_COVERAGE_LCOV_RMOPTS_DEFAULT)
# - CODE_COVERAGE_GENHTML_OPTIONS_DEFAULT: Extra options to pass to the
# genhtml instance. (Default: based on $CODE_COVERAGE_BRANCH_COVERAGE)
# - CODE_COVERAGE_GENHTML_OPTIONS: Extra options to pass to the genhtml
# instance. (Default: $CODE_COVERAGE_GENHTML_OPTIONS_DEFAULT)
# - CODE_COVERAGE_IGNORE_PATTERN: Extra glob pattern of files to ignore
#
# The generated report will be titled using the $(PACKAGE_NAME) and
# $(PACKAGE_VERSION). In order to add the current git hash to the title,
# use the git-version-gen script, available online.
# Optional variables
CODE_COVERAGE_DIRECTORY ?= $(top_builddir)
CODE_COVERAGE_OUTPUT_FILE ?= $(PACKAGE_NAME)-$(PACKAGE_VERSION)-coverage.info
CODE_COVERAGE_OUTPUT_DIRECTORY ?= $(PACKAGE_NAME)-$(PACKAGE_VERSION)-coverage
CODE_COVERAGE_BRANCH_COVERAGE ?=
CODE_COVERAGE_LCOV_SHOPTS_DEFAULT ?= $(if $(CODE_COVERAGE_BRANCH_COVERAGE),\
--rc lcov_branch_coverage=$(CODE_COVERAGE_BRANCH_COVERAGE))
CODE_COVERAGE_LCOV_SHOPTS ?= $(CODE_COVERAGE_LCOV_SHOPTS_DEFAULT)
CODE_COVERAGE_LCOV_OPTIONS_GCOVPATH ?= --gcov-tool "$(GCOV)"
CODE_COVERAGE_LCOV_OPTIONS_DEFAULT ?= $(CODE_COVERAGE_LCOV_OPTIONS_GCOVPATH)
CODE_COVERAGE_LCOV_OPTIONS ?= $(CODE_COVERAGE_LCOV_OPTIONS_DEFAULT)
CODE_COVERAGE_LCOV_RMOPTS_DEFAULT ?=
CODE_COVERAGE_LCOV_RMOPTS ?= $(CODE_COVERAGE_LCOV_RMOPTS_DEFAULT)
CODE_COVERAGE_GENHTML_OPTIONS_DEFAULT ?=\
$(if $(CODE_COVERAGE_BRANCH_COVERAGE),\
--rc genhtml_branch_coverage=$(CODE_COVERAGE_BRANCH_COVERAGE))
CODE_COVERAGE_GENHTML_OPTIONS ?= $(CODE_COVERAGE_GENHTML_OPTIONS_DEFAULT)
CODE_COVERAGE_IGNORE_PATTERN ?=
GITIGNOREFILES ?=
GITIGNOREFILES += $(CODE_COVERAGE_OUTPUT_FILE) $(CODE_COVERAGE_OUTPUT_DIRECTORY)
code_coverage_v_lcov_cap = $(code_coverage_v_lcov_cap_$(V))
code_coverage_v_lcov_cap_ = $(code_coverage_v_lcov_cap_$(AM_DEFAULT_VERBOSITY))
code_coverage_v_lcov_cap_0 = @echo " LCOV --capture"\
$(CODE_COVERAGE_OUTPUT_FILE);
code_coverage_v_lcov_ign = $(code_coverage_v_lcov_ign_$(V))
code_coverage_v_lcov_ign_ = $(code_coverage_v_lcov_ign_$(AM_DEFAULT_VERBOSITY))
code_coverage_v_lcov_ign_0 = @echo " LCOV --remove /tmp/*"\
$(CODE_COVERAGE_IGNORE_PATTERN);
code_coverage_v_genhtml = $(code_coverage_v_genhtml_$(V))
code_coverage_v_genhtml_ = $(code_coverage_v_genhtml_$(AM_DEFAULT_VERBOSITY))
code_coverage_v_genhtml_0 = @echo " GEN " $(CODE_COVERAGE_OUTPUT_DIRECTORY);
code_coverage_quiet = $(code_coverage_quiet_$(V))
code_coverage_quiet_ = $(code_coverage_quiet_$(AM_DEFAULT_VERBOSITY))
code_coverage_quiet_0 = --quiet
# sanitizes the test-name: replaces with underscores: dashes and dots
code_coverage_sanitize = $(subst -,_,$(subst .,_,$(1)))
# Use recursive makes in order to ignore errors during check
check-code-coverage:'"$CODE_COVERAGE_RULES_CHECK"'
# Capture code coverage data
code-coverage-capture: code-coverage-capture-hook'"$CODE_COVERAGE_RULES_CAPTURE"'
# Hook rule executed before code-coverage-capture, overridable by the user
code-coverage-capture-hook:
'"$CODE_COVERAGE_RULES_CLEAN"'
A''M_DISTCHECK_CONFIGURE_FLAGS ?=
A''M_DISTCHECK_CONFIGURE_FLAGS += --disable-code-coverage
.PHONY: check-code-coverage code-coverage-capture code-coverage-capture-hook code-coverage-clean
']
AC_SUBST([CODE_COVERAGE_RULES])
m4_ifdef([_AM_SUBST_NOTMAKE], [_AM_SUBST_NOTMAKE([CODE_COVERAGE_RULES])])
])

View File

@ -2,24 +2,25 @@ deb-local:
@(if test "${HAVE_DPKGBUILD}" = "no"; then \
echo -e "\n" \
"*** Required util ${DPKGBUILD} missing. Please install the\n" \
"*** package for your distribution which provides ${DPKGBUILD},\n" \
"*** package for your distribution which provides ${DPKGBUILD},\n" \
"*** re-run configure, and try again.\n"; \
exit 1; \
exit 1; \
fi; \
if test "${HAVE_ALIEN}" = "no"; then \
echo -e "\n" \
"*** Required util ${ALIEN} missing. Please install the\n" \
"*** package for your distribution which provides ${ALIEN},\n" \
"*** package for your distribution which provides ${ALIEN},\n" \
"*** re-run configure, and try again.\n"; \
exit 1; \
exit 1; \
fi)
deb-kmod: deb-local rpm-kmod
name=${PACKAGE}; \
version=${VERSION}-${RELEASE}; \
arch=`$(RPM) -qp $${name}-kmod-$${version}.src.rpm --qf %{arch} | tail -1`; \
debarch=`$(DPKG) --print-architecture`; \
pkg1=kmod-$${name}*$${version}.$${arch}.rpm; \
fakeroot $(ALIEN) --bump=0 --scripts --to-deb $$pkg1; \
fakeroot $(ALIEN) --bump=0 --scripts --to-deb --target=$$debarch $$pkg1; \
$(RM) $$pkg1
@ -27,14 +28,16 @@ deb-dkms: deb-local rpm-dkms
name=${PACKAGE}; \
version=${VERSION}-${RELEASE}; \
arch=`$(RPM) -qp $${name}-dkms-$${version}.src.rpm --qf %{arch} | tail -1`; \
debarch=`$(DPKG) --print-architecture`; \
pkg1=$${name}-dkms-$${version}.$${arch}.rpm; \
fakeroot $(ALIEN) --bump=0 --scripts --to-deb $$pkg1; \
fakeroot $(ALIEN) --bump=0 --scripts --to-deb --target=$$debarch $$pkg1; \
$(RM) $$pkg1
deb-utils: deb-local rpm-utils
name=${PACKAGE}; \
version=${VERSION}-${RELEASE}; \
arch=`$(RPM) -qp $${name}-$${version}.src.rpm --qf %{arch} | tail -1`; \
debarch=`$(DPKG) --print-architecture`; \
pkg1=$${name}-$${version}.$${arch}.rpm; \
pkg2=libnvpair1-$${version}.$${arch}.rpm; \
pkg3=libuutil1-$${version}.$${arch}.rpm; \
@ -57,7 +60,7 @@ deb-utils: deb-local rpm-utils
## which should NOT be mixed with the alien-generated debs created here
chmod +x $${path_prepend}/dh_shlibdeps; \
env PATH=$${path_prepend}:$${PATH} \
fakeroot $(ALIEN) --bump=0 --scripts --to-deb \
fakeroot $(ALIEN) --bump=0 --scripts --to-deb --target=$$debarch \
$$pkg1 $$pkg2 $$pkg3 $$pkg4 $$pkg5 $$pkg6 $$pkg7 \
$$pkg8 $$pkg9; \
$(RM) $${path_prepend}/dh_shlibdeps; \

View File

@ -0,0 +1,21 @@
dnl #
dnl # Linux 5.0: access_ok() drops 'type' parameter:
dnl #
dnl # - access_ok(type, addr, size)
dnl # + access_ok(addr, size)
dnl #
AC_DEFUN([ZFS_AC_KERNEL_ACCESS_OK_TYPE], [
AC_MSG_CHECKING([whether access_ok() has 'type' parameter])
ZFS_LINUX_TRY_COMPILE([
#include <linux/uaccess.h>
],[
const void __user __attribute__((unused)) *addr = (void *) 0xdeadbeef;
unsigned long __attribute__((unused)) size = 1;
int error __attribute__((unused)) = access_ok(0, addr, size);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_ACCESS_OK_TYPE, 1, [kernel has access_ok with 'type' parameter])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,20 @@
dnl #
dnl # 4.16 kernel: check if struct posix_acl acl.a_refcount is a refcount_t.
dnl # It's an atomic_t on older kernels.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_ACL_HAS_REFCOUNT], [
AC_MSG_CHECKING([whether posix_acl has refcount_t])
ZFS_LINUX_TRY_COMPILE([
#include <linux/backing-dev.h>
#include <linux/refcount.h>
#include <linux/posix_acl.h>
],[
struct posix_acl acl;
refcount_t *r __attribute__ ((unused)) = &acl.a_refcount;
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_ACL_REFCOUNT, 1, [posix_acl has refcount_t])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -184,6 +184,7 @@ AC_DEFUN([ZFS_AC_KERNEL_INODE_OPERATIONS_PERMISSION_WITH_NAMEIDATA], [
AC_MSG_CHECKING([whether iops->permission() wants nameidata])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
#include <linux/sched.h>
int permission_fn(struct inode *inode, int mask,
struct nameidata *nd) { return 0; }

View File

@ -1,34 +0,0 @@
dnl #
dnl # 2.6.x API change
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BDEV_BLOCK_DEVICE_OPERATIONS], [
AC_MSG_CHECKING([block device operation prototypes])
tmp_flags="$EXTRA_KCFLAGS"
EXTRA_KCFLAGS="${NO_UNUSED_BUT_SET_VARIABLE}"
ZFS_LINUX_TRY_COMPILE([
#include <linux/blkdev.h>
int blk_open(struct block_device *bdev, fmode_t mode)
{ return 0; }
int blk_ioctl(struct block_device *bdev, fmode_t mode,
unsigned x, unsigned long y) { return 0; }
int blk_compat_ioctl(struct block_device * bdev, fmode_t mode,
unsigned x, unsigned long y) { return 0; }
static const struct block_device_operations
bops __attribute__ ((unused)) = {
.open = blk_open,
.release = NULL,
.ioctl = blk_ioctl,
.compat_ioctl = blk_compat_ioctl,
};
],[
],[
AC_MSG_RESULT(struct block_device)
AC_DEFINE(HAVE_BDEV_BLOCK_DEVICE_OPERATIONS, 1,
[struct block_device_operations use bdevs])
],[
AC_MSG_RESULT(struct inode)
])
EXTRA_KCFLAGS="$tmp_flags"
])

View File

@ -1,10 +1,10 @@
dnl #
dnl # Linux 4.14 API,
dnl #
dnl # The bio_set_dev() helper was introduced as part of the transition
dnl # The bio_set_dev() helper macro was introduced as part of the transition
dnl # to have struct gendisk in struct bio.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BIO_SET_DEV], [
AC_DEFUN([ZFS_AC_KERNEL_BIO_SET_DEV_MACRO], [
AC_MSG_CHECKING([whether bio_set_dev() exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/bio.h>
@ -20,3 +20,34 @@ AC_DEFUN([ZFS_AC_KERNEL_BIO_SET_DEV], [
AC_MSG_RESULT(no)
])
])
dnl #
dnl # Linux 5.0 API,
dnl #
dnl # The bio_set_dev() helper macro was updated to internally depend on
dnl # bio_associate_blkg() symbol which is exported GPL-only.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BIO_SET_DEV_GPL_ONLY], [
AC_MSG_CHECKING([whether bio_set_dev() is GPL-only])
ZFS_LINUX_TRY_COMPILE([
#include <linux/module.h>
#include <linux/bio.h>
#include <linux/fs.h>
MODULE_LICENSE("$ZFS_META_LICENSE");
],[
struct block_device *bdev = NULL;
struct bio *bio = NULL;
bio_set_dev(bio, bdev);
],[
AC_MSG_RESULT(no)
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BIO_SET_DEV_GPL_ONLY, 1,
[bio_set_dev() GPL-only])
])
])
AC_DEFUN([ZFS_AC_KERNEL_BIO_SET_DEV], [
ZFS_AC_KERNEL_BIO_SET_DEV_MACRO
ZFS_AC_KERNEL_BIO_SET_DEV_GPL_ONLY
])

View File

@ -0,0 +1,38 @@
dnl #
dnl # API change
dnl # https://github.com/torvalds/linux/commit/8814ce8
dnl # Introduction of blk_queue_flag_set and blk_queue_flag_clear
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_FLAG_SET], [
AC_MSG_CHECKING([whether blk_queue_flag_set() exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/kernel.h>
#include <linux/blkdev.h>
],[
struct request_queue *q = NULL;
blk_queue_flag_set(0, q);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_FLAG_SET, 1, [blk_queue_flag_set() exists])
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_FLAG_CLEAR], [
AC_MSG_CHECKING([whether blk_queue_flag_clear() exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/kernel.h>
#include <linux/blkdev.h>
],[
struct request_queue *q = NULL;
blk_queue_flag_clear(0, q);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_FLAG_CLEAR, 1, [blk_queue_flag_clear() exists])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -1,29 +0,0 @@
dnl #
dnl # 3.10.x API change
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
AC_MSG_CHECKING([whether block_device_operations.release is void])
tmp_flags="$EXTRA_KCFLAGS"
EXTRA_KCFLAGS="${NO_UNUSED_BUT_SET_VARIABLE}"
ZFS_LINUX_TRY_COMPILE([
#include <linux/blkdev.h>
void blk_release(struct gendisk *g, fmode_t mode) { return; }
static const struct block_device_operations
bops __attribute__ ((unused)) = {
.open = NULL,
.release = blk_release,
.ioctl = NULL,
.compat_ioctl = NULL,
};
],[
],[
AC_MSG_RESULT(void)
AC_DEFINE(HAVE_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID, 1,
[struct block_device_operations.release returns void])
],[
AC_MSG_RESULT(int)
])
EXTRA_KCFLAGS="$tmp_flags"
])

View File

@ -0,0 +1,57 @@
dnl #
dnl # 2.6.38 API change
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS], [
AC_MSG_CHECKING([whether bops->check_events() exists])
tmp_flags="$EXTRA_KCFLAGS"
EXTRA_KCFLAGS="${NO_UNUSED_BUT_SET_VARIABLE}"
ZFS_LINUX_TRY_COMPILE([
#include <linux/blkdev.h>
unsigned int blk_check_events(struct gendisk *disk,
unsigned int clearing) { return (0); }
static const struct block_device_operations
bops __attribute__ ((unused)) = {
.check_events = blk_check_events,
};
],[
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS, 1,
[bops->check_events() exists])
],[
AC_MSG_RESULT(no)
])
EXTRA_KCFLAGS="$tmp_flags"
])
dnl #
dnl # 3.10.x API change
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
AC_MSG_CHECKING([whether bops->release() is void])
tmp_flags="$EXTRA_KCFLAGS"
EXTRA_KCFLAGS="${NO_UNUSED_BUT_SET_VARIABLE}"
ZFS_LINUX_TRY_COMPILE([
#include <linux/blkdev.h>
void blk_release(struct gendisk *g, fmode_t mode) { return; }
static const struct block_device_operations
bops __attribute__ ((unused)) = {
.open = NULL,
.release = blk_release,
.ioctl = NULL,
.compat_ioctl = NULL,
};
],[
],[
AC_MSG_RESULT(void)
AC_DEFINE(HAVE_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID, 1,
[bops->release() returns void])
],[
AC_MSG_RESULT(int)
])
EXTRA_KCFLAGS="$tmp_flags"
])

View File

@ -5,6 +5,7 @@ AC_DEFUN([ZFS_AC_KERNEL_CREATE_NAMEIDATA], [
AC_MSG_CHECKING([whether iops->create() passes nameidata])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
#include <linux/sched.h>
#ifdef HAVE_MKDIR_UMODE_T
int inode_create(struct inode *inode ,struct dentry *dentry,

View File

@ -1,15 +1,14 @@
dnl #
dnl # 4.9, current_time() added
dnl # 4.18, return type changed from timespec to timespec64
dnl #
AC_DEFUN([ZFS_AC_KERNEL_CURRENT_TIME],
[AC_MSG_CHECKING([whether current_time() exists])
ZFS_LINUX_TRY_COMPILE_SYMBOL([
#include <linux/fs.h>
], [
struct inode ip;
struct timespec now __attribute__ ((unused));
now = current_time(&ip);
struct inode ip __attribute__ ((unused));
ip.i_atime = current_time(&ip);
], [current_time], [fs/inode.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_CURRENT_TIME, 1, [current_time() exists])

View File

@ -5,6 +5,7 @@ AC_DEFUN([ZFS_AC_KERNEL_D_REVALIDATE_NAMEIDATA], [
AC_MSG_CHECKING([whether dops->d_revalidate() takes struct nameidata])
ZFS_LINUX_TRY_COMPILE([
#include <linux/dcache.h>
#include <linux/sched.h>
int revalidate (struct dentry *dentry,
struct nameidata *nidata) { return 0; }

View File

@ -1,6 +1,6 @@
dnl #
dnl # 2.6.36 API change
dnl # Verify the elevator_change() symbol is available.
dnl # 2.6.36 API, exported elevator_change() symbol
dnl # 4.12 API, removed elevator_change() symbol
dnl #
AC_DEFUN([ZFS_AC_KERNEL_ELEVATOR_CHANGE], [
AC_MSG_CHECKING([whether elevator_change() is available])

View File

@ -1,18 +1,41 @@
dnl #
dnl # 4.2 API change
dnl # asm/i387.h is replaced by asm/fpu/api.h
dnl # Handle differences in kernel FPU code.
dnl #
dnl # Kernel
dnl # 5.0: All kernel fpu functions are GPL only, so we can't use them.
dnl # (nothing defined)
dnl #
dnl # 4.2: Use __kernel_fpu_{begin,end}()
dnl # HAVE_UNDERSCORE_KERNEL_FPU & KERNEL_EXPORTS_X86_FPU
dnl #
dnl # Pre-4.2: Use kernel_fpu_{begin,end}()
dnl # HAVE_KERNEL_FPU & KERNEL_EXPORTS_X86_FPU
dnl #
AC_DEFUN([ZFS_AC_KERNEL_FPU], [
AC_MSG_CHECKING([whether asm/fpu/api.h exists])
AC_MSG_CHECKING([which kernel_fpu function to use])
ZFS_LINUX_TRY_COMPILE([
#include <linux/kernel.h>
#include <asm/fpu/api.h>
#include <asm/i387.h>
#include <asm/xcr.h>
],[
__kernel_fpu_begin();
kernel_fpu_begin();
kernel_fpu_end();
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_FPU_API_H, 1, [kernel has <asm/fpu/api.h> interface])
AC_MSG_RESULT(kernel_fpu_*)
AC_DEFINE(HAVE_KERNEL_FPU, 1, [kernel has kernel_fpu_* functions])
AC_DEFINE(KERNEL_EXPORTS_X86_FPU, 1, [kernel exports FPU functions])
],[
AC_MSG_RESULT(no)
ZFS_LINUX_TRY_COMPILE([
#include <linux/kernel.h>
#include <asm/fpu/api.h>
],[
__kernel_fpu_begin();
__kernel_fpu_end();
],[
AC_MSG_RESULT(__kernel_fpu_*)
AC_DEFINE(HAVE_UNDERSCORE_KERNEL_FPU, 1, [kernel has __kernel_fpu_* functions])
AC_DEFINE(KERNEL_EXPORTS_X86_FPU, 1, [kernel exports FPU functions])
],[
AC_MSG_RESULT(not exported)
])
])
])

View File

@ -0,0 +1,28 @@
dnl #
dnl # 2.6.38 API change
dnl # The .get_sb callback has been replaced by a .mount callback
dnl # in the file_system_type structure.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_FST_MOUNT], [
AC_MSG_CHECKING([whether fst->mount() exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
static struct dentry *
mount(struct file_system_type *fs_type, int flags,
const char *osname, void *data) {
struct dentry *d = NULL;
return (d);
}
static struct file_system_type fst __attribute__ ((unused)) = {
.mount = mount,
};
],[
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_FST_MOUNT, 1, [fst->mount() exists])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,19 @@
dnl #
dnl # 4.16 API change
dnl # Verify if get_disk_and_module() symbol is available.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_GET_DISK_AND_MODULE],
[AC_MSG_CHECKING([whether get_disk_and_module() is available])
ZFS_LINUX_TRY_COMPILE_SYMBOL([
#include <linux/genhd.h>
], [
struct gendisk *disk = NULL;
(void) get_disk_and_module(disk);
], [get_disk_and_module], [block/genhd.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GET_DISK_AND_MODULE,
1, [get_disk_and_module() is available])
], [
AC_MSG_RESULT(no)
])
])

View File

@ -41,7 +41,7 @@ AC_DEFUN([ZFS_AC_KERNEL_FOLLOW_LINK], [
AC_DEFINE(HAVE_FOLLOW_LINK_NAMEIDATA, 1,
[iops->follow_link() nameidata])
],[
AC_MSG_ERROR(no; please file a bug report)
AC_MSG_ERROR(no; please file a bug report)
])
])
])

View File

@ -0,0 +1,109 @@
dnl #
dnl # 4.8 API change
dnl #
dnl # 75ef71840539 mm, vmstat: add infrastructure for per-node vmstats
dnl # 599d0c954f91 mm, vmscan: move LRU lists to node
dnl #
AC_DEFUN([ZFS_AC_KERNEL_GLOBAL_NODE_PAGE_STATE], [
AC_MSG_CHECKING([whether global_node_page_state() exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/mm.h>
#include <linux/vmstat.h>
],[
(void) global_node_page_state(0);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(ZFS_GLOBAL_NODE_PAGE_STATE, 1, [global_node_page_state() exists])
],[
AC_MSG_RESULT(no)
])
])
dnl #
dnl # 4.14 API change
dnl #
dnl # c41f012ade0b mm: rename global_page_state to global_zone_page_state
dnl #
AC_DEFUN([ZFS_AC_KERNEL_GLOBAL_ZONE_PAGE_STATE], [
AC_MSG_CHECKING([whether global_zone_page_state() exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/mm.h>
#include <linux/vmstat.h>
],[
(void) global_zone_page_state(0);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(ZFS_GLOBAL_ZONE_PAGE_STATE, 1, [global_zone_page_state() exists])
],[
AC_MSG_RESULT(no)
])
])
dnl #
dnl # Create a define and autoconf variable for an enum member
dnl #
AC_DEFUN([ZFS_AC_KERNEL_ENUM_MEMBER], [
AC_MSG_CHECKING([whether enum $2 contains $1])
AS_IF([AC_TRY_COMMAND("${srcdir}/scripts/enum-extract.pl" "$2" "$3" | egrep -qx $1)],[
AC_MSG_RESULT([yes])
AC_DEFINE(m4_join([_], [ZFS_ENUM], m4_toupper($2), $1), 1, [enum $2 contains $1])
m4_join([_], [ZFS_ENUM], m4_toupper($2), $1)=1
],[
AC_MSG_RESULT([no])
])
])
dnl #
dnl # Sanity check helpers
dnl #
AC_DEFUN([ZFS_AC_KERNEL_GLOBAL_PAGE_STATE_ENUM_ERROR],[
AC_MSG_RESULT(no)
AC_MSG_RESULT([$1 in either node_stat_item or zone_stat_item: $2])
AC_MSG_RESULT([configure needs updating, see: config/kernel-global_page_state.m4])
AC_MSG_FAILURE([SHUT 'ER DOWN CLANCY, SHE'S PUMPIN' MUD!])
])
AC_DEFUN([ZFS_AC_KERNEL_GLOBAL_PAGE_STATE_ENUM_CHECK], [
enum_check_a="m4_join([_], [$ZFS_ENUM_NODE_STAT_ITEM], $1)"
enum_check_b="m4_join([_], [$ZFS_ENUM_ZONE_STAT_ITEM], $1)"
AS_IF([test -n "$enum_check_a" -a -n "$enum_check_b"],[
ZFS_AC_KERNEL_GLOBAL_PAGE_STATE_ENUM_ERROR([$1], [DUPLICATE])
])
AS_IF([test -z "$enum_check_a" -a -z "$enum_check_b"],[
ZFS_AC_KERNEL_GLOBAL_PAGE_STATE_ENUM_ERROR([$1], [NOT FOUND])
])
])
dnl #
dnl # Ensure the config tests are finding one and only one of each enum of interest
dnl #
AC_DEFUN([ZFS_AC_KERNEL_GLOBAL_ZONE_PAGE_STATE_SANITY], [
AC_MSG_CHECKING([global_page_state enums are sane])
ZFS_AC_KERNEL_GLOBAL_PAGE_STATE_ENUM_CHECK([NR_FILE_PAGES])
ZFS_AC_KERNEL_GLOBAL_PAGE_STATE_ENUM_CHECK([NR_INACTIVE_ANON])
ZFS_AC_KERNEL_GLOBAL_PAGE_STATE_ENUM_CHECK([NR_INACTIVE_FILE])
ZFS_AC_KERNEL_GLOBAL_PAGE_STATE_ENUM_CHECK([NR_SLAB_RECLAIMABLE])
AC_MSG_RESULT(yes)
])
dnl #
dnl # enum members in which we're interested
dnl #
AC_DEFUN([ZFS_AC_KERNEL_GLOBAL_PAGE_STATE], [
ZFS_AC_KERNEL_GLOBAL_NODE_PAGE_STATE
ZFS_AC_KERNEL_GLOBAL_ZONE_PAGE_STATE
ZFS_AC_KERNEL_ENUM_MEMBER([NR_FILE_PAGES], [node_stat_item], [$LINUX/include/linux/mmzone.h])
ZFS_AC_KERNEL_ENUM_MEMBER([NR_INACTIVE_ANON], [node_stat_item], [$LINUX/include/linux/mmzone.h])
ZFS_AC_KERNEL_ENUM_MEMBER([NR_INACTIVE_FILE], [node_stat_item], [$LINUX/include/linux/mmzone.h])
ZFS_AC_KERNEL_ENUM_MEMBER([NR_SLAB_RECLAIMABLE], [node_stat_item], [$LINUX/include/linux/mmzone.h])
ZFS_AC_KERNEL_ENUM_MEMBER([NR_FILE_PAGES], [zone_stat_item], [$LINUX/include/linux/mmzone.h])
ZFS_AC_KERNEL_ENUM_MEMBER([NR_INACTIVE_ANON], [zone_stat_item], [$LINUX/include/linux/mmzone.h])
ZFS_AC_KERNEL_ENUM_MEMBER([NR_INACTIVE_FILE], [zone_stat_item], [$LINUX/include/linux/mmzone.h])
ZFS_AC_KERNEL_ENUM_MEMBER([NR_SLAB_RECLAIMABLE], [zone_stat_item], [$LINUX/include/linux/mmzone.h])
ZFS_AC_KERNEL_GLOBAL_ZONE_PAGE_STATE_SANITY
])

View File

@ -0,0 +1,20 @@
dnl #
dnl # 4.5 API change
dnl # Added in_compat_syscall() which can be overridden on a per-
dnl # architecture basis. Prior to this is_compat_task() was the
dnl # provided interface.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_IN_COMPAT_SYSCALL], [
AC_MSG_CHECKING([whether in_compat_syscall() is available])
ZFS_LINUX_TRY_COMPILE([
#include <linux/compat.h>
],[
in_compat_syscall();
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_IN_COMPAT_SYSCALL, 1,
[in_compat_syscall() is available])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,19 @@
dnl #
dnl # 4.16 API change
dnl # inode_set_iversion introduced to set i_version
dnl #
AC_DEFUN([ZFS_AC_KERNEL_INODE_SET_IVERSION], [
AC_MSG_CHECKING([whether inode_set_iversion() exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/iversion.h>
],[
struct inode inode;
inode_set_iversion(&inode, 1);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_INODE_SET_IVERSION, 1,
[inode_set_iversion() exists])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -5,6 +5,7 @@ AC_DEFUN([ZFS_AC_KERNEL_LOOKUP_NAMEIDATA], [
AC_MSG_CHECKING([whether iops->lookup() passes nameidata])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
#include <linux/sched.h>
struct dentry *inode_lookup(struct inode *inode,
struct dentry *dentry, struct nameidata *nidata)

View File

@ -0,0 +1,26 @@
dnl #
dnl # Determine an available miscellaneous minor number which can be used
dnl # for the /dev/zfs device. This is needed because kernel module
dnl # auto-loading depends on registering a reserved non-conflicting minor
dnl # number. Start with a large known available unreserved minor and work
dnl # our way down to lower value if a collision is detected.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_MISC_MINOR], [
AC_MSG_CHECKING([for available /dev/zfs minor])
for i in $(seq 249 -1 200); do
if ! grep -q "^#define\s\+.*_MINOR\s\+.*$i" \
${LINUX}/include/linux/miscdevice.h; then
ZFS_DEVICE_MINOR="$i"
AC_MSG_RESULT($ZFS_DEVICE_MINOR)
AC_DEFINE_UNQUOTED([ZFS_DEVICE_MINOR],
[$ZFS_DEVICE_MINOR], [/dev/zfs minor])
break
fi
done
AS_IF([ test -z "$ZFS_DEVICE_MINOR"], [
AC_MSG_ERROR([
*** No available misc minor numbers available for use.])
])
])

View File

@ -1,20 +0,0 @@
dnl #
dnl # 2.6.39 API change
dnl # The .get_sb callback has been replaced by a .mount callback
dnl # in the file_system_type structure. When using the new
dnl # interface the caller must now use the mount_nodev() helper.
dnl # This updated callback and helper no longer pass the vfsmount.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_MOUNT_NODEV],
[AC_MSG_CHECKING([whether mount_nodev() is available])
ZFS_LINUX_TRY_COMPILE_SYMBOL([
#include <linux/fs.h>
], [
mount_nodev(NULL, 0, NULL, NULL);
], [mount_nodev], [fs/super.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_MOUNT_NODEV, 1, [mount_nodev() is available])
], [
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,67 @@
dnl #
dnl # 2.6.38 API change
dnl # ns_capable() was introduced
dnl #
AC_DEFUN([ZFS_AC_KERNEL_NS_CAPABLE], [
AC_MSG_CHECKING([whether ns_capable exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/capability.h>
],[
ns_capable((struct user_namespace *)NULL, CAP_SYS_ADMIN);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_NS_CAPABLE, 1,
[ns_capable exists])
],[
AC_MSG_RESULT(no)
])
])
dnl #
dnl # 2.6.39 API change
dnl # struct user_namespace was added to struct cred_t as
dnl # cred->user_ns member
dnl # Note that current_user_ns() was added in 2.6.28.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_CRED_USER_NS], [
AC_MSG_CHECKING([whether cred_t->user_ns exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/cred.h>
],[
struct cred cr;
cr.user_ns = (struct user_namespace *)NULL;
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_CRED_USER_NS, 1,
[cred_t->user_ns exists])
],[
AC_MSG_RESULT(no)
])
])
dnl #
dnl # 3.4 API change
dnl # kuid_has_mapping() and kgid_has_mapping() were added to distinguish
dnl # between internal kernel uids/gids and user namespace uids/gids.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_KUID_HAS_MAPPING], [
AC_MSG_CHECKING([whether kuid_has_mapping/kgid_has_mapping exist])
ZFS_LINUX_TRY_COMPILE([
#include <linux/uidgid.h>
],[
kuid_has_mapping((struct user_namespace *)NULL, KUIDT_INIT(0));
kgid_has_mapping((struct user_namespace *)NULL, KGIDT_INIT(0));
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_KUID_HAS_MAPPING, 1,
[kuid_has_mapping/kgid_has_mapping exist])
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_USERNS_CAPABILITIES], [
ZFS_AC_KERNEL_NS_CAPABLE
ZFS_AC_KERNEL_CRED_USER_NS
ZFS_AC_KERNEL_KUID_HAS_MAPPING
])

View File

@ -23,16 +23,27 @@ AC_DEFUN([ZFS_AC_KERNEL_VFS_ITERATE], [
dnl #
dnl # 3.11 API change
dnl #
dnl # RHEL 7.5 compatibility; the fops.iterate() method was
dnl # added to the file_operations structure but in order to
dnl # maintain KABI compatibility all callers must set
dnl # FMODE_KABI_ITERATE which is checked in iterate_dir().
dnl # When detected ignore this interface and fallback to
dnl # to using fops.readdir() to retain KABI compatibility.
dnl #
AC_MSG_CHECKING([whether fops->iterate() is available])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
int iterate(struct file *filp, struct dir_context * context)
{ return 0; }
int iterate(struct file *filp,
struct dir_context *context) { return 0; }
static const struct file_operations fops
__attribute__ ((unused)) = {
.iterate = iterate,
};
#if defined(FMODE_KABI_ITERATE)
#error "RHEL 7.5, FMODE_KABI_ITERATE interface"
#endif
],[
],[
AC_MSG_RESULT(yes)
@ -44,8 +55,8 @@ AC_DEFUN([ZFS_AC_KERNEL_VFS_ITERATE], [
AC_MSG_CHECKING([whether fops->readdir() is available])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
int readdir(struct file *filp, void *entry, filldir_t func)
{ return 0; }
int readdir(struct file *filp, void *entry,
filldir_t func) { return 0; }
static const struct file_operations fops
__attribute__ ((unused)) = {
@ -57,7 +68,7 @@ AC_DEFUN([ZFS_AC_KERNEL_VFS_ITERATE], [
AC_DEFINE(HAVE_VFS_READDIR, 1,
[fops->readdir() is available])
],[
AC_MSG_ERROR(no; file a bug report with ZFSOnLinux)
AC_MSG_ERROR(no; file a bug report with ZoL)
])
])
])

View File

@ -32,15 +32,23 @@ dnl #
dnl # Linux 4.1 API
dnl #
AC_DEFUN([ZFS_AC_KERNEL_NEW_SYNC_READ],
[AC_MSG_CHECKING([whether new_sync_read() is available])
[AC_MSG_CHECKING([whether new_sync_read/write() are available])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
],[
new_sync_read(NULL, NULL, 0, NULL);
ssize_t ret __attribute__ ((unused));
struct file *filp = NULL;
char __user *rbuf = NULL;
const char __user *wbuf = NULL;
size_t len = 0;
loff_t ppos;
ret = new_sync_read(filp, rbuf, len, &ppos);
ret = new_sync_write(filp, wbuf, len, &ppos);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_NEW_SYNC_READ, 1,
[new_sync_read() is available])
[new_sync_read()/new_sync_write() are available])
],[
AC_MSG_RESULT(no)
])

View File

@ -1,22 +0,0 @@
dnl #
dnl # 4.8 API change
dnl # kernel vm counters change
dnl #
AC_DEFUN([ZFS_AC_KERNEL_VM_NODE_STAT], [
AC_MSG_CHECKING([whether to use vm_node_stat based fn's])
ZFS_LINUX_TRY_COMPILE([
#include <linux/mm.h>
#include <linux/vmstat.h>
],[
int a __attribute__ ((unused)) = NR_VM_NODE_STAT_ITEMS;
long x __attribute__ ((unused)) =
atomic_long_read(&vm_node_stat[0]);
(void) global_node_page_state(0);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(ZFS_GLOBAL_NODE_PAGE_STATE, 1,
[using global_node_page_state()])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -5,14 +5,16 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL
ZFS_AC_SPL
ZFS_AC_QAT
ZFS_AC_KERNEL_ACCESS_OK_TYPE
ZFS_AC_TEST_MODULE
ZFS_AC_KERNEL_MISC_MINOR
ZFS_AC_KERNEL_OBJTOOL
ZFS_AC_KERNEL_CONFIG
ZFS_AC_KERNEL_DECLARE_EVENT_CLASS
ZFS_AC_KERNEL_CURRENT_BIO_TAIL
ZFS_AC_KERNEL_SUPER_USER_NS
ZFS_AC_KERNEL_SUBMIT_BIO
ZFS_AC_KERNEL_BDEV_BLOCK_DEVICE_OPERATIONS
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID
ZFS_AC_KERNEL_TYPE_FMODE_T
ZFS_AC_KERNEL_3ARG_BLKDEV_GET
@ -35,11 +37,14 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_BIO_RW_BARRIER
ZFS_AC_KERNEL_BIO_RW_DISCARD
ZFS_AC_KERNEL_BLK_QUEUE_BDI
ZFS_AC_KERNEL_BLK_QUEUE_FLAG_CLEAR
ZFS_AC_KERNEL_BLK_QUEUE_FLAG_SET
ZFS_AC_KERNEL_BLK_QUEUE_FLUSH
ZFS_AC_KERNEL_BLK_QUEUE_MAX_HW_SECTORS
ZFS_AC_KERNEL_BLK_QUEUE_MAX_SEGMENTS
ZFS_AC_KERNEL_BLK_QUEUE_HAVE_BIO_RW_UNPLUG
ZFS_AC_KERNEL_BLK_QUEUE_HAVE_BLK_PLUG
ZFS_AC_KERNEL_GET_DISK_AND_MODULE
ZFS_AC_KERNEL_GET_DISK_RO
ZFS_AC_KERNEL_GET_GENDISK
ZFS_AC_KERNEL_HAVE_BIO_SET_OP_ATTRS
@ -65,6 +70,7 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_INODE_OPERATIONS_SET_ACL
ZFS_AC_KERNEL_INODE_OPERATIONS_GETATTR
ZFS_AC_KERNEL_INODE_SET_FLAGS
ZFS_AC_KERNEL_INODE_SET_IVERSION
ZFS_AC_KERNEL_GET_ACL_HANDLE_CACHE
ZFS_AC_KERNEL_SHOW_OPTIONS
ZFS_AC_KERNEL_FILE_INODE
@ -98,7 +104,7 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_TRUNCATE_SETSIZE
ZFS_AC_KERNEL_6ARGS_SECURITY_INODE_INIT_SECURITY
ZFS_AC_KERNEL_CALLBACK_SECURITY_INODE_INIT_SECURITY
ZFS_AC_KERNEL_MOUNT_NODEV
ZFS_AC_KERNEL_FST_MOUNT
ZFS_AC_KERNEL_SHRINK
ZFS_AC_KERNEL_SHRINK_CONTROL_HAS_NID
ZFS_AC_KERNEL_S_INSTANCES_LIST_HEAD
@ -122,7 +128,10 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_RENAME_WANTS_FLAGS
ZFS_AC_KERNEL_HAVE_GENERIC_SETXATTR
ZFS_AC_KERNEL_CURRENT_TIME
ZFS_AC_KERNEL_VM_NODE_STAT
ZFS_AC_KERNEL_GLOBAL_PAGE_STATE
ZFS_AC_KERNEL_ACL_HAS_REFCOUNT
ZFS_AC_KERNEL_USERNS_CAPABILITIES
ZFS_AC_KERNEL_IN_COMPAT_SYSCALL
AS_IF([test "$LINUX_OBJ" != "$LINUX"], [
KERNELMAKE_PARAMS="$KERNELMAKE_PARAMS O=$LINUX_OBJ"
@ -248,7 +257,7 @@ AC_DEFUN([ZFS_AC_KERNEL], [
AS_IF([test "$utsrelease"], [
kernsrcver=`(echo "#include <$utsrelease>";
echo "kernsrcver=UTS_RELEASE") |
cpp -I $kernelbuild/include |
${CPP} -I $kernelbuild/include - |
grep "^kernsrcver=" | cut -d \" -f 2`
AS_IF([test -z "$kernsrcver"], [
@ -532,10 +541,10 @@ AC_DEFUN([ZFS_AC_QAT], [
AC_MSG_RESULT([$qatbuild])
QAT_OBJ=${qatbuild}
AS_IF([ ! test -e "$QAT_OBJ/icp_qa_al.ko"], [
AS_IF([ ! test -e "$QAT_OBJ/icp_qa_al.ko" && ! test -e "$QAT_OBJ/qat_api.ko"], [
AC_MSG_ERROR([
*** Please make sure the qat driver is installed then try again.
*** Failed to find icp_qa_al.ko in:
*** Failed to find icp_qa_al.ko or qat_api.ko in:
$QAT_OBJ])
])
@ -721,7 +730,7 @@ AC_DEFUN([ZFS_LINUX_COMPILE_IFELSE], [
modpost_flag=''
test "x$enable_linux_builtin" = xyes && modpost_flag='modpost=true' # fake modpost stage
AS_IF(
[AC_TRY_COMMAND(cp conftest.c conftest.h build && make [$2] -C $LINUX_OBJ EXTRA_CFLAGS="-Werror $EXTRA_KCFLAGS" $ARCH_UM M=$PWD/build $modpost_flag) >/dev/null && AC_TRY_COMMAND([$3])],
[AC_TRY_COMMAND(cp conftest.c conftest.h build && make [$2] -C $LINUX_OBJ EXTRA_CFLAGS="-Werror $FRAME_LARGER_THAN $EXTRA_KCFLAGS" $ARCH_UM M=$PWD/build $modpost_flag) >/dev/null && AC_TRY_COMMAND([$3])],
[$4],
[_AC_MSG_LOG_CONFTEST m4_ifvaln([$5],[$5])]
)

View File

@ -2,9 +2,9 @@ tgz-local:
@(if test "${HAVE_ALIEN}" = "no"; then \
echo -e "\n" \
"*** Required util ${ALIEN} missing. Please install the\n" \
"*** package for your distribution which provides ${ALIEN},\n" \
"*** package for your distribution which provides ${ALIEN},\n" \
"*** re-run configure, and try again.\n"; \
exit 1; \
exit 1; \
fi)
tgz-kmod: tgz-local rpm-kmod

14
config/user-libaio.m4 Normal file
View File

@ -0,0 +1,14 @@
dnl #
dnl # Check for libaio - only used for libaiot test cases.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_USER_LIBAIO], [
LIBAIO=
AC_CHECK_HEADER([libaio.h], [
user_libaio=yes
AC_SUBST([LIBAIO], ["-laio"])
AC_DEFINE([HAVE_LIBAIO], 1, [Define if you have libaio])
], [
user_libaio=no
])
])

View File

@ -1,12 +0,0 @@
dnl #
dnl # Check for libattr
dnl #
AC_DEFUN([ZFS_AC_CONFIG_USER_LIBATTR], [
LIBATTR=
AC_CHECK_HEADER([attr/xattr.h], [], [AC_MSG_FAILURE([
*** attr/xattr.h missing, libattr-devel package required])])
AC_SUBST([LIBATTR], ["-lattr"])
AC_DEFINE([HAVE_LIBATTR], 1, [Define if you have libattr])
])

View File

@ -6,7 +6,7 @@ AC_DEFUN([ZFS_AC_CONFIG_USER_LIBBLKID], [
LIBBLKID=
AC_CHECK_HEADER([blkid/blkid.h], [], [AC_MSG_FAILURE([
*** blkid.h missing, libblkid-devel package required])])
*** blkid.h missing, libblkid-devel package required])])
AC_SUBST([LIBBLKID], ["-lblkid"])
AC_DEFINE([HAVE_LIBBLKID], 1, [Define if you have libblkid])

View File

@ -2,7 +2,8 @@ AC_DEFUN([ZFS_AC_CONFIG_USER_SYSTEMD], [
AC_ARG_ENABLE(systemd,
AC_HELP_STRING([--enable-systemd],
[install systemd unit/preset files [[default: yes]]]),
[],enable_systemd=yes)
[enable_systemd=$enableval],
[enable_systemd=check])
AC_ARG_WITH(systemdunitdir,
AC_HELP_STRING([--with-systemdunitdir=DIR],
@ -19,16 +20,27 @@ AC_DEFUN([ZFS_AC_CONFIG_USER_SYSTEMD], [
[install systemd module load files into dir [[/usr/lib/modules-load.d]]]),
systemdmoduleloaddir=$withval,systemdmodulesloaddir=/usr/lib/modules-load.d)
AS_IF([test "x$enable_systemd" = xcheck], [
AS_IF([systemctl --version >/dev/null 2>&1],
[enable_systemd=yes],
[enable_systemd=no])
])
AS_IF([test "x$enable_systemd" = xyes],
[
AC_MSG_CHECKING(for systemd support)
AC_MSG_RESULT([$enable_systemd])
AS_IF([test "x$enable_systemd" = xyes], [
ZFS_INIT_SYSTEMD=systemd
ZFS_MODULE_LOAD=modules-load.d
DEFINE_SYSTEMD='--with systemd --define "_unitdir $(systemdunitdir)" --define "_presetdir $(systemdpresetdir)"'
modulesloaddir=$systemdmodulesloaddir
])
],[
DEFINE_SYSTEMD='--without systemd'
])
AC_SUBST(ZFS_INIT_SYSTEMD)
AC_SUBST(ZFS_MODULE_LOAD)
AC_SUBST(DEFINE_SYSTEMD)
AC_SUBST(systemdunitdir)
AC_SUBST(systemdpresetdir)
AC_SUBST(modulesloaddir)

View File

@ -11,9 +11,9 @@ AC_DEFUN([ZFS_AC_CONFIG_USER], [
ZFS_AC_CONFIG_USER_LIBUUID
ZFS_AC_CONFIG_USER_LIBTIRPC
ZFS_AC_CONFIG_USER_LIBBLKID
ZFS_AC_CONFIG_USER_LIBATTR
ZFS_AC_CONFIG_USER_LIBUDEV
ZFS_AC_CONFIG_USER_FRAME_LARGER_THAN
ZFS_AC_CONFIG_USER_LIBAIO
ZFS_AC_CONFIG_USER_RUNSTATEDIR
ZFS_AC_CONFIG_USER_MAKEDEV_IN_SYSMACROS
ZFS_AC_CONFIG_USER_MAKEDEV_IN_MKDEV

View File

@ -6,37 +6,75 @@ AC_DEFUN([ZFS_AC_LICENSE], [
AC_MSG_RESULT([$ZFS_META_LICENSE])
])
AC_DEFUN([ZFS_AC_DEBUG_ENABLE], [
KERNELCPPFLAGS="${KERNELCPPFLAGS} -DDEBUG -Werror"
HOSTCFLAGS="${HOSTCFLAGS} -DDEBUG -Werror"
DEBUG_CFLAGS="-DDEBUG -Werror"
DEBUG_ZFS="_with_debug"
AC_DEFINE(ZFS_DEBUG, 1, [zfs debugging enabled])
])
AC_DEFUN([ZFS_AC_DEBUG_DISABLE], [
KERNELCPPFLAGS="${KERNELCPPFLAGS} -DNDEBUG "
HOSTCFLAGS="${HOSTCFLAGS} -DNDEBUG "
DEBUG_CFLAGS="-DNDEBUG"
DEBUG_STACKFLAGS=""
DEBUG_ZFS="_without_debug"
])
AC_DEFUN([ZFS_AC_DEBUG], [
AC_MSG_CHECKING([whether debugging is enabled])
AC_MSG_CHECKING([whether assertion support will be enabled])
AC_ARG_ENABLE([debug],
[AS_HELP_STRING([--enable-debug],
[Enable generic debug support @<:@default=no@:>@])],
[Enable assertion support @<:@default=no@:>@])],
[],
[enable_debug=no])
AS_IF([test "x$enable_debug" = xyes],
[
KERNELCPPFLAGS="${KERNELCPPFLAGS} -DDEBUG -Werror"
HOSTCFLAGS="${HOSTCFLAGS} -DDEBUG -Werror"
DEBUG_CFLAGS="-DDEBUG -Werror"
DEBUG_STACKFLAGS="-fstack-check"
DEBUG_ZFS="_with_debug"
AC_DEFINE(ZFS_DEBUG, 1, [zfs debugging enabled])
],
[
KERNELCPPFLAGS="${KERNELCPPFLAGS} -DNDEBUG "
HOSTCFLAGS="${HOSTCFLAGS} -DNDEBUG "
DEBUG_CFLAGS="-DNDEBUG"
DEBUG_STACKFLAGS=""
DEBUG_ZFS="_without_debug"
])
AS_CASE(["x$enable_debug"],
["xyes"],
[ZFS_AC_DEBUG_ENABLE],
["xno"],
[ZFS_AC_DEBUG_DISABLE],
[AC_MSG_ERROR([Unknown option $enable_debug])])
AC_SUBST(DEBUG_CFLAGS)
AC_SUBST(DEBUG_STACKFLAGS)
AC_SUBST(DEBUG_ZFS)
AC_MSG_RESULT([$enable_debug])
])
AC_DEFUN([ZFS_AC_DEBUGINFO_KERNEL], [
KERNELMAKE_PARAMS="$KERNELMAKE_PARAMS CONFIG_DEBUG_INFO=y"
KERNELCPPFLAGS="${KERNELCPPFLAGS} -fno-inline"
])
AC_DEFUN([ZFS_AC_DEBUGINFO_USER], [
DEBUG_CFLAGS="${DEBUG_CFLAGS} -g -fno-inline"
])
AC_DEFUN([ZFS_AC_DEBUGINFO], [
AC_MSG_CHECKING([whether debuginfo support will be forced])
AC_ARG_ENABLE([debuginfo],
[AS_HELP_STRING([--enable-debuginfo],
[Force generation of debuginfo @<:@default=no@:>@])],
[],
[enable_debuginfo=no])
AS_CASE(["x$enable_debuginfo"],
["xyes"],
[ZFS_AC_DEBUGINFO_KERNEL
ZFS_AC_DEBUGINFO_USER],
["xkernel"],
[ZFS_AC_DEBUGINFO_KERNEL],
["xuser"],
[ZFS_AC_DEBUGINFO_USER],
["xno"],
[],
[AC_MSG_ERROR([Unknown option $enable_debug])])
AC_SUBST(DEBUG_CFLAGS)
AC_MSG_RESULT([$enable_debuginfo])
])
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS], [
ZFS_AC_CONFIG_ALWAYS_NO_UNUSED_BUT_SET_VARIABLE
ZFS_AC_CONFIG_ALWAYS_NO_BOOL_COMPARE
@ -79,11 +117,11 @@ AC_DEFUN([ZFS_AC_CONFIG], [
AM_CONDITIONAL([CONFIG_KERNEL],
[test "$ZFS_CONFIG" = kernel -o "$ZFS_CONFIG" = all] &&
[test "x$enable_linux_builtin" != xyes ])
AM_CONDITIONAL([WANT_DEVNAME2DEVID],
[test "x$user_libudev" = xyes ])
AM_CONDITIONAL([CONFIG_QAT],
[test "$ZFS_CONFIG" = kernel -o "$ZFS_CONFIG" = all] &&
[test "x$qatsrc" != x ])
AM_CONDITIONAL([WANT_DEVNAME2DEVID], [test "x$user_libudev" = xyes ])
AM_CONDITIONAL([WANT_MMAP_LIBAIO], [test "x$user_libaio" = xyes ])
])
dnl #
@ -121,10 +159,45 @@ AC_DEFUN([ZFS_AC_RPM], [
])
RPM_DEFINE_COMMON='--define "$(DEBUG_ZFS) 1"'
RPM_DEFINE_UTIL='--define "_dracutdir $(dracutdir)" --define "_udevdir $(udevdir)" --define "_udevruledir $(udevruledir)" --define "_initconfdir $(DEFAULT_INITCONF_DIR)" $(DEFINE_INITRAMFS)'
RPM_DEFINE_UTIL=' --define "_initconfdir $(DEFAULT_INITCONF_DIR)"'
dnl # Make the next three RPM_DEFINE_UTIL additions conditional, since
dnl # their values may not be set when running:
dnl #
dnl # ./configure --with-config=srpm
dnl #
AS_IF([test -n "$dracutdir" ], [
RPM_DEFINE_UTIL='--define "_dracutdir $(dracutdir)"'
])
AS_IF([test -n "$udevdir" ], [
RPM_DEFINE_UTIL+=' --define "_udevdir $(udevdir)"'
])
AS_IF([test -n "$udevruledir" ], [
RPM_DEFINE_UTIL+=' --define "_udevdir $(udevruledir)"'
])
RPM_DEFINE_UTIL+=' $(DEFINE_INITRAMFS)'
RPM_DEFINE_UTIL+=' $(DEFINE_SYSTEMD)'
RPM_DEFINE_KMOD='--define "kernels $(LINUX_VERSION)" --define "require_spldir $(SPL)" --define "require_splobj $(SPL_OBJ)" --define "ksrc $(LINUX)" --define "kobj $(LINUX_OBJ)"'
RPM_DEFINE_KMOD+=' --define "_wrong_version_format_terminate_build 0"'
RPM_DEFINE_DKMS=
dnl # Override default lib directory on Debian/Ubuntu systems. The provided
dnl # /usr/lib/rpm/platform/<arch>/macros files do not specify the correct
dnl # path for multiarch systems as described by the packaging guidelines.
dnl #
dnl # https://wiki.ubuntu.com/MultiarchSpec
dnl # https://wiki.debian.org/Multiarch/Implementation
dnl #
AS_IF([test "$DEFAULT_PACKAGE" = "deb"], [
MULTIARCH_LIBDIR="lib/$(dpkg-architecture -qDEB_HOST_MULTIARCH)"
RPM_DEFINE_UTIL+=' --define "_lib $(MULTIARCH_LIBDIR)"'
AC_SUBST(MULTIARCH_LIBDIR)
])
SRPM_DEFINE_COMMON='--define "build_src_rpm 1"'
SRPM_DEFINE_UTIL=
SRPM_DEFINE_KMOD=

View File

@ -50,11 +50,13 @@ AC_PROG_CC
AC_PROG_LIBTOOL
AM_PROG_AS
AM_PROG_CC_C_O
AX_CODE_COVERAGE
ZFS_AC_LICENSE
ZFS_AC_PACKAGE
ZFS_AC_CONFIG
ZFS_AC_DEBUG
ZFS_AC_DEBUGINFO
AC_CONFIG_FILES([
Makefile
@ -120,6 +122,9 @@ AC_CONFIG_FILES([
contrib/dracut/02zfsexpandknowledge/Makefile
contrib/dracut/90zfs/Makefile
contrib/initramfs/Makefile
contrib/initramfs/hooks/Makefile
contrib/initramfs/scripts/Makefile
contrib/initramfs/scripts/local-top/Makefile
module/Makefile
module/avl/Makefile
module/nvpair/Makefile
@ -151,6 +156,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/callbacks/Makefile
tests/zfs-tests/cmd/Makefile
tests/zfs-tests/cmd/chg_usr_exec/Makefile
tests/zfs-tests/cmd/user_ns_exec/Makefile
tests/zfs-tests/cmd/devname2devid/Makefile
tests/zfs-tests/cmd/dir_rd_update/Makefile
tests/zfs-tests/cmd/file_check/Makefile
@ -162,6 +168,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/cmd/mkfiles/Makefile
tests/zfs-tests/cmd/mktree/Makefile
tests/zfs-tests/cmd/mmap_exec/Makefile
tests/zfs-tests/cmd/mmap_libaio/Makefile
tests/zfs-tests/cmd/mmapwrite/Makefile
tests/zfs-tests/cmd/randfree_file/Makefile
tests/zfs-tests/cmd/readmmap/Makefile
@ -234,6 +241,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/tests/functional/cli_user/zpool_iostat/Makefile
tests/zfs-tests/tests/functional/cli_user/zpool_list/Makefile
tests/zfs-tests/tests/functional/compression/Makefile
tests/zfs-tests/tests/functional/cp_files/Makefile
tests/zfs-tests/tests/functional/ctime/Makefile
tests/zfs-tests/tests/functional/delegate/Makefile
tests/zfs-tests/tests/functional/devices/Makefile
@ -248,6 +256,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/tests/functional/history/Makefile
tests/zfs-tests/tests/functional/inheritance/Makefile
tests/zfs-tests/tests/functional/inuse/Makefile
tests/zfs-tests/tests/functional/kstat/Makefile
tests/zfs-tests/tests/functional/large_files/Makefile
tests/zfs-tests/tests/functional/largest_pool/Makefile
tests/zfs-tests/tests/functional/link_count/Makefile
@ -282,6 +291,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/tests/functional/threadsappend/Makefile
tests/zfs-tests/tests/functional/tmpfile/Makefile
tests/zfs-tests/tests/functional/truncate/Makefile
tests/zfs-tests/tests/functional/user_namespace/Makefile
tests/zfs-tests/tests/functional/userquota/Makefile
tests/zfs-tests/tests/functional/upgrade/Makefile
tests/zfs-tests/tests/functional/vdev_zaps/Makefile

View File

@ -24,6 +24,7 @@ $(pkgdracut_SCRIPTS):%:%.in
-e 's,@udevruledir\@,$(udevruledir),g' \
-e 's,@sysconfdir\@,$(sysconfdir),g' \
-e 's,@systemdunitdir\@,$(systemdunitdir),g' \
-e 's,@mounthelperdir\@,$(mounthelperdir),g' \
$< >'$@'
distclean-local::

View File

@ -5,7 +5,7 @@ check() {
[ "${1}" = "-d" ] && return 0
# Verify the zfs tool chain
for tool in "@sbindir@/zpool" "@sbindir@/zfs" "@sbindir@/mount.zfs" ; do
for tool in "@sbindir@/zpool" "@sbindir@/zfs" "@mounthelperdir@/mount.zfs" ; do
test -x "$tool" || return 1
done
# Verify grep exists
@ -53,7 +53,7 @@ install() {
# Fallback: Guess the path and include all matches
dracut_install /usr/lib/gcc/*/*/libgcc_s.so*
fi
dracut_install @sbindir@/mount.zfs
dracut_install @mounthelperdir@/mount.zfs
dracut_install @udevdir@/vdev_id
dracut_install awk
dracut_install head

View File

@ -34,6 +34,7 @@ info "ZFS: No sysroot.mount exists or zfs-generator did not extend it."
info "ZFS: Mounting root with the traditional mount-zfs.sh instead."
# Delay until all required block devices are present.
modprobe zfs 2>/dev/null
udevadm settle
if [ "${root}" = "zfs:AUTO" ] ; then

View File

@ -1,16 +1,17 @@
initrddir = $(datarootdir)/initramfs-tools
initrd_SCRIPTS = conf-hooks.d/zfs hooks/zfs scripts/zfs scripts/local-top/zfs
initrd_SCRIPTS = \
conf.d/zfs conf-hooks.d/zfs hooks/zfs scripts/zfs scripts/local-top/zfs
SUBDIRS = hooks scripts
EXTRA_DIST = \
$(top_srcdir)/contrib/initramfs/conf.d/zfs \
$(top_srcdir)/contrib/initramfs/conf-hooks.d/zfs \
$(top_srcdir)/contrib/initramfs/hooks/zfs \
$(top_srcdir)/contrib/initramfs/scripts/zfs \
$(top_srcdir)/contrib/initramfs/scripts/local-top/zfs \
$(top_srcdir)/contrib/initramfs/README.initramfs.markdown
install-initrdSCRIPTS: $(EXTRA_DIST)
for d in conf-hooks.d hooks scripts scripts/local-top; do \
for d in conf.d conf-hooks.d hooks scripts scripts/local-top; do \
$(MKDIR_P) $(DESTDIR)$(initrddir)/$$d; \
cp $(top_srcdir)/contrib/initramfs/$$d/zfs \
$(DESTDIR)$(initrddir)/$$d/; \

View File

@ -0,0 +1,8 @@
for x in $(cat /proc/cmdline)
do
case $x in
root=ZFS=*|root=zfs:*)
BOOT=zfs
;;
esac
done

1
contrib/initramfs/hooks/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
zfs

View File

@ -0,0 +1,21 @@
hooksdir = $(datarootdir)/initramfs-tools/hooks
hooks_SCRIPTS = \
zfs
EXTRA_DIST = \
$(top_srcdir)/contrib/initramfs/hooks/zfs.in
$(hooks_SCRIPTS):%:%.in
-$(SED) -e 's,@sbindir\@,$(sbindir),g' \
-e 's,@sysconfdir\@,$(sysconfdir),g' \
-e 's,@udevdir\@,$(udevdir),g' \
-e 's,@udevruledir\@,$(udevruledir),g' \
-e 's,@mounthelperdir\@,$(mounthelperdir),g' \
$< >'$@'
clean-local::
-$(RM) $(hooks_SCRIPTS)
distclean-local::
-$(RM) $(hooks_SCRIPTS)

View File

@ -8,14 +8,17 @@ PREREQ="zdev"
# These prerequisites are provided by the zfsutils package. The zdb utility is
# not strictly required, but it can be useful at the initramfs recovery prompt.
COPY_EXEC_LIST="/sbin/zdb /sbin/zpool /sbin/zfs /sbin/mount.zfs"
COPY_EXEC_LIST="$COPY_EXEC_LIST /usr/bin/dirname /lib/udev/vdev_id"
COPY_FILE_LIST="/etc/hostid /etc/zfs/zpool.cache /etc/default/zfs"
COPY_FILE_LIST="$COPY_FILE_LIST /etc/zfs/zfs-functions /etc/zfs/vdev_id.conf"
COPY_FILE_LIST="$COPY_FILE_LIST /lib/udev/rules.d/69-vdev.rules"
COPY_EXEC_LIST="@sbindir@/zdb @sbindir@/zpool @sbindir@/zfs"
COPY_EXEC_LIST="$COPY_EXEC_LIST @mounthelperdir@/mount.zfs @udevdir@/vdev_id"
COPY_FILE_LIST="/etc/hostid @sysconfdir@/zfs/zpool.cache"
COPY_FILE_LIST="$COPY_FILE_LIST @sysconfdir@/default/zfs"
COPY_FILE_LIST="$COPY_FILE_LIST @sysconfdir@/zfs/zfs-functions"
COPY_FILE_LIST="$COPY_FILE_LIST @sysconfdir@/zfs/vdev_id.conf"
COPY_FILE_LIST="$COPY_FILE_LIST @udevruledir@/69-vdev.rules"
# These prerequisites are provided by the base system.
COPY_EXEC_LIST="$COPY_EXEC_LIST /bin/hostname /sbin/blkid"
COPY_EXEC_LIST="$COPY_EXEC_LIST /usr/bin/dirname /bin/hostname /sbin/blkid"
COPY_EXEC_LIST="$COPY_EXEC_LIST /usr/bin/env"
# Explicitly specify all kernel modules because automatic dependency resolution
# is unreliable on many systems.

1
contrib/initramfs/scripts/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
zfs

View File

@ -0,0 +1,20 @@
scriptsdir = $(datarootdir)/initramfs-tools/scripts
scripts_SCRIPTS = \
zfs
SUBDIRS = local-top
EXTRA_DIST = \
$(top_srcdir)/contrib/initramfs/scripts/zfs.in
$(scripts_SCRIPTS):%:%.in
-$(SED) -e 's,@sbindir\@,$(sbindir),g' \
-e 's,@sysconfdir\@,$(sysconfdir),g' \
$< >'$@'
clean-local::
-$(RM) $(scripts_SCRIPTS)
distclean-local::
-$(RM) $(scripts_SCRIPTS)

View File

@ -0,0 +1,3 @@
localtopdir = $(datarootdir)/initramfs-tools/scripts/local-top
EXTRA_DIST = zfs

View File

@ -11,9 +11,9 @@
# Paths to what we need - in the initrd, these paths are hardcoded,
# so override the defines in zfs-functions.
ZFS="/sbin/zfs"
ZPOOL="/sbin/zpool"
ZPOOL_CACHE="/etc/zfs/zpool.cache"
ZFS="@sbindir@/zfs"
ZPOOL="@sbindir@/zpool"
ZPOOL_CACHE="@sysconfdir@/zfs/zpool.cache"
export ZFS ZPOOL ZPOOL_CACHE
# This runs any scripts that should run before we start importing
@ -150,7 +150,7 @@ get_pools()
fi
fi
# Filter out any exceptions...
# Filter out any exceptions...
if [ -n "$ZFS_POOL_EXCEPTIONS" ]
then
local found=""
@ -193,7 +193,7 @@ import_pool()
# Verify that the pool isn't already imported
# Make as sure as we can to not require '-f' to import.
"${ZPOOL}" status "$pool" > /dev/null 2>&1 && return 0
"${ZPOOL}" get name,guid -o value -H 2>/dev/null | grep -Fxq "$pool" && return 0
# For backwards compatibility, make sure that ZPOOL_IMPORT_PATH is set
# to something we can use later with the real import(s). We want to
@ -317,6 +317,14 @@ mount_fs()
"${ZFS}" list -oname -tfilesystem -H "${fs}" > /dev/null 2>&1
[ "$?" -ne 0 ] && return 1
# Skip filesystems with canmount=off. The root fs should not have
# canmount=off, but ignore it for backwards compatibility just in case.
if [ "$fs" != "${ZFS_BOOTFS}" ]
then
canmount=$(get_fs_value "$fs" canmount)
[ "$canmount" = "off" ] && return 0
fi
# Need the _original_ datasets mountpoint!
mountpoint=$(get_fs_value "$fs" mountpoint)
if [ "$mountpoint" = "legacy" -o "$mountpoint" = "none" ]; then
@ -329,11 +337,9 @@ mount_fs()
"$mountpoint" = "-" ]
then
if [ "$fs" != "${ZFS_BOOTFS}" ]; then
# We don't have a proper mountpoint, this
# isn't the root fs. So extract the root fs
# value from the filesystem, and we should
# (hopefully!) have a mountpoint we can use.
mountpoint="${fs##$ZFS_BOOTFS}"
# We don't have a proper mountpoint and this
# isn't the root fs.
return 0
else
# Last hail-mary: Hope 'rootmnt' is set!
mountpoint=""
@ -472,7 +478,7 @@ destroy_fs()
echo "Message: $ZFS_STDERR"
echo "Error: $ZFS_ERROR"
echo ""
echo "Failed to destroy '$fs'. Please make sure that '$fs' is not availible."
echo "Failed to destroy '$fs'. Please make sure that '$fs' is not available."
echo "Hint: Try: zfs destroy -Rfn $fs"
echo "If this dryrun looks good, then remove the 'n' from '-Rfn' and try again."
/bin/sh
@ -766,6 +772,7 @@ mountroot()
# root=zfs:<pool>/<dataset> (uses this for rpool - first part, without 'zfs:')
#
# Option <dataset> could also be <snapshot>
# Option <pool> could also be <guid>
# ------------
# Support force option
@ -883,6 +890,32 @@ mountroot()
/bin/sh
fi
# In case the pool was specified as guid, resolve guid to name
pool="$("${ZPOOL}" get name,guid -o name,value -H | \
awk -v pool="${ZFS_RPOOL}" '$2 == pool { print $1 }')"
if [ -n "$pool" ]; then
ZFS_BOOTFS="${pool}/${ZFS_BOOTFS#*/}"
ZFS_RPOOL="${pool}"
fi
# Set elevator=noop on the root pool's vdevs' disks. ZFS already
# does this for wholedisk vdevs (for all pools), so this is only
# important for partitions.
"${ZPOOL}" status -L "${ZFS_RPOOL}" 2> /dev/null |
awk '/^\t / && !/(mirror|raidz)/ {
dev=$1;
sub(/[0-9]+$/, "", dev);
print dev
}' |
while read i
do
if [ -e "/sys/block/$i/queue/scheduler" ]
then
echo noop > "/sys/block/$i/queue/scheduler"
fi
done
# ----------------------------------------------------------------
# P R E P A R E R O O T F I L E S Y S T E M
@ -925,7 +958,7 @@ mountroot()
# NOTE: Mounted in the order specified in the
# ZFS_INITRD_ADDITIONAL_DATASETS variable so take care!
# Go through the complete list (recursivly) of all filesystems below
# Go through the complete list (recursively) of all filesystems below
# the real root dataset
filesystems=$("${ZFS}" list -oname -tfilesystem -H -r "${ZFS_BOOTFS}")
for fs in $filesystems $ZFS_INITRD_ADDITIONAL_DATASETS

View File

@ -22,7 +22,7 @@ $(init_SCRIPTS) $(initconf_SCRIPTS) $(initcommon_SCRIPTS):%:%.in
NFS_SRV=nfs; \
fi; \
if [ -e /sbin/openrc-run ]; then \
SHELL=/sbin/runscript; \
SHELL=/sbin/openrc-run; \
else \
SHELL=/bin/sh; \
fi; \

0
etc/init.d/zfs-import.in Executable file → Normal file
View File

0
etc/init.d/zfs-mount.in Executable file → Normal file
View File

0
etc/init.d/zfs-share.in Executable file → Normal file
View File

0
etc/init.d/zfs-zed.in Executable file → Normal file
View File

View File

@ -1,3 +1,3 @@
# Always load kernel modules at boot. The default behavior is to load the
# kernel modules in the zfs-import-*.service or when blkid(8) detects a pool.
# The default behavior is to allow udev to load the kernel modules on demand.
# Uncomment the following line to unconditionally load them at boot.
#zfs

View File

@ -1,6 +1,7 @@
# ZFS is enabled by default
enable zfs-import-cache.service
disable zfs-import-scan.service
enable zfs-import.target
enable zfs-mount.service
enable zfs-share.service
enable zfs-zed.service

View File

@ -7,6 +7,7 @@ systemdunit_DATA = \
zfs-import-scan.service \
zfs-mount.service \
zfs-share.service \
zfs-import.target \
zfs.target
EXTRA_DIST = \
@ -15,6 +16,7 @@ EXTRA_DIST = \
$(top_srcdir)/etc/systemd/system/zfs-import-scan.service.in \
$(top_srcdir)/etc/systemd/system/zfs-mount.service.in \
$(top_srcdir)/etc/systemd/system/zfs-share.service.in \
$(top_srcdir)/etc/systemd/system/zfs-import.target.in \
$(top_srcdir)/etc/systemd/system/zfs.target.in \
$(top_srcdir)/etc/systemd/system/50-zfs.preset.in

View File

@ -6,14 +6,13 @@ After=systemd-udev-settle.service
After=cryptsetup.target
After=systemd-remount-fs.service
Before=dracut-mount.service
Before=zfs-import.target
ConditionPathExists=@sysconfdir@/zfs/zpool.cache
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStartPre=/sbin/modprobe zfs
ExecStart=@sbindir@/zpool import -c @sysconfdir@/zfs/zpool.cache -aN
[Install]
WantedBy=zfs-mount.service
WantedBy=zfs.target
WantedBy=zfs-import.target

View File

@ -5,14 +5,13 @@ Requires=systemd-udev-settle.service
After=systemd-udev-settle.service
After=cryptsetup.target
Before=dracut-mount.service
Before=zfs-import.target
ConditionPathExists=!@sysconfdir@/zfs/zpool.cache
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStartPre=/sbin/modprobe zfs
ExecStart=@sbindir@/zpool import -aN -o cachefile=none
[Install]
WantedBy=zfs-mount.service
WantedBy=zfs.target
WantedBy=zfs-import.target

View File

@ -0,0 +1,6 @@
[Unit]
Description=ZFS pool import target
[Install]
WantedBy=zfs-mount.service
WantedBy=zfs.target

View File

@ -2,8 +2,7 @@
Description=Mount ZFS filesystems
DefaultDependencies=no
After=systemd-udev-settle.service
After=zfs-import-cache.service
After=zfs-import-scan.service
After=zfs-import.target
After=systemd-remount-fs.service
Before=local-fs.target

View File

@ -4,6 +4,7 @@ pkgsysconf_DATA = \
vdev_id.conf.alias.example \
vdev_id.conf.sas_direct.example \
vdev_id.conf.sas_switch.example \
vdev_id.conf.multipath.example
vdev_id.conf.multipath.example \
vdev_id.conf.scsi.example
EXTRA_DIST = $(pkgsysconf_DATA)

Some files were not shown because too many files have changed in this diff Show More