Compare commits

...

100 Commits

Author SHA1 Message Date
Tony Hutter a8c2b7ebc6 Tag zfs-0.7.13
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2019-02-22 09:47:55 -08:00
John Wren Kennedy 2af898ee24 test-runner: python3 support
Updated to be compatible with Python 2.6, 2.7, 3.5 or newer.

Reviewed-by: John Ramsden <johnramsden@riseup.net>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: John Wren Kennedy <john.kennedy@delphix.com>
Closes #8096
2019-02-22 09:47:34 -08:00
Gregor Kopka c32c2f17d0 Fix flake 8 style warnings
Ran zts-report.py and test-runner.py from ./tests/test-runner/bin/
through the 2to3 (https://docs.python.org/2/library/2to3.html).
Checked the result, fixed:
- 'maxint' -> 'maxsize' that 2to3 missed.
- 'cmp=' parameter for a 'sorted()' with a 'key=' version.
- try/except wrapping of configparser import as there are still
  python 2.7 systems that lack a compatibility shim

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gregor Kopka <gregor@kopka.net>
Closes #7925
Closes #7952
2019-02-22 09:47:34 -08:00
Tony Hutter 2254b2bbbe GCC 9.0: Fix ztest "directive argument is not a nul-terminated string"
GCC 9.0 is complaining because we're trying to print strings that
are defined like this:

.zo_pool = { 'z', 't', 'e', 's', 't', '\0' },

Fix them by making them actual strings.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8330
2019-02-22 09:47:34 -08:00
Brian Behlendorf 5c4ec382a7 Linux 5.0 compat: Fix bio_set_dev()
The Linux 5.0 kernel updated the bio_set_dev() macro so it calls the
GPL-only bio_associate_blkg() symbol thus inadvertently converting
the entire macro.  Provide a minimal version which always assigns the
request queue's root_blkg to the bio.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8287
2019-02-22 09:47:34 -08:00
Tony Hutter e22bfd8149 Linux 5.0 compat: Disable vector instructions on 5.0+ kernels
The 5.0 kernel no longer exports the functions we need to do vector
(SSE/SSE2/SSE3/AVX...) instructions.  Disable vector-based checksum
algorithms when building against those kernels.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8259
2019-02-22 09:47:34 -08:00
Tony Hutter f45ad7bff6 Linux 5.0 compat: Fix SUBDIRs
SUBDIRs has been deprecated for a long time, and was finally removed in
the 5.0 kernel.  Use "M=" instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8257
2019-02-22 09:47:34 -08:00
Tony Hutter 0a3a4d067a Linux 5.0 compat: Convert MS_* macros to SB_*
In the 5.0 kernel, only the mount namespace code should use the MS_*
macos. Filesystems should use the SB_* ones.

https://patchwork.kernel.org/patch/10552493/

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8264
2019-02-22 09:47:34 -08:00
Tony Hutter ba8024a284 Linux 5.0 compat: Use totalram_pages()
totalram_pages() was converted to an atomic variable in 5.0:

https://patchwork.kernel.org/patch/10652795/

Its value should now be read though the totalram_pages() helper
function.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8263
2019-02-22 09:47:34 -08:00
Tony Hutter edc2675aed Linux 5.0 compat: access_ok() drops 'type' parameter
access_ok no longer needs a 'type' parameter in the 5.0 kernel.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8261
2019-02-22 09:47:34 -08:00
ilbsmart 98bb45e27a deadlock between mm_sem and tx assign in zfs_write() and page fault
The bug time sequence:
1. thread #1, `zfs_write` assign a txg "n".
2. In a same process, thread #2, mmap page fault (which means the
   `mm_sem` is hold) occurred, `zfs_dirty_inode` open a txg failed,
   and wait previous txg "n" completed.
3. thread #1 call `uiomove` to write, however page fault is occurred
   in `uiomove`, which means it need `mm_sem`, but `mm_sem` is hold by
   thread #2, so it stuck and can't complete,  then txg "n" will
   not complete.

So thread #1 and thread #2 are deadlocked.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Grady Wong <grady.w@xtaotech.com>
Closes #7939
2019-02-22 09:47:34 -08:00
Neal Gompa (ニール・ゴンパ) 44f463824b dkms: Enable debuginfo option to be set with zfs sysconfig file
On some Linux distributions, the kernel module build will not
default to building with debuginfo symbols, which can make it
difficult for debugging and testing.

For this case, we provide a flag to override the build to force
debuginfo to be produced for the kernel module build.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Neal Gompa <ngompa@datto.com>
Co-authored-by: Simon Watson <swatson@datto.com>
Signed-off-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Simon Watson <swatson@datto.com>
Closes #8304
2019-02-22 09:47:34 -08:00
Neal Gompa (ニール・ゴンパ) b0d579bc55 Bump commit subject length to 72 characters
There's not really a reason to keep the subject length so short,
since the reason to make it this short was for making nice renders
of a summary list of the git log. With 72 characters, this still
works out fine, so let's just raise it to that so that it's easier
to give slightly more descriptive change summaries.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Neal Gompa <ngompa@datto.com>
Closes #8250
2019-02-22 09:47:34 -08:00
Benjamin Gentil 7e5def8ae0 zfs.8 uses wrong snapshot names in Example 15
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: bunder2015 <omfgbunder@gmail.com>
Signed-off-by: Benjamin Gentil <benjamin@gentil.io>
Closes #8241
2019-02-22 09:47:34 -08:00
Tony Hutter 89019a846b Add enclosure_symlinks option to vdev_id
Add an 'enclosure_symlinks' option to vdev_id.conf.  This creates
consistently named symlinks to the enclosure devices (/dev/sg*) based
off the configuration in vdev_id.conf.  The enclosure symlinks show
up in /dev/by-enclosure/<prefix>-<channel><num>.  The links make it
make it easy to run sg_ses on a particular enclosure device.  The
enclosure links are created in addition to the normal
/dev/disk/by-vdev links.

'enclosure_symlinks' is only valid in sas_direct configurations.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Simon Guest <simon.guest@tesujimath.org>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8194
2019-02-22 09:47:34 -08:00
Simon Guest 41f7723e9c vdev_id: new slot type ses
This extends vdev_id to support a new slot type, ses, for SCSI Enclosure
Services.  With slot type ses, the disk slot numbers are determined by
using the device slot number reported by sg_ses for the device with
matching SAS address, found by querying all available enclosures.

This is primarily of use on systems with a deficient driver omitting
support for bay_identifier in /sys/devices.  In my testing, I found that
the existing slot types of port and id were not stable across disk
replacement, so an alternative was required.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Simon Guest <simon.guest@tesujimath.org>
Closes #6956
2019-02-22 09:47:34 -08:00
Simon Guest 2b8c3cb0c8 vdev_id: extension for new scsi topology
On systems with SCSI rather than SAS disk topology, this change enables
the vdev_id script to match against the block device path, and therefore
create a vdev alias in /dev/disk/by-vdev.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Simon Guest <simon.guest@tesujimath.org>
Closes #6592
2019-02-22 09:47:34 -08:00
Olaf Faaland f325d76e96 Rename macro ZFS_MINOR due to Lustre conflict
Macro ZFS_MINOR, introduced in commit a6cc9756 to record the chosen
static minor number for /dev/zfs, conflicts with an existing macro
in Lustre.  The lustre macro (along with _MAJOR, _PATCH, _FIX) is
used to record the zfsonlinux version Lustre is being built against.

Since the Lustre macro came first, and is used in past versions of
lustre at least going back to 2.10, it makes sense to rename the
macro in ZFS instead of doing so in Lustre which would require
backporting the patch.

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #8195
2019-02-22 09:47:34 -08:00
Brian Behlendorf e3fb781c5f Add kernel module auto-loading
Historically a dynamic misc minor number was registered for the
/dev/zfs device in order to prevent minor number collisions.  This
was fine but it prevented us from being able to use the kernel
module auto-loaded which requires a known reserved value.

Resolve this issue by adding a configure test to find an available
misc minor number which can then be used in MODULE_ALIAS_MISCDEV at
build time.  By adding this alias the zfs kmod is added to the list
of known static-nodes and the systemd-tmpfiles-setup-dev service
will create a /dev/zfs character device at boot time.

This in turn allows us to update the 90-zfs.rules file to make it
aware this is a static node.  The upshot of this is that whenever
a process (zpool, zfs, zed) opens the /dev/zfs the kmods will be
automatic loaded.  This even works for unprivileged users so there
is no longer a need to manually load the modules at boot time.

As an additional bonus the zed now no longer needs to start after
the zfs-import.service since it will trigger the module load.

In the unlikely event the minor number we selected conflicts with
another out of tree unregistered minor number the code falls back
to dynamically allocating it.  In this case the modules again
must be manually loaded.

Note that due to the change in the method of registering the minor
number the zimport.sh test case may incorrectly fail when the
static node for the installed packages is created instead of the
dynamic one.  This issue will only transiently impact zimport.sh
for this single commit when we transition and are mixing and
matching methods.

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
TEST_ZIMPORT_SKIP="yes"
Closes #7287
2019-02-22 09:47:34 -08:00
Ben Wolsieffer 14a5e48fb9 Use autoconf variable for C preprocessor
This fixes the build when cross-compiling, where the preprocessor might
be prefixed.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Ben Wolsieffer <benwolsieffer@gmail.com>
Closes #8180
2019-02-22 09:47:34 -08:00
Matthew Ahrens 01937958ce OpenZFS 9577 - remove zfs_dbuf_evict_key tsd
The zfs_dbuf_evict_key TSD (thread-specific data) is not necessary -
we can instead pass a flag down in a few places to prevent recursive
dbuf eviction. Making this change has 3 benefits:

1. The code semantics are easier to understand.
2. On Linux, performance is improved, because creating/removing
   TSD values (by setting to NULL vs non-NULL) is expensive, and
   we do it very often.
3. According to Nexenta, the current semantics can cause a
   deadlock when concurrently calling dmu_objset_evict_dbufs()
   (which is rare today, but they are working on a "parallel
   unmount" change that triggers this more easily):

Porting Notes:
* Minor conflict with OpenZFS 9337 which has not yet been ported.

Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

OpenZFS-issue: https://illumos.org/issues/9577
OpenZFS-commit: https://github.com/openzfs/openzfs/pull/645
External-issue: DLPX-58547
Closes #7602
2019-02-22 09:47:34 -08:00
LOLi edb504f9db Honor --with-mounthelperdir where applicable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6962
2019-02-22 09:47:34 -08:00
LOLi 2428fbbfcf contrib/initramfs: switch to automake
Use automake to build initramfs scripts and hooks.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6761
2019-02-22 09:47:33 -08:00
Tony Hutter 16d298188f Tag zfs-0.7.12
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-11-08 14:38:37 -08:00
Tony Hutter f42f8702ce Add BuildRequires gcc, make, elfutils-libelf-devel
This adds a BuildRequires for gcc, make, and elfutils-libelf-devel
into our spec files.  gcc has been a packaging requirement for
awhile now:

https://fedoraproject.org/wiki/Packaging:C_and_C%2B%2B

These additional BuildRequires allow us to mock build in
Fedora 29.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Tony Hutter <hutter2@llnl.gov>
Closes #8095
Closes #8102
2018-11-08 14:38:28 -08:00
Brian Behlendorf 9e58d5ef38 Fix flake8 "invalid escape sequence 'x'" warning
From, https://lintlyci.github.io/Flake8Rules/rules/W605.html

As of Python 3.6, a backslash-character pair that is not a valid
escape sequence now generates a DeprecationWarning. Although this
will eventually become a SyntaxError, that will not be for several
Python releases.

Note 'float_pobj' was simply removed from arcstat.py since it
was entirely unused.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8056
2018-11-08 14:38:28 -08:00
Brian Behlendorf 320f9de8ab ZTS: Update O_TMPFILE support check
In CentOS 7.5 the kernel provided a compatibility wrapper to support
O_TMPFILE.  This results in the test setup script correctly detecting
kernel support.  But the ZFS module was built without O_TMPFILE
support due to the non-standard CentOS kernel interface.

Handle this case by updating the setup check to fail either when
the kernel or the ZFS module fail to provide support.  The reason
will be clearly logged in the test results.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7528
2018-11-08 14:38:28 -08:00
George Melikov 262275ab26 Allow use of pool GUID as root pool
It's helpful if there are pools with same names,
but you need to use only one of them.

Main case is twin servers, meanwhile some software
requires the same name of pools (e.g. Proxmox).

Reviewed-by: Kash Pande <kash@tripleback.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Igor ‘guardian’ Lidin of Moscow, Russia
Closes #8052
2018-11-08 14:38:28 -08:00
Brian Behlendorf 55f39a01e6 Fix arc_release() refcount
Update arc_release to use arc_buf_size().  This hunk was accidentally
dropped when porting compressed send/recv, 2aa34383b.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8000
2018-11-08 14:38:28 -08:00
Tim Schumacher b884768e46 Prefix all refcount functions with zfs_
Recent changes in the Linux kernel made it necessary to prefix
the refcount_add() function with zfs_ due to a name collision.

To bring the other functions in line with that and to avoid future
collisions, prefix the other refcount functions as well.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Schumacher <timschumi@gmx.de>
Closes #7963
2018-11-08 14:38:28 -08:00
Tim Schumacher f8f4e13776 Linux 4.19-rc3+ compat: Remove refcount_t compat
torvalds/linux@59b57717f ("blkcg: delay blkg destruction until
after writeback has finished") added a refcount_t to the blkcg
structure. Due to the refcount_t compatibility code, zfs_refcount_t
was used by mistake.

Resolve this by removing the compatibility code and replacing the
occurrences of refcount_t with zfs_refcount_t.

Reviewed-by: Franz Pletz <fpletz@fnordicwalking.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Schumacher <timschumi@gmx.de>
Closes #7885
Closes #7932
2018-11-08 14:38:28 -08:00
Gregor Kopka 5f07d51751 Zpool iostat: remove latency/queue scaling
Bandwidth and iops are average per second while *_wait are averages
per request for latency or, for queue depths, an instantaneous
measurement at the end of an interval (according to man zpool).

When calculating the first two it makes sense to do
x/interval_duration (x being the increase in total bytes or number of
requests over the duration of the interval, interval_duration in
seconds) to 'scale' from amount/interval_duration to amount/second.

But applying the same math for the latter (*_wait latencies/queue) is
wrong as there is no interval_duration component in the values (these
are time/requests to get to average_time/request or already an
absulute number).

This bug leads to the only correct continuous *_wait figures for both
latencies and queue depths from 'zpool iostat -l/q' being with
duration=1 as then the wrong math cancels itself (x/1 is a nop).

This removes temporal scaling from latency and queue depth figures.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gregor Kopka <gregor@kopka.net>
Closes #7945
Closes #7694
2018-11-08 14:38:28 -08:00
Brian Behlendorf b2f003c4f4 Fix statfs(2) for 32-bit user space
When handling a 32-bit statfs() system call the returned fields,
although 64-bit in the kernel, must be limited to 32-bits or an
EOVERFLOW error will be returned.

This is less of an issue for block counts since the default
reported block size in 128KiB. But since it is possible to
set a smaller block size, these values will be scaled as
needed to fit in a 32-bit unsigned long.

Unlike most other filesystems the total possible file counts
are more likely to overflow because they are calculated based
on the available free space in the pool. In order to prevent
this the reported value must be capped at 2^32-1. This is
only for statfs(2) reporting, there are no changes to the
internal ZFS limits.

Reviewed-by: Andreas Dilger <andreas.dilger@whamcloud.com>
Reviewed-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #7927
Closes #7122
Closes #7937
2018-11-08 14:38:28 -08:00
Olaf Faaland 9014da2b01 Skip import activity test in more zdb code paths
Since zdb opens the pools read-only, it cannot damage the pool in the
event the pool is already imported either on the same host or on
another one.

If the pool vdev structure is changing while zdb is importing the
pool, it may cause zdb to crash.  However this is unlikely, and in any
case it's a user space process and can simply be run again.

For this reason, zdb should disable the multihost activity test on
import that is normally run.

This commit fixes a few zdb code paths where that had been overlooked.
It also adds tests to ensure that several common use cases handle this
properly in the future.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Gu Zheng <guzheng2331314@163.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7797
Closes #7801
2018-11-08 14:38:28 -08:00
Matthew Ahrens 45579c9515 Reduce taskq and context-switch cost of zio pipe
When doing a read from disk, ZFS creates 3 ZIO's: a zio_null(), the
logical zio_read(), and then a physical zio. Currently, each of these
results in a separate taskq_dispatch(zio_execute).

On high-read-iops workloads, this causes a significant performance
impact. By processing all 3 ZIO's in a single taskq entry, we reduce the
overhead on taskq locking and context switching.  We accomplish this by
allowing zio_done() to return a "next zio to execute" to zio_execute().

This results in a ~12% performance increase for random reads, from
96,000 iops to 108,000 iops (with recordsize=8k, on SSD's).

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-59292
Closes #7736
2018-11-08 14:38:28 -08:00
Tom Caputi b32f1279d4 Fix race in dnode_check_slots_free()
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable, leading to crashes. This patch adds the ability for
dnodes to track which txg they were last dirtied in and adds a
check for this before performing the reclaim.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #7147
Closes #7388
2018-11-08 14:38:28 -08:00
Tony Hutter 1b0cd07131 Tag zfs-0.7.11
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-09-13 10:13:41 -07:00
Dr. András Korn 8c6867dae4 tx_waited -> tx_dirty_delayed in trace_dmu.h
This change was missed in 0735ecb334.

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: András Korn <korn-github.com@elan.rulez.org>
Closes #7096
2018-09-13 10:12:22 -07:00
Tony Hutter 99310c0aa0 Revert "zpool reopen should detect expanded devices"
This reverts commit 2a16d4cfaf.

The commit was causing a "attempt to access beyond the end
of device" error:

list.zfsonlinux.org/pipermail/zfs-discuss/2018-September/032217.html
2018-09-13 10:11:42 -07:00
Tony Hutter d126980e5f Tag zfs-0.7.10
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-09-05 10:37:32 -07:00
Chris Siebenmann 88ef5b238b Correctly handle errors from kern_path
As a regular kernel function, kern_path() returns errors as negative
errnos, such as -ELOOP. zfsctl_snapdir_vget() must convert these into
the positive errnos used throughout the ZFS code when it returns them
to other ZFS functions so that the ZFS code properly sees them as
errors.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Siebenmann <cks.git01@cs.toronto.edu>
Closes #7764
Closes #7864
2018-07-06 02:46:51 -07:00
Georgy Yakovlev 30d8b85702 Fix build with CONFIG_GCC_PLUGIN_RANDSTRUCT
fs/zfs/zfs/metaslab.c:1055:2: error: positional initialization of field
in ‘struct’ declared with ‘designated_init’ attribute
[-Werror=designated-init]
  metaslab_rt_remove,

Signed-off-by: Georgy Yakovlev <ya@sysdump.net>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes: #7069
2018-07-06 02:46:51 -07:00
Tom Caputi 45f0437912 Fix 'zfs recv' of non large_dnode send streams
Currently, there is a bug where older send streams without the
DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly.
The code in receive_object() fails to handle cases where
drro->drr_dn_slots is set to 0, which is always the case when the
sending code does not support this feature flag. This patch fixes
the issue by ensuring that that a value of 0 is treated as
DNODE_MIN_SLOTS.

Tested-by:  DHE <git@dehacked.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #7617
Closes #7662
2018-07-06 02:46:51 -07:00
Tom Caputi dc3eea871a Fix object reclaim when using large dnodes
Currently, when the receive_object() code wants to reclaim an
object, it always assumes that the dnode is the legacy 512 bytes,
even when the incoming bonus buffer exceeds this length. This
causes a buffer overflow if --enable-debug is not provided and
triggers an ASSERT if it is. This patch resolves this issue and
adds an ASSERT to ensure this can't happen again.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #7097
Closes #7433
2018-07-06 02:46:51 -07:00
Tim Chase d2c8103a68 Fix problems receiving reallocated dnodes
This is a port of 047116ac - Raw sends must be able to decrease nlevels,
to the zfs-0.7-stable branch.  It includes the various fixes to the
problem of receiving incremental streams which include reallocated dnodes
in which the number of dnode slots has changed but excludes the parts
which are related to raw streams.

From 047116ac:

    Currently, when a raw zfs send file includes a
    DRR_OBJECT record that would decrease the number of
    levels of an existing object, the object is reallocated
    with dmu_object_reclaim() which creates the new dnode
    using the old object's nlevels. For non-raw sends this
    doesn't really matter, but raw sends require that
    nlevels on the receive side match that of the send
    side so that the checksum-of-MAC tree can be properly
    maintained. This patch corrects the issue by freeing
    the object completely before allocating it again in
    this case.

    This patch also corrects several issues with
    dnode_hold_impl() and related functions that prevented
    dnodes (particularly multi-slot dnodes) from being
    reallocated properly due to the fact that existing
    dnodes were not being fully cleaned up when they
    were freed.

    This patch adds a test to make sure that zfs recv
    functions properly with incremental streams containing
    dnodes of different sizes.

This also includes a one-liner fix from loli10K to fix a test failure:
https://github.com/zfsonlinux/zfs/pull/7792#discussion_r212769264

Authored-by: Tom Caputi <tcaputi@datto.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Ported-by: Tim Chase <tim@chase2k.com>

Closes #6821
Closes #6864

NOTE: This is the first of the port of 3 related patches patches to the
zfs-0.7-release branch of ZoL.  The other two patches should immediately
follow this one.
2018-07-06 02:46:51 -07:00
Joao Carlos Mendes Luis 3ea1f7f193 Fedora 28: Fix misc bounds check compiler warnings
Fix a bunch of truncation compiler warnings that show up
on Fedora 28 (GCC 8.0.1).

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #7368
Closes #7826
Closes #7830
2018-07-06 02:46:51 -07:00
LOLi 4356dd23a9 Fix libaio-devel requirement for Debian-based distributions
BuildRequires tags for "-devel" packages in the RPM spec file do not
work when building on Debian-based distributions.

Fix this issue by making this requirement conditional to RPM-based
distributions.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7829
Closes #7831
2018-07-06 02:46:51 -07:00
Brian Behlendorf 75318ec497 Add libaio-devel BuildRequires
The zfs-test package needs a build requirement on the libaio-devel
package.  Without it ./configure will correctly determine that
mmap_libaio cannot be built and it will be skipped.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7821
Closes #7824
2018-07-06 02:46:51 -07:00
Brian Behlendorf c1629734ab Add missing zfs-dracut RPM dependencies
The zfs-dracut package requires the hostid, basename, head, awk,
and grep utilities be installed.  The first three are provided by
coreutils but additional dependencies are required for awk and grep.

Reviewed-by: Manuel Amador (Rudd-O) <rudd-o@rudd-o.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7729
Closes #7747
2018-07-06 02:46:51 -07:00
DeHackEd 778290d5bc Don't modify argv[] in user tools
argv[] gets modified during string parsing for input arguments. This
is reflected in the live process listing. Don't do that.

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #7760
2018-07-06 02:46:51 -07:00
LOLi 98bc8e0b23 Fix arcstat.py handling of unsupported options
This change allows the arcstat.py script to handle unsupported options
gracefully and print both error and usage messages when one such option
is provided.

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7799
2018-07-06 02:46:51 -07:00
LOLi caafa436eb Allow inherited properties in zfs_check_settable()
This change modifies how 'checksum' and 'dedup' properties are verified
in zfs_check_settable() handling the case where they are explicitly
inherited in the dataset hierarchy when receiving a recursive send
stream.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7755
Closes #7576
Closes #7757
2018-07-06 02:46:51 -07:00
LOLi fe8de1c8a6 Fix zfs incremental send remove '-o' properties
When receiving an incremental send stream with intermediary snapshots
zfs_receive_one() does not correctly identify the top-level dataset:
consequently we restore said snapshots as if they were children
datasets in the hierarchy, forcing inheritance of any property received
with 'zfs send -o' and effectively removing any locally set value.

The test case did not correctly verify this situation because it uses
adjacent snapshots, basically testing 'zfs send -i' instead of
'zfs send -I': this commit adds an additional intermediary snapshot to
the test script.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7478
2018-07-06 02:46:51 -07:00
Toomas Soome 1bd93ea1e0 OpenZFS 8906 - uts: illumos rootfs should support salted cksum
Porting notes:
* As of grub-2.02 these checksums are not supported.  However, as
  pointed out in #6501 there are alternatives such as EFISTUB which
  work and have no such restriction.  A warning was added to the
  checksum property section of the zfs.8 man page.

Authored by: Toomas Soome <tsoome@me.com>
Reviewed by: C Fraire <cfraire@me.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Yuri Pankov <yuripv@yuripv.net>
Approved by: Dan McDonald <danmcd@joyent.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

OpenZFS-issue: https://illumos.org/issues/8906
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/7dec52f
Closes #6501
Closes #7714
2018-07-06 02:46:51 -07:00
Brian Behlendorf 6857950e46 Fix zpl_mount() deadlock
Commit 93b43af10 inadvertently introduced the following scenario which
can result in a deadlock.  This issue was most easily reproduced by
LXD containers using a ZFS storage backend but should be reproducible
under any workload which is frequently mounting and unmounting.

-- THREAD A --
spa_sync()
  spa_sync_upgrades()
    rrw_enter(&dp->dp_config_rwlock, RW_WRITER, FTAG); <- Waiting on B

-- THREAD B --
mount_fs()
  zpl_mount()
    zpl_mount_impl()
      dmu_objset_hold()
        dmu_objset_hold_flags()
          dsl_pool_hold()
            dsl_pool_config_enter()
              rrw_enter(&dp->dp_config_rwlock, RW_READER, tag);
    sget()
      sget_userns()
        grab_super()
          down_write(&s->s_umount); <- Waiting on C

-- THREAD C --
cleanup_mnt()
  deactivate_super()
    down_write(&s->s_umount);
    deactivate_locked_super()
      zpl_kill_sb()
        kill_anon_super()
          generic_shutdown_super()
            sync_filesystem()
              zpl_sync_fs()
                zfs_sync()
                  zil_commit()
                    txg_wait_synced() <- Waiting on A

Reviewed by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7598 
Closes #7659 
Closes #7691 
Closes #7693
2018-07-06 02:46:51 -07:00
Brian Behlendorf 716ce2b89e Fix kernel unaligned access on sparc64
Update the SA_COPY_DATA macro to check if architecture supports
efficient unaligned memory accesses at compile time.  Otherwise
fallback to using the sa_copy_data() function.

The kernel provided CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is
used to determine availability in kernel space.  In user space
the x86_64, x86, powerpc, and sometimes arm architectures will
define the HAVE_EFFICIENT_UNALIGNED_ACCESS macro.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7642
Closes #7684
2018-07-06 02:46:51 -07:00
Troels Nørgaard 9daae583d8 Default ashift for Amazon EC2 NVMe devices
Add a default 4 KiB ashift for Amazon EC2 NVMe devices on instances with
NVMe ephemeral devices, such as the types c5d, f1, i3 and m5d.
As per the official documentation [1] a 4096 byte blocksize should be
used to match the underlying hardware.

The string was identified via:

$ sudo sginfo -M /dev/nvme0n1
INQUIRY response (cmd: 0x12)
----------------------------
Device Type                        0
Vendor:                    NVMe
Product:                   Amazon EC2 NVMe
Revision level:

$ lsblk -io KNAME,TYPE,SIZE,MODEL
KNAME   TYPE    SIZE MODEL
nvme0n1 disk  442.4G Amazon EC2 NVMe Instance Storage

[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/
    storage-optimized-instances.html
    Retrived 2018-07-03

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Troels Nørgaard <tnn@tradeshift.com>
Closes #7676
2018-07-06 02:46:51 -07:00
Brian Behlendorf b5ee3df776 Linux 4.14 compat: blk_queue_stackable()
The blk_queue_stackable() function was replaced in the 4.14 kernel
by queue_is_rq_based(), commit torvalds/linux@5fdee212.  This change
resulted in the default elevator being used which can negatively
impact performance.

Rather than adding additional compatibility code to detect the
new interface unconditionally attempt to set the elevator.  Since
we expect this to fail for block devices without an elevator the
error message has been moved in to zfs_dbgmsg().

Finally, it was observed that the elevator_change() was removed
from the 4.12 kernel, commit torvalds/linux@c033269.  Update the
comment to clearly specify which are expected to export the
elevator_change() symbol.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7645
2018-07-06 02:46:51 -07:00
Tony Hutter 17cd9a8e0c Add pool state /proc entry, "SUSPENDED" pools
1. Add a proc entry to display the pool's state:

$ cat /proc/spl/kstat/zfs/tank/state
ONLINE

This is done without using the spa config locks, so it will
never hang.

2. Fix 'zpool status' and 'zpool list -o health' output to print
"SUSPENDED" instead of "ONLINE" for suspended pools.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7331
Closes #7563
2018-07-06 02:46:51 -07:00
Sara Hartse 2a16d4cfaf zpool reopen should detect expanded devices
Update bdev_capacity to have wholedisk vdevs query the
size of the underlying block device (correcting for the size
of the efi parition and partition alignment) and therefore detect
expanded space.

Correct vdev_get_stats_ex so that the expandsize is aligned
to metaslab size and new space is only reported if it is large
enough for a new metaslab.

Reviewed by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Wren Kennedy <jwk404@gmail.com>
Signed-off-by: sara hartse <sara.hartse@delphix.com>
External-issue: LX-165
Closes #7546
Issue #7582
2018-07-06 02:46:51 -07:00
Antonio Russo 3350a33908 Support Debian DKMS builds
scripts/dkms.mkconf calls configure with
`--with-linux=${kernel_source_dir}`, but Debian puts it kernel source at
`/lib/modules/<version>/source`. This patch adds the same logic to the
DKMS file produced by `scripts/dkms.mkconf` that Debian has shipped in
its official ZFS packaging: at DKMS build time, it checks if the system
is a Debian system, and adjusts the path accordingly.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
Closes #7358 
Closes #7540 
Closes #7554
2018-07-06 02:46:51 -07:00
Olaf Faaland 3eef58c9b6 module param callbacks check for initialized spa
Callbacks provided for module parameters are executed both
after the module is loaded, when a user alters it via sysfs, e.g
	echo bar > /sys/modules/zfs/parameters/foo

as well as when the module is loaded with an argument, e.g.
	modprobe zfs foo=bar

In the latter case, the init functions likely have not run yet,
including spa_init() which initializes the namespace lock so it is safe
to use.

Instead of immediately taking the namespace lock and attemping to
iterate over initialized spa structures, check whether spa_mode_global
is nonzero.  This is set by spa_init() after it has initialized the
namespace lock.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7496 
Closes #7521
2018-07-06 02:46:51 -07:00
Brian Behlendorf 4805781c74 Trim new line from zfs_vdev_scheduler
Add a helper function to trim the tailing new line.  While we're
here use this new hook to immediately apply the new scheduler.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3356 
Closes #6573
2018-07-06 02:46:51 -07:00
Chunwei Chen b06f40ea9b Fix ENOSPC in "Handle zap_add() failures in ..."
Commit cc63068 caused ENOSPC error when copy a large amount of files
between two directories. The reason is that the patch limits zap leaf
expansion to 2 retries, and return ENOSPC when failed.

The intent for limiting retries is to prevent pointlessly growing table
to max size when adding a block full of entries with same name in
different case in mixed mode. However, it turns out we cannot use any
limit on the retry. When we copy files from one directory in readdir
order, we are copying in hash order, one leaf block at a time. Which
means that if the leaf block in source directory has expanded 6 times,
and you copy those entries in that block, by the time you need to expand
the leaf in destination directory, you need to expand it 6 times in one
go. So any limit on the retry will result in error where it shouldn't.

Note that while we do use different salt for different directories, it
seems that the salt/hash function doesn't provide enough randomization
to the hash distance to prevent this from happening.

Since cc63068 has already been reverted. This patch adds it back and
removes the retry limit.

Also, as it turn out, failing on zap_add() has a serious side effect for
mzap_upgrade(). When upgrading from micro zap to fat zap, it will
call zap_add() to transfer entries one at a time. If it hit any error
halfway through, the remaining entries will be lost, causing those files
to become orphan. This patch add a VERIFY to catch it.

Reviewed-by: Sanjeev Bagewadi <sanjeev.bagewadi@gmail.com>
Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Albert Lee <trisk@forkgnu.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7401 
Closes #7421
2018-07-06 02:46:51 -07:00
Olaf Faaland 6b5cc49d81 Fix divide-by-zero in mmp_delay_update()
vdev_count_leaves() in the denominator may return 0, caught by Coverity.
Introduced by

* 533ea04 Update mmp_delay on sync or skipped, failed write

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7391
2018-07-06 02:46:51 -07:00
Prakash Surya ef7a79488a OpenZFS 8997 - ztest assertion failure in zil_lwb_write_issue
PROBLEM
=======

When `dmu_tx_assign` is called from `zil_lwb_write_issue`, it's possible
for either `ERESTART` or `EIO` to be returned.

If `ERESTART` is returned, this will cause an assertion to fail directly
in `zil_lwb_write_issue`, where the code assumes the return value is
`EIO` if `dmu_tx_assign` returns a non-zero value. This can occur if the
SPA is suspended when `dmu_tx_assign` is called, and most often occurs
when running `zloop`.

If `EIO` is returned, this can cause assertions to fail elsewhere in the
ZIL code. For example, `zil_commit_waiter_timeout` contains the
following logic:

    lwb_t *nlwb = zil_lwb_write_issue(zilog, lwb);
    ASSERT3S(lwb->lwb_state, !=, LWB_STATE_OPENED);

In this case, if `dmu_tx_assign` returned `EIO` from within
`zil_lwb_write_issue`, the `lwb` variable passed in will not be issued
to disk. Thus, it's `lwb_state` field will remain `LWB_STATE_OPENED` and
this assertion will fail. `zil_commit_waiter_timeout` assumes that after
it calls `zil_lwb_write_issue`, the `lwb` will be issued to disk, and
doesn't handle the case where this is not true; i.e. it doesn't handle
the case where `dmu_tx_assign` returns `EIO`.

SOLUTION
========

This change modifies the `dmu_tx_assign` function such that `txg_how` is
a bitmask, rather than of the `txg_how_t` enum type. Now, the previous
`TXG_WAITED` semantics can be used via `TXG_NOTHROTTLE`, along with
specifying either `TXG_NOWAIT` or `TXG_WAIT` semantics.

Previously, when `TXG_WAITED` was specified, `TXG_NOWAIT` semantics was
automatically invoked. This was not ideal when using `TXG_WAITED` within
`zil_lwb_write_issued`, leading the problem described above. Rather, we
want to achieve the semantics of `TXG_WAIT`, while also preventing the
`tx` from being penalized via the dirty delay throttling.

With this change, `zil_lwb_write_issued` can acheive the semtantics that
it requires by passing in the value `TXG_WAIT | TXG_NOTHROTTLE` to
`dmu_tx_assign`.

Further, consumers of `dmu_tx_assign` wishing to achieve the old
`TXG_WAITED` semantics can pass in the value `TXG_NOWAIT | TXG_NOTHROTTLE`.

Authored by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

Porting Notes:
- Additionally updated `zfs_tmpfile` to use `TXG_NOTHROTTLE`

OpenZFS-issue: https://www.illumos.org/issues/8997
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/19ea6cb0f9
Closes #7084
2018-07-06 02:46:51 -07:00
Brian Behlendorf a2f759146d Linux compat 4.18: check_disk_size_change()
Added support for the bops->check_events() interface which was
added in the 2.6.38 kernel to replace bops->media_changed().
Fully implementing this functionality allows the volume resize
code to rely on revalidate_disk(), which is the preferred
mechanism, and removes the need to use check_disk_size_change().

In order for bops->check_events() to lookup the zvol_state_t
stored in the disk->private_data the zvol_state_lock needs to
be held.  Since the check events interface may poll the mutex
has been converted to a rwlock for better concurrently.  The
rwlock need only be taken as a writer in the zvol_free() path
when disk->private_data is set to NULL.

The configure checks for the block_device_operations structure
were consolidated in a single kernel-block-device-operations.m4
file.

The ZFS_AC_KERNEL_BDEV_BLOCK_DEVICE_OPERATIONS configure checks
and assoicated dead code was removed.  This interface was added
to the 2.6.28 kernel which predates the oldest supported 2.6.32
kernel and will therefore always be available.

Updated maximum Linux version in META file.  The 4.17 kernel
was released on 2018-06-03 and ZoL is compatible with the
finalized kernel.

Reviewed-by: Boris Protopopov <boris.protopopov@actifio.com>
Reviewed-by: Sara Hartse <sara.hartse@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7611
2018-07-06 02:46:51 -07:00
Brian Behlendorf f79c0de208 Linux 4.18 compat: inode timespec -> timespec64
Commit torvalds/linux@95582b0 changes the inode i_atime, i_mtime,
and i_ctime members form timespec's to timespec64's to make them
2038 safe.  As part of this change the current_time() function was
also updated to return the timespec64 type.

Resolve this issue by introducing a new inode_timespec_t type which
is defined to match the timespec type used by the inode.  It should
be used when working with inode timestamps to ensure matching types.

The timestruc_t type under Illumos was used in a similar fashion but
was specified to always be a timespec_t.  Rather than incorrectly
define this type all timespec_t types have been replaced by the new
inode_timespec_t type.

Finally, the kernel and user space 'sys/time.h' headers were aligned
with each other.  They define as appropriate for the context several
constants as macros and include static inline implementation of
gethrestime(), gethrestime_sec(), and gethrtime().

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7643
Backported-by: Richard Yao <ryao@gentoo.org>
2018-07-06 02:46:51 -07:00
Boris Protopopov 1667816089 zv_suspend_lock in zvol_open()/zvol_release()
Acquire zv_suspend_lock on first open and last close only.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Boris Protopopov <boris.protopopov@actifio.com>
Closes #6342
2018-07-06 02:46:51 -07:00
Tony Hutter d1ed1be3cd Tag zfs-0.7.9
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-05-08 13:33:38 -07:00
Tony Hutter e749242a99 Remove DEBUG_STACKFLAGS to bypass compiler error
'Support -fsanitize=address with --enable-asan' (fed9035) removed
DEBUG_STACKFLAGS="-fstack-check" from zfs-build.m4 in master.
However, that's too heavyweight a patch to merge in to the 0.7.x branch,
so just take the one-liner we need to get around a compiler error
on Fedora 28:

$ ./configure --enable-debug --enable-debuginfo && make pkg-utils
  CC       gethrtime.lo
cc1: error: '-fstack-check=' and '-fstack-clash_protection' are mutually
exclusive.  Disabling '-fstack-check=' [-Werror]

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>

Requires-spl: #701
2018-05-07 17:19:58 -07:00
Tony Hutter 9267ef84fd Fedora 28: Add BuildRequires: libtirpc-devel
Add "BuildRequires: libtirpc-devel" to fix mock builds on Fedora 28.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7494
Closes #7495
2018-05-07 17:19:57 -07:00
Brian Behlendorf 0ee129199f RHEL 7.5 compat: FMODE_KABI_ITERATE
As of RHEL 7.5 the mainline fops.iterate() method was added to
the file_operations structure and is correctly detected by the
configure script.

Normally this is what we want, but in order to maintain KABI
compatibility the RHEL change additionally does the following:

* Requires that callers intending to use this extended interface
  set the FMODE_KABI_ITERATE flag on the file structure when
  opening the directory.
* Adds the fops.iterate() method to the end of the structure,
  without removing fops.readdir().

This change updates the configure check to ignore the RHEL 7.5+
variant of fops.iterate() when detected.  Instead fallback to
the fops.readdir() interface which will be available.

Finally, add the 'zpl_' prefix to the directory context wrappers
to avoid colliding with the kernel provided symbols when both
the fops.iterate() and fops.readdir() are provided by the kernel.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7460
Closes #7463
2018-05-07 17:19:57 -07:00
George Melikov 245be00597 Add back iostat -y or -w descriptions
The iostat -y and -w descriptions were left in cda0317e,
get them back.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #7479
Closes #7483
2018-05-07 17:19:57 -07:00
Antonio Russo c38d702330 Add test with two kinds of file creation orders
Data loss was identified in #7401 when many small files were copied.
This adds a reproducer for this bug and other similar ones: randomly
generate N files. Then, listing M of them by `ls -U` order, produce
those same files in a directory of the same name.

This triggers the bug consistently, provided N and M are large enough.
Here, N=2^16 and M=2^13.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
Closes #7411
2018-05-07 17:19:57 -07:00
Seth Forshee 3f729907c8 Allow mounting datasets more than once
Currently mounting an already mounted zfs dataset results in an
error, whereas it is typically allowed with other filesystems.
This causes some bad interactions with mount namespaces. Take
this sequence for example:

- Create a dataset
- Create a snapshot of the dataset
- Create a clone of the snapshot
- Create a new mount namespace
- Rename the original dataset

The rename results in unmounting and remounting the clone in the
original mount namespace, however the remount fails because the
dataset is still mounted in the new mount namespace. (Note that
this means the mount in the new mount namespace is never being
unmounted, so perhaps the unmount/remount of the clone isn't
actually necessary.)

The problem here is a result of the way mounting is implemented
in the kernel module. Since it is not mounting block devices it
uses mount_nodev() instead of the usual mount_bdev(). However,
mount_nodev() is written for filesystems for which each mount is
a new instance (i.e. a new super block), and zfs should be able
to detect when a mount request can be satisfied using an existing
super block.

Change zpl_mount() to call sget() directly with it's own test
callback. Passing the objset_t object as the fs data allows
checking if a superblock already exists for the dataset, and in
that case we just need to return a new reference for the sb's
root dentry.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Closes #5796
Closes #7207
2018-05-07 17:19:57 -07:00
beren12 cca220d7c6 Fix zfs_arc_max minimum tuning
When setting `zfs_arc_max` its minimum value is allowed
to be 64 MiB.  There was an off-by-1 error which can matter
on tiny systems.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Zubrzycki <github@mid-earth.net>
Closes #7417
2018-05-07 17:19:57 -07:00
Brian Behlendorf 4ed30958ce Linux compat 4.16: blk_queue_flag_{set,clear}
The HAVE_BLK_QUEUE_WRITE_CACHE_GPL_ONLY case was overlooked in
the original 10f88c5c commit because blk_queue_write_cache()
was available for the in-kernel builds.

Update the blk_queue_flag_{set,clear} wrappers to call the locked
versions to avoid confusion.  This is safe for all existing callers.

The blk_queue_set_write_cache() function has been updated to use
these wrappers.  This means setting/clearing both QUEUE_FLAG_WC
and QUEUE_FLAG_FUA is no longer atomic but this only done early
in zvol_alloc() prior to any requests so there is no issue.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Kash Pande <kash@tripleback.net>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7428
Closes #7431
2018-05-07 17:19:57 -07:00
Giuseppe Di Natale 2f118072cb Linux compat 4.16: blk_queue_flag_{set,clear}
queue_flag_{set,clear}_unlocked are now private interfaces in
the Linux kernel (https://github.com/torvalds/linux/commit/8a0ac14).
Use blk_queue_flag_{set,clear} interfaces which were introduced as
of https://github.com/torvalds/linux/commit/8814ce8.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #7410
2018-05-07 17:19:57 -07:00
Brian Behlendorf 7440f10ec1 Fix 'zfs send/recv' hang with 16M blocks
When using 16MB blocks the send/recv queue's aren't quite big
enough.  This change leaves the default 16M queue size which a
good value for most pools.  But it additionally ensures that the
queue sizes are at least twice the allowed zfs_max_recordsize.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7365
Closes #7404
2018-05-07 17:19:57 -07:00
Giuseppe Di Natale 8bb800d6b4 Clean up (k)shlib and cfg file shebangs
Most kshlib files are imported by other scripts
and do not have a shebang at the top of their files.
Make all kshlib follow this convention.

Remove shebangs from cfg files as well.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Close #7406
2018-05-07 17:19:57 -07:00
Tony Hutter bbf61c118f Fix "file is executable, but no shebang" warnings
Fedora 28's RPM build checks warn when executable files don't have a
shebang line.  These warnings are caused when we (incorrectly)
include data & config files in the_SCRIPTS automake lines. Files in
_SCRIPTS are marked executable by automake. This patch fixes the
issue by including non-executable scripts in a _DATA line instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7359
Closes #7395
2018-05-07 17:19:57 -07:00
Tony Hutter d296b09456 Exclude python scripts from RPM shebang check
The newest Fedora packaging rules print warnings for scripts using the
/usr/bin/python shebang:

    *** WARNING: mangling shebang in /usr/bin/arc_summary.py from
    #!/usr/bin/python to #!/usr/bin/python2. This will become an ERROR,
    fix it manually!

Fedora wants all cross compatible scripts to pick python3.  Since we
don't want our users to have to pick a specific version of python, we
exclude our scripts from the RPM build check.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7360
Closes #7399
2018-05-07 17:19:57 -07:00
Olaf Faaland 5ac017fc04 Update mmp_delay on sync or skipped, failed write
When an MMP write is skipped, or fails, and time since
mts->mmp_last_write is already greater than mts->mmp_delay, increase
mts->mmp_delay.  The original code only updated mts->mmp_delay when a
write succeeded, but this results in the write(s) after delays and
failed write(s) reporting an ub_mmp_delay which is too low.

Update mmp_last_write and mmp_delay if a txg sync was successful.  At
least one uberblock was written, thus extending the time we can be sure
the pool will not be imported by another host.

Do not allow mmp_delay to go below (MSEC2NSEC(zfs_multihost_interval) /
vdev_count_leaves()) so that a period of frequent successful MMP writes,
e.g. due to frequent txg syncs, does not result in an import activity
check so short it is not reliable based on mmp thread writes alone.

Remove unnecessary local variable, start.  We do not use the start time
of the loop iteration.

Add a debug message in spa_activity_check() to allow verification of the
import_delay value and to prove the activity check occurred.

Alter the tests that import pools and attempt to detect an activity
check.  Calculate the expected duration of spa_activity_check() based on
module parameters at the time the import is performed, rather than a
fixed time set in mmp.cfg.  The fixed time may be wrong.  Also, use the
default zfs_multihost_interval value so the activity check is longer and
easier to recognize.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7330
2018-05-07 17:19:57 -07:00
Tony Hutter f5ecab3aef Fedora 28: Fix misc bounds check compiler warnings
Fix a bunch of (mostly) sprintf/snprintf truncation compiler
warnings that show up on Fedora 28 (GCC 8.0.1).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7361
Closes #7368
2018-05-07 17:19:57 -07:00
LOLi fd01167ffd Fix hung z_zvol tasks during 'zfs receive'
During a receive operation zvol_create_minors_impl() can wait
needlessly for the prefetch thread because both share the same tasks
queue.  This results in hung tasks:

<3>INFO: task z_zvol:5541 blocked for more than 120 seconds.
<3>      Tainted: P           O  3.16.0-4-amd64
<3>"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

The first z_zvol:5541 (zvol_task_cb) is waiting for the long running
traverse_prefetch_thread:260

root@linux:~# cat /proc/spl/taskq
taskq                       act  nthr  spwn  maxt   pri  mina
spl_system_taskq/0            1     2     0    64   100     1
	active: [260]traverse_prefetch_thread [zfs](0xffff88003347ae40)
	wait: 5541
spl_delay_taskq/0             0     1     0     4   100     1
	delay: spa_deadman [zfs](0xffff880039924000)
z_zvol/1                      1     1     0     1   120     1
	active: [5541]zvol_task_cb [zfs](0xffff88001fde6400)
	pend: zvol_task_cb [zfs](0xffff88001fde6800)

This change adds a dedicated, per-pool, prefetch taskq to prevent the
traverse code from monopolizing the global (and limited) system_taskq by
inappropriately scheduling long running tasks on it.

Reviewed-by: Albert Lee <trisk@forkgnu.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6330
Closes #6890
Closes #7343
2018-05-07 17:19:57 -07:00
Don Brady 3b118f0a34 Add support for nvme based devids
Adds a devid for nvme devices. This is very similar to how the
other 'bus' (scsi|sata|usb) devids are generated. The devid
resides in a name/value pair in the leaf vdevs in a zpool config.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #7356
2018-05-07 17:19:57 -07:00
Tony Hutter ebe443c8ff chmod -x on etc/init.d/zfs-*.in automake files
Clear executable bit on zfs-import.in, zfs-mount.in,
zfs-share.in, and zfs-zed.in.  These are automake files and
should not be marked executable.  This fixes a RPM build error
on Fedora 28.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7355
Closes #7327
2018-05-07 17:19:57 -07:00
Brian Behlendorf 63f3396233 Fix mmap / libaio deadlock
Calling uiomove() in mappedread() under the page lock can result
in a deadlock if the user space page needs to be faulted in.

Resolve the issue by dropping the page lock before the uiomove().
The inode range lock protects against concurrent updates via
zfs_read() and zfs_write().

Reviewed-by: Albert Lee <trisk@forkgnu.org>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7335
Closes #7339
2018-05-07 17:19:57 -07:00
DeHackEd 2deb4526ee Remove libattr requirement
RHEL/CentOS 6 supports sys/xattr.h eliminating the need for
libattr-devel as a dependency.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #7344
Closes #7351
2018-05-07 17:19:57 -07:00
Tony Hutter a1662ffcaa Fedora 28: Fix "Macro %_dracutdir has empty body"
If you run ./configure --with-config=srpm, it will not trigger
the user m4 scripts to populate the dracut and udev directories.
This causes a build error on Fedora 28.  Make the dracut and
udev lines conditional to get around this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7326
Closes #7328
2018-05-07 17:19:57 -07:00
kpande ea921bf6a6 modprobe zfs during dracut mount
Resolves importing root pool during boot in dracut.  This case was
inadvertently broken with the module autoloading change in #7287.

Reviewed-by: Matthew Thode <prometheanfire@gentoo.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Kash Pande <kash@tripleback.net>
Closes #7322
2018-05-07 17:19:57 -07:00
timor 6e627cc468 Add support for nvme disk detection
This treats /dev/nvme.. devices the same way as /dev/sd... devices.  The
motivation behind this is that whole disk detection did not work on nvme
SSDs without that, because it DKC_UNKNOWN was returned for such devices.

Perhaps there should be a separate DKC_ type for this, but I don't know
enough about the code to know the implications of that.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: timor <timor.dd@googlemail.com>
Closes #7304
2018-05-07 17:19:56 -07:00
Olaf Faaland 3eb3a13628 Report pool suspended due to MMP
When the pool is suspended, record whether it was due to an I/O error or
due to MMP writes failing to succeed within the required time.

Change spa_suspended from uint8_t to zio_suspend_reason_t to store the
reason.

When userspace queries pool status via spa_tryimport(), report the
reason the pool was suspended in a new key,
ZPOOL_CONFIG_SUSPENDED_REASON.

In libzfs, when interpreting the returned config nvlist, report
suspension due to MMP with a new pool status enum value,
ZPOOL_STATUS_IO_FAILURE_MMP.

In status_callback(), which generates and emits the message when 'zpool
status' is executed, add a case to print an appropriate message for the
new pool status enum value.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7296
2018-05-07 17:19:56 -07:00
Tim Chase c234706270 Add zfs_scan_ignore_errors tunable
When it's set, a DTL range will be cleared even if its scan/scrub had
errors.  This allows to work around resilver/scrub upon import when the
pool has errors.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes #7293
2018-05-07 17:19:56 -07:00
Tony Hutter 6059ba27c4 Allow to limit zed's syslog chattiness
Some usage patterns like send/recv of replication streams can
produce a large number of events. In such a case, the current
all-syslog.sh zedlet will hold up to its name, and flood the
logs with mostly redundant information. Two mitigate this
situation, this changeset introduces to new variables
ZED_SYSLOG_SUBCLASS_INCLUDE and ZED_SYSLOG_SUBCLASS_EXCLUDE
to zed.rc that give more control over which event classes end
up in the syslog.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Daniel Kobras <d.kobras@science-computing.de>
Closes #6886
Closes #7260
2018-05-07 17:19:56 -07:00
Olaf Faaland 927f40d089 Record skipped MMP writes in multihost_history
Once per pass through the MMP thread's loop, the vdev tree is walked to
find a suitable leaf to write the next MMP block to.  If no such leaf is
found, the thread sleeps for a while and resumes at the top of the loop.

Add an entry to multihost_history when no leaf can be found, and record
the reason in the error column.  The error code for such entries is a
bitfield, displayed in hex:

0x1  At least one vdev (interior or leaf) was not writeable.
0x2  At least one writeable leaf vdev was found, but it had a pending
MMP write.

timestamp = the time in seconds since the epoch when no leaf could be
found originally.

duration = the time (in ns) during which no MMP block was written for
this reason.  This does not include the preceeding inter-write period
nor the following inter-write period.

vdev_guid = the number of sequential cycles of the MMP thread looop when
this occurred.

Sample output, truncated to fit:

For records of skipped MMP writes the right-most column, vdev_path, is
reported as "-".

id   txg  timestamp   error  duration    mmp_delay  vdev_guid     ...
936  11   1520036441  0      146264      891422313  1740883117838 ...
937  11   1520036441  0      163956      888356657  7320395061548 ...
938  11   1520036442  0      130690      885314969  7320395061548 ...
939  11   1520036442  0      2001068577  882296582  1740883117838 ...
940  11   1520036443  0      161806      882296582  7320395061548 ...
941  11   1520036443  0x2    0           998020546  1             ...
942  11   1520036444  0      136585      998020546  7320395061548 ...
943  11   1520036444  0x2    0           998020257  1             ...
944  11   1520036445  5      2002662964  994160219  1740883117838 ...
945  11   1520036445  0x2    998073118   994160219  3             ...
946  11   1520036447  0      247136      994160219  7320395061548 ...

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7212
2018-05-07 17:19:56 -07:00
Giuseppe Di Natale 6356d50e67 Introduce a destroy_dataset helper
Datasets can be busy when calling zfs destroy. Introduce
a helper function to destroy datasets and use it to destroy
datasets in zfs_allow_004_pos, zfs_promote_008_pos, and
zfs_destroy_002_pos.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #7224
Closes #7246
Closes #7249
Closes #7267
2018-05-07 17:19:56 -07:00
Tony Hutter bd69ae3b53 Tag zfs-0.7.8
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-04-09 14:31:57 -07:00
Tony Hutter 9a2e90c9fc Revert "Handle zap_add() failures in mixed ... "
This reverts commit cc63068e95.

Under certain circumstances this change can result in an ENOSPC
error when adding new files to a directory.  See #7401 for full
details.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Issue #7401
Closes #7416
2018-04-09 17:29:59 -04:00
310 changed files with 5527 additions and 1938 deletions

View File

@ -161,7 +161,7 @@ coding convention.
### Commit Message Formats
#### New Changes
Commit messages for new changes must meet the following guidelines:
* In 50 characters or less, provide a summary of the change as the
* In 72 characters or less, provide a summary of the change as the
first line in the commit message.
* A body which provides a description of the change. If necessary,
please summarize important information such as why the proposed

2
META
View File

@ -1,7 +1,7 @@
Meta: 1
Name: zfs
Branch: 1.0
Version: 0.7.7
Version: 0.7.13
Release: 1
Release-Tags: relext
License: CDDL

View File

@ -112,7 +112,6 @@ cur = {}
d = {}
out = None
kstat = None
float_pobj = re.compile("^[0-9]+(\.[0-9]+)?$")
def detailed_usage():
@ -285,7 +284,7 @@ def init():
]
)
except getopt.error as msg:
sys.stderr.write(msg)
sys.stderr.write("Error: %s\n" % str(msg))
usage()
opts = None

View File

@ -7,6 +7,8 @@ DEFAULT_INCLUDES += \
#
# Ignore the prefix for the mount helper. It must be installed in /sbin/
# because this path is hardcoded in the mount(8) for security reasons.
# However, if needed, the configure option --with-mounthelperdir= can be used
# to override the default install location.
#
sbindir=$(mounthelperdir)
sbin_PROGRAMS = mount.zfs

View File

@ -100,10 +100,11 @@ usage() {
cat << EOF
Usage: vdev_id [-h]
vdev_id <-d device> [-c config_file] [-p phys_per_port]
[-g sas_direct|sas_switch] [-m]
[-g sas_direct|sas_switch|scsi] [-m]
-c specify name of alernate config file [default=$CONFIG]
-d specify basename of device (i.e. sda)
-e Create enclose device symlinks only (/dev/by-enclosure)
-g Storage network topology [default="$TOPOLOGY"]
-m Run in multipath mode
-p number of phy's per switch port [default=$PHYS_PER_PORT]
@ -135,7 +136,7 @@ map_channel() {
MAPPED_CHAN=`awk "\\$1 == \"channel\" && \\$2 == ${PORT} \
{ print \\$3; exit }" $CONFIG`
;;
"sas_direct")
"sas_direct"|"scsi")
MAPPED_CHAN=`awk "\\$1 == \"channel\" && \
\\$2 == \"${PCI_ID}\" && \\$3 == ${PORT} \
{ print \\$4; exit }" $CONFIG`
@ -276,6 +277,23 @@ sas_handler() {
d=$(eval echo \${$i})
SLOT=`echo $d | sed -e 's/^.*://'`
;;
"ses")
# look for this SAS path in all SCSI Enclosure Services
# (SES) enclosures
sas_address=`cat $end_device_dir/sas_address 2>/dev/null`
enclosures=`lsscsi -g | \
sed -n -e '/enclosu/s/^.* \([^ ][^ ]*\) *$/\1/p'`
for enclosure in $enclosures; do
set -- $(sg_ses -p aes $enclosure | \
awk "/device slot number:/{slot=\$12} \
/SAS address: $sas_address/\
{print slot}")
SLOT=$1
if [ -n "$SLOT" ] ; then
break
fi
done
;;
esac
if [ -z "$SLOT" ] ; then
return
@ -289,6 +307,156 @@ sas_handler() {
echo ${CHAN}${SLOT}${PART}
}
scsi_handler() {
if [ -z "$FIRST_BAY_NUMBER" ] ; then
FIRST_BAY_NUMBER=`awk "\\$1 == \"first_bay_number\" \
{print \\$2; exit}" $CONFIG`
fi
FIRST_BAY_NUMBER=${FIRST_BAY_NUMBER:-0}
if [ -z "$PHYS_PER_PORT" ] ; then
PHYS_PER_PORT=`awk "\\$1 == \"phys_per_port\" \
{print \\$2; exit}" $CONFIG`
fi
PHYS_PER_PORT=${PHYS_PER_PORT:-4}
if ! echo $PHYS_PER_PORT | grep -q -E '^[0-9]+$' ; then
echo "Error: phys_per_port value $PHYS_PER_PORT is non-numeric"
exit 1
fi
if [ -z "$MULTIPATH_MODE" ] ; then
MULTIPATH_MODE=`awk "\\$1 == \"multipath\" \
{print \\$2; exit}" $CONFIG`
fi
# Use first running component device if we're handling a dm-mpath device
if [ "$MULTIPATH_MODE" = "yes" ] ; then
# If udev didn't tell us the UUID via DM_NAME, check /dev/mapper
if [ -z "$DM_NAME" ] ; then
DM_NAME=`ls -l --full-time /dev/mapper |
awk "/\/$DEV$/{print \\$9}"`
fi
# For raw disks udev exports DEVTYPE=partition when
# handling partitions, and the rules can be written to
# take advantage of this to append a -part suffix. For
# dm devices we get DEVTYPE=disk even for partitions so
# we have to append the -part suffix directly in the
# helper.
if [ "$DEVTYPE" != "partition" ] ; then
PART=`echo $DM_NAME | awk -Fp '/p/{print "-part"$2}'`
fi
# Strip off partition information.
DM_NAME=`echo $DM_NAME | sed 's/p[0-9][0-9]*$//'`
if [ -z "$DM_NAME" ] ; then
return
fi
# Get the raw scsi device name from multipath -ll. Strip off
# leading pipe symbols to make field numbering consistent.
DEV=`multipath -ll $DM_NAME |
awk '/running/{gsub("^[|]"," "); print $3 ; exit}'`
if [ -z "$DEV" ] ; then
return
fi
fi
if echo $DEV | grep -q ^/devices/ ; then
sys_path=$DEV
else
sys_path=`udevadm info -q path -p /sys/block/$DEV 2>/dev/null`
fi
# expect sys_path like this, for example:
# /devices/pci0000:00/0000:00:0b.0/0000:09:00.0/0000:0a:05.0/0000:0c:00.0/host3/target3:1:0/3:1:0:21/block/sdv
# Use positional parameters as an ad-hoc array
set -- $(echo "$sys_path" | tr / ' ')
num_dirs=$#
scsi_host_dir="/sys"
# Get path up to /sys/.../hostX
i=1
while [ $i -le $num_dirs ] ; do
d=$(eval echo \${$i})
scsi_host_dir="$scsi_host_dir/$d"
echo $d | grep -q -E '^host[0-9]+$' && break
i=$(($i + 1))
done
if [ $i = $num_dirs ] ; then
return
fi
PCI_ID=$(eval echo \${$(($i -1))} | awk -F: '{print $2":"$3}')
# In scsi mode, the directory two levels beneath
# /sys/.../hostX reveals the port and slot.
port_dir=$scsi_host_dir
j=$(($i + 2))
i=$(($i + 1))
while [ $i -le $j ] ; do
port_dir="$port_dir/$(eval echo \${$i})"
i=$(($i + 1))
done
set -- $(echo $port_dir | sed -e 's/^.*:\([^:]*\):\([^:]*\)$/\1 \2/')
PORT=$1
SLOT=$(($2 + $FIRST_BAY_NUMBER))
if [ -z "$SLOT" ] ; then
return
fi
CHAN=`map_channel $PCI_ID $PORT`
SLOT=`map_slot $SLOT $CHAN`
if [ -z "$CHAN" ] ; then
return
fi
echo ${CHAN}${SLOT}${PART}
}
# Figure out the name for the enclosure symlink
enclosure_handler () {
# We get all the info we need from udev's DEVPATH variable:
#
# DEVPATH=/sys/devices/pci0000:00/0000:00:03.0/0000:05:00.0/host0/subsystem/devices/0:0:0:0/scsi_generic/sg0
# Get the enclosure ID ("0:0:0:0")
ENC=$(basename $(readlink -m "/sys/$DEVPATH/../.."))
if [ ! -d /sys/class/enclosure/$ENC ] ; then
# Not an enclosure, bail out
return
fi
# Get the long sysfs device path to our enclosure. Looks like:
# /devices/pci0000:00/0000:00:03.0/0000:05:00.0/host0/port-0:0/ ... /enclosure/0:0:0:0
ENC_DEVICE=$(readlink /sys/class/enclosure/$ENC)
# Grab the full path to the hosts port dir:
# /devices/pci0000:00/0000:00:03.0/0000:05:00.0/host0/port-0:0
PORT_DIR=$(echo $ENC_DEVICE | grep -Eo '.+host[0-9]+/port-[0-9]+:[0-9]+')
# Get the port number
PORT_ID=$(echo $PORT_DIR | grep -Eo "[0-9]+$")
# The PCI directory is two directories up from the port directory
# /sys/devices/pci0000:00/0000:00:03.0/0000:05:00.0
PCI_ID_LONG=$(basename $(readlink -m "/sys/$PORT_DIR/../.."))
# Strip down the PCI address from 0000:05:00.0 to 05:00.0
PCI_ID=$(echo "$PCI_ID_LONG" | sed -r 's/^[0-9]+://g')
# Name our device according to vdev_id.conf (like "L0" or "U1").
NAME=$(awk "/channel/{if (\$1 == \"channel\" && \$2 == \"$PCI_ID\" && \
\$3 == \"$PORT_ID\") {print \$4int(count[\$4])}; count[\$4]++}" $CONFIG)
echo "${NAME}"
}
alias_handler () {
# Special handling is needed to correctly append a -part suffix
# to partitions of device mapper devices. The DEVTYPE attribute
@ -344,7 +512,7 @@ alias_handler () {
done
}
while getopts 'c:d:g:mp:h' OPTION; do
while getopts 'c:d:eg:mp:h' OPTION; do
case ${OPTION} in
c)
CONFIG=${OPTARG}
@ -352,6 +520,16 @@ while getopts 'c:d:g:mp:h' OPTION; do
d)
DEV=${OPTARG}
;;
e)
# When udev sees a scsi_generic device, it calls this script with -e to
# create the enclosure device symlinks only. We also need
# "enclosure_symlinks yes" set in vdev_id.config to actually create the
# symlink.
ENCLOSURE_MODE=$(awk '{if ($1 == "enclosure_symlinks") print $2}' $CONFIG)
if [ "$ENCLOSURE_MODE" != "yes" ] ; then
exit 0
fi
;;
g)
TOPOLOGY=$OPTARG
;;
@ -371,7 +549,7 @@ if [ ! -r $CONFIG ] ; then
exit 0
fi
if [ -z "$DEV" ] ; then
if [ -z "$DEV" -a -z "$ENCLOSURE_MODE" ] ; then
echo "Error: missing required option -d"
exit 1
fi
@ -384,16 +562,37 @@ if [ -z "$BAY" ] ; then
BAY=`awk "\\$1 == \"slot\" {print \\$2; exit}" $CONFIG`
fi
TOPOLOGY=${TOPOLOGY:-sas_direct}
# Should we create /dev/by-enclosure symlinks?
if [ "$ENCLOSURE_MODE" = "yes" -a "$TOPOLOGY" = "sas_direct" ] ; then
ID_ENCLOSURE=$(enclosure_handler)
if [ -z "$ID_ENCLOSURE" ] ; then
exit 0
fi
# Just create the symlinks to the enclosure devices and then exit.
ENCLOSURE_PREFIX=$(awk '/enclosure_symlinks_prefix/{print $2}' $CONFIG)
if [ -z "$ENCLOSURE_PREFIX" ] ; then
ENCLOSURE_PREFIX="enc"
fi
echo "ID_ENCLOSURE=$ID_ENCLOSURE"
echo "ID_ENCLOSURE_PATH=by-enclosure/$ENCLOSURE_PREFIX-$ID_ENCLOSURE"
exit 0
fi
# First check if an alias was defined for this device.
ID_VDEV=`alias_handler`
if [ -z "$ID_VDEV" ] ; then
BAY=${BAY:-bay}
TOPOLOGY=${TOPOLOGY:-sas_direct}
case $TOPOLOGY in
sas_direct|sas_switch)
ID_VDEV=`sas_handler`
;;
scsi)
ID_VDEV=`scsi_handler`
;;
*)
echo "Error: unknown topology $TOPOLOGY"
exit 1

View File

@ -24,7 +24,7 @@
* Copyright (c) 2011, 2016 by Delphix. All rights reserved.
* Copyright (c) 2014 Integros [integros.com]
* Copyright 2016 Nexenta Systems, Inc.
* Copyright (c) 2017 Lawrence Livermore National Security, LLC.
* Copyright (c) 2017, 2018 Lawrence Livermore National Security, LLC.
* Copyright (c) 2015, 2017, Intel Corporation.
*/
@ -3659,6 +3659,22 @@ dump_simulated_ddt(spa_t *spa)
dump_dedup_ratio(&dds_total);
}
static void
zdb_set_skip_mmp(char *target)
{
spa_t *spa;
/*
* Disable the activity check to allow examination of
* active pools.
*/
mutex_enter(&spa_namespace_lock);
if ((spa = spa_lookup(target)) != NULL) {
spa->spa_import_flags |= ZFS_IMPORT_SKIP_MMP;
}
mutex_exit(&spa_namespace_lock);
}
static void
dump_zpool(spa_t *spa)
{
@ -4412,14 +4428,15 @@ main(int argc, char **argv)
target, strerror(ENOMEM));
}
/*
* Disable the activity check to allow examination of
* active pools.
*/
if (dump_opt['C'] > 1) {
(void) printf("\nConfiguration for import:\n");
dump_nvlist(cfg, 8);
}
/*
* Disable the activity check to allow examination of
* active pools.
*/
error = spa_import(target_pool, cfg, NULL,
flags | ZFS_IMPORT_SKIP_MMP);
}
@ -4430,16 +4447,7 @@ main(int argc, char **argv)
if (error == 0) {
if (target_is_spa || dump_opt['R']) {
/*
* Disable the activity check to allow examination of
* active pools.
*/
mutex_enter(&spa_namespace_lock);
if ((spa = spa_lookup(target)) != NULL) {
spa->spa_import_flags |= ZFS_IMPORT_SKIP_MMP;
}
mutex_exit(&spa_namespace_lock);
zdb_set_skip_mmp(target);
error = spa_open_rewind(target, &spa, FTAG, policy,
NULL);
if (error) {
@ -4462,6 +4470,7 @@ main(int argc, char **argv)
}
}
} else {
zdb_set_skip_mmp(target);
error = open_objset(target, DMU_OST_ANY, FTAG, &os);
}
}

View File

@ -10,6 +10,8 @@
: "${ZED_DEBUG_LOG:="${TMPDIR:="/tmp"}/zed.debug.log"}"
zed_exit_if_ignoring_this_event
lockfile="$(basename -- "${ZED_DEBUG_LOG}").lock"
umask 077

View File

@ -5,6 +5,8 @@
[ -f "${ZED_ZEDLET_DIR}/zed.rc" ] && . "${ZED_ZEDLET_DIR}/zed.rc"
. "${ZED_ZEDLET_DIR}/zed-functions.sh"
zed_exit_if_ignoring_this_event
zed_log_msg "eid=${ZEVENT_EID}" "class=${ZEVENT_SUBCLASS}" \
"${ZEVENT_POOL_GUID:+"pool_guid=${ZEVENT_POOL_GUID}"}" \
"${ZEVENT_VDEV_PATH:+"vdev_path=${ZEVENT_VDEV_PATH}"}" \

View File

@ -438,3 +438,23 @@ zed_guid_to_pool()
$ZPOOL get -H -ovalue,name guid | awk '$1=='"$guid"' {print $2}'
fi
}
# zed_exit_if_ignoring_this_event
#
# Exit the script if we should ignore this event, as determined by
# $ZED_SYSLOG_SUBCLASS_INCLUDE and $ZED_SYSLOG_SUBCLASS_EXCLUDE in zed.rc.
# This function assumes you've imported the normal zed variables.
zed_exit_if_ignoring_this_event()
{
if [ -n "${ZED_SYSLOG_SUBCLASS_INCLUDE}" ]; then
eval "case ${ZEVENT_SUBCLASS} in
${ZED_SYSLOG_SUBCLASS_INCLUDE});;
*) exit 0;;
esac"
elif [ -n "${ZED_SYSLOG_SUBCLASS_EXCLUDE}" ]; then
eval "case ${ZEVENT_SUBCLASS} in
${ZED_SYSLOG_SUBCLASS_EXCLUDE}) exit 0;;
*);;
esac"
fi
}

View File

@ -100,3 +100,14 @@ ZED_USE_ENCLOSURE_LEDS=1
#
#ZED_SYSLOG_TAG="zed"
##
# Which set of event subclasses to log
# By default, events from all subclasses are logged.
# If ZED_SYSLOG_SUBCLASS_INCLUDE is set, only subclasses
# matching the pattern are logged. Use the pipe symbol (|)
# or shell wildcards (*, ?) to match multiple subclasses.
# Otherwise, if ZED_SYSLOG_SUBCLASS_EXCLUDE is set, the
# matching subclasses are excluded from logging.
#ZED_SYSLOG_SUBCLASS_INCLUDE="checksum|scrub_*|vdev.*"
#ZED_SYSLOG_SUBCLASS_EXCLUDE="statechange|config_*|history_event"

View File

@ -7041,6 +7041,7 @@ main(int argc, char **argv)
int ret = 0;
int i = 0;
char *cmdname;
char **newargv;
(void) setlocale(LC_ALL, "");
(void) textdomain(TEXT_DOMAIN);
@ -7095,17 +7096,26 @@ main(int argc, char **argv)
libzfs_print_on_error(g_zfs, B_TRUE);
/*
* Many commands modify input strings for string parsing reasons.
* We create a copy to protect the original argv.
*/
newargv = malloc((argc + 1) * sizeof (newargv[0]));
for (i = 0; i < argc; i++)
newargv[i] = strdup(argv[i]);
newargv[argc] = NULL;
/*
* Run the appropriate command.
*/
libzfs_mnttab_cache(g_zfs, B_TRUE);
if (find_command_idx(cmdname, &i) == 0) {
current_command = &command_table[i];
ret = command_table[i].func(argc - 1, argv + 1);
ret = command_table[i].func(argc - 1, newargv + 1);
} else if (strchr(cmdname, '=') != NULL) {
verify(find_command_idx("set", &i) == 0);
current_command = &command_table[i];
ret = command_table[i].func(argc, argv);
ret = command_table[i].func(argc, newargv);
} else {
(void) fprintf(stderr, gettext("unrecognized "
"command '%s'\n"), cmdname);
@ -7113,6 +7123,10 @@ main(int argc, char **argv)
ret = 1;
}
for (i = 0; i < argc; i++)
free(newargv[i]);
free(newargv);
if (ret == 0 && log_history)
(void) zpool_log_history(g_zfs, history_str);

View File

@ -525,10 +525,11 @@ run_one(cmd_args_t *args, uint32_t id, uint32_t T, uint32_t N,
memset(cmd, 0, cmd_size);
cmd->cmd_magic = ZPIOS_CMD_MAGIC;
strncpy(cmd->cmd_pool, args->pool, ZPIOS_NAME_SIZE - 1);
strncpy(cmd->cmd_pre, args->pre, ZPIOS_PATH_SIZE - 1);
strncpy(cmd->cmd_post, args->post, ZPIOS_PATH_SIZE - 1);
strncpy(cmd->cmd_log, args->log, ZPIOS_PATH_SIZE - 1);
snprintf(cmd->cmd_pool, sizeof (cmd->cmd_pool), "%s", args->pool);
snprintf(cmd->cmd_pre, sizeof (cmd->cmd_pre), "%s", args->pre);
snprintf(cmd->cmd_post, sizeof (cmd->cmd_post), "%s", args->post);
snprintf(cmd->cmd_log, sizeof (cmd->cmd_log), "%s", args->log);
cmd->cmd_id = id;
cmd->cmd_chunk_size = C;
cmd->cmd_thread_count = T;

View File

@ -3493,7 +3493,7 @@ single_histo_average(uint64_t *histo, unsigned int buckets)
static void
print_iostat_queues(iostat_cbdata_t *cb, nvlist_t *oldnv,
nvlist_t *newnv, double scale)
nvlist_t *newnv)
{
int i;
uint64_t val;
@ -3523,7 +3523,7 @@ print_iostat_queues(iostat_cbdata_t *cb, nvlist_t *oldnv,
format = ZFS_NICENUM_1024;
for (i = 0; i < ARRAY_SIZE(names); i++) {
val = nva[i].data[0] * scale;
val = nva[i].data[0];
print_one_stat(val, format, column_width, cb->cb_scripted);
}
@ -3532,7 +3532,7 @@ print_iostat_queues(iostat_cbdata_t *cb, nvlist_t *oldnv,
static void
print_iostat_latency(iostat_cbdata_t *cb, nvlist_t *oldnv,
nvlist_t *newnv, double scale)
nvlist_t *newnv)
{
int i;
uint64_t val;
@ -3562,7 +3562,7 @@ print_iostat_latency(iostat_cbdata_t *cb, nvlist_t *oldnv,
/* Print our avg latencies on the line */
for (i = 0; i < ARRAY_SIZE(names); i++) {
/* Compute average latency for a latency histo */
val = single_histo_average(nva[i].data, nva[i].count) * scale;
val = single_histo_average(nva[i].data, nva[i].count);
print_one_stat(val, format, column_width, cb->cb_scripted);
}
free_calc_stats(nva, ARRAY_SIZE(names));
@ -3701,9 +3701,9 @@ print_vdev_stats(zpool_handle_t *zhp, const char *name, nvlist_t *oldnv,
print_iostat_default(calcvs, cb, scale);
}
if (cb->cb_flags & IOS_LATENCY_M)
print_iostat_latency(cb, oldnv, newnv, scale);
print_iostat_latency(cb, oldnv, newnv);
if (cb->cb_flags & IOS_QUEUES_M)
print_iostat_queues(cb, oldnv, newnv, scale);
print_iostat_queues(cb, oldnv, newnv);
if (cb->cb_flags & IOS_ANYHISTO_M) {
printf("\n");
print_iostat_histos(cb, oldnv, newnv, scale, name);
@ -6226,7 +6226,8 @@ status_callback(zpool_handle_t *zhp, void *data)
&nvroot) == 0);
verify(nvlist_lookup_uint64_array(nvroot, ZPOOL_CONFIG_VDEV_STATS,
(uint64_t **)&vs, &c) == 0);
health = zpool_state_to_name(vs->vs_state, vs->vs_aux);
health = zpool_get_state_str(zhp);
(void) printf(gettext(" pool: %s\n"), zpool_get_name(zhp));
(void) printf(gettext(" state: %s\n"), health);
@ -6395,6 +6396,15 @@ status_callback(zpool_handle_t *zhp, void *data)
"to be recovered.\n"));
break;
case ZPOOL_STATUS_IO_FAILURE_MMP:
(void) printf(gettext("status: The pool is suspended because "
"multihost writes failed or were delayed;\n\tanother "
"system could import the pool undetected.\n"));
(void) printf(gettext("action: Make sure the pool's devices "
"are connected, then reboot your system and\n\timport the "
"pool.\n"));
break;
case ZPOOL_STATUS_IO_FAILURE_WAIT:
case ZPOOL_STATUS_IO_FAILURE_CONTINUE:
(void) printf(gettext("status: One or more devices are "
@ -7961,6 +7971,7 @@ main(int argc, char **argv)
int ret = 0;
int i = 0;
char *cmdname;
char **newargv;
(void) setlocale(LC_ALL, "");
(void) textdomain(TEXT_DOMAIN);
@ -7995,16 +8006,25 @@ main(int argc, char **argv)
zfs_save_arguments(argc, argv, history_str, sizeof (history_str));
/*
* Many commands modify input strings for string parsing reasons.
* We create a copy to protect the original argv.
*/
newargv = malloc((argc + 1) * sizeof (newargv[0]));
for (i = 0; i < argc; i++)
newargv[i] = strdup(argv[i]);
newargv[argc] = NULL;
/*
* Run the appropriate command.
*/
if (find_command_idx(cmdname, &i) == 0) {
current_command = &command_table[i];
ret = command_table[i].func(argc - 1, argv + 1);
ret = command_table[i].func(argc - 1, newargv + 1);
} else if (strchr(cmdname, '=')) {
verify(find_command_idx("set", &i) == 0);
current_command = &command_table[i];
ret = command_table[i].func(argc, argv);
ret = command_table[i].func(argc, newargv);
} else if (strcmp(cmdname, "freeze") == 0 && argc == 3) {
/*
* 'freeze' is a vile debugging abomination, so we treat
@ -8021,6 +8041,10 @@ main(int argc, char **argv)
ret = 1;
}
for (i = 0; i < argc; i++)
free(newargv[i]);
free(newargv);
if (ret == 0 && log_history)
(void) zpool_log_history(g_zfs, history_str);

View File

@ -191,6 +191,7 @@ static vdev_disk_db_entry_t vdev_disk_database[] = {
{"ATA INTEL SSDSC2BP24", 4096},
{"ATA INTEL SSDSC2BP48", 4096},
{"NA SmrtStorSDLKAE9W", 4096},
{"NVMe Amazon EC2 NVMe ", 4096},
/* Imported from Open Solaris */
{"ATA MARVELL SD88SA02", 4096},
/* Advanced format Hard drives */

View File

@ -171,8 +171,8 @@ typedef struct ztest_shared_opts {
} ztest_shared_opts_t;
static const ztest_shared_opts_t ztest_opts_defaults = {
.zo_pool = { 'z', 't', 'e', 's', 't', '\0' },
.zo_dir = { '/', 't', 'm', 'p', '\0' },
.zo_pool = "ztest",
.zo_dir = "/tmp",
.zo_alt_ztest = { '\0' },
.zo_alt_libpath = { '\0' },
.zo_vdevs = 5,
@ -197,7 +197,8 @@ extern uint64_t metaslab_gang_bang;
extern uint64_t metaslab_df_alloc_threshold;
extern int metaslab_preload_limit;
extern boolean_t zfs_compressed_arc_enabled;
extern int zfs_abd_scatter_enabled;
extern int zfs_abd_scatter_enabled;
extern int dmu_object_alloc_chunk_shift;
static ztest_shared_opts_t *ztest_shared_opts;
static ztest_shared_opts_t ztest_opts;
@ -310,6 +311,7 @@ static ztest_shared_callstate_t *ztest_shared_callstate;
ztest_func_t ztest_dmu_read_write;
ztest_func_t ztest_dmu_write_parallel;
ztest_func_t ztest_dmu_object_alloc_free;
ztest_func_t ztest_dmu_object_next_chunk;
ztest_func_t ztest_dmu_commit_callbacks;
ztest_func_t ztest_zap;
ztest_func_t ztest_zap_parallel;
@ -357,6 +359,7 @@ ztest_info_t ztest_info[] = {
ZTI_INIT(ztest_dmu_read_write, 1, &zopt_always),
ZTI_INIT(ztest_dmu_write_parallel, 10, &zopt_always),
ZTI_INIT(ztest_dmu_object_alloc_free, 1, &zopt_always),
ZTI_INIT(ztest_dmu_object_next_chunk, 1, &zopt_sometimes),
ZTI_INIT(ztest_dmu_commit_callbacks, 1, &zopt_always),
ZTI_INIT(ztest_zap, 30, &zopt_always),
ZTI_INIT(ztest_zap_parallel, 100, &zopt_always),
@ -1186,7 +1189,7 @@ ztest_spa_prop_set_uint64(zpool_prop_t prop, uint64_t value)
*/
typedef struct {
list_node_t z_lnode;
refcount_t z_refcnt;
zfs_refcount_t z_refcnt;
uint64_t z_object;
zfs_rlock_t z_range_lock;
} ztest_znode_t;
@ -1202,7 +1205,7 @@ ztest_znode_init(uint64_t object)
ztest_znode_t *zp = umem_alloc(sizeof (*zp), UMEM_NOFAIL);
list_link_init(&zp->z_lnode);
refcount_create(&zp->z_refcnt);
zfs_refcount_create(&zp->z_refcnt);
zp->z_object = object;
zfs_rlock_init(&zp->z_range_lock);
@ -1212,10 +1215,10 @@ ztest_znode_init(uint64_t object)
static void
ztest_znode_fini(ztest_znode_t *zp)
{
ASSERT(refcount_is_zero(&zp->z_refcnt));
ASSERT(zfs_refcount_is_zero(&zp->z_refcnt));
zfs_rlock_destroy(&zp->z_range_lock);
zp->z_object = 0;
refcount_destroy(&zp->z_refcnt);
zfs_refcount_destroy(&zp->z_refcnt);
list_link_init(&zp->z_lnode);
umem_free(zp, sizeof (*zp));
}
@ -1245,13 +1248,13 @@ ztest_znode_get(ztest_ds_t *zd, uint64_t object)
for (zp = list_head(&zll->z_list); (zp);
zp = list_next(&zll->z_list, zp)) {
if (zp->z_object == object) {
refcount_add(&zp->z_refcnt, RL_TAG);
zfs_refcount_add(&zp->z_refcnt, RL_TAG);
break;
}
}
if (zp == NULL) {
zp = ztest_znode_init(object);
refcount_add(&zp->z_refcnt, RL_TAG);
zfs_refcount_add(&zp->z_refcnt, RL_TAG);
list_insert_head(&zll->z_list, zp);
}
mutex_exit(&zll->z_lock);
@ -1265,8 +1268,8 @@ ztest_znode_put(ztest_ds_t *zd, ztest_znode_t *zp)
ASSERT3U(zp->z_object, !=, 0);
zll = &zd->zd_range_lock[zp->z_object & (ZTEST_OBJECT_LOCKS - 1)];
mutex_enter(&zll->z_lock);
refcount_remove(&zp->z_refcnt, RL_TAG);
if (refcount_is_zero(&zp->z_refcnt)) {
zfs_refcount_remove(&zp->z_refcnt, RL_TAG);
if (zfs_refcount_is_zero(&zp->z_refcnt)) {
list_remove(&zll->z_list, zp);
ztest_znode_fini(zp);
}
@ -3927,6 +3930,26 @@ ztest_dmu_object_alloc_free(ztest_ds_t *zd, uint64_t id)
umem_free(od, size);
}
/*
* Rewind the global allocator to verify object allocation backfilling.
*/
void
ztest_dmu_object_next_chunk(ztest_ds_t *zd, uint64_t id)
{
objset_t *os = zd->zd_os;
int dnodes_per_chunk = 1 << dmu_object_alloc_chunk_shift;
uint64_t object;
/*
* Rewind the global allocator randomly back to a lower object number
* to force backfilling and reclamation of recently freed dnodes.
*/
mutex_enter(&os->os_obj_lock);
object = ztest_random(os->os_obj_next_chunk);
os->os_obj_next_chunk = P2ALIGN(object, dnodes_per_chunk);
mutex_exit(&os->os_obj_lock);
}
#undef OD_ARRAY_SIZE
#define OD_ARRAY_SIZE 2

View File

@ -55,11 +55,12 @@ main(int argc, char **argv)
{
int fd, error = 0;
char zvol_name[ZFS_MAX_DATASET_NAME_LEN];
char zvol_name_part[ZFS_MAX_DATASET_NAME_LEN];
char *zvol_name_part = NULL;
char *dev_name;
struct stat64 statbuf;
int dev_minor, dev_part;
int i;
int rc;
if (argc < 2) {
printf("Usage: %s /dev/zvol_device_node\n", argv[0]);
@ -88,11 +89,13 @@ main(int argc, char **argv)
return (errno);
}
if (dev_part > 0)
snprintf(zvol_name_part, ZFS_MAX_DATASET_NAME_LEN,
"%s-part%d", zvol_name, dev_part);
rc = asprintf(&zvol_name_part, "%s-part%d", zvol_name,
dev_part);
else
snprintf(zvol_name_part, ZFS_MAX_DATASET_NAME_LEN,
"%s", zvol_name);
rc = asprintf(&zvol_name_part, "%s", zvol_name);
if (rc == -1 || zvol_name_part == NULL)
goto error;
for (i = 0; i < strlen(zvol_name_part); i++) {
if (isblank(zvol_name_part[i]))
@ -100,6 +103,8 @@ main(int argc, char **argv)
}
printf("%s\n", zvol_name_part);
free(zvol_name_part);
error:
close(fd);
return (error);
}

View File

@ -0,0 +1,21 @@
dnl #
dnl # Linux 5.0: access_ok() drops 'type' parameter:
dnl #
dnl # - access_ok(type, addr, size)
dnl # + access_ok(addr, size)
dnl #
AC_DEFUN([ZFS_AC_KERNEL_ACCESS_OK_TYPE], [
AC_MSG_CHECKING([whether access_ok() has 'type' parameter])
ZFS_LINUX_TRY_COMPILE([
#include <linux/uaccess.h>
],[
const void __user __attribute__((unused)) *addr = (void *) 0xdeadbeef;
unsigned long __attribute__((unused)) size = 1;
int error __attribute__((unused)) = access_ok(0, addr, size);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_ACCESS_OK_TYPE, 1, [kernel has access_ok with 'type' parameter])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -1,34 +0,0 @@
dnl #
dnl # 2.6.x API change
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BDEV_BLOCK_DEVICE_OPERATIONS], [
AC_MSG_CHECKING([block device operation prototypes])
tmp_flags="$EXTRA_KCFLAGS"
EXTRA_KCFLAGS="${NO_UNUSED_BUT_SET_VARIABLE}"
ZFS_LINUX_TRY_COMPILE([
#include <linux/blkdev.h>
int blk_open(struct block_device *bdev, fmode_t mode)
{ return 0; }
int blk_ioctl(struct block_device *bdev, fmode_t mode,
unsigned x, unsigned long y) { return 0; }
int blk_compat_ioctl(struct block_device * bdev, fmode_t mode,
unsigned x, unsigned long y) { return 0; }
static const struct block_device_operations
bops __attribute__ ((unused)) = {
.open = blk_open,
.release = NULL,
.ioctl = blk_ioctl,
.compat_ioctl = blk_compat_ioctl,
};
],[
],[
AC_MSG_RESULT(struct block_device)
AC_DEFINE(HAVE_BDEV_BLOCK_DEVICE_OPERATIONS, 1,
[struct block_device_operations use bdevs])
],[
AC_MSG_RESULT(struct inode)
])
EXTRA_KCFLAGS="$tmp_flags"
])

View File

@ -1,10 +1,10 @@
dnl #
dnl # Linux 4.14 API,
dnl #
dnl # The bio_set_dev() helper was introduced as part of the transition
dnl # The bio_set_dev() helper macro was introduced as part of the transition
dnl # to have struct gendisk in struct bio.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BIO_SET_DEV], [
AC_DEFUN([ZFS_AC_KERNEL_BIO_SET_DEV_MACRO], [
AC_MSG_CHECKING([whether bio_set_dev() exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/bio.h>
@ -20,3 +20,34 @@ AC_DEFUN([ZFS_AC_KERNEL_BIO_SET_DEV], [
AC_MSG_RESULT(no)
])
])
dnl #
dnl # Linux 5.0 API,
dnl #
dnl # The bio_set_dev() helper macro was updated to internally depend on
dnl # bio_associate_blkg() symbol which is exported GPL-only.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BIO_SET_DEV_GPL_ONLY], [
AC_MSG_CHECKING([whether bio_set_dev() is GPL-only])
ZFS_LINUX_TRY_COMPILE([
#include <linux/module.h>
#include <linux/bio.h>
#include <linux/fs.h>
MODULE_LICENSE("$ZFS_META_LICENSE");
],[
struct block_device *bdev = NULL;
struct bio *bio = NULL;
bio_set_dev(bio, bdev);
],[
AC_MSG_RESULT(no)
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BIO_SET_DEV_GPL_ONLY, 1,
[bio_set_dev() GPL-only])
])
])
AC_DEFUN([ZFS_AC_KERNEL_BIO_SET_DEV], [
ZFS_AC_KERNEL_BIO_SET_DEV_MACRO
ZFS_AC_KERNEL_BIO_SET_DEV_GPL_ONLY
])

View File

@ -0,0 +1,38 @@
dnl #
dnl # API change
dnl # https://github.com/torvalds/linux/commit/8814ce8
dnl # Introduction of blk_queue_flag_set and blk_queue_flag_clear
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_FLAG_SET], [
AC_MSG_CHECKING([whether blk_queue_flag_set() exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/kernel.h>
#include <linux/blkdev.h>
],[
struct request_queue *q = NULL;
blk_queue_flag_set(0, q);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_FLAG_SET, 1, [blk_queue_flag_set() exists])
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_FLAG_CLEAR], [
AC_MSG_CHECKING([whether blk_queue_flag_clear() exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/kernel.h>
#include <linux/blkdev.h>
],[
struct request_queue *q = NULL;
blk_queue_flag_clear(0, q);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_FLAG_CLEAR, 1, [blk_queue_flag_clear() exists])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -1,29 +0,0 @@
dnl #
dnl # 3.10.x API change
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
AC_MSG_CHECKING([whether block_device_operations.release is void])
tmp_flags="$EXTRA_KCFLAGS"
EXTRA_KCFLAGS="${NO_UNUSED_BUT_SET_VARIABLE}"
ZFS_LINUX_TRY_COMPILE([
#include <linux/blkdev.h>
void blk_release(struct gendisk *g, fmode_t mode) { return; }
static const struct block_device_operations
bops __attribute__ ((unused)) = {
.open = NULL,
.release = blk_release,
.ioctl = NULL,
.compat_ioctl = NULL,
};
],[
],[
AC_MSG_RESULT(void)
AC_DEFINE(HAVE_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID, 1,
[struct block_device_operations.release returns void])
],[
AC_MSG_RESULT(int)
])
EXTRA_KCFLAGS="$tmp_flags"
])

View File

@ -0,0 +1,57 @@
dnl #
dnl # 2.6.38 API change
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS], [
AC_MSG_CHECKING([whether bops->check_events() exists])
tmp_flags="$EXTRA_KCFLAGS"
EXTRA_KCFLAGS="${NO_UNUSED_BUT_SET_VARIABLE}"
ZFS_LINUX_TRY_COMPILE([
#include <linux/blkdev.h>
unsigned int blk_check_events(struct gendisk *disk,
unsigned int clearing) { return (0); }
static const struct block_device_operations
bops __attribute__ ((unused)) = {
.check_events = blk_check_events,
};
],[
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS, 1,
[bops->check_events() exists])
],[
AC_MSG_RESULT(no)
])
EXTRA_KCFLAGS="$tmp_flags"
])
dnl #
dnl # 3.10.x API change
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
AC_MSG_CHECKING([whether bops->release() is void])
tmp_flags="$EXTRA_KCFLAGS"
EXTRA_KCFLAGS="${NO_UNUSED_BUT_SET_VARIABLE}"
ZFS_LINUX_TRY_COMPILE([
#include <linux/blkdev.h>
void blk_release(struct gendisk *g, fmode_t mode) { return; }
static const struct block_device_operations
bops __attribute__ ((unused)) = {
.open = NULL,
.release = blk_release,
.ioctl = NULL,
.compat_ioctl = NULL,
};
],[
],[
AC_MSG_RESULT(void)
AC_DEFINE(HAVE_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID, 1,
[bops->release() returns void])
],[
AC_MSG_RESULT(int)
])
EXTRA_KCFLAGS="$tmp_flags"
])

View File

@ -1,15 +1,14 @@
dnl #
dnl # 4.9, current_time() added
dnl # 4.18, return type changed from timespec to timespec64
dnl #
AC_DEFUN([ZFS_AC_KERNEL_CURRENT_TIME],
[AC_MSG_CHECKING([whether current_time() exists])
ZFS_LINUX_TRY_COMPILE_SYMBOL([
#include <linux/fs.h>
], [
struct inode ip;
struct timespec now __attribute__ ((unused));
now = current_time(&ip);
struct inode ip __attribute__ ((unused));
ip.i_atime = current_time(&ip);
], [current_time], [fs/inode.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_CURRENT_TIME, 1, [current_time() exists])

View File

@ -1,6 +1,6 @@
dnl #
dnl # 2.6.36 API change
dnl # Verify the elevator_change() symbol is available.
dnl # 2.6.36 API, exported elevator_change() symbol
dnl # 4.12 API, removed elevator_change() symbol
dnl #
AC_DEFUN([ZFS_AC_KERNEL_ELEVATOR_CHANGE], [
AC_MSG_CHECKING([whether elevator_change() is available])

View File

@ -1,18 +1,41 @@
dnl #
dnl # Handle differences in kernel FPU code.
dnl #
dnl # 4.2 API change
dnl # asm/i387.h is replaced by asm/fpu/api.h
dnl # Kernel
dnl # 5.0: All kernel fpu functions are GPL only, so we can't use them.
dnl # (nothing defined)
dnl #
dnl # 4.2: Use __kernel_fpu_{begin,end}()
dnl # HAVE_UNDERSCORE_KERNEL_FPU & KERNEL_EXPORTS_X86_FPU
dnl #
dnl # Pre-4.2: Use kernel_fpu_{begin,end}()
dnl # HAVE_KERNEL_FPU & KERNEL_EXPORTS_X86_FPU
dnl #
AC_DEFUN([ZFS_AC_KERNEL_FPU], [
AC_MSG_CHECKING([whether asm/fpu/api.h exists])
AC_MSG_CHECKING([which kernel_fpu function to use])
ZFS_LINUX_TRY_COMPILE([
#include <linux/kernel.h>
#include <asm/fpu/api.h>
#include <asm/i387.h>
#include <asm/xcr.h>
],[
__kernel_fpu_begin();
kernel_fpu_begin();
kernel_fpu_end();
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_FPU_API_H, 1, [kernel has <asm/fpu/api.h> interface])
AC_MSG_RESULT(kernel_fpu_*)
AC_DEFINE(HAVE_KERNEL_FPU, 1, [kernel has kernel_fpu_* functions])
AC_DEFINE(KERNEL_EXPORTS_X86_FPU, 1, [kernel exports FPU functions])
],[
AC_MSG_RESULT(no)
ZFS_LINUX_TRY_COMPILE([
#include <linux/kernel.h>
#include <asm/fpu/api.h>
],[
__kernel_fpu_begin();
__kernel_fpu_end();
],[
AC_MSG_RESULT(__kernel_fpu_*)
AC_DEFINE(HAVE_UNDERSCORE_KERNEL_FPU, 1, [kernel has __kernel_fpu_* functions])
AC_DEFINE(KERNEL_EXPORTS_X86_FPU, 1, [kernel exports FPU functions])
],[
AC_MSG_RESULT(not exported)
])
])
])

View File

@ -0,0 +1,28 @@
dnl #
dnl # 2.6.38 API change
dnl # The .get_sb callback has been replaced by a .mount callback
dnl # in the file_system_type structure.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_FST_MOUNT], [
AC_MSG_CHECKING([whether fst->mount() exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
static struct dentry *
mount(struct file_system_type *fs_type, int flags,
const char *osname, void *data) {
struct dentry *d = NULL;
return (d);
}
static struct file_system_type fst __attribute__ ((unused)) = {
.mount = mount,
};
],[
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_FST_MOUNT, 1, [fst->mount() exists])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,20 @@
dnl #
dnl # 4.5 API change
dnl # Added in_compat_syscall() which can be overridden on a per-
dnl # architecture basis. Prior to this is_compat_task() was the
dnl # provided interface.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_IN_COMPAT_SYSCALL], [
AC_MSG_CHECKING([whether in_compat_syscall() is available])
ZFS_LINUX_TRY_COMPILE([
#include <linux/compat.h>
],[
in_compat_syscall();
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_IN_COMPAT_SYSCALL, 1,
[in_compat_syscall() is available])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,26 @@
dnl #
dnl # Determine an available miscellaneous minor number which can be used
dnl # for the /dev/zfs device. This is needed because kernel module
dnl # auto-loading depends on registering a reserved non-conflicting minor
dnl # number. Start with a large known available unreserved minor and work
dnl # our way down to lower value if a collision is detected.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_MISC_MINOR], [
AC_MSG_CHECKING([for available /dev/zfs minor])
for i in $(seq 249 -1 200); do
if ! grep -q "^#define\s\+.*_MINOR\s\+.*$i" \
${LINUX}/include/linux/miscdevice.h; then
ZFS_DEVICE_MINOR="$i"
AC_MSG_RESULT($ZFS_DEVICE_MINOR)
AC_DEFINE_UNQUOTED([ZFS_DEVICE_MINOR],
[$ZFS_DEVICE_MINOR], [/dev/zfs minor])
break
fi
done
AS_IF([ test -z "$ZFS_DEVICE_MINOR"], [
AC_MSG_ERROR([
*** No available misc minor numbers available for use.])
])
])

View File

@ -1,20 +0,0 @@
dnl #
dnl # 2.6.39 API change
dnl # The .get_sb callback has been replaced by a .mount callback
dnl # in the file_system_type structure. When using the new
dnl # interface the caller must now use the mount_nodev() helper.
dnl # This updated callback and helper no longer pass the vfsmount.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_MOUNT_NODEV],
[AC_MSG_CHECKING([whether mount_nodev() is available])
ZFS_LINUX_TRY_COMPILE_SYMBOL([
#include <linux/fs.h>
], [
mount_nodev(NULL, 0, NULL, NULL);
], [mount_nodev], [fs/super.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_MOUNT_NODEV, 1, [mount_nodev() is available])
], [
AC_MSG_RESULT(no)
])
])

View File

@ -23,16 +23,27 @@ AC_DEFUN([ZFS_AC_KERNEL_VFS_ITERATE], [
dnl #
dnl # 3.11 API change
dnl #
dnl # RHEL 7.5 compatibility; the fops.iterate() method was
dnl # added to the file_operations structure but in order to
dnl # maintain KABI compatibility all callers must set
dnl # FMODE_KABI_ITERATE which is checked in iterate_dir().
dnl # When detected ignore this interface and fallback to
dnl # to using fops.readdir() to retain KABI compatibility.
dnl #
AC_MSG_CHECKING([whether fops->iterate() is available])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
int iterate(struct file *filp, struct dir_context * context)
{ return 0; }
int iterate(struct file *filp,
struct dir_context *context) { return 0; }
static const struct file_operations fops
__attribute__ ((unused)) = {
.iterate = iterate,
};
#if defined(FMODE_KABI_ITERATE)
#error "RHEL 7.5, FMODE_KABI_ITERATE interface"
#endif
],[
],[
AC_MSG_RESULT(yes)
@ -44,8 +55,8 @@ AC_DEFUN([ZFS_AC_KERNEL_VFS_ITERATE], [
AC_MSG_CHECKING([whether fops->readdir() is available])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
int readdir(struct file *filp, void *entry, filldir_t func)
{ return 0; }
int readdir(struct file *filp, void *entry,
filldir_t func) { return 0; }
static const struct file_operations fops
__attribute__ ((unused)) = {
@ -57,7 +68,7 @@ AC_DEFUN([ZFS_AC_KERNEL_VFS_ITERATE], [
AC_DEFINE(HAVE_VFS_READDIR, 1,
[fops->readdir() is available])
],[
AC_MSG_ERROR(no; file a bug report with ZFSOnLinux)
AC_MSG_ERROR(no; file a bug report with ZoL)
])
])
])

View File

@ -5,14 +5,16 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL
ZFS_AC_SPL
ZFS_AC_QAT
ZFS_AC_KERNEL_ACCESS_OK_TYPE
ZFS_AC_TEST_MODULE
ZFS_AC_KERNEL_MISC_MINOR
ZFS_AC_KERNEL_OBJTOOL
ZFS_AC_KERNEL_CONFIG
ZFS_AC_KERNEL_DECLARE_EVENT_CLASS
ZFS_AC_KERNEL_CURRENT_BIO_TAIL
ZFS_AC_KERNEL_SUPER_USER_NS
ZFS_AC_KERNEL_SUBMIT_BIO
ZFS_AC_KERNEL_BDEV_BLOCK_DEVICE_OPERATIONS
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID
ZFS_AC_KERNEL_TYPE_FMODE_T
ZFS_AC_KERNEL_3ARG_BLKDEV_GET
@ -35,6 +37,8 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_BIO_RW_BARRIER
ZFS_AC_KERNEL_BIO_RW_DISCARD
ZFS_AC_KERNEL_BLK_QUEUE_BDI
ZFS_AC_KERNEL_BLK_QUEUE_FLAG_CLEAR
ZFS_AC_KERNEL_BLK_QUEUE_FLAG_SET
ZFS_AC_KERNEL_BLK_QUEUE_FLUSH
ZFS_AC_KERNEL_BLK_QUEUE_MAX_HW_SECTORS
ZFS_AC_KERNEL_BLK_QUEUE_MAX_SEGMENTS
@ -100,7 +104,7 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_TRUNCATE_SETSIZE
ZFS_AC_KERNEL_6ARGS_SECURITY_INODE_INIT_SECURITY
ZFS_AC_KERNEL_CALLBACK_SECURITY_INODE_INIT_SECURITY
ZFS_AC_KERNEL_MOUNT_NODEV
ZFS_AC_KERNEL_FST_MOUNT
ZFS_AC_KERNEL_SHRINK
ZFS_AC_KERNEL_SHRINK_CONTROL_HAS_NID
ZFS_AC_KERNEL_S_INSTANCES_LIST_HEAD
@ -127,6 +131,7 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_GLOBAL_PAGE_STATE
ZFS_AC_KERNEL_ACL_HAS_REFCOUNT
ZFS_AC_KERNEL_USERNS_CAPABILITIES
ZFS_AC_KERNEL_IN_COMPAT_SYSCALL
AS_IF([test "$LINUX_OBJ" != "$LINUX"], [
KERNELMAKE_PARAMS="$KERNELMAKE_PARAMS O=$LINUX_OBJ"
@ -252,7 +257,7 @@ AC_DEFUN([ZFS_AC_KERNEL], [
AS_IF([test "$utsrelease"], [
kernsrcver=`(echo "#include <$utsrelease>";
echo "kernsrcver=UTS_RELEASE") |
cpp -I $kernelbuild/include |
${CPP} -I $kernelbuild/include - |
grep "^kernsrcver=" | cut -d \" -f 2`
AS_IF([test -z "$kernsrcver"], [

14
config/user-libaio.m4 Normal file
View File

@ -0,0 +1,14 @@
dnl #
dnl # Check for libaio - only used for libaiot test cases.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_USER_LIBAIO], [
LIBAIO=
AC_CHECK_HEADER([libaio.h], [
user_libaio=yes
AC_SUBST([LIBAIO], ["-laio"])
AC_DEFINE([HAVE_LIBAIO], 1, [Define if you have libaio])
], [
user_libaio=no
])
])

View File

@ -1,12 +0,0 @@
dnl #
dnl # Check for libattr
dnl #
AC_DEFUN([ZFS_AC_CONFIG_USER_LIBATTR], [
LIBATTR=
AC_CHECK_HEADER([attr/xattr.h], [], [AC_MSG_FAILURE([
*** attr/xattr.h missing, libattr-devel package required])])
AC_SUBST([LIBATTR], ["-lattr"])
AC_DEFINE([HAVE_LIBATTR], 1, [Define if you have libattr])
])

View File

@ -11,9 +11,9 @@ AC_DEFUN([ZFS_AC_CONFIG_USER], [
ZFS_AC_CONFIG_USER_LIBUUID
ZFS_AC_CONFIG_USER_LIBTIRPC
ZFS_AC_CONFIG_USER_LIBBLKID
ZFS_AC_CONFIG_USER_LIBATTR
ZFS_AC_CONFIG_USER_LIBUDEV
ZFS_AC_CONFIG_USER_FRAME_LARGER_THAN
ZFS_AC_CONFIG_USER_LIBAIO
ZFS_AC_CONFIG_USER_RUNSTATEDIR
ZFS_AC_CONFIG_USER_MAKEDEV_IN_SYSMACROS
ZFS_AC_CONFIG_USER_MAKEDEV_IN_MKDEV

View File

@ -10,7 +10,6 @@ AC_DEFUN([ZFS_AC_DEBUG_ENABLE], [
KERNELCPPFLAGS="${KERNELCPPFLAGS} -DDEBUG -Werror"
HOSTCFLAGS="${HOSTCFLAGS} -DDEBUG -Werror"
DEBUG_CFLAGS="-DDEBUG -Werror"
DEBUG_STACKFLAGS="-fstack-check"
DEBUG_ZFS="_with_debug"
AC_DEFINE(ZFS_DEBUG, 1, [zfs debugging enabled])
])
@ -118,11 +117,11 @@ AC_DEFUN([ZFS_AC_CONFIG], [
AM_CONDITIONAL([CONFIG_KERNEL],
[test "$ZFS_CONFIG" = kernel -o "$ZFS_CONFIG" = all] &&
[test "x$enable_linux_builtin" != xyes ])
AM_CONDITIONAL([WANT_DEVNAME2DEVID],
[test "x$user_libudev" = xyes ])
AM_CONDITIONAL([CONFIG_QAT],
[test "$ZFS_CONFIG" = kernel -o "$ZFS_CONFIG" = all] &&
[test "x$qatsrc" != x ])
AM_CONDITIONAL([WANT_DEVNAME2DEVID], [test "x$user_libudev" = xyes ])
AM_CONDITIONAL([WANT_MMAP_LIBAIO], [test "x$user_libaio" = xyes ])
])
dnl #
@ -160,7 +159,27 @@ AC_DEFUN([ZFS_AC_RPM], [
])
RPM_DEFINE_COMMON='--define "$(DEBUG_ZFS) 1"'
RPM_DEFINE_UTIL='--define "_dracutdir $(dracutdir)" --define "_udevdir $(udevdir)" --define "_udevruledir $(udevruledir)" --define "_initconfdir $(DEFAULT_INITCONF_DIR)" $(DEFINE_INITRAMFS) $(DEFINE_SYSTEMD)'
RPM_DEFINE_UTIL=' --define "_initconfdir $(DEFAULT_INITCONF_DIR)"'
dnl # Make the next three RPM_DEFINE_UTIL additions conditional, since
dnl # their values may not be set when running:
dnl #
dnl # ./configure --with-config=srpm
dnl #
AS_IF([test -n "$dracutdir" ], [
RPM_DEFINE_UTIL='--define "_dracutdir $(dracutdir)"'
])
AS_IF([test -n "$udevdir" ], [
RPM_DEFINE_UTIL+=' --define "_udevdir $(udevdir)"'
])
AS_IF([test -n "$udevruledir" ], [
RPM_DEFINE_UTIL+=' --define "_udevdir $(udevruledir)"'
])
RPM_DEFINE_UTIL+=' $(DEFINE_INITRAMFS)'
RPM_DEFINE_UTIL+=' $(DEFINE_SYSTEMD)'
RPM_DEFINE_KMOD='--define "kernels $(LINUX_VERSION)" --define "require_spldir $(SPL)" --define "require_splobj $(SPL_OBJ)" --define "ksrc $(LINUX)" --define "kobj $(LINUX_OBJ)"'
RPM_DEFINE_KMOD+=' --define "_wrong_version_format_terminate_build 0"'

View File

@ -122,6 +122,9 @@ AC_CONFIG_FILES([
contrib/dracut/02zfsexpandknowledge/Makefile
contrib/dracut/90zfs/Makefile
contrib/initramfs/Makefile
contrib/initramfs/hooks/Makefile
contrib/initramfs/scripts/Makefile
contrib/initramfs/scripts/local-top/Makefile
module/Makefile
module/avl/Makefile
module/nvpair/Makefile
@ -165,6 +168,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/cmd/mkfiles/Makefile
tests/zfs-tests/cmd/mktree/Makefile
tests/zfs-tests/cmd/mmap_exec/Makefile
tests/zfs-tests/cmd/mmap_libaio/Makefile
tests/zfs-tests/cmd/mmapwrite/Makefile
tests/zfs-tests/cmd/randfree_file/Makefile
tests/zfs-tests/cmd/readmmap/Makefile
@ -237,6 +241,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/tests/functional/cli_user/zpool_iostat/Makefile
tests/zfs-tests/tests/functional/cli_user/zpool_list/Makefile
tests/zfs-tests/tests/functional/compression/Makefile
tests/zfs-tests/tests/functional/cp_files/Makefile
tests/zfs-tests/tests/functional/ctime/Makefile
tests/zfs-tests/tests/functional/delegate/Makefile
tests/zfs-tests/tests/functional/devices/Makefile
@ -251,6 +256,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/tests/functional/history/Makefile
tests/zfs-tests/tests/functional/inheritance/Makefile
tests/zfs-tests/tests/functional/inuse/Makefile
tests/zfs-tests/tests/functional/kstat/Makefile
tests/zfs-tests/tests/functional/large_files/Makefile
tests/zfs-tests/tests/functional/largest_pool/Makefile
tests/zfs-tests/tests/functional/link_count/Makefile

View File

@ -24,6 +24,7 @@ $(pkgdracut_SCRIPTS):%:%.in
-e 's,@udevruledir\@,$(udevruledir),g' \
-e 's,@sysconfdir\@,$(sysconfdir),g' \
-e 's,@systemdunitdir\@,$(systemdunitdir),g' \
-e 's,@mounthelperdir\@,$(mounthelperdir),g' \
$< >'$@'
distclean-local::

View File

@ -5,7 +5,7 @@ check() {
[ "${1}" = "-d" ] && return 0
# Verify the zfs tool chain
for tool in "@sbindir@/zpool" "@sbindir@/zfs" "@sbindir@/mount.zfs" ; do
for tool in "@sbindir@/zpool" "@sbindir@/zfs" "@mounthelperdir@/mount.zfs" ; do
test -x "$tool" || return 1
done
# Verify grep exists
@ -53,7 +53,7 @@ install() {
# Fallback: Guess the path and include all matches
dracut_install /usr/lib/gcc/*/*/libgcc_s.so*
fi
dracut_install @sbindir@/mount.zfs
dracut_install @mounthelperdir@/mount.zfs
dracut_install @udevdir@/vdev_id
dracut_install awk
dracut_install head

View File

@ -34,6 +34,7 @@ info "ZFS: No sysroot.mount exists or zfs-generator did not extend it."
info "ZFS: Mounting root with the traditional mount-zfs.sh instead."
# Delay until all required block devices are present.
modprobe zfs 2>/dev/null
udevadm settle
if [ "${root}" = "zfs:AUTO" ] ; then

View File

@ -3,12 +3,11 @@ initrddir = $(datarootdir)/initramfs-tools
initrd_SCRIPTS = \
conf.d/zfs conf-hooks.d/zfs hooks/zfs scripts/zfs scripts/local-top/zfs
SUBDIRS = hooks scripts
EXTRA_DIST = \
$(top_srcdir)/contrib/initramfs/conf.d/zfs \
$(top_srcdir)/contrib/initramfs/conf-hooks.d/zfs \
$(top_srcdir)/contrib/initramfs/hooks/zfs \
$(top_srcdir)/contrib/initramfs/scripts/zfs \
$(top_srcdir)/contrib/initramfs/scripts/local-top/zfs \
$(top_srcdir)/contrib/initramfs/README.initramfs.markdown
install-initrdSCRIPTS: $(EXTRA_DIST)

1
contrib/initramfs/hooks/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
zfs

View File

@ -0,0 +1,21 @@
hooksdir = $(datarootdir)/initramfs-tools/hooks
hooks_SCRIPTS = \
zfs
EXTRA_DIST = \
$(top_srcdir)/contrib/initramfs/hooks/zfs.in
$(hooks_SCRIPTS):%:%.in
-$(SED) -e 's,@sbindir\@,$(sbindir),g' \
-e 's,@sysconfdir\@,$(sysconfdir),g' \
-e 's,@udevdir\@,$(udevdir),g' \
-e 's,@udevruledir\@,$(udevruledir),g' \
-e 's,@mounthelperdir\@,$(mounthelperdir),g' \
$< >'$@'
clean-local::
-$(RM) $(hooks_SCRIPTS)
distclean-local::
-$(RM) $(hooks_SCRIPTS)

View File

@ -8,11 +8,13 @@ PREREQ="zdev"
# These prerequisites are provided by the zfsutils package. The zdb utility is
# not strictly required, but it can be useful at the initramfs recovery prompt.
COPY_EXEC_LIST="/sbin/zdb /sbin/zpool /sbin/zfs /sbin/mount.zfs"
COPY_EXEC_LIST="$COPY_EXEC_LIST /usr/bin/dirname /lib/udev/vdev_id"
COPY_FILE_LIST="/etc/hostid /etc/zfs/zpool.cache /etc/default/zfs"
COPY_FILE_LIST="$COPY_FILE_LIST /etc/zfs/zfs-functions /etc/zfs/vdev_id.conf"
COPY_FILE_LIST="$COPY_FILE_LIST /lib/udev/rules.d/69-vdev.rules"
COPY_EXEC_LIST="@sbindir@/zdb @sbindir@/zpool @sbindir@/zfs"
COPY_EXEC_LIST="$COPY_EXEC_LIST @mounthelperdir@/mount.zfs @udevdir@/vdev_id"
COPY_FILE_LIST="/etc/hostid @sysconfdir@/zfs/zpool.cache"
COPY_FILE_LIST="$COPY_FILE_LIST @sysconfdir@/default/zfs"
COPY_FILE_LIST="$COPY_FILE_LIST @sysconfdir@/zfs/zfs-functions"
COPY_FILE_LIST="$COPY_FILE_LIST @sysconfdir@/zfs/vdev_id.conf"
COPY_FILE_LIST="$COPY_FILE_LIST @udevruledir@/69-vdev.rules"
# These prerequisites are provided by the base system.
COPY_EXEC_LIST="$COPY_EXEC_LIST /usr/bin/dirname /bin/hostname /sbin/blkid"
@ -83,7 +85,7 @@ else
fi
for ii in zfs zfs.conf spl spl.conf
do
do
if [ -f "/etc/modprobe.d/$ii" ]; then
if [ ! -d "$DESTDIR/etc/modprobe.d" ]; then
mkdir -p $DESTDIR/etc/modprobe.d

1
contrib/initramfs/scripts/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
zfs

View File

@ -0,0 +1,20 @@
scriptsdir = $(datarootdir)/initramfs-tools/scripts
scripts_SCRIPTS = \
zfs
SUBDIRS = local-top
EXTRA_DIST = \
$(top_srcdir)/contrib/initramfs/scripts/zfs.in
$(scripts_SCRIPTS):%:%.in
-$(SED) -e 's,@sbindir\@,$(sbindir),g' \
-e 's,@sysconfdir\@,$(sysconfdir),g' \
$< >'$@'
clean-local::
-$(RM) $(scripts_SCRIPTS)
distclean-local::
-$(RM) $(scripts_SCRIPTS)

View File

@ -0,0 +1,3 @@
localtopdir = $(datarootdir)/initramfs-tools/scripts/local-top
EXTRA_DIST = zfs

View File

@ -11,9 +11,9 @@
# Paths to what we need - in the initrd, these paths are hardcoded,
# so override the defines in zfs-functions.
ZFS="/sbin/zfs"
ZPOOL="/sbin/zpool"
ZPOOL_CACHE="/etc/zfs/zpool.cache"
ZFS="@sbindir@/zfs"
ZPOOL="@sbindir@/zpool"
ZPOOL_CACHE="@sysconfdir@/zfs/zpool.cache"
export ZFS ZPOOL ZPOOL_CACHE
# This runs any scripts that should run before we start importing
@ -193,7 +193,7 @@ import_pool()
# Verify that the pool isn't already imported
# Make as sure as we can to not require '-f' to import.
"${ZPOOL}" status "$pool" > /dev/null 2>&1 && return 0
"${ZPOOL}" get name,guid -o value -H 2>/dev/null | grep -Fxq "$pool" && return 0
# For backwards compatibility, make sure that ZPOOL_IMPORT_PATH is set
# to something we can use later with the real import(s). We want to
@ -616,7 +616,7 @@ setup_snapshot_booting()
# Separate the full snapshot ('$snap') into it's filesystem and
# snapshot names. Would have been nice with a split() function..
rootfs="${snap%%@*}"
snapname="${snap##*@}"
snapname="${snap##*@}"
ZFS_BOOTFS="${rootfs}_${snapname}"
if ! grep -qiE '(^|[^\\](\\\\)* )(rollback)=(on|yes|1)( |$)' /proc/cmdline
@ -772,6 +772,7 @@ mountroot()
# root=zfs:<pool>/<dataset> (uses this for rpool - first part, without 'zfs:')
#
# Option <dataset> could also be <snapshot>
# Option <pool> could also be <guid>
# ------------
# Support force option
@ -889,6 +890,14 @@ mountroot()
/bin/sh
fi
# In case the pool was specified as guid, resolve guid to name
pool="$("${ZPOOL}" get name,guid -o name,value -H | \
awk -v pool="${ZFS_RPOOL}" '$2 == pool { print $1 }')"
if [ -n "$pool" ]; then
ZFS_BOOTFS="${pool}/${ZFS_BOOTFS#*/}"
ZFS_RPOOL="${pool}"
fi
# Set elevator=noop on the root pool's vdevs' disks. ZFS already
# does this for wholedisk vdevs (for all pools), so this is only
# important for partitions.

0
etc/init.d/zfs-import.in Executable file → Normal file
View File

0
etc/init.d/zfs-mount.in Executable file → Normal file
View File

0
etc/init.d/zfs-share.in Executable file → Normal file
View File

0
etc/init.d/zfs-zed.in Executable file → Normal file
View File

View File

@ -1,3 +1,3 @@
# Always load kernel modules at boot. The default behavior is to load the
# kernel modules in the zfs-import-*.service or when blkid(8) detects a pool.
# The default behavior is to allow udev to load the kernel modules on demand.
# Uncomment the following line to unconditionally load them at boot.
#zfs

View File

@ -12,7 +12,6 @@ ConditionPathExists=@sysconfdir@/zfs/zpool.cache
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStartPre=-/sbin/modprobe zfs
ExecStart=@sbindir@/zpool import -c @sysconfdir@/zfs/zpool.cache -aN
[Install]

View File

@ -11,7 +11,6 @@ ConditionPathExists=!@sysconfdir@/zfs/zpool.cache
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStartPre=-/sbin/modprobe zfs
ExecStart=@sbindir@/zpool import -aN -o cachefile=none
[Install]

View File

@ -4,6 +4,7 @@ pkgsysconf_DATA = \
vdev_id.conf.alias.example \
vdev_id.conf.sas_direct.example \
vdev_id.conf.sas_switch.example \
vdev_id.conf.multipath.example
vdev_id.conf.multipath.example \
vdev_id.conf.scsi.example
EXTRA_DIST = $(pkgsysconf_DATA)

View File

@ -2,6 +2,9 @@ multipath no
topology sas_direct
phys_per_port 4
# Additionally create /dev/by-enclousure/ symlinks for enclosure devices
enclosure_symlinks yes
# PCI_ID HBA PORT CHANNEL NAME
channel 85:00.0 1 A
channel 85:00.0 0 B

View File

@ -0,0 +1,9 @@
multipath no
topology scsi
phys_per_port 1
# Usually scsi disks are numbered from 0, but this can be offset, to
# match the physical bay numbers, as follows:
first_bay_number 1
# PCI_ID HBA PORT CHANNEL NAME
channel 0c:00.0 0 Y

View File

@ -296,6 +296,8 @@ int zfs_dev_is_whole_disk(char *dev_name);
char *zfs_get_underlying_path(char *dev_name);
char *zfs_get_enclosure_sysfs_path(char *dev_name);
const char *zpool_get_state_str(zpool_handle_t *);
/*
* Functions to manage pool properties
*/
@ -331,6 +333,7 @@ typedef enum {
ZPOOL_STATUS_HOSTID_REQUIRED, /* multihost=on and hostid=0 */
ZPOOL_STATUS_IO_FAILURE_WAIT, /* failed I/O, failmode 'wait' */
ZPOOL_STATUS_IO_FAILURE_CONTINUE, /* failed I/O, failmode 'continue' */
ZPOOL_STATUS_IO_FAILURE_MMP, /* failed MMP, failmode not 'panic' */
ZPOOL_STATUS_BAD_LOG, /* cannot read log chain(s) */
ZPOOL_STATUS_ERRATA, /* informational errata available */

View File

@ -32,11 +32,28 @@
#include <linux/blkdev.h>
#include <linux/elevator.h>
#include <linux/backing-dev.h>
#include <linux/msdos_fs.h> /* for SECTOR_* */
#ifndef HAVE_FMODE_T
typedef unsigned __bitwise__ fmode_t;
#endif /* HAVE_FMODE_T */
#ifndef HAVE_BLK_QUEUE_FLAG_SET
static inline void
blk_queue_flag_set(unsigned int flag, struct request_queue *q)
{
queue_flag_set(flag, q);
}
#endif
#ifndef HAVE_BLK_QUEUE_FLAG_CLEAR
static inline void
blk_queue_flag_clear(unsigned int flag, struct request_queue *q)
{
queue_flag_clear(flag, q);
}
#endif
/*
* 4.7 - 4.x API,
* The blk_queue_write_cache() interface has replaced blk_queue_flush()
@ -56,16 +73,14 @@ static inline void
blk_queue_set_write_cache(struct request_queue *q, bool wc, bool fua)
{
#if defined(HAVE_BLK_QUEUE_WRITE_CACHE_GPL_ONLY)
spin_lock_irq(q->queue_lock);
if (wc)
queue_flag_set(QUEUE_FLAG_WC, q);
blk_queue_flag_set(QUEUE_FLAG_WC, q);
else
queue_flag_clear(QUEUE_FLAG_WC, q);
blk_queue_flag_clear(QUEUE_FLAG_WC, q);
if (fua)
queue_flag_set(QUEUE_FLAG_FUA, q);
blk_queue_flag_set(QUEUE_FLAG_FUA, q);
else
queue_flag_clear(QUEUE_FLAG_FUA, q);
spin_unlock_irq(q->queue_lock);
blk_queue_flag_clear(QUEUE_FLAG_FUA, q);
#elif defined(HAVE_BLK_QUEUE_WRITE_CACHE)
blk_queue_write_cache(q, wc, fua);
#elif defined(HAVE_BLK_QUEUE_FLUSH_GPL_ONLY)
@ -90,17 +105,6 @@ blk_queue_set_write_cache(struct request_queue *q, bool wc, bool fua)
#define blk_fs_request(rq) ((rq)->cmd_type == REQ_TYPE_FS)
#endif
/*
* 2.6.27 API change,
* The blk_queue_stackable() queue flag was added in 2.6.27 to handle dm
* stacking drivers. Prior to this request stacking drivers were detected
* by checking (q->request_fn == NULL), for earlier kernels we revert to
* this legacy behavior.
*/
#ifndef blk_queue_stackable
#define blk_queue_stackable(q) ((q)->request_fn == NULL)
#endif
/*
* 2.6.34 API change,
* The blk_queue_max_hw_sectors() function replaces blk_queue_max_sectors().

View File

@ -27,6 +27,7 @@
#define _ZFS_KMAP_H
#include <linux/highmem.h>
#include <linux/uaccess.h>
#ifdef HAVE_1ARG_KMAP_ATOMIC
/* 2.6.37 API change */
@ -37,4 +38,11 @@
#define zfs_kunmap_atomic(addr, km_type) kunmap_atomic(addr, km_type)
#endif
/* 5.0 API change - no more 'type' argument for access_ok() */
#ifdef HAVE_ACCESS_OK_TYPE
#define zfs_access_ok(type, addr, size) access_ok(type, addr, size)
#else
#define zfs_access_ok(type, addr, size) access_ok(addr, size)
#endif
#endif /* _ZFS_KMAP_H */

View File

@ -81,7 +81,7 @@
#endif
#if defined(_KERNEL)
#if defined(HAVE_FPU_API_H)
#if defined(HAVE_UNDERSCORE_KERNEL_FPU)
#include <asm/fpu/api.h>
#include <asm/fpu/internal.h>
#define kfpu_begin() \
@ -94,12 +94,18 @@
__kernel_fpu_end(); \
preempt_enable(); \
}
#else
#elif defined(HAVE_KERNEL_FPU)
#include <asm/i387.h>
#include <asm/xcr.h>
#define kfpu_begin() kernel_fpu_begin()
#define kfpu_end() kernel_fpu_end()
#endif /* defined(HAVE_FPU_API_H) */
#else
/* Kernel doesn't export any kernel_fpu_* functions */
#include <asm/fpu/internal.h> /* For kernel xgetbv() */
#define kfpu_begin() panic("This code should never run")
#define kfpu_end() panic("This code should never run")
#endif /* defined(HAVE_KERNEL_FPU) */
#else
/*
* fpu dummy methods for userspace
@ -278,11 +284,13 @@ __simd_state_enabled(const uint64_t state)
boolean_t has_osxsave;
uint64_t xcr0;
#if defined(_KERNEL) && defined(X86_FEATURE_OSXSAVE)
#if defined(_KERNEL)
#if defined(X86_FEATURE_OSXSAVE) && defined(KERNEL_EXPORTS_X86_FPU)
has_osxsave = !!boot_cpu_has(X86_FEATURE_OSXSAVE);
#elif defined(_KERNEL) && !defined(X86_FEATURE_OSXSAVE)
has_osxsave = B_FALSE;
#else
has_osxsave = B_FALSE;
#endif
#elif !defined(_KERNEL)
has_osxsave = __cpuid_has_osxsave();
#endif
@ -307,8 +315,12 @@ static inline boolean_t
zfs_sse_available(void)
{
#if defined(_KERNEL)
#if defined(KERNEL_EXPORTS_X86_FPU)
return (!!boot_cpu_has(X86_FEATURE_XMM));
#else
return (B_FALSE);
#endif
#elif !defined(_KERNEL)
return (__cpuid_has_sse());
#endif
}
@ -320,8 +332,12 @@ static inline boolean_t
zfs_sse2_available(void)
{
#if defined(_KERNEL)
#if defined(KERNEL_EXPORTS_X86_FPU)
return (!!boot_cpu_has(X86_FEATURE_XMM2));
#else
return (B_FALSE);
#endif
#elif !defined(_KERNEL)
return (__cpuid_has_sse2());
#endif
}
@ -333,8 +349,12 @@ static inline boolean_t
zfs_sse3_available(void)
{
#if defined(_KERNEL)
#if defined(KERNEL_EXPORTS_X86_FPU)
return (!!boot_cpu_has(X86_FEATURE_XMM3));
#else
return (B_FALSE);
#endif
#elif !defined(_KERNEL)
return (__cpuid_has_sse3());
#endif
}
@ -346,8 +366,12 @@ static inline boolean_t
zfs_ssse3_available(void)
{
#if defined(_KERNEL)
#if defined(KERNEL_EXPORTS_X86_FPU)
return (!!boot_cpu_has(X86_FEATURE_SSSE3));
#else
return (B_FALSE);
#endif
#elif !defined(_KERNEL)
return (__cpuid_has_ssse3());
#endif
}
@ -359,8 +383,12 @@ static inline boolean_t
zfs_sse4_1_available(void)
{
#if defined(_KERNEL)
#if defined(KERNEL_EXPORTS_X86_FPU)
return (!!boot_cpu_has(X86_FEATURE_XMM4_1));
#else
return (B_FALSE);
#endif
#elif !defined(_KERNEL)
return (__cpuid_has_sse4_1());
#endif
}
@ -372,8 +400,12 @@ static inline boolean_t
zfs_sse4_2_available(void)
{
#if defined(_KERNEL)
#if defined(KERNEL_EXPORTS_X86_FPU)
return (!!boot_cpu_has(X86_FEATURE_XMM4_2));
#else
return (B_FALSE);
#endif
#elif !defined(_KERNEL)
return (__cpuid_has_sse4_2());
#endif
}
@ -386,8 +418,12 @@ zfs_avx_available(void)
{
boolean_t has_avx;
#if defined(_KERNEL)
#if defined(KERNEL_EXPORTS_X86_FPU)
has_avx = !!boot_cpu_has(X86_FEATURE_AVX);
#else
has_avx = B_FALSE;
#endif
#elif !defined(_KERNEL)
has_avx = __cpuid_has_avx();
#endif
@ -401,11 +437,13 @@ static inline boolean_t
zfs_avx2_available(void)
{
boolean_t has_avx2;
#if defined(_KERNEL) && defined(X86_FEATURE_AVX2)
#if defined(_KERNEL)
#if defined(X86_FEATURE_AVX2) && defined(KERNEL_EXPORTS_X86_FPU)
has_avx2 = !!boot_cpu_has(X86_FEATURE_AVX2);
#elif defined(_KERNEL) && !defined(X86_FEATURE_AVX2)
has_avx2 = B_FALSE;
#else
has_avx2 = B_FALSE;
#endif
#elif !defined(_KERNEL)
has_avx2 = __cpuid_has_avx2();
#endif
@ -418,11 +456,13 @@ zfs_avx2_available(void)
static inline boolean_t
zfs_bmi1_available(void)
{
#if defined(_KERNEL) && defined(X86_FEATURE_BMI1)
#if defined(_KERNEL)
#if defined(X86_FEATURE_BMI1) && defined(KERNEL_EXPORTS_X86_FPU)
return (!!boot_cpu_has(X86_FEATURE_BMI1));
#elif defined(_KERNEL) && !defined(X86_FEATURE_BMI1)
return (B_FALSE);
#else
return (B_FALSE);
#endif
#elif !defined(_KERNEL)
return (__cpuid_has_bmi1());
#endif
}
@ -433,16 +473,17 @@ zfs_bmi1_available(void)
static inline boolean_t
zfs_bmi2_available(void)
{
#if defined(_KERNEL) && defined(X86_FEATURE_BMI2)
#if defined(_KERNEL)
#if defined(X86_FEATURE_BMI2) && defined(KERNEL_EXPORTS_X86_FPU)
return (!!boot_cpu_has(X86_FEATURE_BMI2));
#elif defined(_KERNEL) && !defined(X86_FEATURE_BMI2)
return (B_FALSE);
#else
return (B_FALSE);
#endif
#elif !defined(_KERNEL)
return (__cpuid_has_bmi2());
#endif
}
/*
* AVX-512 family of instruction sets:
*
@ -466,8 +507,12 @@ zfs_avx512f_available(void)
{
boolean_t has_avx512 = B_FALSE;
#if defined(_KERNEL) && defined(X86_FEATURE_AVX512F)
#if defined(_KERNEL)
#if defined(X86_FEATURE_AVX512F) && defined(KERNEL_EXPORTS_X86_FPU)
has_avx512 = !!boot_cpu_has(X86_FEATURE_AVX512F);
#else
has_avx512 = B_FALSE;
#endif
#elif !defined(_KERNEL)
has_avx512 = __cpuid_has_avx512f();
#endif
@ -481,9 +526,13 @@ zfs_avx512cd_available(void)
{
boolean_t has_avx512 = B_FALSE;
#if defined(_KERNEL) && defined(X86_FEATURE_AVX512CD)
#if defined(_KERNEL)
#if defined(X86_FEATURE_AVX512CD) && defined(KERNEL_EXPORTS_X86_FPU)
has_avx512 = boot_cpu_has(X86_FEATURE_AVX512F) &&
boot_cpu_has(X86_FEATURE_AVX512CD);
#else
has_avx512 = B_FALSE;
#endif
#elif !defined(_KERNEL)
has_avx512 = __cpuid_has_avx512cd();
#endif
@ -497,9 +546,13 @@ zfs_avx512er_available(void)
{
boolean_t has_avx512 = B_FALSE;
#if defined(_KERNEL) && defined(X86_FEATURE_AVX512ER)
#if defined(_KERNEL)
#if defined(X86_FEATURE_AVX512ER) && defined(KERNEL_EXPORTS_X86_FPU)
has_avx512 = boot_cpu_has(X86_FEATURE_AVX512F) &&
boot_cpu_has(X86_FEATURE_AVX512ER);
#else
has_avx512 = B_FALSE;
#endif
#elif !defined(_KERNEL)
has_avx512 = __cpuid_has_avx512er();
#endif
@ -513,9 +566,13 @@ zfs_avx512pf_available(void)
{
boolean_t has_avx512 = B_FALSE;
#if defined(_KERNEL) && defined(X86_FEATURE_AVX512PF)
#if defined(_KERNEL)
#if defined(X86_FEATURE_AVX512PF) && defined(KERNEL_EXPORTS_X86_FPU)
has_avx512 = boot_cpu_has(X86_FEATURE_AVX512F) &&
boot_cpu_has(X86_FEATURE_AVX512PF);
#else
has_avx512 = B_FALSE;
#endif
#elif !defined(_KERNEL)
has_avx512 = __cpuid_has_avx512pf();
#endif
@ -529,9 +586,13 @@ zfs_avx512bw_available(void)
{
boolean_t has_avx512 = B_FALSE;
#if defined(_KERNEL) && defined(X86_FEATURE_AVX512BW)
#if defined(_KERNEL)
#if defined(X86_FEATURE_AVX512BW) && defined(KERNEL_EXPORTS_X86_FPU)
has_avx512 = boot_cpu_has(X86_FEATURE_AVX512F) &&
boot_cpu_has(X86_FEATURE_AVX512BW);
#else
has_avx512 = B_FALSE;
#endif
#elif !defined(_KERNEL)
has_avx512 = __cpuid_has_avx512bw();
#endif
@ -545,9 +606,13 @@ zfs_avx512dq_available(void)
{
boolean_t has_avx512 = B_FALSE;
#if defined(_KERNEL) && defined(X86_FEATURE_AVX512DQ)
#if defined(_KERNEL)
#if defined(X86_FEATURE_AVX512DQ) && defined(KERNEL_EXPORTS_X86_FPU)
has_avx512 = boot_cpu_has(X86_FEATURE_AVX512F) &&
boot_cpu_has(X86_FEATURE_AVX512DQ);
#else
has_avx512 = B_FALSE;
#endif
#elif !defined(_KERNEL)
has_avx512 = __cpuid_has_avx512dq();
#endif
@ -561,9 +626,13 @@ zfs_avx512vl_available(void)
{
boolean_t has_avx512 = B_FALSE;
#if defined(_KERNEL) && defined(X86_FEATURE_AVX512VL)
#if defined(_KERNEL)
#if defined(X86_FEATURE_AVX512VL) && defined(KERNEL_EXPORTS_X86_FPU)
has_avx512 = boot_cpu_has(X86_FEATURE_AVX512F) &&
boot_cpu_has(X86_FEATURE_AVX512VL);
#else
has_avx512 = B_FALSE;
#endif
#elif !defined(_KERNEL)
has_avx512 = __cpuid_has_avx512vl();
#endif
@ -577,9 +646,13 @@ zfs_avx512ifma_available(void)
{
boolean_t has_avx512 = B_FALSE;
#if defined(_KERNEL) && defined(X86_FEATURE_AVX512IFMA)
#if defined(_KERNEL)
#if defined(X86_FEATURE_AVX512IFMA) && defined(KERNEL_EXPORTS_X86_FPU)
has_avx512 = boot_cpu_has(X86_FEATURE_AVX512F) &&
boot_cpu_has(X86_FEATURE_AVX512IFMA);
#else
has_avx512 = B_FALSE;
#endif
#elif !defined(_KERNEL)
has_avx512 = __cpuid_has_avx512ifma();
#endif
@ -593,9 +666,13 @@ zfs_avx512vbmi_available(void)
{
boolean_t has_avx512 = B_FALSE;
#if defined(_KERNEL) && defined(X86_FEATURE_AVX512VBMI)
#if defined(_KERNEL)
#if defined(X86_FEATURE_AVX512VBMI) && defined(KERNEL_EXPORTS_X86_FPU)
has_avx512 = boot_cpu_has(X86_FEATURE_AVX512F) &&
boot_cpu_has(X86_FEATURE_AVX512VBMI);
#else
has_avx512 = B_FALSE;
#endif
#elif !defined(_KERNEL)
has_avx512 = __cpuid_has_avx512f() &&
__cpuid_has_avx512vbmi();

View File

@ -30,6 +30,7 @@
#include <sys/taskq.h>
#include <sys/cred.h>
#include <linux/backing-dev.h>
#include <linux/compat.h>
/*
* 2.6.28 API change,
@ -182,6 +183,30 @@ zpl_bdi_destroy(struct super_block *sb)
}
#endif
/*
* 4.14 adds SB_* flag definitions, define them to MS_* equivalents
* if not set.
*/
#ifndef SB_RDONLY
#define SB_RDONLY MS_RDONLY
#endif
#ifndef SB_SILENT
#define SB_SILENT MS_SILENT
#endif
#ifndef SB_ACTIVE
#define SB_ACTIVE MS_ACTIVE
#endif
#ifndef SB_POSIXACL
#define SB_POSIXACL MS_POSIXACL
#endif
#ifndef SB_MANDLOCK
#define SB_MANDLOCK MS_MANDLOCK
#endif
/*
* 2.6.38 API change,
* LOOKUP_RCU flag introduced to distinguish rcu-walk from ref-walk cases.
@ -272,9 +297,6 @@ lseek_execute(
* This is several orders of magnitude larger than expected grace period.
* At 60 seconds the kernel will also begin issuing RCU stall warnings.
*/
#ifdef refcount_t
#undef refcount_t
#endif
#include <linux/posix_acl.h>
@ -405,8 +427,6 @@ typedef mode_t zpl_equivmode_t;
#define zpl_posix_acl_valid(ip, acl) posix_acl_valid(acl)
#endif
#define refcount_t zfs_refcount_t
#endif /* CONFIG_FS_POSIX_ACL */
/*
@ -602,4 +622,21 @@ inode_set_iversion(struct inode *ip, u64 val)
}
#endif
/*
* Returns true when called in the context of a 32-bit system call.
*/
static inline int
zpl_is_32bit_api(void)
{
#ifdef CONFIG_COMPAT
#ifdef HAVE_IN_COMPAT_SYSCALL
return (in_compat_syscall());
#else
return (is_compat_task());
#endif
#else
return (BITS_PER_LONG == 32);
#endif
}
#endif /* _ZFS_VFS_H */

View File

@ -52,7 +52,7 @@ typedef struct abd {
abd_flags_t abd_flags;
uint_t abd_size; /* excludes scattered abd_offset */
struct abd *abd_parent;
refcount_t abd_children;
zfs_refcount_t abd_children;
union {
struct abd_scatter {
uint_t abd_offset;

View File

@ -76,7 +76,7 @@ struct arc_prune {
void *p_private;
uint64_t p_adjust;
list_node_t p_node;
refcount_t p_refcnt;
zfs_refcount_t p_refcnt;
};
typedef enum arc_strategy {

View File

@ -74,12 +74,12 @@ typedef struct arc_state {
/*
* total amount of evictable data in this state
*/
refcount_t arcs_esize[ARC_BUFC_NUMTYPES];
zfs_refcount_t arcs_esize[ARC_BUFC_NUMTYPES];
/*
* total amount of data in this state; this includes: evictable,
* non-evictable, ARC_BUFC_DATA, and ARC_BUFC_METADATA.
*/
refcount_t arcs_size;
zfs_refcount_t arcs_size;
/*
* supports the "dbufs" kstat
*/
@ -163,7 +163,7 @@ typedef struct l1arc_buf_hdr {
uint32_t b_l2_hits;
/* self protecting */
refcount_t b_refcnt;
zfs_refcount_t b_refcnt;
arc_callback_t *b_acb;
abd_t *b_pabd;
@ -180,7 +180,7 @@ typedef struct l2arc_dev {
kmutex_t l2ad_mtx; /* lock for buffer list */
list_t l2ad_buflist; /* buffer list */
list_node_t l2ad_node; /* device list node */
refcount_t l2ad_alloc; /* allocated bytes */
zfs_refcount_t l2ad_alloc; /* allocated bytes */
} l2arc_dev_t;
typedef struct l2arc_buf_hdr {

View File

@ -20,7 +20,7 @@
*/
/*
* Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2012, 2015 by Delphix. All rights reserved.
* Copyright (c) 2012, 2018 by Delphix. All rights reserved.
* Copyright (c) 2013 by Saso Kiselkov. All rights reserved.
* Copyright (c) 2014 Spectra Logic Corporation, All rights reserved.
*/
@ -212,7 +212,7 @@ typedef struct dmu_buf_impl {
* If nonzero, the buffer can't be destroyed.
* Protected by db_mtx.
*/
refcount_t db_holds;
zfs_refcount_t db_holds;
/* buffer holding our data */
arc_buf_t *db_buf;
@ -294,7 +294,7 @@ boolean_t dbuf_try_add_ref(dmu_buf_t *db, objset_t *os, uint64_t obj,
uint64_t dbuf_refcount(dmu_buf_impl_t *db);
void dbuf_rele(dmu_buf_impl_t *db, void *tag);
void dbuf_rele_and_unlock(dmu_buf_impl_t *db, void *tag);
void dbuf_rele_and_unlock(dmu_buf_impl_t *db, void *tag, boolean_t evicting);
dmu_buf_impl_t *dbuf_find(struct objset *os, uint64_t object, uint8_t level,
uint64_t blkid);

View File

@ -227,11 +227,14 @@ typedef enum dmu_object_type {
DMU_OTN_ZAP_METADATA = DMU_OT(DMU_BSWAP_ZAP, B_TRUE),
} dmu_object_type_t;
typedef enum txg_how {
TXG_WAIT = 1,
TXG_NOWAIT,
TXG_WAITED,
} txg_how_t;
/*
* These flags are intended to be used to specify the "txg_how"
* parameter when calling the dmu_tx_assign() function. See the comment
* above dmu_tx_assign() for more details on the meaning of these flags.
*/
#define TXG_NOWAIT (0ULL)
#define TXG_WAIT (1ULL<<0)
#define TXG_NOTHROTTLE (1ULL<<1)
void byteswap_uint64_array(void *buf, size_t size);
void byteswap_uint32_array(void *buf, size_t size);
@ -694,7 +697,7 @@ void dmu_tx_hold_spill(dmu_tx_t *tx, uint64_t object);
void dmu_tx_hold_sa(dmu_tx_t *tx, struct sa_handle *hdl, boolean_t may_grow);
void dmu_tx_hold_sa_create(dmu_tx_t *tx, int total_size);
void dmu_tx_abort(dmu_tx_t *tx);
int dmu_tx_assign(dmu_tx_t *tx, enum txg_how txg_how);
int dmu_tx_assign(dmu_tx_t *tx, uint64_t txg_how);
void dmu_tx_wait(dmu_tx_t *tx);
void dmu_tx_commit(dmu_tx_t *tx);
void dmu_tx_mark_netfree(dmu_tx_t *tx);
@ -891,7 +894,7 @@ uint64_t dmu_objset_fsid_guid(objset_t *os);
/*
* Get the [cm]time for an objset's snapshot dir
*/
timestruc_t dmu_objset_snap_cmtime(objset_t *os);
inode_timespec_t dmu_objset_snap_cmtime(objset_t *os);
int dmu_objset_is_snapshot(objset_t *os);

View File

@ -161,6 +161,7 @@ extern "C" {
* dn_allocated_txg
* dn_free_txg
* dn_assigned_txg
* dn_dirty_txg
* dd_assigned_tx
* dn_notxholds
* dn_dirtyctx

View File

@ -179,7 +179,7 @@ int dmu_objset_find_dp(struct dsl_pool *dp, uint64_t ddobj,
int func(struct dsl_pool *, struct dsl_dataset *, void *),
void *arg, int flags);
void dmu_objset_evict_dbufs(objset_t *os);
timestruc_t dmu_objset_snap_cmtime(objset_t *os);
inode_timespec_t dmu_objset_snap_cmtime(objset_t *os);
/* called from dsl */
void dmu_objset_sync(objset_t *os, zio_t *zio, dmu_tx_t *tx);

View File

@ -67,9 +67,6 @@ struct dmu_tx {
/* placeholder for syncing context, doesn't need specific holds */
boolean_t tx_anyobj;
/* has this transaction already been delayed? */
boolean_t tx_waited;
/* transaction is marked as being a "net free" of space */
boolean_t tx_netfree;
@ -79,6 +76,9 @@ struct dmu_tx {
/* need to wait for sufficient dirty space */
boolean_t tx_wait_dirty;
/* has this transaction already been delayed? */
boolean_t tx_dirty_delayed;
int tx_err;
};
@ -97,8 +97,8 @@ typedef struct dmu_tx_hold {
dmu_tx_t *txh_tx;
list_node_t txh_node;
struct dnode *txh_dnode;
refcount_t txh_space_towrite;
refcount_t txh_memory_tohold;
zfs_refcount_t txh_space_towrite;
zfs_refcount_t txh_memory_tohold;
enum dmu_tx_hold_type txh_type;
uint64_t txh_arg1;
uint64_t txh_arg2;
@ -138,7 +138,7 @@ extern dmu_tx_stats_t dmu_tx_stats;
* These routines are defined in dmu.h, and are called by the user.
*/
dmu_tx_t *dmu_tx_create(objset_t *dd);
int dmu_tx_assign(dmu_tx_t *tx, txg_how_t txg_how);
int dmu_tx_assign(dmu_tx_t *tx, uint64_t txg_how);
void dmu_tx_commit(dmu_tx_t *tx);
void dmu_tx_abort(dmu_tx_t *tx);
uint64_t dmu_tx_get_txg(dmu_tx_t *tx);

View File

@ -20,7 +20,7 @@
*/
/*
* Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2012, 2017 by Delphix. All rights reserved.
* Copyright (c) 2012, 2018 by Delphix. All rights reserved.
* Copyright (c) 2014 Spectra Logic Corporation, All rights reserved.
*/
@ -260,13 +260,14 @@ struct dnode {
uint64_t dn_allocated_txg;
uint64_t dn_free_txg;
uint64_t dn_assigned_txg;
uint64_t dn_dirty_txg; /* txg dnode was last dirtied */
kcondvar_t dn_notxholds;
enum dnode_dirtycontext dn_dirtyctx;
uint8_t *dn_dirtyctx_firstset; /* dbg: contents meaningless */
/* protected by own devices */
refcount_t dn_tx_holds;
refcount_t dn_holds;
zfs_refcount_t dn_tx_holds;
zfs_refcount_t dn_holds;
kmutex_t dn_dbufs_mtx;
/*
@ -338,7 +339,7 @@ int dnode_hold_impl(struct objset *dd, uint64_t object, int flag, int dn_slots,
void *ref, dnode_t **dnp);
boolean_t dnode_add_ref(dnode_t *dn, void *ref);
void dnode_rele(dnode_t *dn, void *ref);
void dnode_rele_and_unlock(dnode_t *dn, void *tag);
void dnode_rele_and_unlock(dnode_t *dn, void *tag, boolean_t evicting);
void dnode_setdirty(dnode_t *dn, dmu_tx_t *tx);
void dnode_sync(dnode_t *dn, dmu_tx_t *tx);
void dnode_allocate(dnode_t *dn, dmu_object_type_t ot, int blocksize, int ibs,
@ -360,6 +361,10 @@ int dnode_next_offset(dnode_t *dn, int flags, uint64_t *off,
int minlvl, uint64_t blkfill, uint64_t txg);
void dnode_evict_dbufs(dnode_t *dn);
void dnode_evict_bonus(dnode_t *dn);
void dnode_free_interior_slots(dnode_t *dn);
#define DNODE_IS_DIRTY(_dn) \
((_dn)->dn_dirty_txg >= spa_syncing_txg((_dn)->dn_objset->os_spa))
#define DNODE_IS_CACHEABLE(_dn) \
((_dn)->dn_objset->os_primary_cache == ZFS_CACHE_ALL || \
@ -453,6 +458,11 @@ typedef struct dnode_stats {
* which had already been unlinked in an earlier txg.
*/
kstat_named_t dnode_hold_free_txg;
/*
* Number of times dnode_free_interior_slots() needed to retry
* acquiring a slot zrl lock due to contention.
*/
kstat_named_t dnode_free_interior_lock_retry;
/*
* Number of new dnodes allocated by dnode_allocate().
*/

View File

@ -186,7 +186,7 @@ typedef struct dsl_dataset {
* Owning counts as a long hold. See the comments above
* dsl_pool_hold() for details.
*/
refcount_t ds_longholds;
zfs_refcount_t ds_longholds;
/* no locking; only for making guesses */
uint64_t ds_trysnap_txg;

View File

@ -103,7 +103,7 @@ struct dsl_dir {
/* Protected by dd_lock */
kmutex_t dd_lock;
list_t dd_props; /* list of dsl_prop_record_t's */
timestruc_t dd_snap_cmtime; /* last time snapshot namespace changed */
inode_timespec_t dd_snap_cmtime; /* last snapshot namespace change */
uint64_t dd_origin_txg;
/* gross estimate of space used by in-flight tx's */
@ -159,7 +159,7 @@ boolean_t dsl_dir_is_clone(dsl_dir_t *dd);
void dsl_dir_new_refreservation(dsl_dir_t *dd, struct dsl_dataset *ds,
uint64_t reservation, cred_t *cr, dmu_tx_t *tx);
void dsl_dir_snap_cmtime_update(dsl_dir_t *dd);
timestruc_t dsl_dir_snap_cmtime(dsl_dir_t *dd);
inode_timespec_t dsl_dir_snap_cmtime(dsl_dir_t *dd);
void dsl_dir_set_reservation_sync_impl(dsl_dir_t *dd, uint64_t value,
dmu_tx_t *tx);
void dsl_dir_zapify(dsl_dir_t *dd, dmu_tx_t *tx);

View File

@ -638,6 +638,7 @@ typedef struct zpool_rewind_policy {
#define ZPOOL_CONFIG_RESILVER_TXG "resilver_txg"
#define ZPOOL_CONFIG_COMMENT "comment"
#define ZPOOL_CONFIG_SUSPENDED "suspended" /* not stored on disk */
#define ZPOOL_CONFIG_SUSPENDED_REASON "suspended_reason" /* not stored */
#define ZPOOL_CONFIG_TIMESTAMP "timestamp" /* not stored on disk */
#define ZPOOL_CONFIG_BOOTFS "bootfs" /* not stored on disk */
#define ZPOOL_CONFIG_MISSING_DEVICES "missing_vdevs" /* not stored on disk */

View File

@ -179,8 +179,7 @@ struct metaslab_class {
* number of allocations allowed.
*/
uint64_t mc_alloc_max_slots;
refcount_t mc_alloc_slots;
zfs_refcount_t mc_alloc_slots;
uint64_t mc_alloc_groups; /* # of allocatable groups */
uint64_t mc_alloc; /* total allocated space */
@ -230,7 +229,7 @@ struct metaslab_group {
* are unable to handle their share of allocations.
*/
uint64_t mg_max_alloc_queue_depth;
refcount_t mg_alloc_queue_depth;
zfs_refcount_t mg_alloc_queue_depth;
/*
* A metalab group that can no longer allocate the minimum block

View File

@ -43,6 +43,7 @@ typedef struct mmp_thread {
uberblock_t mmp_ub; /* last ub written by sync */
zio_t *mmp_zio_root; /* root of mmp write zios */
uint64_t mmp_kstat_id; /* unique id for next MMP write kstat */
int mmp_skip_error; /* reason for last skipped write */
} mmp_thread_t;

View File

@ -41,17 +41,6 @@ extern "C" {
*/
#define FTAG ((char *)__func__)
/*
* Starting with 4.11, torvalds/linux@f405df5, the linux kernel defines a
* refcount_t type of its own. The macro below effectively changes references
* in the ZFS code from refcount_t to zfs_refcount_t at compile time, so that
* existing code need not be altered, reducing conflicts when landing openZFS
* patches.
*/
#define refcount_t zfs_refcount_t
#define refcount_add zfs_refcount_add
#ifdef ZFS_DEBUG
typedef struct reference {
list_node_t ref_link;
@ -69,57 +58,60 @@ typedef struct refcount {
uint64_t rc_removed_count;
} zfs_refcount_t;
/* Note: refcount_t must be initialized with refcount_create[_untracked]() */
/*
* Note: zfs_refcount_t must be initialized with
* refcount_create[_untracked]()
*/
void refcount_create(refcount_t *rc);
void refcount_create_untracked(refcount_t *rc);
void refcount_create_tracked(refcount_t *rc);
void refcount_destroy(refcount_t *rc);
void refcount_destroy_many(refcount_t *rc, uint64_t number);
int refcount_is_zero(refcount_t *rc);
int64_t refcount_count(refcount_t *rc);
int64_t zfs_refcount_add(refcount_t *rc, void *holder_tag);
int64_t refcount_remove(refcount_t *rc, void *holder_tag);
int64_t refcount_add_many(refcount_t *rc, uint64_t number, void *holder_tag);
int64_t refcount_remove_many(refcount_t *rc, uint64_t number, void *holder_tag);
void refcount_transfer(refcount_t *dst, refcount_t *src);
void refcount_transfer_ownership(refcount_t *, void *, void *);
boolean_t refcount_held(refcount_t *, void *);
boolean_t refcount_not_held(refcount_t *, void *);
void zfs_refcount_create(zfs_refcount_t *);
void zfs_refcount_create_untracked(zfs_refcount_t *);
void zfs_refcount_create_tracked(zfs_refcount_t *);
void zfs_refcount_destroy(zfs_refcount_t *);
void zfs_refcount_destroy_many(zfs_refcount_t *, uint64_t);
int zfs_refcount_is_zero(zfs_refcount_t *);
int64_t zfs_refcount_count(zfs_refcount_t *);
int64_t zfs_refcount_add(zfs_refcount_t *, void *);
int64_t zfs_refcount_remove(zfs_refcount_t *, void *);
int64_t zfs_refcount_add_many(zfs_refcount_t *, uint64_t, void *);
int64_t zfs_refcount_remove_many(zfs_refcount_t *, uint64_t, void *);
void zfs_refcount_transfer(zfs_refcount_t *, zfs_refcount_t *);
void zfs_refcount_transfer_ownership(zfs_refcount_t *, void *, void *);
boolean_t zfs_refcount_held(zfs_refcount_t *, void *);
boolean_t zfs_refcount_not_held(zfs_refcount_t *, void *);
void refcount_init(void);
void refcount_fini(void);
void zfs_refcount_init(void);
void zfs_refcount_fini(void);
#else /* ZFS_DEBUG */
typedef struct refcount {
uint64_t rc_count;
} refcount_t;
} zfs_refcount_t;
#define refcount_create(rc) ((rc)->rc_count = 0)
#define refcount_create_untracked(rc) ((rc)->rc_count = 0)
#define refcount_create_tracked(rc) ((rc)->rc_count = 0)
#define refcount_destroy(rc) ((rc)->rc_count = 0)
#define refcount_destroy_many(rc, number) ((rc)->rc_count = 0)
#define refcount_is_zero(rc) ((rc)->rc_count == 0)
#define refcount_count(rc) ((rc)->rc_count)
#define zfs_refcount_create(rc) ((rc)->rc_count = 0)
#define zfs_refcount_create_untracked(rc) ((rc)->rc_count = 0)
#define zfs_refcount_create_tracked(rc) ((rc)->rc_count = 0)
#define zfs_refcount_destroy(rc) ((rc)->rc_count = 0)
#define zfs_refcount_destroy_many(rc, number) ((rc)->rc_count = 0)
#define zfs_refcount_is_zero(rc) ((rc)->rc_count == 0)
#define zfs_refcount_count(rc) ((rc)->rc_count)
#define zfs_refcount_add(rc, holder) atomic_inc_64_nv(&(rc)->rc_count)
#define refcount_remove(rc, holder) atomic_dec_64_nv(&(rc)->rc_count)
#define refcount_add_many(rc, number, holder) \
#define zfs_refcount_remove(rc, holder) atomic_dec_64_nv(&(rc)->rc_count)
#define zfs_refcount_add_many(rc, number, holder) \
atomic_add_64_nv(&(rc)->rc_count, number)
#define refcount_remove_many(rc, number, holder) \
#define zfs_refcount_remove_many(rc, number, holder) \
atomic_add_64_nv(&(rc)->rc_count, -number)
#define refcount_transfer(dst, src) { \
#define zfs_refcount_transfer(dst, src) { \
uint64_t __tmp = (src)->rc_count; \
atomic_add_64(&(src)->rc_count, -__tmp); \
atomic_add_64(&(dst)->rc_count, __tmp); \
}
#define refcount_transfer_ownership(rc, current_holder, new_holder) (void)0
#define refcount_held(rc, holder) ((rc)->rc_count > 0)
#define refcount_not_held(rc, holder) (B_TRUE)
#define zfs_refcount_transfer_ownership(rc, current_holder, new_holder) (void)0
#define zfs_refcount_held(rc, holder) ((rc)->rc_count > 0)
#define zfs_refcount_not_held(rc, holder) (B_TRUE)
#define refcount_init()
#define refcount_fini()
#define zfs_refcount_init()
#define zfs_refcount_fini()
#endif /* ZFS_DEBUG */

View File

@ -57,8 +57,8 @@ typedef struct rrwlock {
kmutex_t rr_lock;
kcondvar_t rr_cv;
kthread_t *rr_writer;
refcount_t rr_anon_rcount;
refcount_t rr_linked_rcount;
zfs_refcount_t rr_anon_rcount;
zfs_refcount_t rr_linked_rcount;
boolean_t rr_writer_wanted;
boolean_t rr_track_all;
} rrwlock_t;

View File

@ -110,7 +110,7 @@ typedef struct sa_idx_tab {
list_node_t sa_next;
sa_lot_t *sa_layout;
uint16_t *sa_variable_lengths;
refcount_t sa_refcount;
zfs_refcount_t sa_refcount;
uint32_t *sa_idx_tab; /* array of offsets */
} sa_idx_tab_t;

View File

@ -730,6 +730,7 @@ typedef struct spa_stats {
spa_stats_history_t tx_assign_histogram;
spa_stats_history_t io_history;
spa_stats_history_t mmp_history;
spa_stats_history_t state; /* pool state */
} spa_stats_t;
typedef enum txg_state {
@ -759,10 +760,12 @@ extern txg_stat_t *spa_txg_history_init_io(spa_t *, uint64_t,
struct dsl_pool *);
extern void spa_txg_history_fini_io(spa_t *, txg_stat_t *);
extern void spa_tx_assign_add_nsecs(spa_t *spa, uint64_t nsecs);
extern int spa_mmp_history_set_skip(spa_t *spa, uint64_t mmp_kstat_id);
extern int spa_mmp_history_set(spa_t *spa, uint64_t mmp_kstat_id, int io_error,
hrtime_t duration);
extern void spa_mmp_history_add(uint64_t txg, uint64_t timestamp,
uint64_t mmp_delay, vdev_t *vd, int label, uint64_t mmp_kstat_id);
extern void *spa_mmp_history_add(spa_t *spa, uint64_t txg, uint64_t timestamp,
uint64_t mmp_delay, vdev_t *vd, int label, uint64_t mmp_kstat_id,
int error);
/* Pool configuration locks */
extern int spa_config_tryenter(spa_t *spa, int locks, void *tag, krw_t rw);
@ -887,6 +890,8 @@ extern void spa_history_log_internal_ds(struct dsl_dataset *ds, const char *op,
extern void spa_history_log_internal_dd(dsl_dir_t *dd, const char *operation,
dmu_tx_t *tx, const char *fmt, ...);
extern const char *spa_state_to_name(spa_t *spa);
/* error handling */
struct zbookmark_phys;
extern void spa_log_error(spa_t *spa, zio_t *zio);

View File

@ -78,7 +78,7 @@ typedef struct spa_config_lock {
kthread_t *scl_writer;
int scl_write_wanted;
kcondvar_t scl_cv;
refcount_t scl_count;
zfs_refcount_t scl_count;
} spa_config_lock_t;
typedef struct spa_config_dirent {
@ -153,7 +153,7 @@ struct spa {
uint64_t spa_freeze_txg; /* freeze pool at this txg */
uint64_t spa_load_max_txg; /* best initial ub_txg */
uint64_t spa_claim_max_txg; /* highest claimed birth txg */
timespec_t spa_loaded_ts; /* 1st successful open time */
inode_timespec_t spa_loaded_ts; /* 1st successful open time */
objset_t *spa_meta_objset; /* copy of dp->dp_meta_objset */
kmutex_t spa_evicting_os_lock; /* Evicting objset list lock */
list_t spa_evicting_os_list; /* Objsets being evicted. */
@ -233,7 +233,7 @@ struct spa {
zio_t *spa_suspend_zio_root; /* root of all suspended I/O */
kmutex_t spa_suspend_lock; /* protects suspend_zio_root */
kcondvar_t spa_suspend_cv; /* notification of resume */
uint8_t spa_suspended; /* pool is suspended */
zio_suspend_reason_t spa_suspended; /* pool is suspended */
uint8_t spa_claiming; /* pool is doing zil_claim() */
boolean_t spa_debug; /* debug enabled? */
boolean_t spa_is_root; /* pool is root */
@ -275,17 +275,18 @@ struct spa {
spa_stats_t spa_stats; /* assorted spa statistics */
hrtime_t spa_ccw_fail_time; /* Conf cache write fail time */
taskq_t *spa_zvol_taskq; /* Taskq for minor management */
taskq_t *spa_prefetch_taskq; /* Taskq for prefetch threads */
uint64_t spa_multihost; /* multihost aware (mmp) */
mmp_thread_t spa_mmp; /* multihost mmp thread */
/*
* spa_refcount & spa_config_lock must be the last elements
* because refcount_t changes size based on compilation options.
* because zfs_refcount_t changes size based on compilation options.
* In order for the MDB module to function correctly, the other
* fields must remain in the same location.
*/
spa_config_lock_t spa_config_lock[SCL_LOCKS]; /* config changes */
refcount_t spa_refcount; /* number of opens */
zfs_refcount_t spa_refcount; /* number of opens */
taskq_t *spa_upgrade_taskq; /* taskq for upgrade jobs */
};

View File

@ -71,7 +71,7 @@
__entry->db_offset = db->db.db_offset; \
__entry->db_size = db->db.db_size; \
__entry->db_state = db->db_state; \
__entry->db_holds = refcount_count(&db->db_holds); \
__entry->db_holds = zfs_refcount_count(&db->db_holds); \
snprintf(__get_str(msg), TRACE_DBUF_MSG_MAX, \
DBUF_TP_PRINTK_FMT, DBUF_TP_PRINTK_ARGS); \
} else { \

View File

@ -50,7 +50,7 @@ DECLARE_EVENT_CLASS(zfs_delay_mintime_class,
__field(uint64_t, tx_lastsnap_txg)
__field(uint64_t, tx_lasttried_txg)
__field(boolean_t, tx_anyobj)
__field(boolean_t, tx_waited)
__field(boolean_t, tx_dirty_delayed)
__field(hrtime_t, tx_start)
__field(boolean_t, tx_wait_dirty)
__field(int, tx_err)
@ -62,7 +62,7 @@ DECLARE_EVENT_CLASS(zfs_delay_mintime_class,
__entry->tx_lastsnap_txg = tx->tx_lastsnap_txg;
__entry->tx_lasttried_txg = tx->tx_lasttried_txg;
__entry->tx_anyobj = tx->tx_anyobj;
__entry->tx_waited = tx->tx_waited;
__entry->tx_dirty_delayed = tx->tx_dirty_delayed;
__entry->tx_start = tx->tx_start;
__entry->tx_wait_dirty = tx->tx_wait_dirty;
__entry->tx_err = tx->tx_err;
@ -70,11 +70,12 @@ DECLARE_EVENT_CLASS(zfs_delay_mintime_class,
__entry->min_tx_time = min_tx_time;
),
TP_printk("tx { txg %llu lastsnap_txg %llu tx_lasttried_txg %llu "
"anyobj %d waited %d start %llu wait_dirty %d err %i "
"anyobj %d dirty_delayed %d start %llu wait_dirty %d err %i "
"} dirty %llu min_tx_time %llu",
__entry->tx_txg, __entry->tx_lastsnap_txg,
__entry->tx_lasttried_txg, __entry->tx_anyobj, __entry->tx_waited,
__entry->tx_start, __entry->tx_wait_dirty, __entry->tx_err,
__entry->tx_lasttried_txg, __entry->tx_anyobj,
__entry->tx_dirty_delayed, __entry->tx_start,
__entry->tx_wait_dirty, __entry->tx_err,
__entry->dirty, __entry->min_tx_time)
);
/* END CSTYLED */

View File

@ -42,7 +42,7 @@
#include <sys/uio.h>
extern int uiomove(void *, size_t, enum uio_rw, uio_t *);
extern void uio_prefaultpages(ssize_t, uio_t *);
extern int uio_prefaultpages(ssize_t, uio_t *);
extern int uiocopy(void *, size_t, enum uio_rw, uio_t *, size_t *);
extern void uioskip(uio_t *, size_t);

View File

@ -47,7 +47,7 @@
* Structure of all optional attributes.
*/
typedef struct xoptattr {
timestruc_t xoa_createtime; /* Create time of file */
inode_timespec_t xoa_createtime; /* Create time of file */
uint8_t xoa_archive;
uint8_t xoa_system;
uint8_t xoa_readonly;

View File

@ -226,7 +226,7 @@ int zap_lookup_norm_by_dnode(dnode_t *dn, const char *name,
boolean_t *ncp);
int zap_count_write_by_dnode(dnode_t *dn, const char *name,
int add, refcount_t *towrite, refcount_t *tooverwrite);
int add, zfs_refcount_t *towrite, zfs_refcount_t *tooverwrite);
/*
* Create an attribute with the given name and value.

View File

@ -527,7 +527,7 @@ extern char *vn_dumpdir;
#define AV_SCANSTAMP_SZ 32 /* length of anti-virus scanstamp */
typedef struct xoptattr {
timestruc_t xoa_createtime; /* Create time of file */
inode_timespec_t xoa_createtime; /* Create time of file */
uint8_t xoa_archive;
uint8_t xoa_system;
uint8_t xoa_readonly;
@ -640,13 +640,6 @@ extern void delay(clock_t ticks);
#define USEC_TO_TICK(usec) ((usec) / (MICROSEC / hz))
#define NSEC_TO_TICK(usec) ((usec) / (NANOSEC / hz))
#define gethrestime_sec() time(NULL)
#define gethrestime(t) \
do {\
(t)->tv_sec = gethrestime_sec();\
(t)->tv_nsec = 0;\
} while (0);
#define max_ncpus 64
#define boot_ncpus (sysconf(_SC_NPROCESSORS_ONLN))

View File

@ -32,6 +32,7 @@
#include <sys/zil.h>
#include <sys/sa.h>
#include <sys/rrwlock.h>
#include <sys/dsl_dataset.h>
#include <sys/zfs_ioctl.h>
#ifdef __cplusplus

View File

@ -54,7 +54,7 @@ extern int zfs_mkdir(struct inode *dip, char *dirname, vattr_t *vap,
struct inode **ipp, cred_t *cr, int flags, vsecattr_t *vsecp);
extern int zfs_rmdir(struct inode *dip, char *name, struct inode *cwd,
cred_t *cr, int flags);
extern int zfs_readdir(struct inode *ip, struct dir_context *ctx, cred_t *cr);
extern int zfs_readdir(struct inode *ip, zpl_dir_context_t *ctx, cred_t *cr);
extern int zfs_fsync(struct inode *ip, int syncflag, cred_t *cr);
extern int zfs_getattr(struct inode *ip, vattr_t *vap, int flag, cred_t *cr);
extern int zfs_getattr_fast(struct inode *ip, struct kstat *sp);

View File

@ -209,7 +209,7 @@ typedef struct znode_hold {
uint64_t zh_obj; /* object id */
kmutex_t zh_lock; /* lock serializing object access */
avl_node_t zh_node; /* avl tree linkage */
refcount_t zh_refcount; /* active consumer reference count */
zfs_refcount_t zh_refcount; /* active consumer reference count */
} znode_hold_t;
/*
@ -270,19 +270,36 @@ typedef struct znode_hold {
extern unsigned int zfs_object_mutex_size;
/* Encode ZFS stored time values from a struct timespec */
/*
* Encode ZFS stored time values from a struct timespec / struct timespec64.
*/
#define ZFS_TIME_ENCODE(tp, stmp) \
{ \
do { \
(stmp)[0] = (uint64_t)(tp)->tv_sec; \
(stmp)[1] = (uint64_t)(tp)->tv_nsec; \
}
} while (0)
/* Decode ZFS stored time values to a struct timespec */
#if defined(HAVE_INODE_TIMESPEC64_TIMES)
/*
* Decode ZFS stored time values to a struct timespec64
* 4.18 and newer kernels.
*/
#define ZFS_TIME_DECODE(tp, stmp) \
{ \
(tp)->tv_sec = (time_t)(stmp)[0]; \
(tp)->tv_nsec = (long)(stmp)[1]; \
}
do { \
(tp)->tv_sec = (time64_t)(stmp)[0]; \
(tp)->tv_nsec = (long)(stmp)[1]; \
} while (0)
#else
/*
* Decode ZFS stored time values to a struct timespec
* 4.17 and older kernels.
*/
#define ZFS_TIME_DECODE(tp, stmp) \
do { \
(tp)->tv_sec = (time_t)(stmp)[0]; \
(tp)->tv_nsec = (long)(stmp)[1]; \
} while (0)
#endif /* HAVE_INODE_TIMESPEC64_TIMES */
/*
* Timestamp defines

View File

@ -144,6 +144,12 @@ enum zio_checksum {
#define ZIO_FAILURE_MODE_CONTINUE 1
#define ZIO_FAILURE_MODE_PANIC 2
typedef enum zio_suspend_reason {
ZIO_SUSPEND_NONE = 0,
ZIO_SUSPEND_IOERR,
ZIO_SUSPEND_MMP,
} zio_suspend_reason_t;
enum zio_flag {
/*
* Flags inherited by gang, ddt, and vdev children,
@ -231,7 +237,7 @@ enum zio_child {
#define ZIO_CHILD_DDT_BIT ZIO_CHILD_BIT(ZIO_CHILD_DDT)
#define ZIO_CHILD_LOGICAL_BIT ZIO_CHILD_BIT(ZIO_CHILD_LOGICAL)
#define ZIO_CHILD_ALL_BITS \
(ZIO_CHILD_VDEV_BIT | ZIO_CHILD_GANG_BIT | \
(ZIO_CHILD_VDEV_BIT | ZIO_CHILD_GANG_BIT | \
ZIO_CHILD_DDT_BIT | ZIO_CHILD_LOGICAL_BIT)
enum zio_wait_type {
@ -369,7 +375,7 @@ typedef struct zio_transform {
struct zio_transform *zt_next;
} zio_transform_t;
typedef int zio_pipe_stage_t(zio_t *zio);
typedef zio_t *zio_pipe_stage_t(zio_t *zio);
/*
* The io_reexecute flags are distinct from io_flags because the child must
@ -577,7 +583,7 @@ extern enum zio_checksum zio_checksum_dedup_select(spa_t *spa,
extern enum zio_compress zio_compress_select(spa_t *spa,
enum zio_compress child, enum zio_compress parent);
extern void zio_suspend(spa_t *spa, zio_t *zio);
extern void zio_suspend(spa_t *spa, zio_t *zio, zio_suspend_reason_t);
extern int zio_resume(spa_t *spa);
extern void zio_resume_wait(spa_t *spa);

View File

@ -125,56 +125,63 @@ extern const struct inode_operations zpl_ops_shares;
#if defined(HAVE_VFS_ITERATE) || defined(HAVE_VFS_ITERATE_SHARED)
#define DIR_CONTEXT_INIT(_dirent, _actor, _pos) { \
#define ZPL_DIR_CONTEXT_INIT(_dirent, _actor, _pos) { \
.actor = _actor, \
.pos = _pos, \
}
typedef struct dir_context zpl_dir_context_t;
#define zpl_dir_emit dir_emit
#define zpl_dir_emit_dot dir_emit_dot
#define zpl_dir_emit_dotdot dir_emit_dotdot
#define zpl_dir_emit_dots dir_emit_dots
#else
typedef struct dir_context {
typedef struct zpl_dir_context {
void *dirent;
const filldir_t actor;
loff_t pos;
} dir_context_t;
} zpl_dir_context_t;
#define DIR_CONTEXT_INIT(_dirent, _actor, _pos) { \
#define ZPL_DIR_CONTEXT_INIT(_dirent, _actor, _pos) { \
.dirent = _dirent, \
.actor = _actor, \
.pos = _pos, \
}
static inline bool
dir_emit(struct dir_context *ctx, const char *name, int namelen,
zpl_dir_emit(zpl_dir_context_t *ctx, const char *name, int namelen,
uint64_t ino, unsigned type)
{
return (!ctx->actor(ctx->dirent, name, namelen, ctx->pos, ino, type));
}
static inline bool
dir_emit_dot(struct file *file, struct dir_context *ctx)
zpl_dir_emit_dot(struct file *file, zpl_dir_context_t *ctx)
{
return (ctx->actor(ctx->dirent, ".", 1, ctx->pos,
file_inode(file)->i_ino, DT_DIR) == 0);
}
static inline bool
dir_emit_dotdot(struct file *file, struct dir_context *ctx)
zpl_dir_emit_dotdot(struct file *file, zpl_dir_context_t *ctx)
{
return (ctx->actor(ctx->dirent, "..", 2, ctx->pos,
parent_ino(file_dentry(file)), DT_DIR) == 0);
}
static inline bool
dir_emit_dots(struct file *file, struct dir_context *ctx)
zpl_dir_emit_dots(struct file *file, zpl_dir_context_t *ctx)
{
if (ctx->pos == 0) {
if (!dir_emit_dot(file, ctx))
if (!zpl_dir_emit_dot(file, ctx))
return (false);
ctx->pos = 1;
}
if (ctx->pos == 1) {
if (!dir_emit_dotdot(file, ctx))
if (!zpl_dir_emit_dotdot(file, ctx))
return (false);
ctx->pos = 2;
}
@ -182,4 +189,13 @@ dir_emit_dots(struct file *file, struct dir_context *ctx)
}
#endif /* HAVE_VFS_ITERATE */
/*
* Linux 4.18, inode times converted from timespec to timespec64.
*/
#if defined(HAVE_INODE_TIMESPEC64_TIMES)
#define zpl_inode_timespec_trunc(ts, gran) timespec64_trunc(ts, gran)
#else
#define zpl_inode_timespec_trunc(ts, gran) timespec_trunc(ts, gran)
#endif
#endif /* _SYS_ZPL_H */

View File

@ -300,6 +300,20 @@ efi_get_info(int fd, struct dk_cinfo *dki_info)
rval = sscanf(dev_path, "/dev/loop%[0-9]p%hu",
dki_info->dki_dname + 4,
&dki_info->dki_partition);
} else if ((strncmp(dev_path, "/dev/nvme", 9) == 0)) {
strcpy(dki_info->dki_cname, "nvme");
dki_info->dki_ctype = DKC_SCSI_CCS;
strcpy(dki_info->dki_dname, "nvme");
(void) sscanf(dev_path, "/dev/nvme%[0-9]",
dki_info->dki_dname + 4);
size_t controller_length = strlen(
dki_info->dki_dname);
strcpy(dki_info->dki_dname + controller_length,
"n");
rval = sscanf(dev_path,
"/dev/nvme%*[0-9]n%[0-9]p%hu",
dki_info->dki_dname + controller_length + 1,
&dki_info->dki_partition);
} else {
strcpy(dki_info->dki_dname, "unknown");
strcpy(dki_info->dki_cname, "unknown");

View File

@ -218,7 +218,7 @@ smb_enable_share_one(const char *sharename, const char *sharepath)
int rc;
/* Support ZFS share name regexp '[[:alnum:]_-.: ]' */
strncpy(name, sharename, sizeof (name));
strlcpy(name, sharename, sizeof (name));
name [sizeof (name)-1] = '\0';
pos = name;

View File

@ -19,8 +19,6 @@ noinst_LTLIBRARIES = libspl.la
USER_C = \
getexecname.c \
gethrtime.c \
gethrestime.c \
getmntany.c \
list.c \
mkdirp.c \

View File

@ -1,38 +0,0 @@
/*
* CDDL HEADER START
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License (the "License").
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
* When distributing Covered Code, include this CDDL HEADER in each
* file and include the License file at usr/src/OPENSOLARIS.LICENSE.
* If applicable, add the following below this CDDL HEADER, with the
* fields enclosed by brackets "[]" replaced with your own identifying
* information: Portions Copyright [yyyy] [name of copyright owner]
*
* CDDL HEADER END
*/
/*
* Copyright 2008 Sun Microsystems, Inc. All rights reserved.
* Use is subject to license terms.
*/
#include <time.h>
#include <sys/time.h>
void
gethrestime(timestruc_t *ts)
{
struct timeval tv;
gettimeofday(&tv, NULL);
ts->tv_sec = tv.tv_sec;
ts->tv_nsec = tv.tv_usec * NSEC_PER_USEC;
}

View File

@ -1,45 +0,0 @@
/*
* CDDL HEADER START
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License (the "License").
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
* When distributing Covered Code, include this CDDL HEADER in each
* file and include the License file at usr/src/OPENSOLARIS.LICENSE.
* If applicable, add the following below this CDDL HEADER, with the
* fields enclosed by brackets "[]" replaced with your own identifying
* information: Portions Copyright [yyyy] [name of copyright owner]
*
* CDDL HEADER END
*/
/*
* Copyright 2008 Sun Microsystems, Inc. All rights reserved.
* Use is subject to license terms.
*/
#include <time.h>
#include <sys/time.h>
#include <stdlib.h>
#include <stdio.h>
hrtime_t
gethrtime(void)
{
struct timespec ts;
int rc;
rc = clock_gettime(CLOCK_MONOTONIC, &ts);
if (rc) {
fprintf(stderr, "Error: clock_gettime() = %d\n", rc);
abort();
}
return ((((u_int64_t)ts.tv_sec) * NANOSEC) + ts.tv_nsec);
}

View File

@ -55,6 +55,7 @@ extern "C" {
#endif
#define _SUNOS_VTOC_16
#define HAVE_EFFICIENT_UNALIGNED_ACCESS
/* i386 arch specific defines */
#elif defined(__i386) || defined(__i386__)
@ -76,6 +77,7 @@ extern "C" {
#endif
#define _SUNOS_VTOC_16
#define HAVE_EFFICIENT_UNALIGNED_ACCESS
/* powerpc arch specific defines */
#elif defined(__powerpc) || defined(__powerpc__) || defined(__powerpc64__)
@ -99,6 +101,7 @@ extern "C" {
#endif
#define _SUNOS_VTOC_16
#define HAVE_EFFICIENT_UNALIGNED_ACCESS
/* arm arch specific defines */
#elif defined(__arm) || defined(__arm__) || defined(__aarch64__)
@ -129,6 +132,10 @@ extern "C" {
#define _SUNOS_VTOC_16
#if defined(__ARM_FEATURE_UNALIGNED)
#define HAVE_EFFICIENT_UNALIGNED_ACCESS
#endif
/* sparc arch specific defines */
#elif defined(__sparc) || defined(__sparc__)

View File

@ -304,6 +304,8 @@ typedef struct kstat32 {
#define KSTAT_FLAG_PERSISTENT 0x08
#define KSTAT_FLAG_DORMANT 0x10
#define KSTAT_FLAG_INVALID 0x20
#define KSTAT_FLAG_LONGSTRINGS 0x40
#define KSTAT_FLAG_NO_HEADERS 0x80
/*
* Dynamic update support

View File

@ -27,8 +27,9 @@
#ifndef _LIBSPL_SYS_TIME_H
#define _LIBSPL_SYS_TIME_H
#include_next <sys/time.h>
#include <time.h>
#include <sys/types.h>
#include_next <sys/time.h>
#ifndef SEC
#define SEC 1
@ -70,13 +71,33 @@
#define SEC2NSEC(m) ((hrtime_t)(m) * (NANOSEC / SEC))
#endif
typedef long long hrtime_t;
typedef struct timespec timestruc_t;
typedef struct timespec timespec_t;
typedef struct timespec timespec_t;
typedef struct timespec inode_timespec_t;
static inline void
gethrestime(inode_timespec_t *ts)
{
struct timeval tv;
(void) gettimeofday(&tv, NULL);
ts->tv_sec = tv.tv_sec;
ts->tv_nsec = tv.tv_usec * NSEC_PER_USEC;
}
extern hrtime_t gethrtime(void);
extern void gethrestime(timestruc_t *);
static inline time_t
gethrestime_sec(void)
{
struct timeval tv;
(void) gettimeofday(&tv, NULL);
return (tv.tv_sec);
}
static inline hrtime_t
gethrtime(void)
{
struct timespec ts;
(void) clock_gettime(CLOCK_MONOTONIC, &ts);
return ((((u_int64_t)ts.tv_sec) * NANOSEC) + ts.tv_nsec);
}
#endif /* _LIBSPL_SYS_TIME_H */

Some files were not shown because too many files have changed in this diff Show More