Compare commits

...

69 Commits

Author SHA1 Message Date
Tony Hutter 2bc71fa976 Prepare to release 0.6.5.11
META file and RPM release log updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2017-07-10 11:01:14 -07:00
Tony Hutter 5a20d4283c Linux 4.12 compat: super_setup_bdi_name() - add missing code
This includes code that was mistakenly left out of the 7dae2c8 merge into
0.6.5.10.  Its inclusion fixes a kernel warning on Kubuntu 17.04:

	WARN_ON(sb->s_bdi != &noop_backing_dev_info);

Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6089
Closes #6324
(backported from zfs upstream commit 7dae2c81e7)
Signed-off-by: Colin Ian King <colin.king@canonical.com>
2017-07-10 11:00:34 -07:00
alaviss bf04e4d442 Musl libc fixes
Musl libc's <stdio.h> doesn't include <stdarg.h>, which cause
`va_start` and `va_end` end up being undefined symbols.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Leorize <alaviss@users.noreply.github.com>
Closes #6310
2017-07-06 15:25:39 -07:00
DHE 5e6057b574 Increase zfs_vdev_async_write_min_active to 2
Resilver operations frequently cause only a small amount of dirty data
to be written to disk at a time, resulting in the IO scheduler to only
issue 1 write at a time to the resilvering disk. When it is rotational
media the drive will often travel past the next sector to be written
before receiving a write command from ZFS, significantly delaying the
write of the next sector.

Raise zfs_vdev_async_write_min_active so that drives are kept fed
during resilvering.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Issue #4825
Closes #5926
2017-07-06 15:25:39 -07:00
loli10K 94d353a0bf Fix int overflow in zbookmark_is_before()
When the DSL scan code tries to resume the scrub from the saved
zbookmark calls dsl_scan_check_resume()->zbookmark_is_before() to
decide if the current dnode still needs to be visited.

A subtle int overflow condition in zbookmark_is_before(), exacerbated
by bumping the indirect block size to 128K (d7958b4), can lead to the
wrong assuption that the dnode does not need to be scanned.

This results in scrubs completing "successfully" in matter of mere
minutes on pools with several TB of used space because every time we
try to resume the dnode traversal on a dataset zbookmark_is_before()
tells us the whole objset has already been scanned completely.

Fix this by forcing the right shift operator to be executed before
the multiplication, as done in zbookmark_compare() (fcff0f3).

Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
2017-07-06 15:25:39 -07:00
Tony Hutter e9fc1bd5e6 Fix RHEL 7.4 bio_set_op_attrs build error
On RHEL 7.4, include/linux/bio.h now includes a macro for
bio_set_op_attrs that conflicts with the ifndef in ZFS
include/linux/blkdev_compat.h.  This patch fixes the build.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #6234
Closes #6271
2017-07-06 15:25:39 -07:00
Tony Hutter b88f4d7ba7 GCC 7.1 fixes
GCC 7.1 with will warn when we're not checking the snprintf()
return code in cases where the buffer could be truncated. This
patch either checks the snprintf return code (where applicable),
or simply disables the warnings (ztest.c).

Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #6253
2017-07-06 15:25:39 -07:00
Brian Behlendorf 3e297b90f5 Remove complicated libspl assert wrappers
Effectively provide our own version of assert()/verify() for use
in user space.  This minimizes our dependencies and aligns the
user space assertion handling with what's used in the kernel.

Signed-off-by: Carlo Landmeter <clandmeter@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4449
2017-07-06 15:25:39 -07:00
Justin Lecher 709f25e248 Compatibilty with glibc-2.23
In glibc-2.23 <sys/sysmacros.h> isn't automatically included in
<sys/types.h> [1], so we need ot explicitely include it.

https://sourceware.org/ml/libc-alpha/2015-11/msg00253.html

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Justin Lecher <jlec@gentoo.org>
Closes #6132
2017-07-06 15:25:39 -07:00
Olaf Faaland cd2209b75e glibc 2.5 compat: use correct header for makedev() et al.
In glibc 2.5, makedev(), major(), and minor() are defined in
sys/sysmacros.h.  They are also defined in types.h for backward
compatability, but using these definitions triggers a compile warning.
This breaks the ZFS build, as it builds with -Werror.

autoconf email threads indicate these macros may be defined in
sys/mkdev.h in some cases.

This commit adds configure checks to detect where makedev() is defined:
  sys/sysmacros.h
  sys/mkdev.h

It assumes major() and minor() are defined in the same place.

The libspl types.h then includes
	sys/sysmacros.h (preferred) or
	sys/mkdev.h (2nd choice)
if one of those defines makedev().

This is done before including the system types.h.

An alternative would be to remove uses of major, minor, and makedev,
instead comparing the st_dev returned from stat64.  These configure
checks would then be unnecessary.

This change revealed that __NORETURN was being defined unnecessarily in
libspl/include/sys/sysmacros.h.  That definition is removed.

The files in which __NORETURN are used all include types.h, and so all
will get the definition provided by feature_tests.h

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #5945
2017-07-06 15:25:39 -07:00
Tony Hutter a57fa2c532 Prepare to release 0.6.5.10
META file and RPM release log updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2017-06-12 15:31:33 -04:00
Brian Behlendorf 590509b75e Add MS_MANDLOCK mount failure message
Commit torvalds/linux@9e8925b6 allowed for kernels to be built
without support for mandatory locking (MS_MANDLOCK).  This will
result in 'zfs mount' failing when the nbmand=on property is set
if the kernel is built without CONFIG_MANDATORY_FILE_LOCKING.

Unfortunately we can not reliably detect prior to the mount(2) system
call if the kernel was built with this support.  The best we can do
is check if the mount failed with EPERM and if we passed 'mand'
as a mount option and then print a more useful error message. e.g.

  filesystem 'tank/fs' has the 'nbmand=on' property set, this mount
  option may be disabled in your kernel.  Use 'zfs set nbmand=off'
  to disable this option and try to mount the filesystem again.

Additionally, switch the default error message case to use
strerror() to produce a more human readable message.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4729
Closes #6199
2017-06-09 14:05:15 -07:00
Matthew Ahrens d07a8deac8 OpenZFS 8005 - poor performance of 1MB writes on certain RAID-Z configurations
Authored by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Don Brady <don.brady@intel.com>
Ported-by: Matt Ahrens <mahrens@delphix.com>

RAID-Z requires that space be allocated in multiples of P+1 sectors,
because this is the minimum size block that can have the required amount
of parity.  Thus blocks on RAIDZ1 must be allocated in a multiple of 2
sectors; on RAIDZ2 multiple of 3; and on RAIDZ3 multiple of 4.  A sector
is a unit of 2^ashift bytes, typically 512B or 4KB.

To satisfy this constraint, the allocation size is rounded up to the
proper multiple, resulting in up to 3 "pad sectors" at the end of some
blocks.  The contents of these pad sectors are not used, so we do not
need to read or write these sectors.  However, some storage hardware
performs much worse (around 1/2 as fast) on mostly-contiguous writes
when there are small gaps of non-overwritten data between the writes.
Therefore, ZFS creates "optional" zio's when writing RAID-Z blocks that
include pad sectors.  If writing a pad sector will fill the gap between
two (required) writes, we will issue the optional zio, thus doubling
performance.  The gap-filling performance improvement was introduced in
July 2009.

Writing the optional zio is done by the io aggregation code in
vdev_queue.c.  The problem is that it is also subject to the limit on
the size of aggregate writes, zfs_vdev_aggregation_limit, which is by
default 128KB.  For a given block, if the amount of data plus padding
written to a leaf device exceeds zfs_vdev_aggregation_limit, the
optional zio will not be written, resulting in a ~2x performance
degradation.

The problem occurs only for certain values of ashift, compressed block
size, and RAID-Z configuration (number of parity and data disks).  It
cannot occur with the default recordsize=128KB.  If compression is
enabled, all configurations with recordsize=1MB or larger will be
impacted to some degree.

The problem notably occurs with recordsize=1MB, compression=off, with 10
disks in a RAIDZ2 or RAIDZ3 group (with 512B or 4KB sectors).  Therefore
this problem has been known as "the 1MB 10-wide RAIDZ2 (or 3) problem".

The problem also occurs with the following configurations:

With recordsize=512KB or 256KB, compression=off, the problem occurs only
in rarely-used configurations:
* 4-wide RAIDZ1 with recordsize=512KB and ashift=12 (4KB sectors)
* 4-wide RAIDZ2 (either recordsize, either ashift)
* 5-wide RAIDZ2 with recordsize=512KB (either ashift)
* 6-wide RAIDZ2 with recordsize=512KB (either ashift)

With recordsize=1MB, compression=off, ashift=9 (512B sectors)
* RAIDZ1 with 4 or 8 disks
* RAIDZ2 with 4, 8, or 10 disks
* RAIDZ3 with 6, 8, 9, or 10 disks

With recordsize=1MB, compression=off, ashift=12 (4KB sectors)
* RAIDZ1 with 7 or 8 disks
* RAIDZ2 with 4, 5, or 10 disks
* RAIDZ3 with 6, 9, or 10 disks

With recordsize=2MB and larger (which can only be selected by changing
kernel tunables), many configurations are affected, including with
higher numbers of disks (up to 18 disks with recordsize=2MB).

Increase zfs_vdev_aggregation_limit to allow the optional zio to be
aggregated, thus eliminating the problem.  Setting it to 256KB fixes all
commonly-used configurations.

The solution is to aggregate optional zio's regardless of the
aggregation size limit.

Analysis sponsored by Intel Corp.

OpenZFS-issue: https://www.illumos.org/issues/8005
OpenZFS-commit: https://github.com/openzfs/openzfs/pull/321
Closes #5931
2017-06-09 14:05:15 -07:00
Chunwei Chen 69494c6aff Fix import wrong spare/l2 device when path change
If, for example, your aux device was /dev/sdc, but now the aux device is
removed and /dev/sdc points to other device. zpool import will still
use that device and corrupt it.

The problem is that the spa_validate_aux in spa_import, rather than
validate the on-disk label, it would actually write label to disk. We
remove them since spa_load_{spares,l2cache} seems to do everything we
need and they would actually validate on-disk label.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #6158
2017-06-09 14:05:15 -07:00
Chunwei Chen 412e3c26a9 Fix import finding spare/l2cache when path changes
When spare or l2cache device path changes, zpool import will not fix up
their paths like normal vdev. The issue is that when you supply a pool
name argument to zpool import, it will use it to filter out device which
doesn't have the pool name in the label. Since spare and l2cache device
never have that in the label, they'll always get filtered out.

We fix this by making sure we never filter out a spare or l2cache
device.

Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #6158
2017-06-09 14:05:15 -07:00
LOLi ed9cb8390b Linux 4.9 compat: fix zfs_ctldir xattr handling
Since torvalds/linux@d0a5b99 IOP_XATTR is used to indicate the inode
has xattr support: clear it for the ctldir inodes to avoid EIO errors.

Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6189
2017-06-09 14:05:15 -07:00
LOLi cb8210d125 Linux 4.12 compat: fix super_setup_bdi_name() call
Provide a format parameter to super_setup_bdi_name() so we don't
create duplicate names in '/devices/virtual/bdi' sysfs namespace which
would prevent us from mounting more than one ZFS filesystem at a time.

Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6147
2017-06-09 14:05:15 -07:00
Brian Behlendorf 21fd04ec40 Linux 4.12 compat: CURRENT_TIME removed
Linux 4.9 added current_time() as the preferred interface to get
the filesystem time.  CURRENT_TIME was retired in Linux 4.12.

Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6114
2017-06-09 14:05:15 -07:00
Brian Behlendorf e4cb6ee6a5 Linux 4.12 compat: super_setup_bdi_name()
All filesystems were converted to dynamically allocated BDIs.  The
destruction of backing_dev_info structures is handled as part of
super block destruction.  Refactor the code to abstract away the
details of creating and destroying a BDI.

Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6089
2017-06-09 14:05:15 -07:00
Brian Behlendorf a83a4f9d10 Limit zfs_dirty_data_max_max to 4G
Reinstate default 4G zfs_dirty_data_max_max limit.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6072
Closes #6081
2017-06-09 14:05:15 -07:00
Matthew Ahrens 1e5f75ecbe OpenZFS 8166 - zpool scrub thinks it repaired offline device
Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: Matthew Ahrens <mahrens@delphix.com>

If we do a scrub while a leaf device is offline (via "zpool offline"),
we will inadvertently clear the DTL (dirty time log) of the offline
device, even though it is still damaged.  When the device comes back
online, we will incompletely resilver it, thinking that the scrub
repaired blocks written before the scrub was started.  The incomplete
resilver can lead to data loss if there is a subsequent failure of a
different leaf device.

The fix is to never clear the DTL of offline devices.  Note that if a
device is onlined while a scrub is in progress, the scrub will be
restarted.

The problem can be worked around by running "zpool scrub" after
"zpool online".

OpenZFS-issue: https://www.illumos.org/issues/8166
OpenZFS-commit: https://github.com/openzfs/openzfs/pull/372
Closes #5806
Closes #6103
2017-06-09 14:05:15 -07:00
Ned Bass 36ccb9db43 vdev_id: fix failure due to multipath -l bug
Udev may fail to create the expected symbolic links in
/dev/disk/by-vdev on systems with the
device-mapper-multipath-0.4.9-100.el6 package installed. This affects
RHEL 6.9 and possibly other downstream distributions.

That version of the multipath command may incorrectly list a drive
state as "unkown" instead of "running". The issue was introduced
in the patch for https://bugzilla.redhat.com/show_bug.cgi?id=1401769

The vdev_id udev helper uses the state reported by "multipath -l" to
detect an online component disk of a multipath device in order to
resolve its physical slot and enclosure. Changing the command
invocation to "multipath -ll" works around the above issue by causing
multipath to consult additional sources of information to determine
the drive state.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Ned Bass <bass6@llnl.gov>
Closes #6039
2017-06-09 14:05:15 -07:00
jxiong a2c9518711 Guarantee PAGESIZE alignment for large zio buffers
In current implementation, only zio buffers in 16KB and bigger are
guaranteed PAGESIZE alignment. This breaks Lustre since it assumes
that 'arc_buf_t::b_data' must be page aligned when zio buffers are
greater than or equal to PAGESIZE.

This patch will make the zio buffers to be PAGESIZE aligned when
the sizes are not less than PAGESIZE.

This change may cause a little bit memory waste but that should be
fine because after ABD is introduced, zio buffers are used to hold
data temporarily and live in memory for a short while.

Reviewed-by: Don Brady <don.brady@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Closes #6084
2017-06-09 14:05:15 -07:00
Tony Hutter cc519c4027 Fix harmless "BARRIER is deprecated" kernel warning on Centos 6.8
A one time warning after module load that "BARRIER is deprecated" was seen
on the heavily patched 2.6.32-642.13.1.el6.x86_64 Centos 6.8 kernel.  It seems
that kernel had both the old BARRIER and the newer FLUSH/FUA interfaces
defined.  This fixes the warning by prefering the newer FLUSH/FUA interface
if it's available.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #5739
Closes #5828
2017-06-09 14:05:15 -07:00
Chunwei Chen dbb48937ce Add kmap_atomic in dmu_bio_copy
This is needed for 32 bit systems.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
2017-06-09 14:05:15 -07:00
Tim Chase 34a3a7c660 zdb: segfault in dump_bpobj_subobjs()
Avoid buffer overrun on all-zero bpobj subobjects by using signed
array index.  Also fix the type cast on the printf() argument.

Signed-off-by: Tim Chase <tim@onlight.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3905
2017-06-09 14:05:15 -07:00
Brian Behlendorf 4a4c57d5ff Fix atomic_sub_64() i386 assembly implementation
The atomic_sub_64() should use sbbl instead of adcl.  In user
space these atomics are used for statistics tracking and aren't
critical which explain how this was overlooked.  The kernel
space implementation of these atomics are layered on the
architecture specific implementations provided by the kernel.

Reviewed by: Stefan Ring <stefanrin@gmail.com>
Reviewed-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5671
Closes #5717
2017-06-09 14:05:15 -07:00
Chunwei Chen 2094a93e87 Fix loop device becomes read-only
Commit 933ec99 removes read and write from f_op because the vfs layer will
select iter_write or aio_write automatically. However, for Linux <= 4.0,
loop_set_fd will actually check f_op->write and set read-only if not exists.
This patch add them back and use the generic do_sync_{read,write} for
aio_{read,write} and new_sync_{read,write} for {read,write}_iter.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #5776
Closes #5855
2017-06-09 14:05:15 -07:00
loli10K 03336d011c Allow ZVOL bookmarks to be listed recursively
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #4503
Closes #5072
2017-06-09 14:05:15 -07:00
Brian Behlendorf f0a4bfbe4d Fix zfs-mount.service failure on boot
The mount(8) command will helpfully try to resolve any device name
which is passed in.  It does this by applying some simple heuristics
before passing it along to the registered mount helper.

Normally this fine.  However, one of these heuristics is to prepend
the current working directory to the passed device name.  If that
resulting directory name exists mount(8) will perform the mount(2)
system call and never invoke the helper utility.

Since the cwd for systemd when running as the system instance is
the root directory the default mount points created by zfs(8) can
cause a mount failure.

This change avoids the issue by explicitly setting the cwd to
a different path when performing the mount.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5719
2017-06-09 14:05:15 -07:00
Brian Behlendorf ebef1f2fb6 Fix iput() calls within a tx
As explicitly stated in section 2 of the 'Programming rules'
comments at the top of zfs_vnops.c.

  If you must call iput() within a tx then use zfs_iput_async().

Move iput() calls after dmu_tx_commit() / dmu_tx_abort when
possible.  When not possible convert the iput() calls to
zfs_iput_async().

Reviewed-by: Don Brady <don.brady@intel.com>
Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5758
2017-06-09 14:05:15 -07:00
Chunwei Chen 00a1a11989 Fix off by one in zpl_lookup
Doing the following command would return success with zfs creating an orphan
object.

	touch $(for i in $(seq 256); do printf "n"; done)

The funny thing is that this will only work once for each directory, because
after upgraded to fzap, zfs_lookup would fail properly since it has additional
length check.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5768
2017-06-09 14:05:15 -07:00
Olaf Faaland b4c181dc76 Linux 4.11 compat: iops.getattr and friends
In torvalds/linux@a528d35, there are changes to the getattr family of functions,
struct kstat, and the interface of inode_operations .getattr.

The inode_operations .getattr and simple_getattr() interface changed to:

int (*getattr) (const struct path *, struct dentry *, struct kstat *,
    u32 request_mask, unsigned int query_flags)

The request_mask argument indicates which field(s) the caller intends to use.
Fields the caller has not specified via request_mask may be set in the returned
struct anyway, but their values may be approximate.

The query_flags argument indicates whether the filesystem must update
the attributes from the backing store.

Currently both fields are ignored.  It is possible that getattr-related
functions within zfs could be optimized based on the request_mask.

struct kstat includes new fields:
u32               result_mask;  /* What fields the user got */
u64               attributes;   /* See STATX_ATTR_* flags */
struct timespec   btime;        /* File creation time */

Fields attribute and btime are cleared; the result_mask reflects this.  These
appear to be optional based on simple_getattr() and vfs_getattr() within the
kernel, which take the same approach.

Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #5875
2017-06-09 14:05:15 -07:00
Olaf Faaland 626ba3142b Linux 4.11 compat: avoid refcount_t name conflict
Linux 4.11 introduces a new type, refcount_t, which conflicts with the
type of the same name defined within ZFS.

Rename the ZFS type zfs_refcount_t.  Within the ZFS code, use a macro to
cause references to refcount_t to be changed to zfs_refcount_t at
compile time.  This reduces conflicts when later landing OpenZFS
patches.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #5823
Closes #5842
2017-06-09 14:05:15 -07:00
Brian Behlendorf 0bbd80c058 Prepare to release 0.6.5.9
META file and RPM release log updated.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2017-02-03 13:11:42 -08:00
Chunwei Chen 10fbf7c406 Make zfs mount according to relatime config in dataset
Also enable lazytime in mount.zfs

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #4482
2017-02-03 11:58:19 -08:00
Chunwei Chen 1ad7f89628 Enable lazytime semantic for atime
Linux 4.0 introduces lazytime. The idea is that when we update the atime, we
delay writing it to disk for as long as it is reasonably possible.

When lazytime is enabled, dirty_inode will be called with only I_DIRTY_TIME
flag whenever i_atime is updated. So under such condition, we will set
z_atime_dirty. We will only write it to disk if file is closed, inode is
evicted or setattr is called. Ideally, we should also write it whenever SA
is going to be updated, but it is left for future improvement.

There's one thing that we should take care of now that we allow i_atime to be
dirty. In original implementation, whenever SA is modified, zfs_inode_update
will be called to overwrite every thing in inode. This will cause dirty
i_atime to be discarded. We fix this by don't overwrite i_atime in
zfs_inode_update. We only overwrite i_atime when allocating new inode or doing
zfs_rezget with zfs_inode_update_new.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #4482
2017-02-03 11:58:19 -08:00
Chunwei Chen 5137c95dec Fix atime handling and relatime
The problem for atime:

We have 3 places for atime: inode->i_atime, znode->z_atime and SA. And its
handling is a mess. A huge part of mess regarding atime comes from
zfs_tstamp_update_setup, zfs_inode_update, and zfs_getattr, which behave
inconsistently with those three values.

zfs_tstamp_update_setup clears z_atime_dirty unconditionally as long as you
don't pass ATTR_ATIME. Which means every write(2) operation which only updates
ctime and mtime will cause atime changes to not be written to disk.

Also zfs_inode_update from write(2) will replace inode->i_atime with what's
inside SA(stale). But doesn't touch z_atime. So after read(2) and write(2).
You'll have i_atime(stale), z_atime(new), SA(stale) and z_atime_dirty=0.

Now, if you do stat(2), zfs_getattr will actually replace i_atime with what's
inside, z_atime. So you will have now you'll have i_atime(new), z_atime(new),
SA(stale) and z_atime_dirty=0. These will all gone after umount. And you'll
leave with a stale atime.

The problem for relatime:

We do have a relatime config inside ZFS dataset, but how it should interact
with the mount flag MS_RELATIME is not well defined. It seems it wanted
relatime mount option to override the dataset config by showing it as
temporary in `zfs get`. But at the same time, `zfs set relatime=on|off` would
also seems to want to override the mount option. Not to mention that
MS_RELATIME flag is actually never passed into ZFS, so it never really worked.

How Linux handles atime:

The Linux kernel actually handles atime completely in VFS, except for writing
it to disk. So if we remove the atime handling in ZFS, things would just work,
no matter it's strictatime, relatime, noatime, or even O_NOATIME. And whenever
VFS updates the i_atime, it will notify the underlying filesystem via
sb->dirty_inode().

And also there's one thing to note about atime flags like MS_RELATIME and
other flags like MS_NODEV, etc. They are mount point flags rather than
filesystem(sb) flags. Since native linux filesystem can be mounted at multiple
places at the same time, they can all have different atime settings. So these
flags are never passed down to filesystem drivers.

What this patch tries to do:

We remove znode->z_atime, since we won't gain anything from it. We remove most
of the atime handling and leave it to VFS. The only thing we do with atime is
to write it when dirty_inode() or setattr() is called. We also add
file_accessed() in zpl_read() since it's not provided in vfs_read().

After this patch, only the MS_RELATIME flag will have effect. The setting in
dataset won't do anything. We will make zfstuil to mount ZFS with MS_RELATIME
set according to the setting in dataset in future patch.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #4482
2017-02-03 11:58:19 -08:00
Chunwei Chen a0e099580a Fix write(2) returns zero bug from 933ec99
For generic_write_checks with 2 args, we can exit when it returns zero because
it means count is zero. However this is not the case for generic_write_checks
with 4 args, where zero means no error.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Haakan T Johansson <f96hajo@chalmers.se>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #5720
Closes #5726
2017-02-03 10:25:41 -08:00
Chunwei Chen 5070e5311c Retire .write/.read file operations
The .write/.read file operations callbacks can be retired since
support for .read_iter/.write_iter and .aio_read/.aio_write has
been added.  The vfs_write()/vfs_read() entry functions will
select the correct interface for the kernel.  This is desirable
because all VFS write/read operations now rely on common code.

This change also add the generic write checks to make sure that
ulimits are enforced correctly on write.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #5587
Closes #5673
2017-02-03 10:25:37 -08:00
Chunwei Chen 110470266d Fix zmo leak when zfs_sb_create fails
zfs_sb_create would normally takes ownership of zmo, and it will be freed in
zfs_sb_free. However, when zfs_sb_create fails we need to explicit free it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #5490
Closes #5496
2017-02-03 10:25:33 -08:00
Chunwei Chen d425320ac8 Fix fchange in zpl_ioctl_setflags
The fchange in zpl_ioctl_setflags was for detecting flag change. However it
was incorrect and would always fail to detect a flag change from set to unset,
causing users without CAP_LINUX_IMMUTABLE to be able to unset flags.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
2017-02-03 10:25:29 -08:00
Chunwei Chen 2a51899946 Fix wrong operator in xvattr.h
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
2017-02-03 10:25:25 -08:00
Chunwei Chen f3da7a1b40 Don't count '@' for dataset namelen if not a snapshot
Don't count '@' for dataset namelen if not a snapshot.  This
fixes making a pool unimportable when the  dataset namelen
is 255.

Add test file for zfs create name length 255.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #5432
Closes #5456
2017-02-03 10:25:22 -08:00
Richard Yao 625ee0a5e0 zfs_inode_update should not call dmu_object_size_from_db under spinlock
We should never block when holding a spin lock, but zfs_inode_update can
block in the critical section of a spin lock in zfs_inode_update:

zfs_inode_update -> dmu_object_size_from_db -> zrl_add -> mutex_enter

Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #3858
2017-02-03 10:25:19 -08:00
Gvozden Neskovic 9dd467a271 Fix ZFS_AC_KERNEL_SET_CACHED_ACL_USABLE check
Pass `ACL_TYPE_ACCESS` for type parameter of `set_cached_acl()` and
`forget_cached_acl()` to avoid removal of dead code after BUG() in
compile time. Tested on 3.2.0 kernel.

Introduced in 3779913

Reviewed-by: Massimo Maggi <me@massimo-maggi.eu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Closes #5378
2017-02-03 10:25:15 -08:00
Isaac Huang 6ebfe58117 Explicit block device plugging when submitting multiple BIOs
Without plugging, the default 'noop' scheduler will not merge
the BIOs which are part of a large ZIO.

Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Isaac Huang <he.huang@intel.com>
Closes #5181
2017-02-03 10:25:12 -08:00
Tim Chase 39d65926c9 4.10 compat - BIO flag changes and others
[bio] The req_op enum was changed to req_opf.  Update the "Linux 4.8 API"
autotools checks to use an int to determine whether the various REQ_OP
values are defined.  This should work properly on kernels >= 4.8.

[bio] bio_set_op_attrs() is now an inline function and can't be detected
with #ifdef.  Add a configure check to determine whether bio_set_op_attrs()
is defined.  Move the local definition of it from vdev_disk.c to
blkdev_compat.h for consistency with other related compability shims.

[bio] The read/write flags and their modifiers, including WRITE_FLUSH,
WRITE_FUA and WRITE_FLUSH_FUA have been removed from fs.h.  Add the new
bio_set_flush() compatibility wrapper to replace VDEV_WRITE_FLUSH_FUA
and set the flags appropriately for each supported kernel version.

[vfs] The generic_readlink() function has been made static.  If .readlink
in inode_operations is NULL, generic_readlink() is used.

[zol typo] Completely unrelated to 4.10 compat, fix a typo in the check
for REQ_OP_SECURE_ERASE so that the proper macro is defined:

    s/HAVE_REQ_OP_SECURE_DISCARD/HAVE_REQ_OP_SECURE_ERASE/

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes #5499
2017-02-03 10:25:07 -08:00
Brian Behlendorf a57228e51c Reorder HAVE_BIO_RW_* checks
The HAVE_BIO_RW_* #ifdef's must appear before REQ_* #ifdef's
in the bio_is_flush() and bio_is_discard() macros.  Linux 2.6.32
era kernels defined both of values and the HAVE_BIO_RW_* must be
used in this case.  This resulted in a panic in zconfig test 5.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #4951
Closes #4959
2017-02-03 10:25:03 -08:00
Brian Behlendorf bea68ec5bf Remove custom root pool import code
Non-Linux OpenZFS implementations require additional support to be
used a root pool.  This code should simply be removed to avoid
confusion and improve readability.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #4951
2017-02-03 10:24:59 -08:00
Tim Chase 88fa992878 Fix sync behavior for disk vdevs
Prior to b39c22b, which was first generally available in the 0.6.5
release as b39c22b, ZoL never actually submitted synchronous read or write
requests to the Linux block layer.  This means the vdev_disk_dio_is_sync()
function had always returned false and, therefore, the completion in
dio_request_t.dr_comp was never actually used.

In b39c22b, synchronous ZIO operations were translated to synchronous
BIO requests in vdev_disk_io_start().  The follow-on commits 5592404 and
aa159af fixed several problems introduced by b39c22b.  In particular,
5592404 introduced the new flag parameter "wait" to __vdev_disk_physio()
but under ZoL, since vdev_disk_physio() is never actually used, the wait
flag was always zero so the new code had no effect other than to cause
a bug in the use of the dio_request_t.dr_comp which was fixed by aa159af.

The original rationale for introducing synchronous operations in b39c22b
was to hurry certains requests through the BIO layer which would have
otherwise been subject to its unplug timer which would increase the
latency.  This behavior of the unplug timer, however, went away during the
transition of the plug/unplug system between kernels 2.6.32 and 2.6.39.

To handle the unplug timer behavior on 2.6.32-2.6.35 kernels the
BIO_RW_UNPLUG flag is used as a hint to suppress the plugging behavior.

For kernels 2.6.36-2.6.38, the REQ_UNPLUG macro will be available and
ise used for the same purpose.

Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4858
2017-02-03 10:24:54 -08:00
Chunwei Chen c09af45f7b Use set_cached_acl and forget_cached_acl when possible
Originally, these two function are inline, so their usability is tied to
posix_acl_release. However, since Linux 3.14, they became EXPORT_SYMBOL, so we
can always use them. In this patch, we create an independent test for these
two functions so we can use them when possible.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
2017-02-03 10:24:50 -08:00
Chunwei Chen 64c259c509 Batch free zpl_posix_acl_release
Currently every calls to zpl_posix_acl_release will schedule a delayed task,
and each delayed task will add a timer. This used to be fine except for
possibly bad performance impact.

However, in Linux 4.8, a new timer wheel implementation[1] is introduced. In
this new implementation, the larger the delay, the less accuracy the timer is.
So when we have a flood of timer from zpl_posix_acl_release, they will expire
at the same time. Couple with the fact that task_expire will do linear search
with lock held. This causes an extreme amount of contention inside interrupt
and would actually lockup the system.

We fix this by doing batch free to prevent a flood of delayed task. Every call
to zpl_posix_acl_release will put the posix_acl to be freed on a lockless
list. Every batch window, 1 sec, the zpl_posix_acl_free will fire up and free
every posix_acl that passed the grace period on the list. This way, we only
have one delayed task every second.

[1] https://lwn.net/Articles/646950/

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
2017-02-03 10:24:45 -08:00
Neal Gompa (ニール・ゴンパ) 447040c31d Process all systemd services through the systemd scriptlets
This patch ensures that all systemd services are processed through the
systemd scriptlets, so that services are properly configured per the
preset file installed by the package.

Without this, zfs.target is set, but none of the services are enabled per
the preset file, meaning automounting filesystems and such won't work
out of the box.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Neal Gompa <ngompa13@gmail.com>
Closes #5356
2017-02-03 10:24:41 -08:00
tuxoko 734e235f67 Fix cred leak in zpl_fallocate_common
This is caught by kmemleak when running compress_004_pos

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #5244
Closes #5330
2017-02-03 10:24:38 -08:00
Hajo Möller ffcd0c5434 Fix lookup_bdev() on Ubuntu
Ubuntu added support for checking inode permissions to lookup_bdev() in kernel
commit 193fb6a2c94fab8eb8ce70a5da4d21c7d4023bee (merged in 4.4.0-6.21).
Upstream bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1636517

This patch adds a test for Ubuntu's variant of lookup_bdev() to configure and
calls the function in the correct way.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Hajo Möller <dasjoe@gmail.com>
Closes #5336
2017-02-03 10:24:34 -08:00
LOLi d2beed9116 Fix uninitialized variable snapprops_nvlist in zfs_receive_one
The variable snapprops_nvlist was never initialized, so properties
were not applied to the received snapshot.

Additionally, add zfs_receive_013_pos.ksh script to ZFS test suite to exercise
'zfs receive' functionality for user properties.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #4338
2017-02-03 10:24:30 -08:00
Tim Chase 4c83fa9b87 Write issue taskq shouldn't be dynamic
This is as much an upstream compatibility as it's a bit of a performance
gain.

The illumos taskq implemention doesn't allow a TASKQ_THREADS_CPU_PCT type
to be dynamic and in fact enforces as much with an ASSERT.

As to performance, if this taskq is dynamic, it can cause excessive
contention on tq_lock as the threads are created and destroyed because it
can see bursts of many thousands of tasks in a short time, particularly
in heavy high-concurrency zvol write workloads.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes #5236
2017-02-03 10:24:26 -08:00
Brian Behlendorf cbf8713874 Use large stacks when available
While stack size will vary by architecture it has historically defaulted to
8K on x86_64 systems.  However, as of Linux 3.15 the default thread stack
size was increased to 16K.  These kernels are now the default in most non-
enterprise distributions which means we no longer need to assume 8K stacks.

This patch takes advantage of that fact by appropriately reverting stack
conservation changes which were made to ensure stability.  Changes which
may have had a negative impact on performance for certain workloads.  This
also has the side effect of bringing the code slightly more in line with
upstream.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Closes #4059
2017-02-03 10:24:22 -08:00
Stian Ellingsen dc3d6a6db1 Use env, not sh in zfsctl_snapshot_{,un}mount()
Call mount and umount via /usr/bin/env instead of /bin/sh in
zfsctl_snapshot_mount() and zfsctl_snapshot_unmount().

This change fixes a shell code injection flaw.  The call to /bin/sh
passed the mountpoint unescaped, only surrounded by single quotes.  A
mountpoint containing one or more single quotes would cause the command
to fail or potentially execute arbitrary shell code.

This change also provides compatibility with grsecurity patches.
Grsecurity only allows call_usermodehelper() to use helper binaries in
certain paths.  /usr/bin/* is allowed, /bin/* is not.
2017-02-03 10:24:17 -08:00
Stian Ellingsen d71db895a1 Fix use after free in zfsctl_snapshot_unmount() 2017-02-03 10:24:12 -08:00
tuxoko 42dae6d7a6 Linux 3.14 compat: assign inode->set_acl
Linux 3.14 introduces inode->set_acl(). Normally, acl modification will come
from setxattr, which will handle by the acl xattr_handler, and we already
handles that well. However, nfsd will directly calls inode->set_acl or
return error if it doesn't exists.

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed-by: Massimo Maggi <me@massimo-maggi.eu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #5371
Closes #5375
2017-02-03 10:24:09 -08:00
Brian Behlendorf f85c85ea06 Linux 4.9 compat: inode_change_ok() renamed setattr_prepare()
In torvalds/linux@31051c8 the inode_change_ok() function was
renamed setattr_prepare() and updated to take a dentry ratheri
than an inode.  Update the code to call the setattr_prepare()
and add a wrapper function which call inode_change_ok() for
older kernels.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
2017-02-03 10:24:06 -08:00
Chunwei Chen 670508f080 Linux 4.9 compat: remove iops->{set,get,remove}xattr
In Linux 4.9, torvalds/linux@fd50eca, iops->{set,get,remove}xattr and
generic_{set,get,remove}xattr are removed. xattr operations will directly
go through sb->s_xattr.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
2017-02-03 10:24:00 -08:00
Chunwei Chen 28172e8aa7 Linux 4.9 compat: iops->rename() wants flags
In Linux 4.9, torvalds/linux@2773bf0, iops->rename() and iops->rename2() are
merged together into iops->rename(), it now wants flags.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
2017-02-03 10:23:57 -08:00
tuxoko c0716f13ef Linux 4.7 compat: Fix deadlock during lookup on case-insensitive
We must not use d_add_ci if the dentry already has the real name. Otherwise,
d_add_ci()->d_alloc_parallel() will find itself on the lookup hash and wait
on itself causing deadlock.

Tested-by: satmandu
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #5124
Closes #5141
Closes #5147
Closes #5148
2017-02-03 10:23:53 -08:00
DeHackEd dbc95a682c Kernel 4.9 compat: file_operations->aio_fsync removal
Linux kernel commit 723c038475b78 removed this field.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #5393
2017-02-03 10:23:50 -08:00
Chunwei Chen 20a0763746 Remove dir inode operations from zpl_inode_operations
These operations are dir specific, there's no point putting them in
zpl_inode_operations which is for regular files.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
2017-02-03 10:23:47 -08:00
Brian Behlendorf e56852059f Fix uninitialized variable in avl_add()
Silence the following warning when compiling with gcc 5.4.0.
Specifically gcc (Ubuntu 5.4.0-6ubuntu1~16.04.1) 5.4.0 20160609.

module/avl/avl.c: In function ‘avl_add’:
module/avl/avl.c:647:2: warning: ‘where’ may be used uninitialized
    in this function [-Wmaybe-uninitialized]
  avl_insert(tree, new_node, where);

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2017-02-03 10:23:42 -08:00
73 changed files with 1658 additions and 875 deletions

2
META
View File

@ -1,7 +1,7 @@
Meta: 1
Name: zfs
Branch: 1.0
Version: 0.6.5.8
Version: 0.6.5.11
Release: 1
Release-Tags: relext
License: CDDL

View File

@ -77,7 +77,10 @@ static const option_map_t option_map[] = {
{ MNTOPT_RELATIME, MS_RELATIME, ZS_COMMENT },
#endif
#ifdef MS_STRICTATIME
{ MNTOPT_DFRATIME, MS_STRICTATIME, ZS_COMMENT },
{ MNTOPT_STRICTATIME, MS_STRICTATIME, ZS_COMMENT },
#endif
#ifdef MS_LAZYTIME
{ MNTOPT_LAZYTIME, MS_LAZYTIME, ZS_COMMENT },
#endif
{ MNTOPT_CONTEXT, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_FSCONTEXT, MS_COMMENT, ZS_COMMENT },
@ -605,10 +608,23 @@ main(int argc, char **argv)
"failed for unknown reason.\n"), dataset);
}
return (MOUNT_SYSERR);
#ifdef MS_MANDLOCK
case EPERM:
if (mntflags & MS_MANDLOCK) {
(void) fprintf(stderr, gettext("filesystem "
"'%s' has the 'nbmand=on' property set, "
"this mount\noption may be disabled in "
"your kernel. Use 'zfs set nbmand=off'\n"
"to disable this option and try to "
"mount the filesystem again.\n"), dataset);
return (MOUNT_SYSERR);
}
/* fallthru */
#endif
default:
(void) fprintf(stderr, gettext("filesystem "
"'%s' can not be mounted due to error "
"%d\n"), dataset, errno);
"'%s' can not be mounted: %s\n"), dataset,
strerror(errno));
return (MOUNT_USAGE);
}
}

View File

@ -184,9 +184,9 @@ sas_handler() {
return
fi
# Get the raw scsi device name from multipath -l. Strip off
# Get the raw scsi device name from multipath -ll. Strip off
# leading pipe symbols to make field numbering consistent.
DEV=`multipath -l $DM_NAME |
DEV=`multipath -ll $DM_NAME |
awk '/running/{gsub("^[|]"," "); print $3 ; exit}'`
if [ -z "$DEV" ] ; then
return

View File

@ -478,7 +478,7 @@ static void
dump_bpobj_subobjs(objset_t *os, uint64_t object, void *data, size_t size)
{
dmu_object_info_t doi;
uint64_t i;
int64_t i;
VERIFY0(dmu_object_info(os, object, &doi));
uint64_t *subobjs = kmem_alloc(doi.doi_max_offset, KM_SLEEP);
@ -497,7 +497,7 @@ dump_bpobj_subobjs(objset_t *os, uint64_t object, void *data, size_t size)
}
for (i = 0; i <= last_nonzero; i++) {
(void) printf("\t%llu\n", (longlong_t)subobjs[i]);
(void) printf("\t%llu\n", (u_longlong_t)subobjs[i]);
}
kmem_free(subobjs, doi.doi_max_offset);
}

View File

@ -444,13 +444,13 @@ zfs_for_each(int argc, char **argv, int flags, zfs_type_t types,
/*
* If we're recursive, then we always allow filesystems as
* arguments. If we also are interested in snapshots, then we
* can take volumes as well.
* arguments. If we also are interested in snapshots or
* bookmarks, then we can take volumes as well.
*/
argtype = types;
if (flags & ZFS_ITER_RECURSE) {
argtype |= ZFS_TYPE_FILESYSTEM;
if (types & ZFS_TYPE_SNAPSHOT)
if (types & (ZFS_TYPE_SNAPSHOT | ZFS_TYPE_BOOKMARK))
argtype |= ZFS_TYPE_VOLUME;
}

View File

@ -1,6 +1,8 @@
include $(top_srcdir)/config/Rules.am
AM_CFLAGS += $(DEBUG_STACKFLAGS) $(FRAME_LARGER_THAN)
# -Wnoformat-truncation to get rid of compiler warning for unchecked
# truncating snprintfs on gcc 7.1.1.
AM_CFLAGS += $(DEBUG_STACKFLAGS) $(FRAME_LARGER_THAN) $(NO_FORMAT_TRUNCATION)
DEFAULT_INCLUDES += \
-I$(top_srcdir)/include \

View File

@ -7,7 +7,8 @@ AM_CFLAGS += ${NO_BOOL_COMPARE}
AM_CFLAGS += -fno-strict-aliasing
AM_CPPFLAGS = -D_GNU_SOURCE -D__EXTENSIONS__ -D_REENTRANT
AM_CPPFLAGS += -D_POSIX_PTHREAD_SEMANTICS -D_FILE_OFFSET_BITS=64
AM_CPPFLAGS += -D_LARGEFILE64_SOURCE -DTEXT_DOMAIN=\"zfs-linux-user\"
AM_CPPFLAGS += -D_LARGEFILE64_SOURCE -DHAVE_LARGE_STACKS=1
AM_CPPFLAGS += -DTEXT_DOMAIN=\"zfs-linux-user\"
AM_CPPFLAGS += -DLIBEXECDIR=\"$(libexecdir)\"
AM_CPPFLAGS += -DRUNSTATEDIR=\"$(runstatedir)\"
AM_CPPFLAGS += -DSBINDIR=\"$(sbindir)\"

View File

@ -39,6 +39,35 @@ AC_DEFUN([ZFS_AC_KERNEL_POSIX_ACL_RELEASE], [
])
])
dnl #
dnl # 3.14 API change,
dnl # set_cached_acl() and forget_cached_acl() changed from inline to
dnl # EXPORT_SYMBOL. In the former case, they may not be usable because of
dnl # posix_acl_release. In the latter case, we can always use them.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SET_CACHED_ACL_USABLE], [
AC_MSG_CHECKING([whether set_cached_acl() is usable])
ZFS_LINUX_TRY_COMPILE([
#include <linux/module.h>
#include <linux/cred.h>
#include <linux/fs.h>
#include <linux/posix_acl.h>
MODULE_LICENSE("$ZFS_META_LICENSE");
],[
struct inode *ip = NULL;
struct posix_acl *acl = posix_acl_alloc(1, 0);
set_cached_acl(ip, ACL_TYPE_ACCESS, acl);
forget_cached_acl(ip, ACL_TYPE_ACCESS);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SET_CACHED_ACL_USABLE, 1,
[posix_acl_release() is usable])
],[
AC_MSG_RESULT(no)
])
])
dnl #
dnl # 3.1 API change,
dnl # posix_acl_chmod_masq() is not exported anymore and posix_acl_chmod()
@ -249,13 +278,38 @@ AC_DEFUN([ZFS_AC_KERNEL_INODE_OPERATIONS_GET_ACL], [
])
])
dnl #
dnl # 3.14 API change,
dnl # Check if inode_operations contains the function set_acl
dnl #
AC_DEFUN([ZFS_AC_KERNEL_INODE_OPERATIONS_SET_ACL], [
AC_MSG_CHECKING([whether iops->set_acl() exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
int set_acl_fn(struct inode *inode, struct posix_acl *acl, int type)
{ return 0; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.set_acl = set_acl_fn,
};
],[
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SET_ACL, 1, [iops->set_acl() exists])
],[
AC_MSG_RESULT(no)
])
])
dnl #
dnl # 4.7 API change,
dnl # The kernel get_acl will now check cache before calling i_op->get_acl and
dnl # do set_cached_acl after that, so i_op->get_acl don't need to do that
dnl # anymore.
dnl #
AC_DEFUN([ZFS_AC_KERNE_GET_ACL_HANDLE_CACHE], [
AC_DEFUN([ZFS_AC_KERNEL_GET_ACL_HANDLE_CACHE], [
AC_MSG_CHECKING([whether uncached_acl_sentinel() exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>

View File

@ -0,0 +1,21 @@
dnl #
dnl # Linux 4.9-rc5+ ABI, removal of the .aio_fsync field
dnl #
AC_DEFUN([ZFS_AC_KERNEL_AIO_FSYNC], [
AC_MSG_CHECKING([whether fops->aio_fsync() exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
static const struct file_operations
fops __attribute__ ((unused)) = {
.aio_fsync = NULL,
};
],[
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_FILE_AIO_FSYNC, 1, [fops->aio_fsync() exists])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -1,38 +0,0 @@
dnl #
dnl # 2.6.32 - 2.6.33, bdi_setup_and_register() is not exported.
dnl # 2.6.34 - 3.19, bdi_setup_and_register() takes 3 arguments.
dnl # 4.0 - x.y, bdi_setup_and_register() takes 2 arguments.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BDI_SETUP_AND_REGISTER], [
AC_MSG_CHECKING([whether bdi_setup_and_register() wants 2 args])
ZFS_LINUX_TRY_COMPILE_SYMBOL([
#include <linux/backing-dev.h>
struct backing_dev_info bdi;
], [
char *name = "bdi";
int error __attribute__((unused)) =
bdi_setup_and_register(&bdi, name);
], [bdi_setup_and_register], [mm/backing-dev.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_2ARGS_BDI_SETUP_AND_REGISTER, 1,
[bdi_setup_and_register() wants 2 args])
], [
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether bdi_setup_and_register() wants 3 args])
ZFS_LINUX_TRY_COMPILE_SYMBOL([
#include <linux/backing-dev.h>
struct backing_dev_info bdi;
], [
char *name = "bdi";
unsigned int cap = BDI_CAP_MAP_COPY;
int error __attribute__((unused)) =
bdi_setup_and_register(&bdi, name, cap);
], [bdi_setup_and_register], [mm/backing-dev.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_3ARGS_BDI_SETUP_AND_REGISTER, 1,
[bdi_setup_and_register() wants 3 args])
], [
AC_MSG_RESULT(no)
])
])
])

56
config/kernel-bdi.m4 Normal file
View File

@ -0,0 +1,56 @@
dnl #
dnl # 2.6.32 - 2.6.33, bdi_setup_and_register() is not exported.
dnl # 2.6.34 - 3.19, bdi_setup_and_register() takes 3 arguments.
dnl # 4.0 - 4.11, bdi_setup_and_register() takes 2 arguments.
dnl # 4.12 - x.y, super_setup_bdi_name() new interface.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BDI], [
AC_MSG_CHECKING([whether super_setup_bdi_name() exists])
ZFS_LINUX_TRY_COMPILE_SYMBOL([
#include <linux/fs.h>
struct super_block sb;
], [
char *name = "bdi";
int error __attribute__((unused)) =
super_setup_bdi_name(&sb, name);
], [super_setup_bdi_name], [fs/super.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SUPER_SETUP_BDI_NAME, 1,
[super_setup_bdi_name() exits])
], [
AC_MSG_RESULT(no)
AC_MSG_CHECKING(
[whether bdi_setup_and_register() wants 2 args])
ZFS_LINUX_TRY_COMPILE_SYMBOL([
#include <linux/backing-dev.h>
struct backing_dev_info bdi;
], [
char *name = "bdi";
int error __attribute__((unused)) =
bdi_setup_and_register(&bdi, name);
], [bdi_setup_and_register], [mm/backing-dev.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_2ARGS_BDI_SETUP_AND_REGISTER, 1,
[bdi_setup_and_register() wants 2 args])
], [
AC_MSG_RESULT(no)
AC_MSG_CHECKING(
[whether bdi_setup_and_register() wants 3 args])
ZFS_LINUX_TRY_COMPILE_SYMBOL([
#include <linux/backing-dev.h>
struct backing_dev_info bdi;
], [
char *name = "bdi";
unsigned int cap = BDI_CAP_MAP_COPY;
int error __attribute__((unused)) =
bdi_setup_and_register(&bdi, name, cap);
], [bdi_setup_and_register], [mm/backing-dev.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_3ARGS_BDI_SETUP_AND_REGISTER, 1,
[bdi_setup_and_register() wants 3 args])
], [
AC_MSG_RESULT(no)
])
])
])
])

View File

@ -10,7 +10,7 @@ AC_DEFUN([ZFS_AC_KERNEL_REQ_OP_DISCARD], [
ZFS_LINUX_TRY_COMPILE([
#include <linux/blk_types.h>
],[
enum req_op op __attribute__ ((unused)) = REQ_OP_DISCARD;
int op __attribute__ ((unused)) = REQ_OP_DISCARD;
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_REQ_OP_DISCARD, 1,
@ -25,10 +25,10 @@ AC_DEFUN([ZFS_AC_KERNEL_REQ_OP_SECURE_ERASE], [
ZFS_LINUX_TRY_COMPILE([
#include <linux/blk_types.h>
],[
enum req_op op __attribute__ ((unused)) = REQ_OP_SECURE_ERASE;
int op __attribute__ ((unused)) = REQ_OP_SECURE_ERASE;
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_REQ_OP_SECURE_DISCARD, 1,
AC_DEFINE(HAVE_REQ_OP_SECURE_ERASE, 1,
[REQ_OP_SECURE_ERASE is defined])
],[
AC_MSG_RESULT(no)
@ -41,7 +41,7 @@ AC_DEFUN([ZFS_AC_KERNEL_REQ_OP_FLUSH], [
ZFS_LINUX_TRY_COMPILE([
#include <linux/blk_types.h>
],[
enum req_op op __attribute__ ((unused)) = REQ_OP_FLUSH;
int op __attribute__ ((unused)) = REQ_OP_FLUSH;
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_REQ_OP_FLUSH, 1,
@ -65,3 +65,20 @@ AC_DEFUN([ZFS_AC_KERNEL_BIO_BI_OPF], [
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_HAVE_BIO_SET_OP_ATTRS], [
AC_MSG_CHECKING([whether bio_set_op_attrs is available])
ZFS_LINUX_TRY_COMPILE([
#include <linux/bio.h>
],[
struct bio *bio __attribute__ ((unused)) = NULL;
bio_set_op_attrs(bio, 0, 0);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BIO_SET_OP_ATTRS, 1,
[bio_set_op_attrs is available])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,44 @@
dnl #
dnl # 2.6.32-2.6.35 API - The BIO_RW_UNPLUG enum can be used as a hint
dnl # to unplug the queue.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_HAVE_BIO_RW_UNPLUG], [
AC_MSG_CHECKING([whether the BIO_RW_UNPLUG enum is available])
tmp_flags="$EXTRA_KCFLAGS"
EXTRA_KCFLAGS="${NO_UNUSED_BUT_SET_VARIABLE}"
ZFS_LINUX_TRY_COMPILE([
#include <linux/blkdev.h>
],[
extern enum bio_rw_flags rw;
rw = BIO_RW_UNPLUG;
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_HAVE_BIO_RW_UNPLUG, 1,
[BIO_RW_UNPLUG is available])
],[
AC_MSG_RESULT(no)
])
EXTRA_KCFLAGS="$tmp_flags"
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_HAVE_BLK_PLUG], [
AC_MSG_CHECKING([whether struct blk_plug is available])
tmp_flags="$EXTRA_KCFLAGS"
EXTRA_KCFLAGS="${NO_UNUSED_BUT_SET_VARIABLE}"
ZFS_LINUX_TRY_COMPILE([
#include <linux/blkdev.h>
],[
struct blk_plug plug;
blk_start_plug(&plug);
blk_finish_plug(&plug);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_HAVE_BLK_PLUG, 1,
[struct blk_plug is available])
],[
AC_MSG_RESULT(no)
])
EXTRA_KCFLAGS="$tmp_flags"
])

View File

@ -0,0 +1,19 @@
dnl #
dnl # 4.9, current_time() added
dnl #
AC_DEFUN([ZFS_AC_KERNEL_CURRENT_TIME],
[AC_MSG_CHECKING([whether current_time() exists])
ZFS_LINUX_TRY_COMPILE_SYMBOL([
#include <linux/fs.h>
], [
struct inode ip;
struct timespec now __attribute__ ((unused));
now = current_time(&ip);
], [current_time], [fs/inode.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_CURRENT_TIME, 1, [current_time() exists])
], [
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,22 @@
dnl #
dnl # 4.10 API
dnl #
dnl # NULL inode_operations.readlink implies generic_readlink(), which
dnl # has been made static.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_GENERIC_READLINK_GLOBAL], [
AC_MSG_CHECKING([whether generic_readlink is global])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
],[
int i __attribute__ ((unused));
i = generic_readlink(NULL, NULL, 0);
],[
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_GENERIC_READLINK, 1,
[generic_readlink is global])
],[
AC_MSG_RESULT([no])
])
])

View File

@ -0,0 +1,67 @@
dnl #
dnl # Linux 4.11 API
dnl # See torvalds/linux@a528d35
dnl #
AC_DEFUN([ZFS_AC_PATH_KERNEL_IOPS_GETATTR], [
AC_MSG_CHECKING([whether iops->getattr() takes a path])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
int test_getattr(
const struct path *p, struct kstat *k,
u32 request_mask, unsigned int query_flags)
{ return 0; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.getattr = test_getattr,
};
],[
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_PATH_IOPS_GETATTR, 1,
[iops->getattr() takes a path])
],[
AC_MSG_RESULT(no)
])
])
dnl #
dnl # Linux 3.9 - 4.10 API
dnl #
AC_DEFUN([ZFS_AC_VFSMOUNT_KERNEL_IOPS_GETATTR], [
AC_MSG_CHECKING([whether iops->getattr() takes a vfsmount])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
int test_getattr(
struct vfsmount *mnt, struct dentry *d,
struct kstat *k)
{ return 0; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.getattr = test_getattr,
};
],[
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_VFSMOUNT_IOPS_GETATTR, 1,
[iops->getattr() takes a vfsmount])
],[
AC_MSG_RESULT(no)
])
])
dnl #
dnl # The interface of the getattr callback from the inode_operations
dnl # structure changed. Also, the interface of the simple_getattr()
dnl # function provided by the kernel changed.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_INODE_OPERATIONS_GETATTR], [
ZFS_AC_PATH_KERNEL_IOPS_GETATTR
ZFS_AC_VFSMOUNT_KERNEL_IOPS_GETATTR
])

View File

@ -1,17 +1,29 @@
dnl #
dnl # 2.6.27 API change
dnl # lookup_bdev() was exported.
dnl # 2.6.27, lookup_bdev() was exported.
dnl # 4.4.0-6.21 - x.y on Ubuntu, lookup_bdev() takes 2 arguments.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_LOOKUP_BDEV],
[AC_MSG_CHECKING([whether lookup_bdev() is available])
[AC_MSG_CHECKING([whether lookup_bdev() wants 1 arg])
ZFS_LINUX_TRY_COMPILE_SYMBOL([
#include <linux/fs.h>
], [
lookup_bdev(NULL);
], [lookup_bdev], [fs/block_dev.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_LOOKUP_BDEV, 1, [lookup_bdev() is available])
AC_DEFINE(HAVE_1ARG_LOOKUP_BDEV, 1, [lookup_bdev() wants 1 arg])
], [
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether lookup_bdev() wants 2 args])
ZFS_LINUX_TRY_COMPILE_SYMBOL([
#include <linux/fs.h>
], [
lookup_bdev(NULL, FMODE_READ);
], [lookup_bdev], [fs/block_dev.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_2ARGS_LOOKUP_BDEV, 1,
[lookup_bdev() wants 2 args])
], [
AC_MSG_RESULT(no)
])
])
])

25
config/kernel-rename.m4 Normal file
View File

@ -0,0 +1,25 @@
dnl #
dnl # 4.9 API change,
dnl # iops->rename2() merged into iops->rename(), and iops->rename() now wants
dnl # flags.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_RENAME_WANTS_FLAGS], [
AC_MSG_CHECKING([whether iops->rename() wants flags])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
int rename_fn(struct inode *sip, struct dentry *sdp,
struct inode *tip, struct dentry *tdp,
unsigned int flags) { return 0; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.rename = rename_fn,
};
],[
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_RENAME_WANTS_FLAGS, 1, [iops->rename() wants flags])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,23 @@
dnl #
dnl # 4.9 API change
dnl # The inode_change_ok() function has been renamed setattr_prepare()
dnl # and updated to take a dentry rather than an inode.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SETATTR_PREPARE],
[AC_MSG_CHECKING([whether setattr_prepare() is available])
ZFS_LINUX_TRY_COMPILE_SYMBOL([
#include <linux/fs.h>
], [
struct dentry *dentry = NULL;
struct iattr *attr = NULL;
int error;
error = setattr_prepare(dentry, attr);
], [setattr_prepare], [fs/attr.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SETATTR_PREPARE, 1,
[setattr_prepare() is available])
], [
AC_MSG_RESULT(no)
])
])

View File

@ -1,5 +1,5 @@
dnl #
dnl # Linux 4.1.x API
dnl # Linux 3.16 API
dnl #
AC_DEFUN([ZFS_AC_KERNEL_VFS_RW_ITERATE],
[AC_MSG_CHECKING([whether fops->read/write_iter() are available])
@ -21,6 +21,47 @@ AC_DEFUN([ZFS_AC_KERNEL_VFS_RW_ITERATE],
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_VFS_RW_ITERATE, 1,
[fops->read/write_iter() are available])
ZFS_AC_KERNEL_NEW_SYNC_READ
],[
AC_MSG_RESULT(no)
])
])
dnl #
dnl # Linux 4.1 API
dnl #
AC_DEFUN([ZFS_AC_KERNEL_NEW_SYNC_READ],
[AC_MSG_CHECKING([whether new_sync_read() is available])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
],[
new_sync_read(NULL, NULL, 0, NULL);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_NEW_SYNC_READ, 1,
[new_sync_read() is available])
],[
AC_MSG_RESULT(no)
])
])
dnl #
dnl # Linux 4.1.x API
dnl #
AC_DEFUN([ZFS_AC_KERNEL_GENERIC_WRITE_CHECKS],
[AC_MSG_CHECKING([whether generic_write_checks() takes kiocb])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
],[
struct kiocb *iocb = NULL;
struct iov_iter *iov = NULL;
generic_write_checks(iocb, iov);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GENERIC_WRITE_CHECKS_KIOCB, 1,
[generic_write_checks() takes kiocb])
],[
AC_MSG_RESULT(no)
])

View File

@ -57,6 +57,31 @@ AC_DEFUN([ZFS_AC_KERNEL_XATTR_HANDLER_NAME], [
])
])
dnl #
dnl # 4.9 API change,
dnl # iops->{set,get,remove}xattr and generic_{set,get,remove}xattr are
dnl # removed. xattr operations will directly go through sb->s_xattr.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_HAVE_GENERIC_SETXATTR], [
AC_MSG_CHECKING([whether generic_setxattr() exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/fs.h>
#include <linux/xattr.h>
static const struct inode_operations
iops __attribute__ ((unused)) = {
.setxattr = generic_setxattr
};
],[
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GENERIC_SETXATTR, 1,
[generic_setxattr() exists])
],[
AC_MSG_RESULT(no)
])
])
dnl #
dnl # Supported xattr handler get() interfaces checked newest to oldest.
dnl #

View File

@ -33,8 +33,12 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_BLK_QUEUE_FLUSH
ZFS_AC_KERNEL_BLK_QUEUE_MAX_HW_SECTORS
ZFS_AC_KERNEL_BLK_QUEUE_MAX_SEGMENTS
ZFS_AC_KERNEL_BLK_QUEUE_HAVE_BIO_RW_UNPLUG
ZFS_AC_KERNEL_BLK_QUEUE_HAVE_BLK_PLUG
ZFS_AC_KERNEL_GET_DISK_RO
ZFS_AC_KERNEL_GET_GENDISK
ZFS_AC_KERNEL_HAVE_BIO_SET_OP_ATTRS
ZFS_AC_KERNEL_GENERIC_READLINK_GLOBAL
ZFS_AC_KERNEL_DISCARD_GRANULARITY
ZFS_AC_KERNEL_CONST_XATTR_HANDLER
ZFS_AC_KERNEL_XATTR_HANDLER_NAME
@ -44,6 +48,7 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_INODE_OWNER_OR_CAPABLE
ZFS_AC_KERNEL_POSIX_ACL_FROM_XATTR_USERNS
ZFS_AC_KERNEL_POSIX_ACL_RELEASE
ZFS_AC_KERNEL_SET_CACHED_ACL_USABLE
ZFS_AC_KERNEL_POSIX_ACL_CHMOD
ZFS_AC_KERNEL_POSIX_ACL_EQUIV_MODE_WANTS_UMODE_T
ZFS_AC_KERNEL_POSIX_ACL_VALID_WITH_NS
@ -52,7 +57,9 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_INODE_OPERATIONS_CHECK_ACL
ZFS_AC_KERNEL_INODE_OPERATIONS_CHECK_ACL_WITH_FLAGS
ZFS_AC_KERNEL_INODE_OPERATIONS_GET_ACL
ZFS_AC_KERNE_GET_ACL_HANDLE_CACHE
ZFS_AC_KERNEL_INODE_OPERATIONS_SET_ACL
ZFS_AC_KERNEL_INODE_OPERATIONS_GETATTR
ZFS_AC_KERNEL_GET_ACL_HANDLE_CACHE
ZFS_AC_KERNEL_SHOW_OPTIONS
ZFS_AC_KERNEL_FILE_INODE
ZFS_AC_KERNEL_FSYNC
@ -61,6 +68,7 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_NR_CACHED_OBJECTS
ZFS_AC_KERNEL_FREE_CACHED_OBJECTS
ZFS_AC_KERNEL_FALLOCATE
ZFS_AC_KERNEL_AIO_FSYNC
ZFS_AC_KERNEL_MKDIR_UMODE_T
ZFS_AC_KERNEL_LOOKUP_NAMEIDATA
ZFS_AC_KERNEL_CREATE_NAMEIDATA
@ -71,6 +79,7 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_ENCODE_FH_WITH_INODE
ZFS_AC_KERNEL_COMMIT_METADATA
ZFS_AC_KERNEL_CLEAR_INODE
ZFS_AC_KERNEL_SETATTR_PREPARE
ZFS_AC_KERNEL_INSERT_INODE_LOCKED
ZFS_AC_KERNEL_D_MAKE_ROOT
ZFS_AC_KERNEL_D_OBTAIN_ALIAS
@ -87,17 +96,21 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_SHRINK_CONTROL_HAS_NID
ZFS_AC_KERNEL_S_INSTANCES_LIST_HEAD
ZFS_AC_KERNEL_S_D_OP
ZFS_AC_KERNEL_BDI_SETUP_AND_REGISTER
ZFS_AC_KERNEL_BDI
ZFS_AC_KERNEL_SET_NLINK
ZFS_AC_KERNEL_ELEVATOR_CHANGE
ZFS_AC_KERNEL_5ARG_SGET
ZFS_AC_KERNEL_LSEEK_EXECUTE
ZFS_AC_KERNEL_VFS_ITERATE
ZFS_AC_KERNEL_VFS_RW_ITERATE
ZFS_AC_KERNEL_GENERIC_WRITE_CHECKS
ZFS_AC_KERNEL_KMAP_ATOMIC_ARGS
ZFS_AC_KERNEL_FOLLOW_DOWN_ONE
ZFS_AC_KERNEL_MAKE_REQUEST_FN
ZFS_AC_KERNEL_GENERIC_IO_ACCT
ZFS_AC_KERNEL_RENAME_WANTS_FLAGS
ZFS_AC_KERNEL_HAVE_GENERIC_SETXATTR
ZFS_AC_KERNEL_CURRENT_TIME
AS_IF([test "$LINUX_OBJ" != "$LINUX"], [
KERNELMAKE_PARAMS="$KERNELMAKE_PARAMS O=$LINUX_OBJ"
@ -468,9 +481,35 @@ AC_DEFUN([ZFS_AC_KERNEL_CONFIG], [
])
])
ZFS_AC_KERNEL_CONFIG_THREAD_SIZE
ZFS_AC_KERNEL_CONFIG_DEBUG_LOCK_ALLOC
])
dnl #
dnl # Check configured THREAD_SIZE
dnl #
dnl # The stack size will vary by architecture, but as of Linux 3.15 on x86_64
dnl # the default thread stack size was increased to 16K from 8K. Therefore,
dnl # on newer kernels and some architectures stack usage optimizations can be
dnl # conditionally applied to improve performance without negatively impacting
dnl # stability.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_CONFIG_THREAD_SIZE], [
AC_MSG_CHECKING([whether kernel was built with 16K or larger stacks])
ZFS_LINUX_TRY_COMPILE([
#include <linux/module.h>
],[
#if (THREAD_SIZE < 16384)
#error "THREAD_SIZE is less than 16K"
#endif
],[
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_LARGE_STACKS, 1, [kernel has large stacks])
],[
AC_MSG_RESULT([no])
])
])
dnl #
dnl # Check CONFIG_DEBUG_LOCK_ALLOC
dnl #
@ -580,7 +619,7 @@ dnl #
dnl # ZFS_LINUX_CONFIG
dnl #
AC_DEFUN([ZFS_LINUX_CONFIG],
[AC_MSG_CHECKING([whether Linux was built with CONFIG_$1])
[AC_MSG_CHECKING([whether kernel was built with CONFIG_$1])
ZFS_LINUX_TRY_COMPILE([
#include <linux/module.h>
],[

39
config/user-makedev.m4 Normal file
View File

@ -0,0 +1,39 @@
dnl #
dnl # glibc 2.25
dnl #
AC_DEFUN([ZFS_AC_CONFIG_USER_MAKEDEV_IN_SYSMACROS], [
AC_MSG_CHECKING([makedev() is declared in sys/sysmacros.h])
AC_TRY_COMPILE(
[
#include <sys/sysmacros.h>
],[
int k;
k = makedev(0,0);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_MAKEDEV_IN_SYSMACROS, 1,
[makedev() is declared in sys/sysmacros.h])
],[
AC_MSG_RESULT(no)
])
])
dnl #
dnl # glibc X < Y < 2.25
dnl #
AC_DEFUN([ZFS_AC_CONFIG_USER_MAKEDEV_IN_MKDEV], [
AC_MSG_CHECKING([makedev() is declared in sys/mkdev.h])
AC_TRY_COMPILE(
[
#include <sys/mkdev.h>
],[
int k;
k = makedev(0,0);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_MAKEDEV_IN_MKDEV, 1,
[makedev() is declared in sys/mkdev.h])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,22 @@
dnl #
dnl # Check if gcc supports -Wno-format-truncation option.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_USER_NO_FORMAT_TRUNCATION], [
AC_MSG_CHECKING([for -Wno-format-truncation support])
saved_flags="$CFLAGS"
CFLAGS="$CFLAGS -Wno-format-truncation"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [])],
[
NO_FORMAT_TRUNCATION=-Wno-format-truncation
AC_MSG_RESULT([yes])
],
[
NO_FORMAT_TRUNCATION=
AC_MSG_RESULT([no])
])
CFLAGS="$saved_flags"
AC_SUBST([NO_FORMAT_TRUNCATION])
])

View File

@ -13,6 +13,9 @@ AC_DEFUN([ZFS_AC_CONFIG_USER], [
ZFS_AC_CONFIG_USER_LIBBLKID
ZFS_AC_CONFIG_USER_FRAME_LARGER_THAN
ZFS_AC_CONFIG_USER_RUNSTATEDIR
ZFS_AC_CONFIG_USER_MAKEDEV_IN_SYSMACROS
ZFS_AC_CONFIG_USER_MAKEDEV_IN_MKDEV
ZFS_AC_CONFIG_USER_NO_FORMAT_TRUNCATION
dnl #
dnl # Checks for library functions
AC_CHECK_FUNCS([mlockall])

View File

@ -11,6 +11,7 @@ Before=local-fs.target
Type=oneshot
RemainAfterExit=yes
ExecStart=@sbindir@/zfs mount -a
WorkingDirectory=-/sbin/
[Install]
WantedBy=zfs-share.service

View File

@ -46,11 +46,6 @@
extern "C" {
#endif
#ifdef VERIFY
#undef VERIFY
#endif
#define VERIFY verify
typedef struct libzfs_fru {
char *zf_device;
char *zf_fru;

View File

@ -261,12 +261,21 @@ bio_set_flags_failfast(struct block_device *bdev, int *flags)
/*
* 2.6.27 API change
* The function was exported for use, prior to this it existed by the
* The function was exported for use, prior to this it existed but the
* symbol was not exported.
*
* 4.4.0-6.21 API change for Ubuntu
* lookup_bdev() gained a second argument, FMODE_*, to check inode permissions.
*/
#ifndef HAVE_LOOKUP_BDEV
#define lookup_bdev(path) ERR_PTR(-ENOTSUP)
#endif
#ifdef HAVE_1ARG_LOOKUP_BDEV
#define vdev_lookup_bdev(path) lookup_bdev(path)
#else
#ifdef HAVE_2ARGS_LOOKUP_BDEV
#define vdev_lookup_bdev(path) lookup_bdev(path, 0)
#else
#define vdev_lookup_bdev(path) ERR_PTR(-ENOTSUP)
#endif /* HAVE_2ARGS_LOOKUP_BDEV */
#endif /* HAVE_1ARG_LOOKUP_BDEV */
/*
* 2.6.30 API change
@ -292,20 +301,59 @@ bio_set_flags_failfast(struct block_device *bdev, int *flags)
#endif /* HAVE_BDEV_LOGICAL_BLOCK_SIZE */
#endif /* HAVE_BDEV_PHYSICAL_BLOCK_SIZE */
#ifndef HAVE_BIO_SET_OP_ATTRS
/*
* 2.6.37 API change
* The WRITE_FLUSH, WRITE_FUA, and WRITE_FLUSH_FUA flags have been
* introduced as a replacement for WRITE_BARRIER. This was done to
* allow richer semantics to be expressed to the block layer. It is
* the block layers responsibility to choose the correct way to
* implement these semantics.
* Kernels without bio_set_op_attrs use bi_rw for the bio flags.
*/
#ifdef WRITE_FLUSH_FUA
#define VDEV_WRITE_FLUSH_FUA WRITE_FLUSH_FUA
#else
#define VDEV_WRITE_FLUSH_FUA WRITE_BARRIER
static inline void
bio_set_op_attrs(struct bio *bio, unsigned rw, unsigned flags)
{
bio->bi_rw |= rw | flags;
}
#endif
/*
* bio_set_flush - Set the appropriate flags in a bio to guarantee
* data are on non-volatile media on completion.
*
* 2.6.X - 2.6.36 API,
* WRITE_BARRIER - Tells the block layer to commit all previously submitted
* writes to stable storage before this one is started and that the current
* write is on stable storage upon completion. Also prevents reordering
* on both sides of the current operation.
*
* 2.6.37 - 4.8 API,
* Introduce WRITE_FLUSH, WRITE_FUA, and WRITE_FLUSH_FUA flags as a
* replacement for WRITE_BARRIER to allow expressing richer semantics
* to the block layer. It's up to the block layer to implement the
* semantics correctly. Use the WRITE_FLUSH_FUA flag combination.
*
* 4.8 - 4.9 API,
* REQ_FLUSH was renamed to REQ_PREFLUSH. For consistency with previous
* ZoL releases, prefer the WRITE_FLUSH_FUA flag set if it's available.
*
* 4.10 API,
* The read/write flags and their modifiers, including WRITE_FLUSH,
* WRITE_FUA and WRITE_FLUSH_FUA were removed from fs.h in
* torvalds/linux@70fd7614 and replaced by direct flag modification
* of the REQ_ flags in bio->bi_opf. Use REQ_PREFLUSH.
*/
static inline void
bio_set_flush(struct bio *bio)
{
#if defined(REQ_PREFLUSH) /* >= 4.10 */
bio_set_op_attrs(bio, 0, REQ_PREFLUSH);
#elif defined(WRITE_FLUSH_FUA) /* >= 2.6.37 and <= 4.9 */
bio_set_op_attrs(bio, 0, WRITE_FLUSH_FUA);
#elif defined(WRITE_BARRIER) /* < 2.6.37 */
bio_set_op_attrs(bio, 0, WRITE_BARRIER);
#else
#error "Allowing the build will cause bio_set_flush requests to be ignored."
"Please file an issue report at: "
"https://github.com/zfsonlinux/zfs/issues/new"
#endif
}
/*
* 4.8 - 4.x API,
* REQ_OP_FLUSH
@ -324,6 +372,7 @@ bio_set_flags_failfast(struct block_device *bdev, int *flags)
* and the new preflush behavior introduced in Linux 4.8. This is correct
* in all cases but may have a performance impact for some kernels. It
* has the advantage of minimizing kernel specific changes in the zvol code.
*
*/
static inline boolean_t
bio_is_flush(struct bio *bio)
@ -376,16 +425,20 @@ bio_is_fua(struct bio *bio)
*
* In all cases the normal I/O path is used for discards. The only
* difference is how the kernel tags individual I/Os as discards.
*
* Note that 2.6.32 era kernels provide both BIO_RW_DISCARD and REQ_DISCARD,
* where BIO_RW_DISCARD is the correct interface. Therefore, it is important
* that the HAVE_BIO_RW_DISCARD check occur before the REQ_DISCARD check.
*/
static inline boolean_t
bio_is_discard(struct bio *bio)
{
#if defined(HAVE_REQ_OP_DISCARD)
return (bio_op(bio) == REQ_OP_DISCARD);
#elif defined(REQ_DISCARD)
return (bio->bi_rw & REQ_DISCARD);
#elif defined(HAVE_BIO_RW_DISCARD)
return (bio->bi_rw & (1 << BIO_RW_DISCARD));
#elif defined(REQ_DISCARD)
return (bio->bi_rw & REQ_DISCARD);
#else
#error "Allowing the build will cause discard requests to become writes "
"potentially triggering the DMU_MAX_ACCESS assertion. Please file "

View File

@ -69,45 +69,115 @@ truncate_setsize(struct inode *ip, loff_t new)
/*
* 2.6.32 - 2.6.33, bdi_setup_and_register() is not available.
* 2.6.34 - 3.19, bdi_setup_and_register() takes 3 arguments.
* 4.0 - x.y, bdi_setup_and_register() takes 2 arguments.
* 4.0 - 4.11, bdi_setup_and_register() takes 2 arguments.
* 4.12 - x.y, super_setup_bdi_name() new interface.
*/
#if defined(HAVE_2ARGS_BDI_SETUP_AND_REGISTER)
#if defined(HAVE_SUPER_SETUP_BDI_NAME)
extern atomic_long_t zfs_bdi_seq;
static inline int
zpl_bdi_setup_and_register(struct backing_dev_info *bdi, char *name)
zpl_bdi_setup(struct super_block *sb, char *name)
{
return (bdi_setup_and_register(bdi, name));
return super_setup_bdi_name(sb, "%.28s-%ld", name,
atomic_long_inc_return(&zfs_bdi_seq));
}
static inline void
zpl_bdi_destroy(struct super_block *sb)
{
}
#elif defined(HAVE_2ARGS_BDI_SETUP_AND_REGISTER)
static inline int
zpl_bdi_setup(struct super_block *sb, char *name)
{
struct backing_dev_info *bdi;
int error;
bdi = kmem_zalloc(sizeof (struct backing_dev_info), KM_SLEEP);
error = bdi_setup_and_register(bdi, name);
if (error) {
kmem_free(bdi, sizeof (struct backing_dev_info));
return (error);
}
sb->s_bdi = bdi;
return (0);
}
static inline void
zpl_bdi_destroy(struct super_block *sb)
{
struct backing_dev_info *bdi = sb->s_bdi;
bdi_destroy(bdi);
kmem_free(bdi, sizeof (struct backing_dev_info));
sb->s_bdi = NULL;
}
#elif defined(HAVE_3ARGS_BDI_SETUP_AND_REGISTER)
static inline int
zpl_bdi_setup_and_register(struct backing_dev_info *bdi, char *name)
zpl_bdi_setup(struct super_block *sb, char *name)
{
return (bdi_setup_and_register(bdi, name, BDI_CAP_MAP_COPY));
struct backing_dev_info *bdi;
int error;
bdi = kmem_zalloc(sizeof (struct backing_dev_info), KM_SLEEP);
error = bdi_setup_and_register(bdi, name, BDI_CAP_MAP_COPY);
if (error) {
kmem_free(sb->s_bdi, sizeof (struct backing_dev_info));
return (error);
}
sb->s_bdi = bdi;
return (0);
}
static inline void
zpl_bdi_destroy(struct super_block *sb)
{
struct backing_dev_info *bdi = sb->s_bdi;
bdi_destroy(bdi);
kmem_free(bdi, sizeof (struct backing_dev_info));
sb->s_bdi = NULL;
}
#else
extern atomic_long_t zfs_bdi_seq;
static inline int
zpl_bdi_setup_and_register(struct backing_dev_info *bdi, char *name)
zpl_bdi_setup(struct super_block *sb, char *name)
{
char tmp[32];
struct backing_dev_info *bdi;
int error;
bdi = kmem_zalloc(sizeof (struct backing_dev_info), KM_SLEEP);
bdi->name = name;
bdi->capabilities = BDI_CAP_MAP_COPY;
error = bdi_init(bdi);
if (error)
return (error);
sprintf(tmp, "%.28s%s", name, "-%d");
error = bdi_register(bdi, NULL, tmp,
atomic_long_inc_return(&zfs_bdi_seq));
if (error) {
bdi_destroy(bdi);
kmem_free(bdi, sizeof (struct backing_dev_info));
return (error);
}
error = bdi_register(bdi, NULL, "%.28s-%ld", name,
atomic_long_inc_return(&zfs_bdi_seq));
if (error) {
bdi_destroy(bdi);
kmem_free(bdi, sizeof (struct backing_dev_info));
return (error);
}
sb->s_bdi = bdi;
return (0);
}
static inline void
zpl_bdi_destroy(struct super_block *sb)
{
struct backing_dev_info *bdi = sb->s_bdi;
bdi_destroy(bdi);
kmem_free(bdi, sizeof (struct backing_dev_info));
sb->s_bdi = NULL;
}
#endif
@ -204,17 +274,9 @@ lseek_execute(
#include <linux/posix_acl.h>
#if defined(HAVE_POSIX_ACL_RELEASE) && !defined(HAVE_POSIX_ACL_RELEASE_GPL_ONLY)
#define zpl_posix_acl_release(arg) posix_acl_release(arg)
#define zpl_set_cached_acl(ip, ty, n) set_cached_acl(ip, ty, n)
#define zpl_forget_cached_acl(ip, ty) forget_cached_acl(ip, ty)
#else
static inline void
zpl_posix_acl_free(void *arg) {
kfree(arg);
}
void zpl_posix_acl_release_impl(struct posix_acl *);
static inline void
zpl_posix_acl_release(struct posix_acl *acl)
@ -222,12 +284,15 @@ zpl_posix_acl_release(struct posix_acl *acl)
if ((acl == NULL) || (acl == ACL_NOT_CACHED))
return;
if (atomic_dec_and_test(&acl->a_refcount)) {
taskq_dispatch_delay(system_taskq, zpl_posix_acl_free, acl,
TQ_SLEEP, ddi_get_lbolt() + 60*HZ);
}
if (atomic_dec_and_test(&acl->a_refcount))
zpl_posix_acl_release_impl(acl);
}
#endif /* HAVE_POSIX_ACL_RELEASE */
#ifdef HAVE_SET_CACHED_ACL_USABLE
#define zpl_set_cached_acl(ip, ty, n) set_cached_acl(ip, ty, n)
#define zpl_forget_cached_acl(ip, ty) forget_cached_acl(ip, ty)
#else
static inline void
zpl_set_cached_acl(struct inode *ip, int type, struct posix_acl *newer) {
struct posix_acl *older = NULL;
@ -257,7 +322,7 @@ static inline void
zpl_forget_cached_acl(struct inode *ip, int type) {
zpl_set_cached_acl(ip, type, (struct posix_acl *)ACL_NOT_CACHED);
}
#endif /* HAVE_POSIX_ACL_RELEASE */
#endif /* HAVE_SET_CACHED_ACL_USABLE */
#ifndef HAVE___POSIX_ACL_CHMOD
#ifdef HAVE_POSIX_ACL_CHMOD
@ -362,4 +427,69 @@ static inline struct inode *file_inode(const struct file *f)
#define zpl_follow_up(path) follow_up(path)
#endif
/*
* 4.9 API change
*/
#ifndef HAVE_SETATTR_PREPARE
static inline int
setattr_prepare(struct dentry *dentry, struct iattr *ia)
{
return (inode_change_ok(dentry->d_inode, ia));
}
#endif
/*
* 4.11 API change
* These macros are defined by kernel 4.11. We define them so that the same
* code builds under kernels < 4.11 and >= 4.11. The macros are set to 0 so
* that it will create obvious failures if they are accidentally used when built
* against a kernel >= 4.11.
*/
#ifndef STATX_BASIC_STATS
#define STATX_BASIC_STATS 0
#endif
#ifndef AT_STATX_SYNC_AS_STAT
#define AT_STATX_SYNC_AS_STAT 0
#endif
/*
* 4.11 API change
* 4.11 takes struct path *, < 4.11 takes vfsmount *
*/
#ifdef HAVE_VFSMOUNT_IOPS_GETATTR
#define ZPL_GETATTR_WRAPPER(func) \
static int \
func(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat) \
{ \
struct path path = { .mnt = mnt, .dentry = dentry }; \
return func##_impl(&path, stat, STATX_BASIC_STATS, \
AT_STATX_SYNC_AS_STAT); \
}
#elif defined(HAVE_PATH_IOPS_GETATTR)
#define ZPL_GETATTR_WRAPPER(func) \
static int \
func(const struct path *path, struct kstat *stat, u32 request_mask, \
unsigned int query_flags) \
{ \
return (func##_impl(path, stat, request_mask, query_flags)); \
}
#else
#error
#endif
/*
* 4.9 API change
* Preferred interface to get the current FS time.
*/
#if !defined(HAVE_CURRENT_TIME)
static inline struct timespec
current_time(struct inode *ip)
{
return (timespec_trunc(current_kernel_time(), ip->i_sb->s_time_gran));
}
#endif
#endif /* _ZFS_VFS_H */

View File

@ -68,8 +68,9 @@
#define MNTOPT_NOFAIL "nofail" /* no failure */
#define MNTOPT_RELATIME "relatime" /* allow relative time updates */
#define MNTOPT_NORELATIME "norelatime" /* do not allow relative time updates */
#define MNTOPT_DFRATIME "strictatime" /* Deferred access time updates */
#define MNTOPT_NODFRATIME "nostrictatime" /* No Deferred access time updates */
#define MNTOPT_STRICTATIME "strictatime" /* strict access time updates */
#define MNTOPT_NOSTRICTATIME "nostrictatime" /* No strict access time updates */
#define MNTOPT_LAZYTIME "lazytime" /* Defer access time writing */
#define MNTOPT_SETUID "suid" /* Both setuid and devices allowed */
#define MNTOPT_NOSETUID "nosuid" /* Neither setuid nor devices allowed */
#define MNTOPT_OWNER "owner" /* allow owner mount */

View File

@ -40,6 +40,17 @@ extern "C" {
*/
#define FTAG ((char *)__func__)
/*
* Starting with 4.11, torvalds/linux@f405df5, the linux kernel defines a
* refcount_t type of its own. The macro below effectively changes references
* in the ZFS code from refcount_t to zfs_refcount_t at compile time, so that
* existing code need not be altered, reducing conflicts when landing openZFS
* patches.
*/
#define refcount_t zfs_refcount_t
#define refcount_add zfs_refcount_add
#ifdef ZFS_DEBUG
typedef struct reference {
list_node_t ref_link;
@ -55,7 +66,7 @@ typedef struct refcount {
list_t rc_removed;
int64_t rc_count;
int64_t rc_removed_count;
} refcount_t;
} zfs_refcount_t;
/* Note: refcount_t must be initialized with refcount_create[_untracked]() */
@ -65,7 +76,7 @@ void refcount_destroy(refcount_t *rc);
void refcount_destroy_many(refcount_t *rc, uint64_t number);
int refcount_is_zero(refcount_t *rc);
int64_t refcount_count(refcount_t *rc);
int64_t refcount_add(refcount_t *rc, void *holder_tag);
int64_t zfs_refcount_add(refcount_t *rc, void *holder_tag);
int64_t refcount_remove(refcount_t *rc, void *holder_tag);
int64_t refcount_add_many(refcount_t *rc, uint64_t number, void *holder_tag);
int64_t refcount_remove_many(refcount_t *rc, uint64_t number, void *holder_tag);
@ -86,7 +97,7 @@ typedef struct refcount {
#define refcount_destroy_many(rc, number) ((rc)->rc_count = 0)
#define refcount_is_zero(rc) ((rc)->rc_count == 0)
#define refcount_count(rc) ((rc)->rc_count)
#define refcount_add(rc, holder) atomic_add_64_nv(&(rc)->rc_count, 1)
#define zfs_refcount_add(rc, holder) atomic_add_64_nv(&(rc)->rc_count, 1)
#define refcount_remove(rc, holder) atomic_add_64_nv(&(rc)->rc_count, -1)
#define refcount_add_many(rc, number, holder) \
atomic_add_64_nv(&(rc)->rc_count, number)

View File

@ -56,7 +56,6 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
__field(uint64_t, z_mapcnt)
__field(uint64_t, z_gen)
__field(uint64_t, z_size)
__array(uint64_t, z_atime, 2)
__field(uint64_t, z_links)
__field(uint64_t, z_pflags)
__field(uint64_t, z_uid)
@ -94,8 +93,6 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
__entry->z_mapcnt = zn->z_mapcnt;
__entry->z_gen = zn->z_gen;
__entry->z_size = zn->z_size;
__entry->z_atime[0] = zn->z_atime[0];
__entry->z_atime[1] = zn->z_atime[1];
__entry->z_links = zn->z_links;
__entry->z_pflags = zn->z_pflags;
__entry->z_uid = zn->z_uid;
@ -124,7 +121,7 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
),
TP_printk("zn { id %llu unlinked %u atime_dirty %u "
"zn_prefetch %u moved %u blksz %u seq %u "
"mapcnt %llu gen %llu size %llu atime 0x%llx:0x%llx "
"mapcnt %llu gen %llu size %llu "
"links %llu pflags %llu uid %llu gid %llu "
"sync_cnt %u mode 0x%x is_sa %d "
"is_mapped %d is_ctldir %d is_stale %d inode { "
@ -134,7 +131,7 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
__entry->z_id, __entry->z_unlinked, __entry->z_atime_dirty,
__entry->z_zn_prefetch, __entry->z_moved, __entry->z_blksz,
__entry->z_seq, __entry->z_mapcnt, __entry->z_gen,
__entry->z_size, __entry->z_atime[0], __entry->z_atime[1],
__entry->z_size,
__entry->z_links, __entry->z_pflags, __entry->z_uid,
__entry->z_gid, __entry->z_sync_cnt, __entry->z_mode,
__entry->z_is_sa, __entry->z_is_mapped,

View File

@ -37,9 +37,5 @@ typedef struct vdev_disk {
struct block_device *vd_bdev;
} vdev_disk_t;
extern int vdev_disk_physio(struct block_device *, caddr_t,
size_t, uint64_t, int, int);
extern int vdev_disk_read_rootlabel(char *, char *, nvlist_t **);
#endif /* _KERNEL */
#endif /* _SYS_VDEV_DISK_H */

View File

@ -225,7 +225,7 @@ typedef struct xvattr {
* of requested attributes (xva_reqattrmap[]).
*/
#define XVA_SET_REQ(xvap, attr) \
ASSERT((xvap)->xva_vattr.va_mask | AT_XVATTR); \
ASSERT((xvap)->xva_vattr.va_mask & AT_XVATTR); \
ASSERT((xvap)->xva_magic == XVA_MAGIC); \
(xvap)->xva_reqattrmap[XVA_INDEX(attr)] |= XVA_ATTRBIT(attr)
/*
@ -233,7 +233,7 @@ typedef struct xvattr {
* of requested attributes (xva_reqattrmap[]).
*/
#define XVA_CLR_REQ(xvap, attr) \
ASSERT((xvap)->xva_vattr.va_mask | AT_XVATTR); \
ASSERT((xvap)->xva_vattr.va_mask & AT_XVATTR); \
ASSERT((xvap)->xva_magic == XVA_MAGIC); \
(xvap)->xva_reqattrmap[XVA_INDEX(attr)] &= ~XVA_ATTRBIT(attr)
@ -242,7 +242,7 @@ typedef struct xvattr {
* of returned attributes (xva_rtnattrmap[]).
*/
#define XVA_SET_RTN(xvap, attr) \
ASSERT((xvap)->xva_vattr.va_mask | AT_XVATTR); \
ASSERT((xvap)->xva_vattr.va_mask & AT_XVATTR); \
ASSERT((xvap)->xva_magic == XVA_MAGIC); \
(XVA_RTNATTRMAP(xvap))[XVA_INDEX(attr)] |= XVA_ATTRBIT(attr)
@ -251,7 +251,7 @@ typedef struct xvattr {
* to see of the corresponding attribute bit is set. If so, returns non-zero.
*/
#define XVA_ISSET_REQ(xvap, attr) \
((((xvap)->xva_vattr.va_mask | AT_XVATTR) && \
((((xvap)->xva_vattr.va_mask & AT_XVATTR) && \
((xvap)->xva_magic == XVA_MAGIC) && \
((xvap)->xva_mapsize > XVA_INDEX(attr))) ? \
((xvap)->xva_reqattrmap[XVA_INDEX(attr)] & XVA_ATTRBIT(attr)) : 0)
@ -261,7 +261,7 @@ typedef struct xvattr {
* to see of the corresponding attribute bit is set. If so, returns non-zero.
*/
#define XVA_ISSET_RTN(xvap, attr) \
((((xvap)->xva_vattr.va_mask | AT_XVATTR) && \
((((xvap)->xva_vattr.va_mask & AT_XVATTR) && \
((xvap)->xva_magic == XVA_MAGIC) && \
((xvap)->xva_mapsize > XVA_INDEX(attr))) ? \
((XVA_RTNATTRMAP(xvap))[XVA_INDEX(attr)] & XVA_ATTRBIT(attr)) : 0)

View File

@ -64,7 +64,6 @@ typedef struct zfs_mntopts {
typedef struct zfs_sb {
struct super_block *z_sb; /* generic super_block */
struct backing_dev_info z_bdi; /* generic backing dev info */
struct zfs_sb *z_parent; /* parent fs */
objset_t *z_os; /* objset reference */
zfs_mntopts_t *z_mntopts; /* passed mount options */

View File

@ -198,7 +198,6 @@ typedef struct znode {
uint64_t z_mapcnt; /* number of pages mapped to file */
uint64_t z_gen; /* generation (cached) */
uint64_t z_size; /* file size (cached) */
uint64_t z_atime[2]; /* atime (cached) */
uint64_t z_links; /* file links (cached) */
uint64_t z_pflags; /* pflags (cached) */
uint64_t z_uid; /* uid fuid (cached) */
@ -304,16 +303,12 @@ extern unsigned int zfs_object_mutex_size;
#define STATE_CHANGED (ATTR_CTIME)
#define CONTENT_MODIFIED (ATTR_MTIME | ATTR_CTIME)
#define ZFS_ACCESSTIME_STAMP(zsb, zp) \
if ((zsb)->z_atime && !(zfs_is_readonly(zsb))) \
zfs_tstamp_update_setup(zp, ACCESSED, NULL, NULL, B_FALSE);
extern int zfs_init_fs(zfs_sb_t *, znode_t **);
extern void zfs_set_dataprop(objset_t *);
extern void zfs_create_fs(objset_t *os, cred_t *cr, nvlist_t *,
dmu_tx_t *tx);
extern void zfs_tstamp_update_setup(znode_t *, uint_t, uint64_t [2],
uint64_t [2], boolean_t);
uint64_t [2]);
extern void zfs_grow_blocksize(znode_t *, uint64_t, dmu_tx_t *);
extern int zfs_freesp(znode_t *, uint64_t, uint64_t, int, boolean_t);
extern void zfs_znode_init(void);

View File

@ -76,7 +76,7 @@ extern ssize_t zpl_xattr_list(struct dentry *dentry, char *buf, size_t size);
extern int zpl_xattr_security_init(struct inode *ip, struct inode *dip,
const struct qstr *qstr);
#if defined(CONFIG_FS_POSIX_ACL)
extern int zpl_set_acl(struct inode *ip, int type, struct posix_acl *acl);
extern int zpl_set_acl(struct inode *ip, struct posix_acl *acl, int type);
extern struct posix_acl *zpl_get_acl(struct inode *ip, int type);
#if !defined(HAVE_GET_ACL)
#if defined(HAVE_CHECK_ACL_WITH_FLAGS)

View File

@ -507,7 +507,7 @@
movl 16(%esp), %ebx
movl 20(%esp), %ecx
subl %eax, %ebx
adcl %edx, %ecx
sbbl %edx, %ecx
lock
cmpxchg8b (%edi)
jne 1b

View File

@ -34,6 +34,7 @@
#include <sys/mnttab.h>
#include <sys/types.h>
#include <sys/sysmacros.h>
#include <sys/stat.h>
#include <unistd.h>

View File

@ -31,69 +31,66 @@
#include <stdio.h>
#include <stdlib.h>
#ifndef __assert_c99
static inline void
__assert_c99(const char *expr, const char *file, int line, const char *func)
{
fprintf(stderr, "%s:%i: %s: Assertion `%s` failed.\n",
file, line, func, expr);
abort();
}
#endif /* __assert_c99 */
#ifndef verify
#if defined(__STDC__)
#if __STDC_VERSION__ - 0 >= 199901L
#define verify(EX) (void)((EX) || \
(__assert_c99(#EX, __FILE__, __LINE__, __func__), 0))
#else
#define verify(EX) (void)((EX) || (__assert(#EX, __FILE__, __LINE__), 0))
#endif /* __STDC_VERSION__ - 0 >= 199901L */
#else
#define verify(EX) (void)((EX) || (_assert("EX", __FILE__, __LINE__), 0))
#endif /* __STDC__ */
#endif /* verify */
#undef VERIFY
#undef ASSERT
#define VERIFY verify
#define ASSERT assert
extern void __assert(const char *, const char *, int);
#include <stdarg.h>
static inline int
assfail(const char *buf, const char *file, int line)
libspl_assert(const char *buf, const char *file, const char *func, int line)
{
__assert(buf, file, line);
return (0);
fprintf(stderr, "%s\n", buf);
fprintf(stderr, "ASSERT at %s:%d:%s()", file, line, func);
abort();
}
/* BEGIN CSTYLED */
#define VERIFY3_IMPL(LEFT, OP, RIGHT, TYPE) do { \
/* printf version of libspl_assert */
static inline void
libspl_assertf(const char *file, const char *func, int line, char *format, ...)
{
va_list args;
va_start(args, format);
vfprintf(stderr, format, args);
fprintf(stderr, "\n");
fprintf(stderr, "ASSERT at %s:%d:%s()", file, line, func);
va_end(args);
abort();
}
#ifdef verify
#undef verify
#endif
#define VERIFY(cond) \
(void) ((!(cond)) && \
libspl_assert(#cond, __FILE__, __FUNCTION__, __LINE__))
#define verify(cond) \
(void) ((!(cond)) && \
libspl_assert(#cond, __FILE__, __FUNCTION__, __LINE__))
#define VERIFY3_IMPL(LEFT, OP, RIGHT, TYPE) \
do { \
const TYPE __left = (TYPE)(LEFT); \
const TYPE __right = (TYPE)(RIGHT); \
if (!(__left OP __right)) { \
char *__buf = alloca(256); \
(void) snprintf(__buf, 256, "%s %s %s (0x%llx %s 0x%llx)", \
#LEFT, #OP, #RIGHT, \
(u_longlong_t)__left, #OP, (u_longlong_t)__right); \
assfail(__buf, __FILE__, __LINE__); \
} \
if (!(__left OP __right)) \
libspl_assertf(__FILE__, __FUNCTION__, __LINE__, \
"%s %s %s (0x%llx %s 0x%llx)", #LEFT, #OP, #RIGHT); \
} while (0)
/* END CSTYLED */
#define VERIFY3S(x, y, z) VERIFY3_IMPL(x, y, z, int64_t)
#define VERIFY3U(x, y, z) VERIFY3_IMPL(x, y, z, uint64_t)
#define VERIFY3P(x, y, z) VERIFY3_IMPL(x, y, z, uintptr_t)
#define VERIFY0(x) VERIFY3_IMPL(x, ==, 0, uint64_t)
#ifdef assert
#undef assert
#endif
#ifdef NDEBUG
#define ASSERT3S(x, y, z) ((void)0)
#define ASSERT3U(x, y, z) ((void)0)
#define ASSERT3P(x, y, z) ((void)0)
#define ASSERT0(x) ((void)0)
#define ASSERT(x) ((void)0)
#define assert(x) ((void)0)
#define ASSERTV(x)
#define IMPLY(A, B) ((void)0)
#define EQUIV(A, B) ((void)0)
@ -102,13 +99,17 @@ assfail(const char *buf, const char *file, int line)
#define ASSERT3U(x, y, z) VERIFY3U(x, y, z)
#define ASSERT3P(x, y, z) VERIFY3P(x, y, z)
#define ASSERT0(x) VERIFY0(x)
#define ASSERT(x) VERIFY(x)
#define assert(x) VERIFY(x)
#define ASSERTV(x) x
#define IMPLY(A, B) \
((void)(((!(A)) || (B)) || \
assfail("(" #A ") implies (" #B ")", __FILE__, __LINE__)))
libspl_assert("(" #A ") implies (" #B ")", \
__FILE__, __FUNCTION__, __LINE__)))
#define EQUIV(A, B) \
((void)((!!(A) == !!(B)) || \
assfail("(" #A ") is equivalent to (" #B ")", __FILE__, __LINE__)))
libspl_assert("(" #A ") is equivalent to (" #B ")", \
__FILE__, __FUNCTION__, __LINE__)))
#endif /* NDEBUG */

View File

@ -42,7 +42,6 @@
#define makedevice(maj, min) makedev(maj, min)
#define _sysconf(a) sysconf(a)
#define __NORETURN __attribute__((noreturn))
/*
* Compatibility macros/typedefs needed for Solaris -> Linux port

View File

@ -27,6 +27,12 @@
#ifndef _LIBSPL_SYS_TYPES_H
#define _LIBSPL_SYS_TYPES_H
#if defined(HAVE_MAKEDEV_IN_SYSMACROS)
#include <sys/sysmacros.h>
#elif defined(HAVE_MAKEDEV_IN_MKDEV)
#include <sys/mkdev.h>
#endif
#include <sys/isa_defs.h>
#include <sys/feature_tests.h>
#include_next <sys/types.h>

View File

@ -3315,8 +3315,9 @@ zfs_check_snap_cb(zfs_handle_t *zhp, void *arg)
char name[ZFS_MAXNAMELEN];
int rv = 0;
(void) snprintf(name, sizeof (name),
"%s@%s", zhp->zfs_name, dd->snapname);
if (snprintf(name, sizeof (name), "%s@%s", zhp->zfs_name,
dd->snapname) >= sizeof (name))
return (EINVAL);
if (lzc_exists(name))
verify(nvlist_add_boolean(dd->nvl, name) == 0);
@ -3534,8 +3535,9 @@ zfs_snapshot_cb(zfs_handle_t *zhp, void *arg)
int rv = 0;
if (zfs_prop_get_int(zhp, ZFS_PROP_INCONSISTENT) == 0) {
(void) snprintf(name, sizeof (name),
"%s@%s", zfs_get_name(zhp), sd->sd_snapname);
if (snprintf(name, sizeof (name), "%s@%s", zfs_get_name(zhp),
sd->sd_snapname) >= sizeof (name))
return (EINVAL);
fnvlist_add_boolean(sd->sd_nvl, name);
@ -4257,8 +4259,9 @@ zfs_hold_one(zfs_handle_t *zhp, void *arg)
char name[ZFS_MAXNAMELEN];
int rv = 0;
(void) snprintf(name, sizeof (name),
"%s@%s", zhp->zfs_name, ha->snapname);
if (snprintf(name, sizeof (name), "%s@%s", zhp->zfs_name,
ha->snapname) >= sizeof (name))
return (EINVAL);
if (lzc_exists(name))
fnvlist_add_string(ha->nvl, name, ha->tag);
@ -4377,8 +4380,11 @@ zfs_release_one(zfs_handle_t *zhp, void *arg)
int rv = 0;
nvlist_t *existing_holds;
(void) snprintf(name, sizeof (name),
"%s@%s", zhp->zfs_name, ha->snapname);
if (snprintf(name, sizeof (name), "%s@%s", zhp->zfs_name,
ha->snapname) >= sizeof (name)) {
ha->error = EINVAL;
rv = EINVAL;
}
if (lzc_get_holds(name, &existing_holds) != 0) {
ha->error = ENOENT;

View File

@ -1337,16 +1337,33 @@ zpool_find_import_impl(libzfs_handle_t *hdl, importargs_t *iarg)
if (config != NULL) {
boolean_t matched = B_TRUE;
boolean_t aux = B_FALSE;
char *pname;
if ((iarg->poolname != NULL) &&
/*
* Check if it's a spare or l2cache device. If
* it is, we need to skip the name and guid
* check since they don't exist on aux device
* label.
*/
if (iarg->poolname != NULL ||
iarg->guid != 0) {
uint64_t state;
aux = nvlist_lookup_uint64(config,
ZPOOL_CONFIG_POOL_STATE,
&state) == 0 &&
(state == POOL_STATE_SPARE ||
state == POOL_STATE_L2CACHE);
}
if ((iarg->poolname != NULL) && !aux &&
(nvlist_lookup_string(config,
ZPOOL_CONFIG_POOL_NAME, &pname) == 0)) {
if (strcmp(iarg->poolname, pname))
matched = B_FALSE;
} else if (iarg->guid != 0) {
} else if (iarg->guid != 0 && !aux) {
uint64_t this_guid;
matched = nvlist_lookup_uint64(config,

View File

@ -204,8 +204,11 @@ zfs_iter_bookmarks(zfs_handle_t *zhp, zfs_iter_f func, void *data)
bmark_name = nvpair_name(pair);
bmark_props = fnvpair_value_nvlist(pair);
(void) snprintf(name, sizeof (name), "%s#%s", zhp->zfs_name,
bmark_name);
if (snprintf(name, sizeof (name), "%s#%s", zhp->zfs_name,
bmark_name) >= sizeof (name)) {
err = EINVAL;
goto out;
}
nzhp = make_bookmark_handle(zhp, name, bmark_props);
if (nzhp == NULL)

View File

@ -364,6 +364,14 @@ zfs_add_options(zfs_handle_t *zhp, char *options, int len)
error = zfs_add_option(zhp, options, len,
ZFS_PROP_ATIME, MNTOPT_ATIME, MNTOPT_NOATIME);
/*
* don't add relatime/strictatime when atime=off, otherwise strictatime
* will force atime=on
*/
if (strstr(options, MNTOPT_NOATIME) == NULL) {
error = zfs_add_option(zhp, options, len,
ZFS_PROP_RELATIME, MNTOPT_RELATIME, MNTOPT_STRICTATIME);
}
error = error ? error : zfs_add_option(zhp, options, len,
ZFS_PROP_DEVICES, MNTOPT_DEVICES, MNTOPT_NODEVICES);
error = error ? error : zfs_add_option(zhp, options, len,

View File

@ -1487,9 +1487,13 @@ zfs_send(zfs_handle_t *zhp, const char *fromsnap, const char *tosnap,
drr_versioninfo, DMU_COMPOUNDSTREAM);
DMU_SET_FEATUREFLAGS(drr.drr_u.drr_begin.
drr_versioninfo, featureflags);
(void) snprintf(drr.drr_u.drr_begin.drr_toname,
if (snprintf(drr.drr_u.drr_begin.drr_toname,
sizeof (drr.drr_u.drr_begin.drr_toname),
"%s@%s", zhp->zfs_name, tosnap);
"%s@%s", zhp->zfs_name, tosnap) >=
sizeof (drr.drr_u.drr_begin.drr_toname)) {
err = EINVAL;
goto stderr_out;
}
drr.drr_payloadlen = buflen;
err = cksum_and_write(&drr, sizeof (drr), &zc, outfd);
@ -2689,7 +2693,8 @@ zfs_receive_one(libzfs_handle_t *hdl, int infd, const char *tosnap,
ENOENT);
if (stream_avl != NULL) {
char *snapname;
char *snapname = NULL;
nvlist_t *lookup = NULL;
nvlist_t *fs = fsavl_find(stream_avl, drrb->drr_toguid,
&snapname);
nvlist_t *props;
@ -2710,6 +2715,11 @@ zfs_receive_one(libzfs_handle_t *hdl, int infd, const char *tosnap,
nvlist_free(props);
if (ret != 0)
return (-1);
if (0 == nvlist_lookup_nvlist(fs, "snapprops", &lookup)) {
VERIFY(0 == nvlist_lookup_nvlist(lookup,
snapname, &snapprops_nvlist));
}
}
cp = NULL;

View File

@ -883,7 +883,12 @@ Default value: \fB10\fR.
Minimum asynchronous write I/Os active to each device.
See the section "ZFS I/O SCHEDULER".
.sp
Default value: \fB1\fR.
Lower values are associated with better latency on rotational media but poorer
resilver performance. The default value of 2 was chosen as a compromise. A
value of 3 has been shown to improve resilver performance further at a cost of
further increasing latency.
.sp
Default value: \fB2\fR.
.RE
.sp

View File

@ -630,7 +630,7 @@ avl_insert_here(
void
avl_add(avl_tree_t *tree, void *new_node)
{
avl_index_t where;
avl_index_t where = 0;
/*
* This is unfortunate. We want to call panic() here, even for

View File

@ -5475,11 +5475,12 @@ arc_init(void)
* If it has been set by a module parameter, take that.
* Otherwise, use a percentage of physical memory defined by
* zfs_dirty_data_max_percent (default 10%) with a cap at
* zfs_dirty_data_max_max (default 25% of physical memory).
* zfs_dirty_data_max_max (default 4G or 25% of physical memory).
*/
if (zfs_dirty_data_max_max == 0)
zfs_dirty_data_max_max = (uint64_t)physmem * PAGESIZE *
zfs_dirty_data_max_max_percent / 100;
zfs_dirty_data_max_max = MIN(4ULL * 1024 * 1024 * 1024,
(uint64_t)physmem * PAGESIZE *
zfs_dirty_data_max_max_percent / 100);
if (zfs_dirty_data_max == 0) {
zfs_dirty_data_max = (uint64_t)physmem * PAGESIZE *

View File

@ -49,6 +49,7 @@
#ifdef _KERNEL
#include <sys/vmsystm.h>
#include <sys/zfs_znode.h>
#include <linux/kmap_compat.h>
#endif
/*
@ -1056,6 +1057,7 @@ dmu_bio_copy(void *arg_buf, int size, struct bio *bio, size_t bio_offset)
char *bv_buf;
int tocpy, bv_len, bv_offset;
int offset = 0;
void *paddr;
bio_for_each_segment4(bv, bvp, bio, iter) {
@ -1080,14 +1082,15 @@ dmu_bio_copy(void *arg_buf, int size, struct bio *bio, size_t bio_offset)
tocpy = MIN(bv_len, size - offset);
ASSERT3S(tocpy, >=, 0);
bv_buf = page_address(bvp->bv_page) + bv_offset;
ASSERT3P(bv_buf, !=, NULL);
paddr = zfs_kmap_atomic(bvp->bv_page, KM_USER0);
bv_buf = paddr + bv_offset;
ASSERT3P(paddr, !=, NULL);
if (bio_data_dir(bio) == WRITE)
memcpy(arg_buf + offset, bv_buf, tocpy);
else
memcpy(bv_buf, arg_buf + offset, tocpy);
zfs_kunmap_atomic(paddr, KM_USER0);
offset += tocpy;
}
out:

View File

@ -69,7 +69,7 @@ typedef struct dump_bytes_io {
} dump_bytes_io_t;
static void
dump_bytes_strategy(void *arg)
dump_bytes_cb(void *arg)
{
dump_bytes_io_t *dbi = (dump_bytes_io_t *)arg;
dmu_sendarg_t *dsp = dbi->dbi_dsp;
@ -96,6 +96,9 @@ dump_bytes(dmu_sendarg_t *dsp, void *buf, int len)
dbi.dbi_buf = buf;
dbi.dbi_len = len;
#if defined(HAVE_LARGE_STACKS)
dump_bytes_cb(&dbi);
#else
/*
* The vn_rdwr() call is performed in a taskq to ensure that there is
* always enough stack space to write safely to the target filesystem.
@ -103,7 +106,8 @@ dump_bytes(dmu_sendarg_t *dsp, void *buf, int len)
* them and they are used in vdev_file.c for a similar purpose.
*/
spa_taskq_dispatch_sync(dmu_objset_spa(dsp->dsa_os), ZIO_TYPE_FREE,
ZIO_TASKQ_ISSUE, dump_bytes_strategy, &dbi, TQ_SLEEP);
ZIO_TASKQ_ISSUE, dump_bytes_cb, &dbi, TQ_SLEEP);
#endif /* HAVE_LARGE_STACKS */
return (dsp->dsa_err);
}

View File

@ -671,7 +671,11 @@ dsl_dataset_namelen(dsl_dataset_t *ds)
int len;
VERIFY0(dsl_dataset_get_snapname(ds));
mutex_enter(&ds->ds_lock);
len = dsl_dir_namelen(ds->ds_dir) + 1 + strlen(ds->ds_snapname);
len = strlen(ds->ds_snapname);
/* add '@' if ds is a snap */
if (len > 0)
len++;
len += dsl_dir_namelen(ds->ds_dir);
mutex_exit(&ds->ds_lock);
return (len);
}

View File

@ -137,7 +137,7 @@ refcount_add_many(refcount_t *rc, uint64_t number, void *holder)
}
int64_t
refcount_add(refcount_t *rc, void *holder)
zfs_refcount_add(refcount_t *rc, void *holder)
{
return (refcount_add_many(rc, 1, holder));
}

View File

@ -845,7 +845,7 @@ spa_taskqs_init(spa_t *spa, zio_type_t t, zio_taskq_type_t q)
uint_t count = ztip->zti_count;
spa_taskqs_t *tqs = &spa->spa_zio_taskq[t][q];
char name[32];
uint_t i, flags = TASKQ_DYNAMIC;
uint_t i, flags = 0;
boolean_t batch = B_FALSE;
if (mode == ZTI_MODE_NULL) {
@ -863,6 +863,7 @@ spa_taskqs_init(spa_t *spa, zio_type_t t, zio_taskq_type_t q)
case ZTI_MODE_FIXED:
ASSERT3U(value, >=, 1);
value = MAX(value, 1);
flags |= TASKQ_DYNAMIC;
break;
case ZTI_MODE_BATCH:
@ -3862,211 +3863,6 @@ spa_create(const char *pool, nvlist_t *nvroot, nvlist_t *props,
return (0);
}
#ifdef _KERNEL
/*
* Get the root pool information from the root disk, then import the root pool
* during the system boot up time.
*/
extern int vdev_disk_read_rootlabel(char *, char *, nvlist_t **);
static nvlist_t *
spa_generate_rootconf(char *devpath, char *devid, uint64_t *guid)
{
nvlist_t *config;
nvlist_t *nvtop, *nvroot;
uint64_t pgid;
if (vdev_disk_read_rootlabel(devpath, devid, &config) != 0)
return (NULL);
/*
* Add this top-level vdev to the child array.
*/
VERIFY(nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE,
&nvtop) == 0);
VERIFY(nvlist_lookup_uint64(config, ZPOOL_CONFIG_POOL_GUID,
&pgid) == 0);
VERIFY(nvlist_lookup_uint64(config, ZPOOL_CONFIG_GUID, guid) == 0);
/*
* Put this pool's top-level vdevs into a root vdev.
*/
VERIFY(nvlist_alloc(&nvroot, NV_UNIQUE_NAME, KM_SLEEP) == 0);
VERIFY(nvlist_add_string(nvroot, ZPOOL_CONFIG_TYPE,
VDEV_TYPE_ROOT) == 0);
VERIFY(nvlist_add_uint64(nvroot, ZPOOL_CONFIG_ID, 0ULL) == 0);
VERIFY(nvlist_add_uint64(nvroot, ZPOOL_CONFIG_GUID, pgid) == 0);
VERIFY(nvlist_add_nvlist_array(nvroot, ZPOOL_CONFIG_CHILDREN,
&nvtop, 1) == 0);
/*
* Replace the existing vdev_tree with the new root vdev in
* this pool's configuration (remove the old, add the new).
*/
VERIFY(nvlist_add_nvlist(config, ZPOOL_CONFIG_VDEV_TREE, nvroot) == 0);
nvlist_free(nvroot);
return (config);
}
/*
* Walk the vdev tree and see if we can find a device with "better"
* configuration. A configuration is "better" if the label on that
* device has a more recent txg.
*/
static void
spa_alt_rootvdev(vdev_t *vd, vdev_t **avd, uint64_t *txg)
{
int c;
for (c = 0; c < vd->vdev_children; c++)
spa_alt_rootvdev(vd->vdev_child[c], avd, txg);
if (vd->vdev_ops->vdev_op_leaf) {
nvlist_t *label;
uint64_t label_txg;
if (vdev_disk_read_rootlabel(vd->vdev_physpath, vd->vdev_devid,
&label) != 0)
return;
VERIFY(nvlist_lookup_uint64(label, ZPOOL_CONFIG_POOL_TXG,
&label_txg) == 0);
/*
* Do we have a better boot device?
*/
if (label_txg > *txg) {
*txg = label_txg;
*avd = vd;
}
nvlist_free(label);
}
}
/*
* Import a root pool.
*
* For x86. devpath_list will consist of devid and/or physpath name of
* the vdev (e.g. "id1,sd@SSEAGATE..." or "/pci@1f,0/ide@d/disk@0,0:a").
* The GRUB "findroot" command will return the vdev we should boot.
*
* For Sparc, devpath_list consists the physpath name of the booting device
* no matter the rootpool is a single device pool or a mirrored pool.
* e.g.
* "/pci@1f,0/ide@d/disk@0,0:a"
*/
int
spa_import_rootpool(char *devpath, char *devid)
{
spa_t *spa;
vdev_t *rvd, *bvd, *avd = NULL;
nvlist_t *config, *nvtop;
uint64_t guid, txg;
char *pname;
int error;
/*
* Read the label from the boot device and generate a configuration.
*/
config = spa_generate_rootconf(devpath, devid, &guid);
#if defined(_OBP) && defined(_KERNEL)
if (config == NULL) {
if (strstr(devpath, "/iscsi/ssd") != NULL) {
/* iscsi boot */
get_iscsi_bootpath_phy(devpath);
config = spa_generate_rootconf(devpath, devid, &guid);
}
}
#endif
if (config == NULL) {
cmn_err(CE_NOTE, "Cannot read the pool label from '%s'",
devpath);
return (SET_ERROR(EIO));
}
VERIFY(nvlist_lookup_string(config, ZPOOL_CONFIG_POOL_NAME,
&pname) == 0);
VERIFY(nvlist_lookup_uint64(config, ZPOOL_CONFIG_POOL_TXG, &txg) == 0);
mutex_enter(&spa_namespace_lock);
if ((spa = spa_lookup(pname)) != NULL) {
/*
* Remove the existing root pool from the namespace so that we
* can replace it with the correct config we just read in.
*/
spa_remove(spa);
}
spa = spa_add(pname, config, NULL);
spa->spa_is_root = B_TRUE;
spa->spa_import_flags = ZFS_IMPORT_VERBATIM;
/*
* Build up a vdev tree based on the boot device's label config.
*/
VERIFY(nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE,
&nvtop) == 0);
spa_config_enter(spa, SCL_ALL, FTAG, RW_WRITER);
error = spa_config_parse(spa, &rvd, nvtop, NULL, 0,
VDEV_ALLOC_ROOTPOOL);
spa_config_exit(spa, SCL_ALL, FTAG);
if (error) {
mutex_exit(&spa_namespace_lock);
nvlist_free(config);
cmn_err(CE_NOTE, "Can not parse the config for pool '%s'",
pname);
return (error);
}
/*
* Get the boot vdev.
*/
if ((bvd = vdev_lookup_by_guid(rvd, guid)) == NULL) {
cmn_err(CE_NOTE, "Can not find the boot vdev for guid %llu",
(u_longlong_t)guid);
error = SET_ERROR(ENOENT);
goto out;
}
/*
* Determine if there is a better boot device.
*/
avd = bvd;
spa_alt_rootvdev(rvd, &avd, &txg);
if (avd != bvd) {
cmn_err(CE_NOTE, "The boot device is 'degraded'. Please "
"try booting from '%s'", avd->vdev_path);
error = SET_ERROR(EINVAL);
goto out;
}
/*
* If the boot device is part of a spare vdev then ensure that
* we're booting off the active spare.
*/
if (bvd->vdev_parent->vdev_ops == &vdev_spare_ops &&
!bvd->vdev_isspare) {
cmn_err(CE_NOTE, "The boot device is currently spared. Please "
"try booting from '%s'",
bvd->vdev_parent->
vdev_child[bvd->vdev_parent->vdev_children - 1]->vdev_path);
error = SET_ERROR(EINVAL);
goto out;
}
error = 0;
out:
spa_config_enter(spa, SCL_ALL, FTAG, RW_WRITER);
vdev_free(rvd);
spa_config_exit(spa, SCL_ALL, FTAG);
mutex_exit(&spa_namespace_lock);
nvlist_free(config);
return (error);
}
#endif
/*
* Import a non-root pool into the system.
*/
@ -4166,12 +3962,6 @@ spa_import(char *pool, nvlist_t *config, nvlist_t *props, uint64_t flags)
VERIFY(nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE,
&nvroot) == 0);
if (error == 0)
error = spa_validate_aux(spa, nvroot, -1ULL,
VDEV_ALLOC_SPARE);
if (error == 0)
error = spa_validate_aux(spa, nvroot, -1ULL,
VDEV_ALLOC_L2CACHE);
spa_config_exit(spa, SCL_ALL, FTAG);
if (props != NULL)
@ -6780,7 +6570,6 @@ EXPORT_SYMBOL(spa_open);
EXPORT_SYMBOL(spa_open_rewind);
EXPORT_SYMBOL(spa_get_stats);
EXPORT_SYMBOL(spa_create);
EXPORT_SYMBOL(spa_import_rootpool);
EXPORT_SYMBOL(spa_import);
EXPORT_SYMBOL(spa_tryimport);
EXPORT_SYMBOL(spa_destroy);

View File

@ -1799,6 +1799,9 @@ vdev_dtl_should_excise(vdev_t *vd)
ASSERT0(scn->scn_phys.scn_errors);
ASSERT0(vd->vdev_children);
if (vd->vdev_state < VDEV_STATE_DEGRADED)
return (B_FALSE);
if (vd->vdev_resilver_txg == 0 ||
range_tree_space(vd->vdev_dtl[DTL_MISSING]) == 0)
return (B_TRUE);

View File

@ -41,10 +41,8 @@ static void *zfs_vdev_holder = VDEV_HOLDER;
* Virtual device vector for disks.
*/
typedef struct dio_request {
struct completion dr_comp; /* Completion for sync IO */
zio_t *dr_zio; /* Parent ZIO */
atomic_t dr_ref; /* References */
int dr_wait; /* Wait for IO */
int dr_error; /* Bio error */
int dr_bio_count; /* Count of bio's */
struct bio *dr_bio[0]; /* Attached bio's */
@ -363,7 +361,6 @@ vdev_disk_dio_alloc(int bio_count)
dr = kmem_zalloc(sizeof (dio_request_t) +
sizeof (struct bio *) * bio_count, KM_SLEEP);
if (dr) {
init_completion(&dr->dr_comp);
atomic_set(&dr->dr_ref, 0);
dr->dr_bio_count = bio_count;
dr->dr_error = 0;
@ -426,7 +423,6 @@ BIO_END_IO_PROTO(vdev_disk_physio_completion, bio, error)
{
dio_request_t *dr = bio->bi_private;
int rc;
int wait;
if (dr->dr_error == 0) {
#ifdef HAVE_1ARG_BIO_END_IO_T
@ -439,13 +435,8 @@ BIO_END_IO_PROTO(vdev_disk_physio_completion, bio, error)
#endif
}
wait = dr->dr_wait;
/* Drop reference aquired by __vdev_disk_physio */
rc = vdev_disk_dio_put(dr);
/* Wake up synchronous waiter this is the last outstanding bio */
if (wait && rc == 1)
complete(&dr->dr_comp);
}
static inline unsigned long
@ -494,11 +485,6 @@ bio_map(struct bio *bio, void *bio_ptr, unsigned int bio_size)
return (bio_size);
}
#ifndef bio_set_op_attrs
#define bio_set_op_attrs(bio, rw, flags) \
do { (bio)->bi_rw |= (rw)|(flags); } while (0)
#endif
static inline void
vdev_submit_bio_impl(struct bio *bio)
{
@ -527,13 +513,16 @@ vdev_submit_bio(struct bio *bio)
static int
__vdev_disk_physio(struct block_device *bdev, zio_t *zio, caddr_t kbuf_ptr,
size_t kbuf_size, uint64_t kbuf_offset, int rw, int flags, int wait)
size_t kbuf_size, uint64_t kbuf_offset, int rw, int flags)
{
dio_request_t *dr;
caddr_t bio_ptr;
uint64_t bio_offset;
int bio_size, bio_count = 16;
int i = 0, error = 0;
#if defined(HAVE_BLK_QUEUE_HAVE_BLK_PLUG)
struct blk_plug plug;
#endif
ASSERT3U(kbuf_offset + kbuf_size, <=, bdev->bd_inode->i_size);
@ -546,7 +535,6 @@ retry:
bio_set_flags_failfast(bdev, &flags);
dr->dr_zio = zio;
dr->dr_wait = wait;
/*
* When the IO size exceeds the maximum bio size for the request
@ -605,39 +593,26 @@ retry:
if (zio)
zio->io_delay = jiffies_64;
#if defined(HAVE_BLK_QUEUE_HAVE_BLK_PLUG)
if (dr->dr_bio_count > 1)
blk_start_plug(&plug);
#endif
/* Submit all bio's associated with this dio */
for (i = 0; i < dr->dr_bio_count; i++)
if (dr->dr_bio[i])
vdev_submit_bio(dr->dr_bio[i]);
/*
* On synchronous blocking requests we wait for all bio the completion
* callbacks to run. We will be woken when the last callback runs
* for this dio. We are responsible for putting the last dio_request
* reference will in turn put back the last bio references. The
* only synchronous consumer is vdev_disk_read_rootlabel() all other
* IO originating from vdev_disk_io_start() is asynchronous.
*/
if (wait) {
wait_for_completion(&dr->dr_comp);
error = dr->dr_error;
ASSERT3S(atomic_read(&dr->dr_ref), ==, 1);
}
#if defined(HAVE_BLK_QUEUE_HAVE_BLK_PLUG)
if (dr->dr_bio_count > 1)
blk_finish_plug(&plug);
#endif
(void) vdev_disk_dio_put(dr);
return (error);
}
int
vdev_disk_physio(struct block_device *bdev, caddr_t kbuf,
size_t size, uint64_t offset, int rw, int flags)
{
bio_set_flags_failfast(bdev, &flags);
return (__vdev_disk_physio(bdev, NULL, kbuf, size, offset, rw, flags,
1));
}
BIO_END_IO_PROTO(vdev_disk_io_flush_completion, bio, rc)
{
zio_t *zio = bio->bi_private;
@ -676,7 +651,7 @@ vdev_disk_io_flush(struct block_device *bdev, zio_t *zio)
bio->bi_private = zio;
bio->bi_bdev = bdev;
zio->io_delay = jiffies_64;
bio_set_op_attrs(bio, 0, VDEV_WRITE_FLUSH_FUA);
bio_set_flush(bio);
vdev_submit_bio(bio);
invalidate_bdev(bdev);
@ -688,7 +663,6 @@ vdev_disk_io_start(zio_t *zio)
{
vdev_t *v = zio->io_vd;
vdev_disk_t *vd = v->vdev_tsd;
zio_priority_t pri = zio->io_priority;
int rw, flags, error;
switch (zio->io_type) {
@ -729,18 +703,24 @@ vdev_disk_io_start(zio_t *zio)
return;
case ZIO_TYPE_WRITE:
rw = WRITE;
if ((pri == ZIO_PRIORITY_SYNC_WRITE) && (v->vdev_nonrot))
flags = WRITE_SYNC;
else
#if defined(HAVE_BLK_QUEUE_HAVE_BIO_RW_UNPLUG)
flags = (1 << BIO_RW_UNPLUG);
#elif defined(REQ_UNPLUG)
flags = REQ_UNPLUG;
#else
flags = 0;
#endif
break;
case ZIO_TYPE_READ:
rw = READ;
if ((pri == ZIO_PRIORITY_SYNC_READ) && (v->vdev_nonrot))
flags = READ_SYNC;
else
#if defined(HAVE_BLK_QUEUE_HAVE_BIO_RW_UNPLUG)
flags = (1 << BIO_RW_UNPLUG);
#elif defined(REQ_UNPLUG)
flags = REQ_UNPLUG;
#else
flags = 0;
#endif
break;
default:
@ -750,7 +730,7 @@ vdev_disk_io_start(zio_t *zio)
}
error = __vdev_disk_physio(vd->vd_bdev, zio, zio->io_data,
zio->io_size, zio->io_offset, rw, flags, 0);
zio->io_size, zio->io_offset, rw, flags);
if (error) {
zio->io_error = error;
zio_interrupt(zio);
@ -820,69 +800,5 @@ vdev_ops_t vdev_disk_ops = {
B_TRUE /* leaf vdev */
};
/*
* Given the root disk device devid or pathname, read the label from
* the device, and construct a configuration nvlist.
*/
int
vdev_disk_read_rootlabel(char *devpath, char *devid, nvlist_t **config)
{
struct block_device *bdev;
vdev_label_t *label;
uint64_t s, size;
int i;
bdev = vdev_bdev_open(devpath, vdev_bdev_mode(FREAD), zfs_vdev_holder);
if (IS_ERR(bdev))
return (-PTR_ERR(bdev));
s = bdev_capacity(bdev);
if (s == 0) {
vdev_bdev_close(bdev, vdev_bdev_mode(FREAD));
return (EIO);
}
size = P2ALIGN_TYPED(s, sizeof (vdev_label_t), uint64_t);
label = vmem_alloc(sizeof (vdev_label_t), KM_SLEEP);
for (i = 0; i < VDEV_LABELS; i++) {
uint64_t offset, state, txg = 0;
/* read vdev label */
offset = vdev_label_offset(size, i, 0);
if (vdev_disk_physio(bdev, (caddr_t)label,
VDEV_SKIP_SIZE + VDEV_PHYS_SIZE, offset, READ,
REQ_SYNC) != 0)
continue;
if (nvlist_unpack(label->vl_vdev_phys.vp_nvlist,
sizeof (label->vl_vdev_phys.vp_nvlist), config, 0) != 0) {
*config = NULL;
continue;
}
if (nvlist_lookup_uint64(*config, ZPOOL_CONFIG_POOL_STATE,
&state) != 0 || state >= POOL_STATE_DESTROYED) {
nvlist_free(*config);
*config = NULL;
continue;
}
if (nvlist_lookup_uint64(*config, ZPOOL_CONFIG_POOL_TXG,
&txg) != 0 || txg == 0) {
nvlist_free(*config);
*config = NULL;
continue;
}
break;
}
vmem_free(label, sizeof (vdev_label_t));
vdev_bdev_close(bdev, vdev_bdev_mode(FREAD));
return (0);
}
module_param(zfs_vdev_scheduler, charp, 0644);
MODULE_PARM_DESC(zfs_vdev_scheduler, "I/O scheduler");

View File

@ -24,7 +24,7 @@
*/
/*
* Copyright (c) 2012, 2014 by Delphix. All rights reserved.
* Copyright (c) 2012, 2017 by Delphix. All rights reserved.
*/
#include <sys/zfs_context.h>
@ -146,7 +146,7 @@ uint32_t zfs_vdev_sync_write_min_active = 10;
uint32_t zfs_vdev_sync_write_max_active = 10;
uint32_t zfs_vdev_async_read_min_active = 1;
uint32_t zfs_vdev_async_read_max_active = 3;
uint32_t zfs_vdev_async_write_min_active = 1;
uint32_t zfs_vdev_async_write_min_active = 2;
uint32_t zfs_vdev_async_write_max_active = 10;
uint32_t zfs_vdev_scrub_min_active = 1;
uint32_t zfs_vdev_scrub_max_active = 2;
@ -545,7 +545,7 @@ vdev_queue_aggregate(vdev_queue_t *vq, zio_t *zio)
/*
* Walk backwards through sufficiently contiguous I/Os
* recording the last non-option I/O.
* recording the last non-optional I/O.
*/
while ((dio = AVL_PREV(t, first)) != NULL &&
(dio->io_flags & ZIO_FLAG_AGG_INHERIT) == flags &&
@ -567,10 +567,14 @@ vdev_queue_aggregate(vdev_queue_t *vq, zio_t *zio)
/*
* Walk forward through sufficiently contiguous I/Os.
* The aggregation limit does not apply to optional i/os, so that
* we can issue contiguous writes even if they are larger than the
* aggregation limit.
*/
while ((dio = AVL_NEXT(t, last)) != NULL &&
(dio->io_flags & ZIO_FLAG_AGG_INHERIT) == flags &&
IO_SPAN(first, dio) <= zfs_vdev_aggregation_limit &&
(IO_SPAN(first, dio) <= zfs_vdev_aggregation_limit ||
(dio->io_flags & ZIO_FLAG_OPTIONAL)) &&
IO_GAP(last, dio) <= maxgap) {
last = dio;
if (!(last->io_flags & ZIO_FLAG_OPTIONAL))
@ -605,6 +609,7 @@ vdev_queue_aggregate(vdev_queue_t *vq, zio_t *zio)
dio = AVL_NEXT(t, last);
dio->io_flags &= ~ZIO_FLAG_OPTIONAL;
} else {
/* do not include the optional i/o */
while (last != mandatory && last != first) {
ASSERT(last->io_flags & ZIO_FLAG_OPTIONAL);
last = AVL_PREV(t, last);
@ -616,7 +621,6 @@ vdev_queue_aggregate(vdev_queue_t *vq, zio_t *zio)
return (NULL);
size = IO_SPAN(first, last);
ASSERT3U(size, <=, zfs_vdev_aggregation_limit);
buf = zio_buf_alloc_flags(size, KM_NOSLEEP);
if (buf == NULL)

View File

@ -1457,7 +1457,7 @@ zfs_aclset_common(znode_t *zp, zfs_acl_t *aclp, cred_t *cr, dmu_tx_t *tx)
if (ace_trivial_common(aclp, 0, zfs_ace_walk) == 0)
zp->z_pflags |= ZFS_ACL_TRIVIAL;
zfs_tstamp_update_setup(zp, STATE_CHANGED, NULL, ctime, B_TRUE);
zfs_tstamp_update_setup(zp, STATE_CHANGED, NULL, ctime);
return (sa_bulk_update(zp->z_sa_hdl, bulk, count, tx));
}

View File

@ -455,7 +455,7 @@ static struct inode *
zfsctl_inode_alloc(zfs_sb_t *zsb, uint64_t id,
const struct file_operations *fops, const struct inode_operations *ops)
{
struct timespec now = current_fs_time(zsb->z_sb);
struct timespec now;
struct inode *ip;
znode_t *zp;
@ -463,6 +463,7 @@ zfsctl_inode_alloc(zfs_sb_t *zsb, uint64_t id,
if (ip == NULL)
return (NULL);
now = current_time(ip);
zp = ITOZ(ip);
ASSERT3P(zp->z_dirlocks, ==, NULL);
ASSERT3P(zp->z_acl_cached, ==, NULL);
@ -478,8 +479,6 @@ zfsctl_inode_alloc(zfs_sb_t *zsb, uint64_t id,
zp->z_mapcnt = 0;
zp->z_gen = 0;
zp->z_size = 0;
zp->z_atime[0] = 0;
zp->z_atime[1] = 0;
zp->z_links = 0;
zp->z_pflags = 0;
zp->z_uid = 0;
@ -500,6 +499,9 @@ zfsctl_inode_alloc(zfs_sb_t *zsb, uint64_t id,
ip->i_ctime = now;
ip->i_fop = fops;
ip->i_op = ops;
#if defined(IOP_XATTR)
ip->i_opflags &= ~IOP_XATTR;
#endif
if (insert_inode_locked(ip)) {
unlock_new_inode(ip);
@ -1009,16 +1011,11 @@ out:
* best effort. In the case where it does fail, perhaps because
* it's in use, the unmount will fail harmlessly.
*/
#define SET_UNMOUNT_CMD \
"exec 0</dev/null " \
" 1>/dev/null " \
" 2>/dev/null; " \
"umount -t zfs -n %s'%s'"
int
zfsctl_snapshot_unmount(char *snapname, int flags)
{
char *argv[] = { "/bin/sh", "-c", NULL, NULL };
char *argv[] = { "/usr/bin/env", "umount", "-t", "zfs", "-n", NULL,
NULL };
char *envp[] = { NULL };
zfs_snapentry_t *se;
int error;
@ -1030,12 +1027,12 @@ zfsctl_snapshot_unmount(char *snapname, int flags)
}
rw_exit(&zfs_snapshot_lock);
argv[2] = kmem_asprintf(SET_UNMOUNT_CMD,
flags & MNT_FORCE ? "-f " : "", se->se_path);
zfsctl_snapshot_rele(se);
if (flags & MNT_FORCE)
argv[4] = "-fn";
argv[5] = se->se_path;
dprintf("unmount; path=%s\n", se->se_path);
error = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC);
strfree(argv[2]);
zfsctl_snapshot_rele(se);
/*
@ -1050,11 +1047,6 @@ zfsctl_snapshot_unmount(char *snapname, int flags)
}
#define MOUNT_BUSY 0x80 /* Mount failed due to EBUSY (from mntent.h) */
#define SET_MOUNT_CMD \
"exec 0</dev/null " \
" 1>/dev/null " \
" 2>/dev/null; " \
"mount -t zfs -n '%s' '%s'"
int
zfsctl_snapshot_mount(struct path *path, int flags)
@ -1065,7 +1057,8 @@ zfsctl_snapshot_mount(struct path *path, int flags)
zfs_sb_t *snap_zsb;
zfs_snapentry_t *se;
char *full_name, *full_path;
char *argv[] = { "/bin/sh", "-c", NULL, NULL };
char *argv[] = { "/usr/bin/env", "mount", "-t", "zfs", "-n", NULL, NULL,
NULL };
char *envp[] = { NULL };
int error;
struct path spath;
@ -1110,9 +1103,9 @@ zfsctl_snapshot_mount(struct path *path, int flags)
* value from call_usermodehelper() will be (exitcode << 8 + signal).
*/
dprintf("mount; name=%s path=%s\n", full_name, full_path);
argv[2] = kmem_asprintf(SET_MOUNT_CMD, full_name, full_path);
argv[5] = full_name;
argv[6] = full_path;
error = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC);
strfree(argv[2]);
if (error) {
if (!(error & MOUNT_BUSY << 8)) {
cmn_err(CE_WARN, "Unable to automount %s/%s: %d",

View File

@ -760,7 +760,7 @@ zfs_link_create(zfs_dirlock_t *dl, znode_t *zp, dmu_tx_t *tx, int flag)
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_CTIME(zsb), NULL,
ctime, sizeof (ctime));
zfs_tstamp_update_setup(zp, STATE_CHANGED, mtime,
ctime, B_TRUE);
ctime);
}
error = sa_bulk_update(zp->z_sa_hdl, bulk, count, tx);
ASSERT(error == 0);
@ -781,7 +781,7 @@ zfs_link_create(zfs_dirlock_t *dl, znode_t *zp, dmu_tx_t *tx, int flag)
ctime, sizeof (ctime));
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_FLAGS(zsb), NULL,
&dzp->z_pflags, sizeof (dzp->z_pflags));
zfs_tstamp_update_setup(dzp, CONTENT_MODIFIED, mtime, ctime, B_TRUE);
zfs_tstamp_update_setup(dzp, CONTENT_MODIFIED, mtime, ctime);
error = sa_bulk_update(dzp->z_sa_hdl, bulk, count, tx);
ASSERT(error == 0);
mutex_exit(&dzp->z_lock);
@ -876,8 +876,8 @@ zfs_link_destroy(zfs_dirlock_t *dl, znode_t *zp, dmu_tx_t *tx, int flag,
NULL, &ctime, sizeof (ctime));
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_FLAGS(zsb),
NULL, &zp->z_pflags, sizeof (zp->z_pflags));
zfs_tstamp_update_setup(zp, STATE_CHANGED, mtime, ctime,
B_TRUE);
zfs_tstamp_update_setup(zp, STATE_CHANGED, mtime,
ctime);
}
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_LINKS(zsb),
NULL, &zp->z_links, sizeof (zp->z_links));
@ -904,7 +904,7 @@ zfs_link_destroy(zfs_dirlock_t *dl, znode_t *zp, dmu_tx_t *tx, int flag,
NULL, mtime, sizeof (mtime));
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_FLAGS(zsb),
NULL, &dzp->z_pflags, sizeof (dzp->z_pflags));
zfs_tstamp_update_setup(dzp, CONTENT_MODIFIED, mtime, ctime, B_TRUE);
zfs_tstamp_update_setup(dzp, CONTENT_MODIFIED, mtime, ctime);
error = sa_bulk_update(dzp->z_sa_hdl, bulk, count, tx);
ASSERT(error == 0);
mutex_exit(&dzp->z_lock);

View File

@ -277,7 +277,7 @@ zfs_sa_upgrade(sa_handle_t *hdl, dmu_tx_t *tx)
sa_bulk_attr_t *bulk, *sa_attrs;
zfs_acl_locator_cb_t locate = { 0 };
uint64_t uid, gid, mode, rdev, xattr, parent;
uint64_t crtime[2], mtime[2], ctime[2];
uint64_t crtime[2], mtime[2], ctime[2], atime[2];
zfs_acl_phys_t znode_acl;
char scanstamp[AV_SCANSTAMP_SZ];
boolean_t drop_lock = B_FALSE;
@ -309,6 +309,7 @@ zfs_sa_upgrade(sa_handle_t *hdl, dmu_tx_t *tx)
/* First do a bulk query of the attributes that aren't cached */
bulk = kmem_alloc(sizeof (sa_bulk_attr_t) * 20, KM_SLEEP);
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_ATIME(zsb), NULL, &atime, 16);
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_MTIME(zsb), NULL, &mtime, 16);
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_CTIME(zsb), NULL, &ctime, 16);
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_CRTIME(zsb), NULL, &crtime, 16);
@ -344,7 +345,7 @@ zfs_sa_upgrade(sa_handle_t *hdl, dmu_tx_t *tx)
SA_ADD_BULK_ATTR(sa_attrs, count, SA_ZPL_FLAGS(zsb), NULL,
&zp->z_pflags, 8);
SA_ADD_BULK_ATTR(sa_attrs, count, SA_ZPL_ATIME(zsb), NULL,
zp->z_atime, 16);
&atime, 16);
SA_ADD_BULK_ATTR(sa_attrs, count, SA_ZPL_MTIME(zsb), NULL,
&mtime, 16);
SA_ADD_BULK_ATTR(sa_attrs, count, SA_ZPL_CTIME(zsb), NULL,

View File

@ -699,20 +699,18 @@ zfs_sb_create(const char *osname, zfs_mntopts_t *zmo, zfs_sb_t **zsbp)
zsb = kmem_zalloc(sizeof (zfs_sb_t), KM_SLEEP);
/*
* Optional temporary mount options, free'd in zfs_sb_free().
*/
zsb->z_mntopts = (zmo ? zmo : zfs_mntopts_alloc());
/*
* We claim to always be readonly so we can open snapshots;
* other ZPL code will prevent us from writing to snapshots.
*/
error = dmu_objset_own(osname, DMU_OST_ZFS, B_TRUE, zsb, &os);
if (error) {
kmem_free(zsb, sizeof (zfs_sb_t));
return (error);
}
/*
* Optional temporary mount options, free'd in zfs_sb_free().
*/
zsb->z_mntopts = (zmo ? zmo : zfs_mntopts_alloc());
if (error)
goto out_zmo;
/*
* Initialize the zfs-specific filesystem structure.
@ -840,8 +838,9 @@ zfs_sb_create(const char *osname, zfs_mntopts_t *zmo, zfs_sb_t **zsbp)
out:
dmu_objset_disown(os, zsb);
out_zmo:
*zsbp = NULL;
zfs_mntopts_free(zsb->z_mntopts);
kmem_free(zsb, sizeof (zfs_sb_t));
return (error);
}
@ -1404,13 +1403,13 @@ zfs_domount(struct super_block *sb, zfs_mntopts_t *zmo, int silent)
sb->s_time_gran = 1;
sb->s_blocksize = recordsize;
sb->s_blocksize_bits = ilog2(recordsize);
zsb->z_bdi.ra_pages = 0;
sb->s_bdi = &zsb->z_bdi;
error = -zpl_bdi_setup_and_register(&zsb->z_bdi, "zfs");
error = -zpl_bdi_setup(sb, "zfs");
if (error)
goto out;
sb->s_bdi->ra_pages = 0;
/* Set callback operations for the file system. */
sb->s_op = &zpl_super_operations;
sb->s_xattr = zpl_xattr_handlers;
@ -1506,7 +1505,7 @@ zfs_umount(struct super_block *sb)
arc_remove_prune_callback(zsb->z_arc_prune);
VERIFY(zfs_sb_teardown(zsb, B_TRUE) == 0);
os = zsb->z_os;
bdi_destroy(sb->s_bdi);
zpl_bdi_destroy(sb);
/*
* z_os will be NULL if there was an error in
@ -1879,7 +1878,10 @@ zfs_init(void)
void
zfs_fini(void)
{
taskq_wait_outstanding(system_taskq, 0);
/*
* we don't use outstanding because zpl_posix_acl_free might add more.
*/
taskq_wait(system_taskq);
unregister_filesystem(&zpl_fs_type);
zfs_znode_fini();
zfsctl_fini();

View File

@ -550,7 +550,6 @@ zfs_read(struct inode *ip, uio_t *uio, int ioflag, cred_t *cr)
out:
zfs_range_unlock(rl);
ZFS_ACCESSTIME_STAMP(zsb, zp);
ZFS_EXIT(zsb);
return (error);
}
@ -865,8 +864,7 @@ zfs_write(struct inode *ip, uio_t *uio, int ioflag, cred_t *cr)
}
mutex_exit(&zp->z_acl_lock);
zfs_tstamp_update_setup(zp, CONTENT_MODIFIED, mtime, ctime,
B_TRUE);
zfs_tstamp_update_setup(zp, CONTENT_MODIFIED, mtime, ctime);
/*
* Update the file size (zp_size) if it has changed;
@ -1604,13 +1602,13 @@ top:
error = dmu_tx_assign(tx, waited ? TXG_WAITED : TXG_NOWAIT);
if (error) {
zfs_dirent_unlock(dl);
iput(ip);
if (xzp)
iput(ZTOI(xzp));
if (error == ERESTART) {
waited = B_TRUE;
dmu_tx_wait(tx);
dmu_tx_abort(tx);
iput(ip);
if (xzp)
iput(ZTOI(xzp));
goto top;
}
#ifdef HAVE_PN_UTILS
@ -1618,6 +1616,9 @@ top:
pn_free(realnmp);
#endif /* HAVE_PN_UTILS */
dmu_tx_abort(tx);
iput(ip);
if (xzp)
iput(ZTOI(xzp));
ZFS_EXIT(zsb);
return (error);
}
@ -1946,14 +1947,15 @@ top:
rw_exit(&zp->z_parent_lock);
rw_exit(&zp->z_name_lock);
zfs_dirent_unlock(dl);
iput(ip);
if (error == ERESTART) {
waited = B_TRUE;
dmu_tx_wait(tx);
dmu_tx_abort(tx);
iput(ip);
goto top;
}
dmu_tx_abort(tx);
iput(ip);
ZFS_EXIT(zsb);
return (error);
}
@ -2140,9 +2142,6 @@ update:
zap_cursor_fini(&zc);
if (error == ENOENT)
error = 0;
ZFS_ACCESSTIME_STAMP(zsb, zp);
out:
ZFS_EXIT(zsb);
@ -2195,11 +2194,11 @@ zfs_getattr(struct inode *ip, vattr_t *vap, int flags, cred_t *cr)
zfs_sb_t *zsb = ITOZSB(ip);
int error = 0;
uint64_t links;
uint64_t mtime[2], ctime[2];
uint64_t atime[2], mtime[2], ctime[2];
xvattr_t *xvap = (xvattr_t *)vap; /* vap may be an xvattr_t * */
xoptattr_t *xoap = NULL;
boolean_t skipaclchk = (flags & ATTR_NOACLCHECK) ? B_TRUE : B_FALSE;
sa_bulk_attr_t bulk[2];
sa_bulk_attr_t bulk[3];
int count = 0;
ZFS_ENTER(zsb);
@ -2207,6 +2206,7 @@ zfs_getattr(struct inode *ip, vattr_t *vap, int flags, cred_t *cr)
zfs_fuid_map_ids(zp, cr, &vap->va_uid, &vap->va_gid);
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_ATIME(zsb), NULL, &atime, 16);
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_MTIME(zsb), NULL, &mtime, 16);
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_CTIME(zsb), NULL, &ctime, 16);
@ -2355,7 +2355,7 @@ zfs_getattr(struct inode *ip, vattr_t *vap, int flags, cred_t *cr)
}
}
ZFS_TIME_DECODE(&vap->va_atime, zp->z_atime);
ZFS_TIME_DECODE(&vap->va_atime, atime);
ZFS_TIME_DECODE(&vap->va_mtime, mtime);
ZFS_TIME_DECODE(&vap->va_ctime, ctime);
@ -2402,7 +2402,6 @@ zfs_getattr_fast(struct inode *ip, struct kstat *sp)
mutex_enter(&zp->z_lock);
generic_fillattr(ip, sp);
ZFS_TIME_DECODE(&sp->atime, zp->z_atime);
sa_object_size(zp->z_sa_hdl, &blksize, &nblocks);
sp->blksize = blksize;
@ -2466,7 +2465,7 @@ zfs_setattr(struct inode *ip, vattr_t *vap, int flags, cred_t *cr)
uint64_t new_mode;
uint64_t new_uid, new_gid;
uint64_t xattr_obj;
uint64_t mtime[2], ctime[2];
uint64_t mtime[2], ctime[2], atime[2];
znode_t *attrzp;
int need_policy = FALSE;
int err, err2;
@ -2939,10 +2938,11 @@ top:
}
if (mask & ATTR_ATIME) {
ZFS_TIME_ENCODE(&vap->va_atime, zp->z_atime);
if ((mask & ATTR_ATIME) || zp->z_atime_dirty) {
zp->z_atime_dirty = 0;
ZFS_TIME_ENCODE(&ip->i_atime, atime);
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_ATIME(zsb), NULL,
&zp->z_atime, sizeof (zp->z_atime));
&atime, sizeof (atime));
}
if (mask & ATTR_MTIME) {
@ -2957,19 +2957,17 @@ top:
NULL, mtime, sizeof (mtime));
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_CTIME(zsb), NULL,
&ctime, sizeof (ctime));
zfs_tstamp_update_setup(zp, CONTENT_MODIFIED, mtime, ctime,
B_TRUE);
zfs_tstamp_update_setup(zp, CONTENT_MODIFIED, mtime, ctime);
} else if (mask != 0) {
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_CTIME(zsb), NULL,
&ctime, sizeof (ctime));
zfs_tstamp_update_setup(zp, STATE_CHANGED, mtime, ctime,
B_TRUE);
zfs_tstamp_update_setup(zp, STATE_CHANGED, mtime, ctime);
if (attrzp) {
SA_ADD_BULK_ATTR(xattr_bulk, xattr_count,
SA_ZPL_CTIME(zsb), NULL,
&ctime, sizeof (ctime));
zfs_tstamp_update_setup(attrzp, STATE_CHANGED,
mtime, ctime, B_TRUE);
mtime, ctime);
}
}
/*
@ -3031,8 +3029,6 @@ out:
ASSERT(err2 == 0);
}
if (attrzp)
iput(ZTOI(attrzp));
if (aclp)
zfs_acl_free(aclp);
@ -3043,11 +3039,15 @@ out:
if (err) {
dmu_tx_abort(tx);
if (attrzp)
iput(ZTOI(attrzp));
if (err == ERESTART)
goto top;
} else {
err2 = sa_bulk_update(zp->z_sa_hdl, bulk, count, tx);
dmu_tx_commit(tx);
if (attrzp)
iput(ZTOI(attrzp));
zfs_inode_update(zp);
}
@ -3080,7 +3080,7 @@ zfs_rename_unlock(zfs_zlock_t **zlpp)
while ((zl = *zlpp) != NULL) {
if (zl->zl_znode != NULL)
iput(ZTOI(zl->zl_znode));
zfs_iput_async(ZTOI(zl->zl_znode));
rw_exit(zl->zl_rwlock);
*zlpp = zl->zl_next;
kmem_free(zl, sizeof (*zl));
@ -3417,16 +3417,19 @@ top:
if (sdzp == tdzp)
rw_exit(&sdzp->z_name_lock);
iput(ZTOI(szp));
if (tzp)
iput(ZTOI(tzp));
if (error == ERESTART) {
waited = B_TRUE;
dmu_tx_wait(tx);
dmu_tx_abort(tx);
iput(ZTOI(szp));
if (tzp)
iput(ZTOI(tzp));
goto top;
}
dmu_tx_abort(tx);
iput(ZTOI(szp));
if (tzp)
iput(ZTOI(tzp));
ZFS_EXIT(zsb);
return (error);
}
@ -3690,7 +3693,6 @@ zfs_readlink(struct inode *ip, uio_t *uio, cred_t *cr)
error = zfs_sa_readlink(zp, uio);
mutex_exit(&zp->z_lock);
ZFS_ACCESSTIME_STAMP(zsb, zp);
ZFS_EXIT(zsb);
return (error);
}
@ -4056,7 +4058,7 @@ zfs_dirty_inode(struct inode *ip, int flags)
dmu_tx_t *tx;
uint64_t mode, atime[2], mtime[2], ctime[2];
sa_bulk_attr_t bulk[4];
int error;
int error = 0;
int cnt = 0;
if (zfs_is_readonly(zsb) || dmu_objset_is_snapshot(zsb->z_os))
@ -4065,6 +4067,20 @@ zfs_dirty_inode(struct inode *ip, int flags)
ZFS_ENTER(zsb);
ZFS_VERIFY_ZP(zp);
#ifdef I_DIRTY_TIME
/*
* This is the lazytime semantic indroduced in Linux 4.0
* This flag will only be called from update_time when lazytime is set.
* (Note, I_DIRTY_SYNC will also set if not lazytime)
* Fortunately mtime and ctime are managed within ZFS itself, so we
* only need to dirty atime.
*/
if (flags == I_DIRTY_TIME) {
zp->z_atime_dirty = 1;
goto out;
}
#endif
tx = dmu_tx_create(zsb->z_os);
dmu_tx_hold_sa(tx, zp->z_sa_hdl, B_FALSE);
@ -4077,6 +4093,8 @@ zfs_dirty_inode(struct inode *ip, int flags)
}
mutex_enter(&zp->z_lock);
zp->z_atime_dirty = 0;
SA_ADD_BULK_ATTR(bulk, cnt, SA_ZPL_MODE(zsb), NULL, &mode, 8);
SA_ADD_BULK_ATTR(bulk, cnt, SA_ZPL_ATIME(zsb), NULL, &atime, 16);
SA_ADD_BULK_ATTR(bulk, cnt, SA_ZPL_MTIME(zsb), NULL, &mtime, 16);
@ -4089,7 +4107,6 @@ zfs_dirty_inode(struct inode *ip, int flags)
mode = ip->i_mode;
zp->z_mode = mode;
zp->z_atime_dirty = 0;
error = sa_bulk_update(zp->z_sa_hdl, bulk, cnt, tx);
mutex_exit(&zp->z_lock);
@ -4107,6 +4124,7 @@ zfs_inactive(struct inode *ip)
{
znode_t *zp = ITOZ(ip);
zfs_sb_t *zsb = ITOZSB(ip);
uint64_t atime[2];
int error;
int need_unlock = 0;
@ -4130,9 +4148,10 @@ zfs_inactive(struct inode *ip)
if (error) {
dmu_tx_abort(tx);
} else {
ZFS_TIME_ENCODE(&ip->i_atime, atime);
mutex_enter(&zp->z_lock);
(void) sa_update(zp->z_sa_hdl, SA_ZPL_ATIME(zsb),
(void *)&zp->z_atime, sizeof (zp->z_atime), tx);
(void *)&atime, sizeof (atime), tx);
zp->z_atime_dirty = 0;
mutex_exit(&zp->z_lock);
dmu_tx_commit(tx);
@ -4241,9 +4260,6 @@ zfs_getpage(struct inode *ip, struct page *pl[], int nr_pages)
err = zfs_fillpage(ip, pl, nr_pages);
if (!err)
ZFS_ACCESSTIME_STAMP(zsb, zp);
ZFS_EXIT(zsb);
return (err);
}

View File

@ -482,6 +482,90 @@ zfs_inode_set_ops(zfs_sb_t *zsb, struct inode *ip)
}
}
void
zfs_set_inode_flags(znode_t *zp, struct inode *ip)
{
/*
* Linux and Solaris have different sets of file attributes, so we
* restrict this conversion to the intersection of the two.
*/
if (zp->z_pflags & ZFS_IMMUTABLE)
ip->i_flags |= S_IMMUTABLE;
else
ip->i_flags &= ~S_IMMUTABLE;
if (zp->z_pflags & ZFS_APPENDONLY)
ip->i_flags |= S_APPEND;
else
ip->i_flags &= ~S_APPEND;
}
/*
* Update the embedded inode given the znode. We should work toward
* eliminating this function as soon as possible by removing values
* which are duplicated between the znode and inode. If the generic
* inode has the correct field it should be used, and the ZFS code
* updated to access the inode. This can be done incrementally.
*/
static void
zfs_inode_update_impl(znode_t *zp, boolean_t new)
{
zfs_sb_t *zsb;
struct inode *ip;
uint32_t blksize;
u_longlong_t i_blocks;
uint64_t atime[2], mtime[2], ctime[2];
ASSERT(zp != NULL);
zsb = ZTOZSB(zp);
ip = ZTOI(zp);
/* Skip .zfs control nodes which do not exist on disk. */
if (zfsctl_is_node(ip))
return;
sa_lookup(zp->z_sa_hdl, SA_ZPL_ATIME(zsb), &atime, 16);
sa_lookup(zp->z_sa_hdl, SA_ZPL_MTIME(zsb), &mtime, 16);
sa_lookup(zp->z_sa_hdl, SA_ZPL_CTIME(zsb), &ctime, 16);
dmu_object_size_from_db(sa_get_db(zp->z_sa_hdl), &blksize, &i_blocks);
spin_lock(&ip->i_lock);
ip->i_generation = zp->z_gen;
ip->i_uid = SUID_TO_KUID(zp->z_uid);
ip->i_gid = SGID_TO_KGID(zp->z_gid);
set_nlink(ip, zp->z_links);
ip->i_mode = zp->z_mode;
zfs_set_inode_flags(zp, ip);
ip->i_blkbits = SPA_MINBLOCKSHIFT;
ip->i_blocks = i_blocks;
/*
* Only read atime from SA if we are newly created inode (or rezget),
* otherwise i_atime might be dirty.
*/
if (new)
ZFS_TIME_DECODE(&ip->i_atime, atime);
ZFS_TIME_DECODE(&ip->i_mtime, mtime);
ZFS_TIME_DECODE(&ip->i_ctime, ctime);
i_size_write(ip, zp->z_size);
spin_unlock(&ip->i_lock);
}
static void
zfs_inode_update_new(znode_t *zp)
{
zfs_inode_update_impl(zp, B_TRUE);
}
void
zfs_inode_update(znode_t *zp)
{
zfs_inode_update_impl(zp, B_FALSE);
}
/*
* Construct a znode+inode and initialize.
*
@ -497,7 +581,7 @@ zfs_znode_alloc(zfs_sb_t *zsb, dmu_buf_t *db, int blksz,
struct inode *ip;
uint64_t mode;
uint64_t parent;
sa_bulk_attr_t bulk[9];
sa_bulk_attr_t bulk[8];
int count = 0;
ASSERT(zsb != NULL);
@ -536,8 +620,6 @@ zfs_znode_alloc(zfs_sb_t *zsb, dmu_buf_t *db, int blksz,
&zp->z_pflags, 8);
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_PARENT(zsb), NULL,
&parent, 8);
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_ATIME(zsb), NULL,
&zp->z_atime, 16);
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_UID(zsb), NULL, &zp->z_uid, 8);
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_GID(zsb), NULL, &zp->z_gid, 8);
@ -551,7 +633,7 @@ zfs_znode_alloc(zfs_sb_t *zsb, dmu_buf_t *db, int blksz,
zp->z_mode = mode;
ip->i_ino = obj;
zfs_inode_update(zp);
zfs_inode_update_new(zp);
zfs_inode_set_ops(zsb, ip);
/*
@ -578,71 +660,6 @@ error:
return (NULL);
}
void
zfs_set_inode_flags(znode_t *zp, struct inode *ip)
{
/*
* Linux and Solaris have different sets of file attributes, so we
* restrict this conversion to the intersection of the two.
*/
if (zp->z_pflags & ZFS_IMMUTABLE)
ip->i_flags |= S_IMMUTABLE;
else
ip->i_flags &= ~S_IMMUTABLE;
if (zp->z_pflags & ZFS_APPENDONLY)
ip->i_flags |= S_APPEND;
else
ip->i_flags &= ~S_APPEND;
}
/*
* Update the embedded inode given the znode. We should work toward
* eliminating this function as soon as possible by removing values
* which are duplicated between the znode and inode. If the generic
* inode has the correct field it should be used, and the ZFS code
* updated to access the inode. This can be done incrementally.
*/
void
zfs_inode_update(znode_t *zp)
{
zfs_sb_t *zsb;
struct inode *ip;
uint32_t blksize;
uint64_t atime[2], mtime[2], ctime[2];
ASSERT(zp != NULL);
zsb = ZTOZSB(zp);
ip = ZTOI(zp);
/* Skip .zfs control nodes which do not exist on disk. */
if (zfsctl_is_node(ip))
return;
sa_lookup(zp->z_sa_hdl, SA_ZPL_ATIME(zsb), &atime, 16);
sa_lookup(zp->z_sa_hdl, SA_ZPL_MTIME(zsb), &mtime, 16);
sa_lookup(zp->z_sa_hdl, SA_ZPL_CTIME(zsb), &ctime, 16);
spin_lock(&ip->i_lock);
ip->i_generation = zp->z_gen;
ip->i_uid = SUID_TO_KUID(zp->z_uid);
ip->i_gid = SGID_TO_KGID(zp->z_gid);
set_nlink(ip, zp->z_links);
ip->i_mode = zp->z_mode;
zfs_set_inode_flags(zp, ip);
ip->i_blkbits = SPA_MINBLOCKSHIFT;
dmu_object_size_from_db(sa_get_db(zp->z_sa_hdl), &blksize,
(u_longlong_t *)&ip->i_blocks);
ZFS_TIME_DECODE(&ip->i_atime, atime);
ZFS_TIME_DECODE(&ip->i_mtime, mtime);
ZFS_TIME_DECODE(&ip->i_ctime, ctime);
i_size_write(ip, zp->z_size);
spin_unlock(&ip->i_lock);
}
/*
* Safely mark an inode dirty. Inodes which are part of a read-only
* file system or snapshot may not be dirtied.
@ -1123,7 +1140,7 @@ zfs_rezget(znode_t *zp)
dmu_buf_t *db;
uint64_t obj_num = zp->z_id;
uint64_t mode;
sa_bulk_attr_t bulk[8];
sa_bulk_attr_t bulk[7];
int err;
int count = 0;
uint64_t gen;
@ -1183,8 +1200,6 @@ zfs_rezget(znode_t *zp)
&zp->z_links, sizeof (zp->z_links));
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_FLAGS(zsb), NULL,
&zp->z_pflags, sizeof (zp->z_pflags));
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_ATIME(zsb), NULL,
&zp->z_atime, sizeof (zp->z_atime));
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_UID(zsb), NULL,
&zp->z_uid, sizeof (zp->z_uid));
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_GID(zsb), NULL,
@ -1208,7 +1223,8 @@ zfs_rezget(znode_t *zp)
zp->z_unlinked = (zp->z_links == 0);
zp->z_blksz = doi.doi_data_block_size;
zfs_inode_update(zp);
zp->z_atime_dirty = 0;
zfs_inode_update_new(zp);
zfs_znode_hold_exit(zsb, zh);
@ -1279,78 +1295,28 @@ zfs_compare_timespec(struct timespec *t1, struct timespec *t2)
return (t1->tv_nsec - t2->tv_nsec);
}
/*
* Determine whether the znode's atime must be updated. The logic mostly
* duplicates the Linux kernel's relatime_need_update() functionality.
* This function is only called if the underlying filesystem actually has
* atime updates enabled.
*/
static inline boolean_t
zfs_atime_need_update(znode_t *zp, timestruc_t *now)
{
if (!ZTOZSB(zp)->z_relatime)
return (B_TRUE);
/*
* In relatime mode, only update the atime if the previous atime
* is earlier than either the ctime or mtime or if at least a day
* has passed since the last update of atime.
*/
if (zfs_compare_timespec(&ZTOI(zp)->i_mtime, &ZTOI(zp)->i_atime) >= 0)
return (B_TRUE);
if (zfs_compare_timespec(&ZTOI(zp)->i_ctime, &ZTOI(zp)->i_atime) >= 0)
return (B_TRUE);
if ((long)now->tv_sec - ZTOI(zp)->i_atime.tv_sec >= 24*60*60)
return (B_TRUE);
return (B_FALSE);
}
/*
* Prepare to update znode time stamps.
*
* IN: zp - znode requiring timestamp update
* flag - ATTR_MTIME, ATTR_CTIME, ATTR_ATIME flags
* have_tx - true of caller is creating a new txg
* flag - ATTR_MTIME, ATTR_CTIME flags
*
* OUT: zp - new atime (via underlying inode's i_atime)
* OUT: zp - z_seq
* mtime - new mtime
* ctime - new ctime
*
* NOTE: The arguments are somewhat redundant. The following condition
* is always true:
*
* have_tx == !(flag & ATTR_ATIME)
* Note: We don't update atime here, because we rely on Linux VFS to do
* atime updating.
*/
void
zfs_tstamp_update_setup(znode_t *zp, uint_t flag, uint64_t mtime[2],
uint64_t ctime[2], boolean_t have_tx)
uint64_t ctime[2])
{
timestruc_t now;
ASSERT(have_tx == !(flag & ATTR_ATIME));
gethrestime(&now);
/*
* NOTE: The following test intentionally does not update z_atime_dirty
* in the case where an ATIME update has been requested but for which
* the update is omitted due to relatime logic. The rationale being
* that if the flag was set somewhere else, we should leave it alone
* here.
*/
if (flag & ATTR_ATIME) {
if (zfs_atime_need_update(zp, &now)) {
ZFS_TIME_ENCODE(&now, zp->z_atime);
ZTOI(zp)->i_atime.tv_sec = zp->z_atime[0];
ZTOI(zp)->i_atime.tv_nsec = zp->z_atime[1];
zp->z_atime_dirty = 1;
}
} else {
zp->z_atime_dirty = 0;
zp->z_seq++;
}
if (flag & ATTR_MTIME) {
ZFS_TIME_ENCODE(&now, mtime);
@ -1722,7 +1688,7 @@ log:
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_CTIME(zsb), NULL, ctime, 16);
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_FLAGS(zsb),
NULL, &zp->z_pflags, 8);
zfs_tstamp_update_setup(zp, CONTENT_MODIFIED, mtime, ctime, B_TRUE);
zfs_tstamp_update_setup(zp, CONTENT_MODIFIED, mtime, ctime);
error = sa_bulk_update(zp->z_sa_hdl, bulk, count, tx);
ASSERT(error == 0);

View File

@ -139,10 +139,10 @@ zio_init(void)
if (arc_watch && !IS_P2ALIGNED(size, PAGESIZE))
continue;
#endif
if (size <= 4 * SPA_MINBLOCKSIZE) {
if (size < PAGESIZE) {
align = SPA_MINBLOCKSIZE;
} else if (IS_P2ALIGNED(size, p2 >> 2)) {
align = MIN(p2 >> 2, PAGESIZE);
align = PAGESIZE;
}
if (align != 0) {
@ -1415,6 +1415,31 @@ zio_execute(zio_t *zio)
spl_fstrans_unmark(cookie);
}
/*
* Used to determine if in the current context the stack is sized large
* enough to allow zio_execute() to be called recursively. A minimum
* stack size of 16K is required to avoid needing to re-dispatch the zio.
*/
boolean_t
zio_execute_stack_check(zio_t *zio)
{
#if !defined(HAVE_LARGE_STACKS)
dsl_pool_t *dp = spa_get_dsl(zio->io_spa);
/* Executing in txg_sync_thread() context. */
if (dp && curthread == dp->dp_tx.tx_sync_thread)
return (B_TRUE);
/* Pool initialization outside of zio_taskq context. */
if (dp && spa_is_initializing(dp->dp_spa) &&
!zio_taskq_member(zio, ZIO_TASKQ_ISSUE) &&
!zio_taskq_member(zio, ZIO_TASKQ_ISSUE_HIGH))
return (B_TRUE);
#endif /* HAVE_LARGE_STACKS */
return (B_FALSE);
}
__attribute__((always_inline))
static inline void
__zio_execute(zio_t *zio)
@ -1424,8 +1449,6 @@ __zio_execute(zio_t *zio)
while (zio->io_stage < ZIO_STAGE_DONE) {
enum zio_stage pipeline = zio->io_pipeline;
enum zio_stage stage = zio->io_stage;
dsl_pool_t *dp;
boolean_t cut;
int rv;
ASSERT(!MUTEX_HELD(&zio->io_lock));
@ -1438,10 +1461,6 @@ __zio_execute(zio_t *zio)
ASSERT(stage <= ZIO_STAGE_DONE);
dp = spa_get_dsl(zio->io_spa);
cut = (stage == ZIO_STAGE_VDEV_IO_START) ?
zio_requeue_io_start_cut_in_line : B_FALSE;
/*
* If we are in interrupt context and this pipeline stage
* will grab a config lock that is held across I/O,
@ -1453,21 +1472,19 @@ __zio_execute(zio_t *zio)
*/
if ((stage & ZIO_BLOCKING_STAGES) && zio->io_vd == NULL &&
zio_taskq_member(zio, ZIO_TASKQ_INTERRUPT)) {
boolean_t cut = (stage == ZIO_STAGE_VDEV_IO_START) ?
zio_requeue_io_start_cut_in_line : B_FALSE;
zio_taskq_dispatch(zio, ZIO_TASKQ_ISSUE, cut);
return;
}
/*
* If we executing in the context of the tx_sync_thread,
* or we are performing pool initialization outside of a
* zio_taskq[ZIO_TASKQ_ISSUE|ZIO_TASKQ_ISSUE_HIGH] context.
* Then issue the zio asynchronously to minimize stack usage
* for these deep call paths.
* If the current context doesn't have large enough stacks
* the zio must be issued asynchronously to prevent overflow.
*/
if ((dp && curthread == dp->dp_tx.tx_sync_thread) ||
(dp && spa_is_initializing(dp->dp_spa) &&
!zio_taskq_member(zio, ZIO_TASKQ_ISSUE) &&
!zio_taskq_member(zio, ZIO_TASKQ_ISSUE_HIGH))) {
if (zio_execute_stack_check(zio)) {
boolean_t cut = (stage == ZIO_STAGE_VDEV_IO_START) ?
zio_requeue_io_start_cut_in_line : B_FALSE;
zio_taskq_dispatch(zio, ZIO_TASKQ_ISSUE, cut);
return;
}
@ -3455,7 +3472,7 @@ zbookmark_is_before(const dnode_phys_t *dnp, const zbookmark_phys_t *zb1,
if (zb1->zb_object == DMU_META_DNODE_OBJECT) {
uint64_t nextobj = zb1nextL0 *
(dnp->dn_datablkszsec << SPA_MINBLOCKSHIFT) >> DNODE_SHIFT;
(dnp->dn_datablkszsec << (SPA_MINBLOCKSHIFT - DNODE_SHIFT));
return (nextobj <= zb2thisobj);
}

View File

@ -100,16 +100,17 @@ zpl_root_readdir(struct file *filp, void *dirent, filldir_t filldir)
*/
/* ARGSUSED */
static int
zpl_root_getattr(struct vfsmount *mnt, struct dentry *dentry,
struct kstat *stat)
zpl_root_getattr_impl(const struct path *path, struct kstat *stat,
u32 request_mask, unsigned int query_flags)
{
int error;
struct inode *ip = path->dentry->d_inode;
error = simple_getattr(mnt, dentry, stat);
stat->atime = CURRENT_TIME;
generic_fillattr(ip, stat);
stat->atime = current_time(ip);
return (error);
return (0);
}
ZPL_GETATTR_WRAPPER(zpl_root_getattr);
static struct dentry *
#ifdef HAVE_LOOKUP_NAMEIDATA
@ -301,13 +302,17 @@ zpl_snapdir_readdir(struct file *filp, void *dirent, filldir_t filldir)
}
#endif /* HAVE_VFS_ITERATE */
int
zpl_snapdir_rename(struct inode *sdip, struct dentry *sdentry,
struct inode *tdip, struct dentry *tdentry)
static int
zpl_snapdir_rename2(struct inode *sdip, struct dentry *sdentry,
struct inode *tdip, struct dentry *tdentry, unsigned int flags)
{
cred_t *cr = CRED();
int error;
/* We probably don't want to support renameat2(2) in ctldir */
if (flags)
return (-EINVAL);
crhold(cr);
error = -zfsctl_snapdir_rename(sdip, dname(sdentry),
tdip, dname(tdentry), cr, 0);
@ -317,6 +322,15 @@ zpl_snapdir_rename(struct inode *sdip, struct dentry *sdentry,
return (error);
}
#ifndef HAVE_RENAME_WANTS_FLAGS
static int
zpl_snapdir_rename(struct inode *sdip, struct dentry *sdentry,
struct inode *tdip, struct dentry *tdentry)
{
return (zpl_snapdir_rename2(sdip, sdentry, tdip, tdentry, 0));
}
#endif
static int
zpl_snapdir_rmdir(struct inode *dip, struct dentry *dentry)
{
@ -362,21 +376,22 @@ zpl_snapdir_mkdir(struct inode *dip, struct dentry *dentry, zpl_umode_t mode)
*/
/* ARGSUSED */
static int
zpl_snapdir_getattr(struct vfsmount *mnt, struct dentry *dentry,
struct kstat *stat)
zpl_snapdir_getattr_impl(const struct path *path, struct kstat *stat,
u32 request_mask, unsigned int query_flags)
{
zfs_sb_t *zsb = ITOZSB(dentry->d_inode);
int error;
struct inode *ip = path->dentry->d_inode;
zfs_sb_t *zsb = ITOZSB(path->dentry->d_inode);
ZFS_ENTER(zsb);
error = simple_getattr(mnt, dentry, stat);
generic_fillattr(path->dentry->d_inode, stat);
stat->nlink = stat->size = 2;
stat->ctime = stat->mtime = dmu_objset_snap_cmtime(zsb->z_os);
stat->atime = CURRENT_TIME;
stat->atime = current_time(ip);
ZFS_EXIT(zsb);
return (error);
return (0);
}
ZPL_GETATTR_WRAPPER(zpl_snapdir_getattr);
/*
* The '.zfs/snapshot' directory file operations. These mainly control
@ -405,7 +420,11 @@ const struct file_operations zpl_fops_snapdir = {
const struct inode_operations zpl_ops_snapdir = {
.lookup = zpl_snapdir_lookup,
.getattr = zpl_snapdir_getattr,
#ifdef HAVE_RENAME_WANTS_FLAGS
.rename = zpl_snapdir_rename2,
#else
.rename = zpl_snapdir_rename,
#endif
.rmdir = zpl_snapdir_rmdir,
.mkdir = zpl_snapdir_mkdir,
};
@ -492,10 +511,10 @@ zpl_shares_readdir(struct file *filp, void *dirent, filldir_t filldir)
/* ARGSUSED */
static int
zpl_shares_getattr(struct vfsmount *mnt, struct dentry *dentry,
struct kstat *stat)
zpl_shares_getattr_impl(const struct path *path, struct kstat *stat,
u32 request_mask, unsigned int query_flags)
{
struct inode *ip = dentry->d_inode;
struct inode *ip = path->dentry->d_inode;
zfs_sb_t *zsb = ITOZSB(ip);
znode_t *dzp;
int error;
@ -503,11 +522,11 @@ zpl_shares_getattr(struct vfsmount *mnt, struct dentry *dentry,
ZFS_ENTER(zsb);
if (zsb->z_shares_dir == 0) {
error = simple_getattr(mnt, dentry, stat);
generic_fillattr(path->dentry->d_inode, stat);
stat->nlink = stat->size = 2;
stat->atime = CURRENT_TIME;
stat->atime = current_time(ip);
ZFS_EXIT(zsb);
return (error);
return (0);
}
error = -zfs_zget(zsb, zsb->z_shares_dir, &dzp);
@ -521,6 +540,7 @@ zpl_shares_getattr(struct vfsmount *mnt, struct dentry *dentry,
return (error);
}
ZPL_GETATTR_WRAPPER(zpl_shares_getattr);
/*
* The '.zfs/shares' directory file operations.

View File

@ -131,12 +131,15 @@ zpl_fsync(struct file *filp, struct dentry *dentry, int datasync)
return (error);
}
#ifdef HAVE_FILE_AIO_FSYNC
static int
zpl_aio_fsync(struct kiocb *kiocb, int datasync)
{
struct file *filp = kiocb->ki_filp;
return (zpl_fsync(filp, filp->f_path.dentry, datasync));
}
#endif
#elif defined(HAVE_FSYNC_WITHOUT_DENTRY)
/*
* Linux 2.6.35 - 3.0 API,
@ -162,11 +165,14 @@ zpl_fsync(struct file *filp, int datasync)
return (error);
}
#ifdef HAVE_FILE_AIO_FSYNC
static int
zpl_aio_fsync(struct kiocb *kiocb, int datasync)
{
return (zpl_fsync(kiocb->ki_filp, datasync));
}
#endif
#elif defined(HAVE_FSYNC_RANGE)
/*
* Linux 3.1 - 3.x API,
@ -197,11 +203,14 @@ zpl_fsync(struct file *filp, loff_t start, loff_t end, int datasync)
return (error);
}
#ifdef HAVE_FILE_AIO_FSYNC
static int
zpl_aio_fsync(struct kiocb *kiocb, int datasync)
{
return (zpl_fsync(kiocb->ki_filp, kiocb->ki_pos, -1, datasync));
}
#endif
#else
#error "Unsupported fops->fsync() implementation"
#endif
@ -250,20 +259,6 @@ zpl_read_common(struct inode *ip, const char *buf, size_t len, loff_t *ppos,
flags, cr, 0));
}
static ssize_t
zpl_read(struct file *filp, char __user *buf, size_t len, loff_t *ppos)
{
cred_t *cr = CRED();
ssize_t read;
crhold(cr);
read = zpl_read_common(filp->f_mapping->host, buf, len, ppos,
UIO_USERSPACE, filp->f_flags, cr);
crfree(cr);
return (read);
}
static ssize_t
zpl_iter_read_common(struct kiocb *kiocb, const struct iovec *iovp,
unsigned long nr_segs, size_t count, uio_seg_t seg, size_t skip)
@ -277,6 +272,7 @@ zpl_iter_read_common(struct kiocb *kiocb, const struct iovec *iovp,
nr_segs, &kiocb->ki_pos, seg, filp->f_flags, cr, skip);
crfree(cr);
file_accessed(filp);
return (read);
}
@ -301,7 +297,14 @@ static ssize_t
zpl_aio_read(struct kiocb *kiocb, const struct iovec *iovp,
unsigned long nr_segs, loff_t pos)
{
return (zpl_iter_read_common(kiocb, iovp, nr_segs, kiocb->ki_nbytes,
ssize_t ret;
size_t count;
ret = generic_segment_checks(iovp, &nr_segs, &count, VERIFY_WRITE);
if (ret)
return (ret);
return (zpl_iter_read_common(kiocb, iovp, nr_segs, count,
UIO_USERSPACE, 0));
}
#endif /* HAVE_VFS_RW_ITERATE */
@ -339,6 +342,7 @@ zpl_write_common_iovec(struct inode *ip, const struct iovec *iovp, size_t count,
return (wrote);
}
inline ssize_t
zpl_write_common(struct inode *ip, const char *buf, size_t len, loff_t *ppos,
uio_seg_t segment, int flags, cred_t *cr)
@ -352,20 +356,6 @@ zpl_write_common(struct inode *ip, const char *buf, size_t len, loff_t *ppos,
flags, cr, 0));
}
static ssize_t
zpl_write(struct file *filp, const char __user *buf, size_t len, loff_t *ppos)
{
cred_t *cr = CRED();
ssize_t wrote;
crhold(cr);
wrote = zpl_write_common(filp->f_mapping->host, buf, len, ppos,
UIO_USERSPACE, filp->f_flags, cr);
crfree(cr);
return (wrote);
}
static ssize_t
zpl_iter_write_common(struct kiocb *kiocb, const struct iovec *iovp,
unsigned long nr_segs, size_t count, uio_seg_t seg, size_t skip)
@ -386,16 +376,42 @@ zpl_iter_write_common(struct kiocb *kiocb, const struct iovec *iovp,
static ssize_t
zpl_iter_write(struct kiocb *kiocb, struct iov_iter *from)
{
size_t count;
ssize_t ret;
uio_seg_t seg = UIO_USERSPACE;
#ifndef HAVE_GENERIC_WRITE_CHECKS_KIOCB
struct file *file = kiocb->ki_filp;
struct address_space *mapping = file->f_mapping;
struct inode *ip = mapping->host;
int isblk = S_ISBLK(ip->i_mode);
count = iov_iter_count(from);
ret = generic_write_checks(file, &kiocb->ki_pos, &count, isblk);
if (ret)
return (ret);
#else
/*
* XXX - ideally this check should be in the same lock region with
* write operations, so that there's no TOCTTOU race when doing
* append and someone else grow the file.
*/
ret = generic_write_checks(kiocb, from);
if (ret <= 0)
return (ret);
count = ret;
#endif
if (from->type & ITER_KVEC)
seg = UIO_SYSSPACE;
if (from->type & ITER_BVEC)
seg = UIO_BVEC;
ret = zpl_iter_write_common(kiocb, from->iov, from->nr_segs,
iov_iter_count(from), seg, from->iov_offset);
count, seg, from->iov_offset);
if (ret > 0)
iov_iter_advance(from, ret);
return (ret);
}
#else
@ -403,7 +419,22 @@ static ssize_t
zpl_aio_write(struct kiocb *kiocb, const struct iovec *iovp,
unsigned long nr_segs, loff_t pos)
{
return (zpl_iter_write_common(kiocb, iovp, nr_segs, kiocb->ki_nbytes,
struct file *file = kiocb->ki_filp;
struct address_space *mapping = file->f_mapping;
struct inode *ip = mapping->host;
int isblk = S_ISBLK(ip->i_mode);
size_t count;
ssize_t ret;
ret = generic_segment_checks(iovp, &nr_segs, &count, VERIFY_READ);
if (ret)
return (ret);
ret = generic_write_checks(file, &pos, &count, isblk);
if (ret)
return (ret);
return (zpl_iter_write_common(kiocb, iovp, nr_segs, count,
UIO_USERSPACE, 0));
}
#endif /* HAVE_VFS_RW_ITERATE */
@ -649,8 +680,6 @@ zpl_fallocate_common(struct inode *ip, int mode, loff_t offset, loff_t len)
if (mode != (FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE))
return (error);
crhold(cr);
if (offset < 0 || len <= 0)
return (-EINVAL);
@ -669,6 +698,7 @@ zpl_fallocate_common(struct inode *ip, int mode, loff_t offset, loff_t len)
bf.l_len = len;
bf.l_pid = 0;
crhold(cr);
cookie = spl_fstrans_mark();
error = -zfs_space(ip, F_FREESP, &bf, FWRITE, offset, cr);
spl_fstrans_unmark(cookie);
@ -728,8 +758,7 @@ zpl_ioctl_getflags(struct file *filp, void __user *arg)
* is outside of our jurisdiction.
*/
#define fchange(f0, f1, b0, b1) ((((f0) & (b0)) == (b0)) != \
(((b1) & (f1)) == (f1)))
#define fchange(f0, f1, b0, b1) (!((f0) & (b0)) != !((f1) & (b1)))
static int
zpl_ioctl_setflags(struct file *filp, void __user *arg)
@ -827,18 +856,24 @@ const struct file_operations zpl_file_operations = {
.open = zpl_open,
.release = zpl_release,
.llseek = zpl_llseek,
.read = zpl_read,
.write = zpl_write,
#ifdef HAVE_VFS_RW_ITERATE
#ifdef HAVE_NEW_SYNC_READ
.read = new_sync_read,
.write = new_sync_write,
#endif
.read_iter = zpl_iter_read,
.write_iter = zpl_iter_write,
#else
.read = do_sync_read,
.write = do_sync_write,
.aio_read = zpl_aio_read,
.aio_write = zpl_aio_write,
#endif
.mmap = zpl_mmap,
.fsync = zpl_fsync,
#ifdef HAVE_FILE_AIO_FSYNC
.aio_fsync = zpl_aio_fsync,
#endif
#ifdef HAVE_FILE_FALLOCATE
.fallocate = zpl_fallocate,
#endif /* HAVE_FILE_FALLOCATE */

View File

@ -50,7 +50,7 @@ zpl_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags)
int zfs_flags = 0;
zfs_sb_t *zsb = dentry->d_sb->s_fs_info;
if (dlen(dentry) > ZFS_MAXNAMELEN)
if (dlen(dentry) >= ZAP_MAXNAMELEN)
return (ERR_PTR(-ENAMETOOLONG));
crhold(cr);
@ -102,9 +102,13 @@ zpl_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags)
struct dentry *new_dentry;
struct qstr ci_name;
if (strcmp(dname(dentry), pn.pn_buf) == 0) {
new_dentry = d_splice_alias(ip, dentry);
} else {
ci_name.name = pn.pn_buf;
ci_name.len = strlen(pn.pn_buf);
new_dentry = d_add_ci(dentry, ip, &ci_name);
}
kmem_free(pn.pn_buf, ZFS_MAXNAMELEN);
return (new_dentry);
} else {
@ -298,18 +302,25 @@ zpl_rmdir(struct inode * dir, struct dentry *dentry)
}
static int
zpl_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat)
zpl_getattr_impl(const struct path *path, struct kstat *stat, u32 request_mask,
unsigned int query_flags)
{
int error;
fstrans_cookie_t cookie;
cookie = spl_fstrans_mark();
error = -zfs_getattr_fast(dentry->d_inode, stat);
/*
* XXX request_mask and query_flags currently ignored.
*/
error = -zfs_getattr_fast(path->dentry->d_inode, stat);
spl_fstrans_unmark(cookie);
ASSERT3S(error, <=, 0);
return (error);
}
ZPL_GETATTR_WRAPPER(zpl_getattr);
static int
zpl_setattr(struct dentry *dentry, struct iattr *ia)
@ -320,7 +331,7 @@ zpl_setattr(struct dentry *dentry, struct iattr *ia)
int error;
fstrans_cookie_t cookie;
error = inode_change_ok(ip, ia);
error = setattr_prepare(dentry, ia);
if (error)
return (error);
@ -335,6 +346,9 @@ zpl_setattr(struct dentry *dentry, struct iattr *ia)
vap->va_mtime = ia->ia_mtime;
vap->va_ctime = ia->ia_ctime;
if (vap->va_mask & ATTR_ATIME)
ip->i_atime = ia->ia_atime;
cookie = spl_fstrans_mark();
error = -zfs_setattr(ip, vap, 0, cr);
if (!error && (ia->ia_valid & ATTR_MODE))
@ -349,13 +363,17 @@ zpl_setattr(struct dentry *dentry, struct iattr *ia)
}
static int
zpl_rename(struct inode *sdip, struct dentry *sdentry,
struct inode *tdip, struct dentry *tdentry)
zpl_rename2(struct inode *sdip, struct dentry *sdentry,
struct inode *tdip, struct dentry *tdentry, unsigned int flags)
{
cred_t *cr = CRED();
int error;
fstrans_cookie_t cookie;
/* We don't have renameat2(2) support */
if (flags)
return (-EINVAL);
crhold(cr);
cookie = spl_fstrans_mark();
error = -zfs_rename(sdip, dname(sdentry), tdip, dname(tdentry), cr, 0);
@ -366,6 +384,15 @@ zpl_rename(struct inode *sdip, struct dentry *sdentry,
return (error);
}
#ifndef HAVE_RENAME_WANTS_FLAGS
static int
zpl_rename(struct inode *sdip, struct dentry *sdentry,
struct inode *tdip, struct dentry *tdentry)
{
return (zpl_rename2(sdip, sdentry, tdip, tdentry, 0));
}
#endif
static int
zpl_symlink(struct inode *dir, struct dentry *dentry, const char *name)
{
@ -530,7 +557,7 @@ zpl_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry)
return (-EMLINK);
crhold(cr);
ip->i_ctime = CURRENT_TIME_SEC;
ip->i_ctime = current_time(ip);
igrab(ip); /* Use ihold() if available */
cookie = spl_fstrans_mark();
@ -642,19 +669,13 @@ zpl_revalidate(struct dentry *dentry, unsigned int flags)
}
const struct inode_operations zpl_inode_operations = {
.create = zpl_create,
.link = zpl_link,
.unlink = zpl_unlink,
.symlink = zpl_symlink,
.mkdir = zpl_mkdir,
.rmdir = zpl_rmdir,
.mknod = zpl_mknod,
.rename = zpl_rename,
.setattr = zpl_setattr,
.getattr = zpl_getattr,
#ifdef HAVE_GENERIC_SETXATTR
.setxattr = generic_setxattr,
.getxattr = generic_getxattr,
.removexattr = generic_removexattr,
#endif
.listxattr = zpl_xattr_list,
#ifdef HAVE_INODE_TRUNCATE_RANGE
.truncate_range = zpl_truncate_range,
@ -663,6 +684,9 @@ const struct inode_operations zpl_inode_operations = {
.fallocate = zpl_fallocate,
#endif /* HAVE_INODE_FALLOCATE */
#if defined(CONFIG_FS_POSIX_ACL)
#if defined(HAVE_SET_ACL)
.set_acl = zpl_set_acl,
#endif
#if defined(HAVE_GET_ACL)
.get_acl = zpl_get_acl,
#elif defined(HAVE_CHECK_ACL)
@ -682,14 +706,23 @@ const struct inode_operations zpl_dir_inode_operations = {
.mkdir = zpl_mkdir,
.rmdir = zpl_rmdir,
.mknod = zpl_mknod,
#ifdef HAVE_RENAME_WANTS_FLAGS
.rename = zpl_rename2,
#else
.rename = zpl_rename,
#endif
.setattr = zpl_setattr,
.getattr = zpl_getattr,
#ifdef HAVE_GENERIC_SETXATTR
.setxattr = generic_setxattr,
.getxattr = generic_getxattr,
.removexattr = generic_removexattr,
#endif
.listxattr = zpl_xattr_list,
#if defined(CONFIG_FS_POSIX_ACL)
#if defined(HAVE_SET_ACL)
.set_acl = zpl_set_acl,
#endif
#if defined(HAVE_GET_ACL)
.get_acl = zpl_get_acl,
#elif defined(HAVE_CHECK_ACL)
@ -701,7 +734,9 @@ const struct inode_operations zpl_dir_inode_operations = {
};
const struct inode_operations zpl_symlink_inode_operations = {
#ifdef HAVE_GENERIC_READLINK
.readlink = generic_readlink,
#endif
#if defined(HAVE_GET_LINK_DELAYED) || defined(HAVE_GET_LINK_COOKIE)
.get_link = zpl_get_link,
#elif defined(HAVE_FOLLOW_LINK_COOKIE) || defined(HAVE_FOLLOW_LINK_NAMEIDATA)
@ -712,20 +747,27 @@ const struct inode_operations zpl_symlink_inode_operations = {
#endif
.setattr = zpl_setattr,
.getattr = zpl_getattr,
#ifdef HAVE_GENERIC_SETXATTR
.setxattr = generic_setxattr,
.getxattr = generic_getxattr,
.removexattr = generic_removexattr,
#endif
.listxattr = zpl_xattr_list,
};
const struct inode_operations zpl_special_inode_operations = {
.setattr = zpl_setattr,
.getattr = zpl_getattr,
#ifdef HAVE_GENERIC_SETXATTR
.setxattr = generic_setxattr,
.getxattr = generic_getxattr,
.removexattr = generic_removexattr,
#endif
.listxattr = zpl_xattr_list,
#if defined(CONFIG_FS_POSIX_ACL)
#if defined(HAVE_SET_ACL)
.set_acl = zpl_set_acl,
#endif
#if defined(HAVE_GET_ACL)
.get_acl = zpl_get_acl,
#elif defined(HAVE_CHECK_ACL)

View File

@ -936,9 +936,8 @@ xattr_handler_t zpl_xattr_security_handler = {
*/
#ifdef CONFIG_FS_POSIX_ACL
int
zpl_set_acl(struct inode *ip, int type, struct posix_acl *acl)
zpl_set_acl(struct inode *ip, struct posix_acl *acl, int type)
{
struct super_block *sb = ITOZSB(ip)->z_sb;
char *name, *value = NULL;
int error = 0;
size_t size = 0;
@ -964,7 +963,7 @@ zpl_set_acl(struct inode *ip, int type, struct posix_acl *acl)
*/
if (ip->i_mode != mode) {
ip->i_mode = mode;
ip->i_ctime = current_fs_time(sb);
ip->i_ctime = current_time(ip);
zfs_mark_inode_dirty(ip);
}
@ -1130,7 +1129,7 @@ zpl_init_acl(struct inode *ip, struct inode *dir)
if (!acl) {
ip->i_mode &= ~current_umask();
ip->i_ctime = current_fs_time(ITOZSB(ip)->z_sb);
ip->i_ctime = current_time(ip);
zfs_mark_inode_dirty(ip);
return (0);
}
@ -1140,7 +1139,7 @@ zpl_init_acl(struct inode *ip, struct inode *dir)
umode_t mode;
if (S_ISDIR(ip->i_mode)) {
error = zpl_set_acl(ip, ACL_TYPE_DEFAULT, acl);
error = zpl_set_acl(ip, acl, ACL_TYPE_DEFAULT);
if (error)
goto out;
}
@ -1151,7 +1150,7 @@ zpl_init_acl(struct inode *ip, struct inode *dir)
ip->i_mode = mode;
zfs_mark_inode_dirty(ip);
if (error > 0)
error = zpl_set_acl(ip, ACL_TYPE_ACCESS, acl);
error = zpl_set_acl(ip, acl, ACL_TYPE_ACCESS);
}
}
out:
@ -1178,7 +1177,7 @@ zpl_chmod_acl(struct inode *ip)
error = __posix_acl_chmod(&acl, GFP_KERNEL, ip->i_mode);
if (!error)
error = zpl_set_acl(ip, ACL_TYPE_ACCESS, acl);
error = zpl_set_acl(ip, acl, ACL_TYPE_ACCESS);
zpl_posix_acl_release(acl);
@ -1308,7 +1307,7 @@ __zpl_xattr_acl_set_access(struct inode *ip, const char *name,
acl = NULL;
}
error = zpl_set_acl(ip, type, acl);
error = zpl_set_acl(ip, acl, type);
zpl_posix_acl_release(acl);
return (error);
@ -1348,7 +1347,7 @@ __zpl_xattr_acl_set_default(struct inode *ip, const char *name,
acl = NULL;
}
error = zpl_set_acl(ip, type, acl);
error = zpl_set_acl(ip, acl, type);
zpl_posix_acl_release(acl);
return (error);
@ -1441,3 +1440,103 @@ zpl_xattr_handler(const char *name)
return (NULL);
}
#if !defined(HAVE_POSIX_ACL_RELEASE) || defined(HAVE_POSIX_ACL_RELEASE_GPL_ONLY)
struct acl_rel_struct {
struct acl_rel_struct *next;
struct posix_acl *acl;
clock_t time;
};
#define ACL_REL_GRACE (60*HZ)
#define ACL_REL_WINDOW (1*HZ)
#define ACL_REL_SCHED (ACL_REL_GRACE+ACL_REL_WINDOW)
/*
* Lockless multi-producer single-consumer fifo list.
* Nodes are added to tail and removed from head. Tail pointer is our
* synchronization point. It always points to the next pointer of the last
* node, or head if list is empty.
*/
static struct acl_rel_struct *acl_rel_head = NULL;
static struct acl_rel_struct **acl_rel_tail = &acl_rel_head;
static void
zpl_posix_acl_free(void *arg)
{
struct acl_rel_struct *freelist = NULL;
struct acl_rel_struct *a;
clock_t new_time;
boolean_t refire = B_FALSE;
ASSERT3P(acl_rel_head, !=, NULL);
while (acl_rel_head) {
a = acl_rel_head;
if (ddi_get_lbolt() - a->time >= ACL_REL_GRACE) {
/*
* If a is the last node we need to reset tail, but we
* need to use cmpxchg to make sure it is still the
* last node.
*/
if (acl_rel_tail == &a->next) {
acl_rel_head = NULL;
if (cmpxchg(&acl_rel_tail, &a->next,
&acl_rel_head) == &a->next) {
ASSERT3P(a->next, ==, NULL);
a->next = freelist;
freelist = a;
break;
}
}
/*
* a is not last node, make sure next pointer is set
* by the adder and advance the head.
*/
while (ACCESS_ONCE(a->next) == NULL)
cpu_relax();
acl_rel_head = a->next;
a->next = freelist;
freelist = a;
} else {
/*
* a is still in grace period. We are responsible to
* reschedule the free task, since adder will only do
* so if list is empty.
*/
new_time = a->time + ACL_REL_SCHED;
refire = B_TRUE;
break;
}
}
if (refire)
taskq_dispatch_delay(system_taskq, zpl_posix_acl_free, NULL,
TQ_SLEEP, new_time);
while (freelist) {
a = freelist;
freelist = a->next;
kfree(a->acl);
kmem_free(a, sizeof (struct acl_rel_struct));
}
}
void
zpl_posix_acl_release_impl(struct posix_acl *acl)
{
struct acl_rel_struct *a, **prev;
a = kmem_alloc(sizeof (struct acl_rel_struct), KM_SLEEP);
a->next = NULL;
a->acl = acl;
a->time = ddi_get_lbolt();
/* atomically points tail to us and get the previous tail */
prev = xchg(&acl_rel_tail, &a->next);
ASSERT3P(*prev, ==, NULL);
*prev = a;
/* if it was empty before, schedule the free task */
if (prev == &acl_rel_head)
taskq_dispatch_delay(system_taskq, zpl_posix_acl_free, NULL,
TQ_SLEEP, ddi_get_lbolt() + ACL_REL_SCHED);
}
#endif

View File

@ -174,7 +174,7 @@ zvol_is_zvol(const char *device)
struct block_device *bdev;
unsigned int major;
bdev = lookup_bdev(device);
bdev = vdev_lookup_bdev(device);
if (IS_ERR(bdev))
return (B_FALSE);
@ -1615,14 +1615,12 @@ zvol_rename_minors_impl(const char *oldname, const char *newname)
{
zvol_state_t *zv, *zv_next;
int oldnamelen, newnamelen;
char *name;
if (zvol_inhibit_dev)
return;
oldnamelen = strlen(oldname);
newnamelen = strlen(newname);
name = kmem_alloc(MAXNAMELEN, KM_SLEEP);
mutex_enter(&zvol_state_lock);
@ -1638,16 +1636,15 @@ zvol_rename_minors_impl(const char *oldname, const char *newname)
} else if (strncmp(zv->zv_name, oldname, oldnamelen) == 0 &&
(zv->zv_name[oldnamelen] == '/' ||
zv->zv_name[oldnamelen] == '@')) {
snprintf(name, MAXNAMELEN, "%s%c%s", newname,
char *name = kmem_asprintf("%s%c%s", newname,
zv->zv_name[oldnamelen],
zv->zv_name + oldnamelen + 1);
zvol_rename_minor(zv, name);
kmem_free(name, strlen(name + 1));
}
}
mutex_exit(&zvol_state_lock);
kmem_free(name, MAXNAMELEN);
}
typedef struct zvol_snapdev_cb_arg {

View File

@ -186,6 +186,69 @@ chmod u+x ${RPM_BUILD_ROOT}%{kmodinstdir_prefix}/*/extra/*/*/*
rm -rf $RPM_BUILD_ROOT
%changelog
* Mon Jul 10 2017 Tony Hutter <hutter2@llnl.gov> - 0.6.5.11-1
- Linux 4.12 compat: super_setup_bdi_name() - add missing code zfsonlinux/zfs#6089 zfsonlinux/zfs#6324
- Musl libc fixes zfsonlinux/zfs#6310
- Increase zfs_vdev_async_write_min_active to 2 zfsonlinux/zfs#5926
- Fix int overflow in zbookmark_is_before()- Fix RHEL 7.4 bio_set_op_attrs build error zfsonlinux/zfs#6234 zfsonlinux/zfs#6271
- Fix RHEL 7.4 bio_set_op_attrs build error zfsonlinux/zfs#6253 zfsonlinux/zfs#6271
- GCC 7.1 fixes zfsonlinux/zfs#6253
- Remove complicated libspl assert wrappers zfsonlinux/zfs#4449
- Compatibilty with glibc-2.23 zfsonlinux/zfs#6132
- glibc 2.5 compat: use correct header for makedev() et al. zfsonlinux/zfs#5945
* Mon Jun 12 2017 Tony Hutter <hutter2@llnl.gov> - 0.6.5.10-1
- OpenZFS 8005 - poor performance of 1MB writes on certain RAID-Z configurations zfsonlinux/zfs#5931
- Add MS_MANDLOCK mount failure message zfsonlinux/zfs#4729 zfsonlinux/zfs#6199
- Fix import wrong spare/l2 device when path change zfsonlinux/zfs#6158
- Fix import finding spare/l2cache when path changes zfsonlinux/zfs#6158
- Linux 4.9 compat: fix zfs_ctldir xattr handling zfsonlinux/zfs#6189
- Linux 4.12 compat: fix super_setup_bdi_name() call zfsonlinux/zfs#6147
- Linux 4.12 compat: CURRENT_TIME removed zfsonlinux/zfs#6114
- Linux 4.12 compat: super_setup_bdi_name() zfsonlinux/zfs#6089
- Limit zfs_dirty_data_max_max to 4G zfsonlinux/zfs#6072 zfsonlinux/zfs#6081
- OpenZFS 8166 - zpool scrub thinks it repaired offline device zfsonlinux/zfs#5806 zfsonlinux/zfs#6103
- vdev_id: fix failure due to multipath -l bug zfsonlinux/zfs#6039
- Guarantee PAGESIZE alignment for large zio buffers zfsonlinux/zfs#6084
- Fix harmless "BARRIER is deprecated" kernel warning on Centos 6.8 zfsonlinux/zfs#5739 zfsonlinux/zfs#5828
- Add kmap_atomic in dmu_bio_copy- zdb: segfault in dump_bpobj_subobjs() zfsonlinux/zfs#3905
- Fix atomic_sub_64() i386 assembly implementation zfsonlinux/zfs#5671 zfsonlinux/zfs#5717
- Fix loop device becomes read-only zfsonlinux/zfs#5776 zfsonlinux/zfs#5855
- Allow ZVOL bookmarks to be listed recursively zfsonlinux/zfs#4503 zfsonlinux/zfs#5072
- Fix zfs-mount.service failure on boot zfsonlinux/zfs#5719
- Fix iput() calls within a tx zfsonlinux/zfs#5758
- Fix off by one in zpl_lookup zfsonlinux/zfs#5768
- Linux 4.11 compat: iops.getattr and friends zfsonlinux/zfs#5875
- Linux 4.11 compat: avoid refcount_t name conflict zfsonlinux/zfs#5823 zfsonlinux/zfs#5842
* Fri Feb 3 2017 Brian Behlendorf <behlendorf1@llnl.gov> - 0.6.5.9-1
- Use large stacks when available zfsonlinux/zfs#4059
- Use set_cached_acl() and forget_cached_acl() when possible zfsonlinux/zfs#5378
- Fix batch free zpl_posix_acl_release zfsonlinux/zfs#5340 zfsonlinux/zfs#5353
- Fix zfsctl_snapshot_{,un}mount() issues zfsonlinux/zfs#5250
- Fix systemd services configutation through preset file zfsonlinux/zfs#5356
- Fix RLIMIT_FSIZE enforcement zfsonlinux/zfs#5587 zfsonlinux/zfs#5673 zfsonlinux/zfs#5720 zfsonlinux/zfs#5726
- Fix leak on zfs_sb_create() failure zfsonlinux/zfs#5490 zfsonlinux/zfs#5496
- Fix zpl_fallocate_common() creak leak zfsonlinux/zfs#5244 zfsonlinux/zfs#5330
- Fix fchange in zpl_ioctl_setflags() zfsonlinux/zfs#5486
- Fix wrong operator in xvattr.h zfsonlinux/zfs#5486
- Fix counting '@' in dataset namelen zfsonlinux/zfs#5432 zfsonlinux/zfs#5456
- Fix dmu_object_size_from_db() call under spinlock zfsonlinux/zfs#3858
- Fix lookup_bdev() on Ubuntu zfsonlinux/zfs#5336
- Fix receiving custom snapshot properties zfsonlinux/zfs#5189
- Fix bio merging w/noop scheduler zfsonlinux/zfs#5181
- Fix sync behavior for disk vdevs zfsonlinux/zfs#4858
- Fix uninitialized variable in avl_add() zfsonlinux/zfs#3609
- Fix tq_lock contention by making write taskq non-dynamic zfsonlinux/zfs#5236
- Fix atime handling (relatime, lazytime) zfsonlinux/zfs#4482
- Linux 4.10 compat: BIO flag changes zfsonlinux/zfs#5499
- Linux 4.9 compat: inode_change_ok() renamed setattr_prepare() zfsonlinux/zfs#5307
- Linux 4.9 compat: remove iops->{set,get,remove}xattr zfsonlinux/zfs#5307
- Linux 4.9 compat: iops->rename() wants flags zfsonlinux/zfs#5307
- Linux 4.9 compat: file_operations->aio_fsync removal zfsonlinux/zfs#5393
- Linux 4.9 compat: Remove dir inode operations from zpl_inode_operations zfsonlinux/zfs#5307
- Linux 4.7 compat: Fix deadlock during lookup on case-insensitive zfsonlinux/zfs#5124 zfsonlinux/zfs#5141 zfsonlinux/zfs#5147 zfsonlinux/zfs#5148
- Linux 3.14 compat: assign inode->set_acl zfsonlinux/zfs#5371 zfsonlinux/zfs#5375
- Linux 2.6.32 compat: Reorder HAVE_BIO_RW_* checks zfsonlinux/zfs#4951 zfsonlinux/zfs#4959
- Remove dead root pool import code zfsonlinux/zfs#4951
* Fri Sep 9 2016 Ned Bass <bass6@llnl.gov> - 0.6.5.8-1
- Linux 4.6, 4.7 and 4.8 compatibility zfsonlinux/spl#549 zfsonlinux/spl#563 zfsonlinux/spl#565 zfsonlinux/spl#566 zfsonlinux/zfs#4664 zfsonlinux/zfs#4665 zfsonlinux/zfs#4717 zfsonlinux/zfs#4726 zfsonlinux/zfs#4892 zfsonlinux/zfs#4899 zfsonlinux/zfs#4922 zfsonlinux/zfs#4944 zfsonlinux/zfs#4946 zfsonlinux/zfs#4951
- Fix new tunable to ignore hole_birth, enabled by default zfsonlinux/zfs#4833

View File

@ -40,6 +40,7 @@
# Generic enable switch for systemd
%if %{with systemd}
%define _systemd 1
%define systemd_svcs zfs-import-cache.service zfs-import-scan.service zfs-mount.service zfs-share.service zfs-zed.service zfs.target
%endif
# RHEL >= 7 comes with systemd
@ -240,7 +241,7 @@ find %{?buildroot}%{_libdir} -name '*.la' -exec rm -f {} \;
%post
%if 0%{?_systemd}
%systemd_post zfs.target
%systemd_post %{systemd_svcs}
%else
if [ -x /sbin/chkconfig ]; then
/sbin/chkconfig --add zfs-import
@ -253,7 +254,7 @@ exit 0
%preun
%if 0%{?_systemd}
%systemd_preun zfs.target
%systemd_preun %{systemd_svcs}
%else
if [ $1 -eq 0 ] && [ -x /sbin/chkconfig ]; then
/sbin/chkconfig --del zfs-import
@ -266,7 +267,7 @@ exit 0
%postun
%if 0%{?_systemd}
%systemd_postun zfs.target
%systemd_postun %{systemd_svcs}
%endif
%files
@ -327,6 +328,69 @@ exit 0
%endif
%changelog
* Mon Jul 10 2017 Tony Hutter <hutter2@llnl.gov> - 0.6.5.11-1
- Linux 4.12 compat: super_setup_bdi_name() - add missing code zfsonlinux/zfs#6089 zfsonlinux/zfs#6324
- Musl libc fixes zfsonlinux/zfs#6310
- Increase zfs_vdev_async_write_min_active to 2 zfsonlinux/zfs#5926
- Fix int overflow in zbookmark_is_before()- Fix RHEL 7.4 bio_set_op_attrs build error zfsonlinux/zfs#6234 zfsonlinux/zfs#6271
- Fix RHEL 7.4 bio_set_op_attrs build error zfsonlinux/zfs#6253 zfsonlinux/zfs#6271
- GCC 7.1 fixes zfsonlinux/zfs#6253
- Remove complicated libspl assert wrappers zfsonlinux/zfs#4449
- Compatibilty with glibc-2.23 zfsonlinux/zfs#6132
- glibc 2.5 compat: use correct header for makedev() et al. zfsonlinux/zfs#5945
* Mon Jun 12 2017 Tony Hutter <hutter2@llnl.gov> - 0.6.5.10-1
- OpenZFS 8005 - poor performance of 1MB writes on certain RAID-Z configurations zfsonlinux/zfs#5931
- Add MS_MANDLOCK mount failure message zfsonlinux/zfs#4729 zfsonlinux/zfs#6199
- Fix import wrong spare/l2 device when path change zfsonlinux/zfs#6158
- Fix import finding spare/l2cache when path changes zfsonlinux/zfs#6158
- Linux 4.9 compat: fix zfs_ctldir xattr handling zfsonlinux/zfs#6189
- Linux 4.12 compat: fix super_setup_bdi_name() call zfsonlinux/zfs#6147
- Linux 4.12 compat: CURRENT_TIME removed zfsonlinux/zfs#6114
- Linux 4.12 compat: super_setup_bdi_name() zfsonlinux/zfs#6089
- Limit zfs_dirty_data_max_max to 4G zfsonlinux/zfs#6072 zfsonlinux/zfs#6081
- OpenZFS 8166 - zpool scrub thinks it repaired offline device zfsonlinux/zfs#5806 zfsonlinux/zfs#6103
- vdev_id: fix failure due to multipath -l bug zfsonlinux/zfs#6039
- Guarantee PAGESIZE alignment for large zio buffers zfsonlinux/zfs#6084
- Fix harmless "BARRIER is deprecated" kernel warning on Centos 6.8 zfsonlinux/zfs#5739 zfsonlinux/zfs#5828
- Add kmap_atomic in dmu_bio_copy- zdb: segfault in dump_bpobj_subobjs() zfsonlinux/zfs#3905
- Fix atomic_sub_64() i386 assembly implementation zfsonlinux/zfs#5671 zfsonlinux/zfs#5717
- Fix loop device becomes read-only zfsonlinux/zfs#5776 zfsonlinux/zfs#5855
- Allow ZVOL bookmarks to be listed recursively zfsonlinux/zfs#4503 zfsonlinux/zfs#5072
- Fix zfs-mount.service failure on boot zfsonlinux/zfs#5719
- Fix iput() calls within a tx zfsonlinux/zfs#5758
- Fix off by one in zpl_lookup zfsonlinux/zfs#5768
- Linux 4.11 compat: iops.getattr and friends zfsonlinux/zfs#5875
- Linux 4.11 compat: avoid refcount_t name conflict zfsonlinux/zfs#5823 zfsonlinux/zfs#5842
* Fri Feb 3 2017 Brian Behlendorf <behlendorf1@llnl.gov> - 0.6.5.9-1
- Use large stacks when available zfsonlinux/zfs#4059
- Use set_cached_acl() and forget_cached_acl() when possible zfsonlinux/zfs#5378
- Fix batch free zpl_posix_acl_release zfsonlinux/zfs#5340 zfsonlinux/zfs#5353
- Fix zfsctl_snapshot_{,un}mount() issues zfsonlinux/zfs#5250
- Fix systemd services configutation through preset file zfsonlinux/zfs#5356
- Fix RLIMIT_FSIZE enforcement zfsonlinux/zfs#5587 zfsonlinux/zfs#5673 zfsonlinux/zfs#5720 zfsonlinux/zfs#5726
- Fix leak on zfs_sb_create() failure zfsonlinux/zfs#5490 zfsonlinux/zfs#5496
- Fix zpl_fallocate_common() creak leak zfsonlinux/zfs#5244 zfsonlinux/zfs#5330
- Fix fchange in zpl_ioctl_setflags() zfsonlinux/zfs#5486
- Fix wrong operator in xvattr.h zfsonlinux/zfs#5486
- Fix counting '@' in dataset namelen zfsonlinux/zfs#5432 zfsonlinux/zfs#5456
- Fix dmu_object_size_from_db() call under spinlock zfsonlinux/zfs#3858
- Fix lookup_bdev() on Ubuntu zfsonlinux/zfs#5336
- Fix receiving custom snapshot properties zfsonlinux/zfs#5189
- Fix bio merging w/noop scheduler zfsonlinux/zfs#5181
- Fix sync behavior for disk vdevs zfsonlinux/zfs#4858
- Fix uninitialized variable in avl_add() zfsonlinux/zfs#3609
- Fix tq_lock contention by making write taskq non-dynamic zfsonlinux/zfs#5236
- Fix atime handling (relatime, lazytime) zfsonlinux/zfs#4482
- Linux 4.10 compat: BIO flag changes zfsonlinux/zfs#5499
- Linux 4.9 compat: inode_change_ok() renamed setattr_prepare() zfsonlinux/zfs#5307
- Linux 4.9 compat: remove iops->{set,get,remove}xattr zfsonlinux/zfs#5307
- Linux 4.9 compat: iops->rename() wants flags zfsonlinux/zfs#5307
- Linux 4.9 compat: file_operations->aio_fsync removal zfsonlinux/zfs#5393
- Linux 4.9 compat: Remove dir inode operations from zpl_inode_operations zfsonlinux/zfs#5307
- Linux 4.7 compat: Fix deadlock during lookup on case-insensitive zfsonlinux/zfs#5124 zfsonlinux/zfs#5141 zfsonlinux/zfs#5147 zfsonlinux/zfs#5148
- Linux 3.14 compat: assign inode->set_acl zfsonlinux/zfs#5371 zfsonlinux/zfs#5375
- Linux 2.6.32 compat: Reorder HAVE_BIO_RW_* checks zfsonlinux/zfs#4951 zfsonlinux/zfs#4959
- Remove dead root pool import code zfsonlinux/zfs#4951
* Fri Sep 9 2016 Ned Bass <bass6@llnl.gov> - 0.6.5.8-1
- Linux 4.6, 4.7 and 4.8 compatibility zfsonlinux/spl#549 zfsonlinux/spl#563 zfsonlinux/spl#565 zfsonlinux/spl#566 zfsonlinux/zfs#4664 zfsonlinux/zfs#4665 zfsonlinux/zfs#4717 zfsonlinux/zfs#4726 zfsonlinux/zfs#4892 zfsonlinux/zfs#4899 zfsonlinux/zfs#4922 zfsonlinux/zfs#4944 zfsonlinux/zfs#4946 zfsonlinux/zfs#4951
- Fix new tunable to ignore hole_birth, enabled by default zfsonlinux/zfs#4833