When spare or l2cache device path changes, zpool import will not fix up
their paths like normal vdev. The issue is that when you supply a pool
name argument to zpool import, it will use it to filter out device which
doesn't have the pool name in the label. Since spare and l2cache device
never have that in the label, they'll always get filtered out.
We fix this by making sure we never filter out a spare or l2cache
device.
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes#6158
Also enable lazytime in mount.zfs
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #4482
The variable snapprops_nvlist was never initialized, so properties
were not applied to the received snapshot.
Additionally, add zfs_receive_013_pos.ksh script to ZFS test suite to exercise
'zfs receive' functionality for user properties.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#4338
If the loop index i comes to (ZFS_GET_NCOLS - 1), the cbp->cb_columns[i + 1]
actually read the data of cbp->cb_colwidths[0], which means the array
subscript is above array bounds.
Luckily the cbp->cb_colwidths[0] is always 0 and it seems we haven't
looped enough times to exceed the array bounds so far, but it's really
a secluded risk someday.
Signed-off-by: GeLiXin <ge.lixin@zte.com.cn>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#5003
zfs_get_name() expects a parameter of type zfs_handle_t *zhp , but
gets an invalid parameter type of zfs_handle_t **zhp actually in
libzfs_dataset_cmp(), which may trigger a coredump if called.
libzfs_dataset_cmp() working normally so far, just because all the
callers only give datasets of type ZFS_TYPE_FILESYSTEM to it, we
compared their mountpoint and return, luckily.
Signed-off-by: GeLiXin <ge.lixin@zte.com.cn>
Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#4919
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Calling dsl_dataset_name on a dataset with a 256 byte buffer is asking
for trouble. We should check every dataset on import, using a 1024 byte
buffer and checking each time to see if the dataset's new name is longer
than 256 bytes.
OpenZFS-issue: https://www.illumos.org/issues/6876
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/ca8674e
4953 zfs rename <snapshot> need not involve libshare
4954 "zfs create" need not involve libshare if we are not sharing
4955 libshare's get_zfs_dataset need not sort the datasets
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Gordon Ross <gordon.ross@nexenta.com>
Approved by: Garrett D'Amore <garrett@damore.org>
References:
https://www.illumos.org/issues/4953https://www.illumos.org/issues/4954https://www.illumos.org/issues/4955https://github.com/illumos/illumos-gate/commit/33cde0d
Porting notes:
- Dropped qsort libshare_zfs.c hunk, no equivalent ZoL code.
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#4219
When ZFS partitions a block device it must wait for udev to create
both a device node and all the device symlinks. This process takes
a variable length of time and depends on factors such how many links
must be created, the complexity of the rules, etc. Complicating
the situation further it is not uncommon for udev to create and
then remove a link multiple times while processing the udev rules.
Given the above, the existing scheme of waiting for an expected
partition to appear by name isn't 100% reliable. At this point
udev may still remove and recreate think link resulting in the
kernel modules being unable to open the device.
In order to address this the zpool_label_disk_wait() function
has been updated to use libudev. Until the registered system
device acknowledges that it in fully initialized the function
will wait. Once fully initialized all device links are checked
and allowed to settle for 50ms. This makes it far more likely
that all the device nodes will exist when the kernel modules
need to open them.
For systems without libudev an alternate zpool_label_disk_wait()
was updated to include a settle time. In addition, the kernel
modules were updated to include retry logic for this ENOENT case.
Due to the improved checks in the utilities it is unlikely this
logic will be invoked. However, if the rare event it is needed
it will prevent a failure.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes#4523Closes#3708Closes#4077Closes#4144Closes#4214Closes#4517
When importing a pool using the blkid cache only the device
node path was added to the list of known paths for a device.
This results in 'zpool import' always using the sdX names
in preference to the 'path' name stored in the label.
To fix the issue the blkid import path has been updated to
add both the 'path', 'devid', and 'devname' names from the
label to the known paths. A sanity check is done to ensure
these paths do refer to the same device identified by blkid.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes#4523Closes#3043
When partitioning a device a name may be specified for each partition.
Internally zfs doesn't use this partition name for anything so it
has always just been set to "zfs".
However this isn't optimal because udev will create symlinks using
this name in /dev/disk/by-partlabel/. If the name isn't unique
then all the links cannot be created.
Therefore a random 64-bit value has been added to the partition
label, i.e "zfs-1234567890abcdef". Additional information could
be encoded here but since partitions may be reused that might
result in confusion and it was decided against.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes#4517
When extracting tokens from the string strtok(2) is allowed to modify
the passed buffer. Therefore the zfs_strcmp_pathname() function must
make a copy of the passed string before passing it to strtok(3).
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Don Brady <don.brady@intel.com>
Closes#4312
The following options have been added to the zpool add, iostat,
list, status, and split subcommands. The default behavior was
not modified, from zfs(8).
-g Display vdev GUIDs instead of the normal short
device names. These GUIDs can be used in-place of
device names for the zpool detach/off‐
line/remove/replace commands.
-L Display real paths for vdevs resolving all symbolic
links. This can be used to lookup the current block
device name regardless of the /dev/disk/ path used
to open it.
-p Display full paths for vdevs instead of only the
last component of the path. This can be used in
conjunction with the -L flag.
This behavior may also be enabled using the following environment
variables.
ZPOOL_VDEV_NAME_GUID
ZPOOL_VDEV_NAME_FOLLOW_LINKS
ZPOOL_VDEV_NAME_PATH
This change is based on worked originally started by Richard Yao
to add a -g option. Then extended by @ilovezfs to add a -L option
for openzfsonosx. Those changes have been merged, re-factored,
a -p option added and extended to all relevant zpool subcommands.
Original-patch-by: Richard Yao <ryao@gentoo.org>
Extended-by: ilovezfs <ilovezfs@icloud.com>
Extended-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: ilovezfs <ilovezfs@icloud.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#2011Closes#4341
As of gcc 5.1.1 20150618 (Red Hat 5.1.1-4) the -Werror=maybe-uninitialized
check detects that 'snapname' in recv_incremental_replication() may not be
initialized. Explicitly initialize the variable to resolved the warning.
libzfs_sendrecv.c: In function ‘recv_incremental_replication’:
libzfs_sendrecv.c:2019:2: error: ‘snapname’ may be used uninitialized in
(void) snprintf(buf, sizeof (buf), "%s@%s", fsname, snapname);
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
ZFS incorrectly uses directory-based extended attributes even when
xattr=sa is specified as a dataset property or mount option. Support to
honor temporary mount options including "xattr" was added in commit
0282c4137e. There are two issues with the
mount option handling:
* Libzfs has historically included "xattr" in its list of default mount
options. This overrides the dataset property, so the dataset is always
configured to use directory-based xattrs even when the xattr dataset
property is set to off or sa. Address this by removing "xattr" from
the set of default mount options in libzfs.
* There was no way to enable system attribute-based extended attributes
using temporary mount options. Add the mount options "saxattr" and
"dirxattr" which enable the xattr behavior their names suggest. This
approach has the advantages of mirroring the valid xattr dataset
property values and following existing conventions for mount option
names.
Issue #3787
Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Add the required kernel side infrastructure to parse arbitrary
mount options. This enables us to support temporary mount
options in largely the same way it is handled on other platforms.
See the 'Temporary Mount Point Properties' section of zfs(8)
for complete details.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#985Closes#3351
Since ZoL allows large blocks to be used by volumes, unlike upstream
illumos, the feature flag must be checked prior to volume creation.
This is critical because unlike filesystems, volumes will create a
object which uses large blocks as part of the create. Therefore, it
cannot be safely checked in zfs_check_settable() after the dataset
can been created.
In addition this patch updates the relevant error messages to use
zfs_nicenum() to print the maximum blocksize.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#3591
Build products from an out of tree build should be written
relative to the build directory. Sources should be referred
to by their locations in the source directory.
This is accomplished by adding the 'src' and 'obj' variables
for the module Makefile.am, using relative paths to reference
source files, and by setting VPATH when source files are not
co-located with the Makefile. This enables the following:
$ mkdir build
$ cd build
$ ../configure \
--with-spl=$HOME/src/git/spl/ \
--with-spl-obj=$HOME/src/git/spl/build
$ make -s
This change also has the advantage of resolving the following
warning which is generated by modern versions of automake.
Makefile.am:00: warning: source file 'xxx' is in a subdirectory,
Makefile.am:00: but option 'subdir-objects' is disabled
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#1082
We fail to make partitions on xvd (Xen Virtual Block) devices. This also
causes debug builds of zpool create to return an error when given xen
virtual block devices. These devices should be given the same treatment
as vd (KVM Virtual Block) devices, so we adjust the relevant code paths.
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#3576
All fprintf() error messages are moved out of the libzfs_init()
library function where they never belonged in the first place. A
libzfs_error_init() function is added to provide useful error
messages for the most common causes of failure.
Additionally, in libzfs_run_process() the 'rc' variable was renamed
to 'error' for consistency with the rest of the code base.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Richard Yao <ryao@gentoo.org>
While module loading itself is synchronous the creation of the /dev/zfs
device is not. This is because /dev/zfs is typically created by a udev
rule after the module is registered and presented to user space through
sysfs. This small window between module loading and device creation
can result in spurious failures of libzfs_init().
This patch closes that race by extending libzfs_init() so it can detect
that the modules are loaded and only if required wait for the /dev/zfs
device to be created. This allows scripts to reliably use the following
shell construct without the need for additional error handling.
$ /sbin/modprobe zfs && /sbin/zpool import -a
To minimize the potential time waiting in libzfs_init() a strategy
similar to adaptive mutexes is employed. The function will busy-wait
for up to 10ms based on the expectation that the modules were just
loaded and therefore the /dev/zfs will be created imminently. If it
takes longer than this it will fall back to polling for up to 10 seconds.
This behavior can be customized to some degree by setting the following
new environment variables. This functionality is provided for backwards
compatibility with existing scripts which depend on the module auto-load
behavior. By default module auto-loading is now disabled.
* ZFS_MODULE_LOADING="YES|yes|ON|on" - Attempt to load modules.
* ZFS_MODULE_TIMEOUT="<seconds>" - Seconds to wait for /dev/zfs
The zfs-import-* systemd service files have been updated to call
'/sbin/modprobe zfs' so they no longer rely on the legacy auto-loading
behavior.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Closes#2556
Prefixing an octal value with a leading zero is the standard way
to disambiguate it. This change only impacts the `zfs diff` output
and is therefore very limited in scope.
Signed-off-by: Hajo M<C3><B6>ller <dasjoe@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#3417
5027 zfs large block support
Reviewed by: Alek Pinchuk <pinchuk.alek@gmail.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Reviewed by: Richard Elling <richard.elling@richardelling.com>
Reviewed by: Saso Kiselkov <skiselkov.ml@gmail.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Dan McDonald <danmcd@omniti.com>
References:
https://www.illumos.org/issues/5027https://github.com/illumos/illumos-gate/commit/b515258
Porting Notes:
* Included in this patch is a tiny ISP2() cleanup in zio_init() from
Illumos 5255.
* Unlike the upstream Illumos commit this patch does not impose an
arbitrary 128K block size limit on volumes. Volumes, like filesystems,
are limited by the zfs_max_recordsize=1M module option.
* By default the maximum record size is limited to 1M by the module
option zfs_max_recordsize. This value may be safely increased up to
16M which is the largest block size supported by the on-disk format.
At the moment, 1M blocks clearly offer a significant performance
improvement but the benefits of going beyond this for the majority
of workloads are less clear.
* The illumos version of this patch increased DMU_MAX_ACCESS to 32M.
This was determined not to be large enough when using 16M blocks
because the zfs_make_xattrdir() function will fail (EFBIG) when
assigning a TX. This was immediately observed under Linux because
all newly created files must have a security xattr created and
that was failing. Therefore, we've set DMU_MAX_ACCESS to 64M.
* On 32-bit platforms a hard limit of 1M is set for blocks due
to the limited virtual address space. We should be able to relax
this one the ABD patches are merged.
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#354
The umem_alloc_aligned() function should not assume that a 'void *'
type is 64-bit. It will not be on 32-bit platforms. Rather than
complicating the ASSERT to handle this it is simply removed.
Additionally, the '%lu' format specifier should not be assumed to
imply a 64-bit value. Fix this by using the 'llu' format specifier
which will always be atleast 64-bit and explicitly casing the
variable to an u_longlong_t. This issue is handled the same way
in many other parts of the code.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
3897 zfs filesystem and snapshot limits
Author: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Christopher Siden <christopher.siden@delphix.com>
References:
https://www.illumos.org/issues/3897https://github.com/illumos/illumos-gate/commit/a2afb61
Porting Notes:
dsl_dataset_snapshot_check(): reduce stack usage using kmem_alloc().
Ported-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
As of this commit, volumes do not possess the version property in
any existing OpenZFS implementation. The zpool upgrade code, however,
uses zfs_prop_get_int() to fetch the version property of all children in
a pool. The semantics of the function, however, demand that it only be
used for known valid properties so it returns a garbage value for volumes.
This patch causes the version of a volume to appear to callers using
the zfs_prop_get_int() API to be that of the default ZPL version for
the implementation. In the future, should volumes gain the property,
its actual value will be used.
Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Fixes#3313
Make the 'zpool import' command honor the overlay property to allow
filesystems to be mounted on a non-empty directory. As it stands now
this property is only checked by the 'zfs mount' command. Move the
check into 'zfs_mount()` in libzpool so the property is honored for all
callers.
Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#3227
When using 'zpool import' to scan for available pools prefer vdev names
which reference vdevs with more valid labels. There should be two labels
at the start of the device and two labels at the end of the device. If
labels are missing then the device has been damaged or is in some other
way incomplete. Preferring names with fully intact labels helps weed out
bad paths and improves the likelihood of being able to import the pool.
This behavior only applies when scanning /dev/ for valid pools. If a
cache file exists the pools described by the cache file will be used.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Closes#3145Closes#2844Closes#3107
I noticed when reviewing documentation that it is possible for user
space to use fctnl(fd, F_SETPIPE_SZ, (unsigned long) size) to change
the kernel pipe buffer size on Linux to increase the pipe size up to
the value specified in /proc/sys/fs/pipe-max-size. There are users using
mbuffer to improve zfs recv performance when piping over the network, so
it seems advantageous to integrate such functionality directly into the
zfs recv tool. This avoids the addition of two buffers and two copies
(one for the buffer mbuffer adds and another for the additional pipe),
so it should be more efficient.
This could have been made configurable and/or this could have changed
the value back to the original after we were done with the file
descriptor, but I do not see a strong case for doing either, so I
went with a simple implementation.
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #1161
The function zfs_nicenum that converts number to human-readable output
uses a index to a string of letters. This patch limits the index to
the length of the string.
Signed-off-by: Christer Ekholm <che@chrekh.se>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#3122
In order to accelerate zfs receive operations in the face of many
property-containing snapshots, commit 0574855 changed the header nvlist
("fss") of a send stream to exclude snapshots which aren't part of the
stream. This, however, would cause zfs receive -F to erroneously remove
snapshots; it would remove any snapshot which wasn't listed in the header
nvlist.
This patch restores the full list of snapshots in fss[<id>[snaps]] but
still suppresses the properties of non-sent snapshots and also removes a
consistency check in which an error is raised if a listed snapshot does
not have any properties in fss[<id>[snapprops]].
The 0574855 commit also introduced a bug in which zfs send -p of a
complete stream (zfs send -p pool/fs@snap) would exclude the snapshot
properties in fss[<id>[snapprops]]. This patch detects the last snapshot
in a series when no "from" snapshot has been specified and includes its
properties.
Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#2907
If spl_hostid is set via module parameter, it's likely different from
gethostid(). Therefore, the userspace tool should read it first before
falling back to gethostid().
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#3034
When importing pools with zpool import -aN there is inconsistent
behavior between '-d /dev/disk/by-id' (or another path) and
'-c /etc/zfs/zpool.cache'.
The difference in behavior is caused by zpool_find_import_cached()
returning an empty nvlist_t when there are no pools to import but
zpool_find_import_impl() returns NULL for the same situation. The
behavior of zpool_find_import_cached() is arguably more correct
because it allows returning NULL to be used for an error case and
not an empty set.
This change resolves the issue by updating get_configs() such that
it returns an empty set instead of NULL when no config is found.
The updated behavior will now always return 0 for this case.
$ zpool import -aN; echo $?
no pools available to import
0
$ zpool import -aN -d /var/tmp/; echo $?
no pools available to import
0
$ zpool import -aN -c /etc/zfs/zpool.cache; echo $?
no pools available to import
0
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#2080
... as opposed to sending properties of all snapshots of the relevant
filesystem. The previous behavior results in properties being set on
all snapshots on the receiving side, which is quite slow.
Behavior of zfs send -R is not changed.
References:
http://thread.gmane.org/gmane.comp.file-systems.openzfs.devel/346
Ported-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2729
Issue #2210
FreeBSD PR kern/172259: Fixes zfs receive errors caused by snapshot
replication being processed in a random order instead of creation
order.
Eliminates needless filesystem renames caused by removed parent
snapshots which subsequently causes many more errors.
PR: kern/172259
Submitted by: Steven Hartland
Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
MFC after: 2 weeks
References:
https://github.com/freebsd/freebsd/commit/4995789
Porting notes:
Minor whitespace fixes were made to conform with style requirements:
lib/libzfs/libzfs_sendrecv.c: 2269: indent by spaces instead of tabs
lib/libzfs/libzfs_sendrecv.c: 2270: indent by spaces instead of tabs
Ported-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2729
Creating virtual machines that have their rootfs on ZFS on hosts that
have their rootfs on ZFS causes SPA namespace collisions when the
standard name rpool is used. The solution is either to give each guest
pool a name unique to the host, which is not always desireable, or boot
a VM environment containing an ISO image to install it, which is
cumbersome.
26b42f3f9d introduced `zpool import -t
...` to simplify situations where a host must access a guest's pool when
there is a SPA namespace conflict. We build upon that to introduce
`zpool import -t tname ...`. That allows us to create a pool whose
in-core name is tname, but whose on-disk name is the normal name
specified.
This simplifies the creation of machine images that use a rootfs on ZFS.
That benefits not only real world deployments, but also ZFSOnLinux
development by decreasing the time needed to perform rootfs on ZFS
experiments.
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2417
5116 zpool history -i goes into infinite loop
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Richard Elling <richard.elling@gmail.com>
Reviewed by: Boris Protopopov <boris.protopopov@me.com>
Approved by: Dan McDonald <danmcd@omniti.com>
References:
https://www.illumos.org/issues/5116https://github.com/illumos/illumos-gate/commit/3339867
Ported by: Turbo Fredriksson <turbo@bayour.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#2715
The 'zpool list -v' command displays lots of info but excludes the
capacity of each disk. This should be added.
5147 zpool list -v should show individual disk capacity
Reviewed by: Adam Leventhal <adam.leventhal@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Matthew Ahrens <matthew.ahrens@delphix.com>
Reviewed by: Richard Elling <richard.elling@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
References:
https://www.illumos.org/issues/5147https://github.com/illumos/illumos-gate/commit/7a09f97
Ported by: Turbo Fredriksson <turbo@bayour.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#2688
When com.delphix:spacemap_histogram is disabled, the value of
fragmentation was printing as 18446744073709551615 (UINT64_MAX),
when it should print as '-'.
The issue was caused by a small mistake during the merge of
"4980 metaslabs should have a fragmentation metric."
upstream: https://github.com/illumos/illumos-gate/commit/2e4c998
ZoL: https://github.com/zfsonlinux/zfs/commit/f3a7f66
The problem is in zpool_get_prop_literal, where the handling of the
pool property ZPOOL_PROP_FRAGMENTATION was added to wrong the
section. In particular, ZPOOL_PROP_FRAGMENTATION should not be in
the section where zpool_get_state(zhp) == POOL_STATE_UNAVAIL, but
lower down after it's already been determined that the pool is in
fact available, which is where upstream illumos correctly has had
it.
Thanks to lundman for helping to track down this bug.
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#2664
Providing a pkg-config file makes is easy for 3rd party applications
to link against the libzfs libraries. It also allows the libzfs
developers to modify the list of required libraries and cflags
without breaking existing applications.
The following example illustrates how pkg-config can be used:
cc `pkg-config --cflags --libs libzfs` -o myapp myapp.c
/*
* myapp.c
*/
void main()
{
libzfs_handle_t *hdl;
hdl = libzfs_init();
if (hdl)
libzfs_fini(hdl);
}
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes: #585
4976 zfs should only avoid writing to a failing non-redundant top-level vdev
4978 ztest fails in get_metaslab_refcount()
4979 extend free space histogram to device and pool
4980 metaslabs should have a fragmentation metric
4981 remove fragmented ops vector from block allocator
4982 space_map object should proactively upgrade when feature is enabled
4983 need to collect metaslab information via mdb
4984 device selection should use fragmentation metric
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Adam Leventhal <adam.leventhal@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
References:
https://www.illumos.org/issues/4976https://www.illumos.org/issues/4978https://www.illumos.org/issues/4979https://www.illumos.org/issues/4980https://www.illumos.org/issues/4981https://www.illumos.org/issues/4982https://www.illumos.org/issues/4983https://www.illumos.org/issues/4984https://github.com/illumos/illumos-gate/commit/2e4c998
Notes:
The "zdb -M" option has been re-tasked to display the new metaslab
fragmentation metric and the new "zdb -I" option is used to control
the maximum number of in-flight I/Os.
The new fragmentation metric is derived from the space map histogram
which has been rolled up to the vdev and pool level and is presented
to the user via "zpool list".
Add a number of module parameters related to the new metaslab weighting
logic.
Ported by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#2595
4914 zfs on-disk bookmark structure should be named *_phys_t
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Reviewed by: Saso Kiselkov <skiselkov.ml@gmail.com>
Approved by: Robert Mustacchi <rm@joyent.com>
References:
https://www.illumos.org/issues/4914https://github.com/illumos/illumos-gate/commit/7802d7b
Porting notes:
There were a number of zfsonlinux-specific uses of zbookmark_t which
needed to be updated. This should reduce the likelihood of further
problems like issue #2094 from occurring.
Ported by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#2558
4390 i/o errors when deleting filesystem/zvol can lead to space map corruption
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Approved by: Dan McDonald <danmcd@omniti.com>
References:
https://www.illumos.org/issues/4390https://github.com/illumos/illumos-gate/commit/7fd05ac
Porting notes:
Previous stack-reduction efforts in traverse_visitb() caused a fair
number of un-mergable pieces of code. This patch should reduce its
stack footprint a bit more.
The new local bptree_entry_phys_t in bptree_add() is dynamically-allocated
using kmem_zalloc() for the purpose of stack reduction.
The new global zfs_free_leak_on_eio has been defined as an integer
rather than a boolean_t as was the case with the related zfs_recover
global. Also, zfs_free_leak_on_eio's definition has been inserted into
zfs_debug.c for consistency with the existing definition of zfs_recover.
Illumos placed it in spa_misc.c.
Ported by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#2545