2012-12-13 23:24:15 +00:00
|
|
|
'\" te
|
Illumos #4101, #4102, #4103, #4105, #4106
4101 metaslab_debug should allow for fine-grained control
4102 space_maps should store more information about themselves
4103 space map object blocksize should be increased
4105 removing a mirrored log device results in a leaked object
4106 asynchronously load metaslab
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Sebastien Roy <seb@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Prior to this patch, space_maps were preferred solely based on the
amount of free space left in each. Unfortunately, this heuristic didn't
contain any information about the make-up of that free space, which
meant we could keep preferring and loading a highly fragmented space map
that wouldn't actually have enough contiguous space to satisfy the
allocation; then unloading that space_map and repeating the process.
This change modifies the space_map's to store additional information
about the contiguous space in the space_map, so that we can use this
information to make a better decision about which space_map to load.
This requires reallocating all space_map objects to increase their
bonus buffer size sizes enough to fit the new metadata.
The above feature can be enabled via a new feature flag introduced by
this change: com.delphix:spacemap_histogram
In addition to the above, this patch allows the space_map block size to
be increase. Currently the block size is set to be 4K in size, which has
certain implications including the following:
* 4K sector devices will not see any compression benefit
* large space_maps require more metadata on-disk
* large space_maps require more time to load (typically random reads)
Now the space_map block size can adjust as needed up to the maximum size
set via the space_map_max_blksz variable.
A bug was fixed which resulted in potentially leaking an object when
removing a mirrored log device. The previous logic for vdev_remove() did
not deal with removing top-level vdevs that are interior vdevs (i.e.
mirror) correctly. The problem would occur when removing a mirrored log
device, and result in the DTL space map object being leaked; because
top-level vdevs don't have DTL space map objects associated with them.
References:
https://www.illumos.org/issues/4101
https://www.illumos.org/issues/4102
https://www.illumos.org/issues/4103
https://www.illumos.org/issues/4105
https://www.illumos.org/issues/4106
https://github.com/illumos/illumos-gate/commit/0713e23
Porting notes:
A handful of kmem_alloc() calls were converted to kmem_zalloc(). Also,
the KM_PUSHPAGE and TQ_PUSHPAGE flags were used as necessary.
Ported-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Prakash Surya <surya1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2488
2013-10-01 21:25:53 +00:00
|
|
|
.\" Copyright (c) 2013 by Delphix. All rights reserved.
|
2013-01-23 09:54:30 +00:00
|
|
|
.\" Copyright (c) 2013 by Saso Kiselkov. All rights reserved.
|
2012-12-13 23:24:15 +00:00
|
|
|
.\" The contents of this file are subject to the terms of the Common Development
|
|
|
|
.\" and Distribution License (the "License"). You may not use this file except
|
|
|
|
.\" in compliance with the License. You can obtain a copy of the license at
|
|
|
|
.\" usr/src/OPENSOLARIS.LICENSE or http://www.opensolaris.org/os/licensing.
|
|
|
|
.\"
|
|
|
|
.\" See the License for the specific language governing permissions and
|
|
|
|
.\" limitations under the License. When distributing Covered Code, include this
|
|
|
|
.\" CDDL HEADER in each file and include the License file at
|
|
|
|
.\" usr/src/OPENSOLARIS.LICENSE. If applicable, add the following below this
|
|
|
|
.\" CDDL HEADER, with the fields enclosed by brackets "[]" replaced with your
|
|
|
|
.\" own identifying information:
|
|
|
|
.\" Portions Copyright [yyyy] [name of copyright owner]
|
Illumos #4101, #4102, #4103, #4105, #4106
4101 metaslab_debug should allow for fine-grained control
4102 space_maps should store more information about themselves
4103 space map object blocksize should be increased
4105 removing a mirrored log device results in a leaked object
4106 asynchronously load metaslab
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Sebastien Roy <seb@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Prior to this patch, space_maps were preferred solely based on the
amount of free space left in each. Unfortunately, this heuristic didn't
contain any information about the make-up of that free space, which
meant we could keep preferring and loading a highly fragmented space map
that wouldn't actually have enough contiguous space to satisfy the
allocation; then unloading that space_map and repeating the process.
This change modifies the space_map's to store additional information
about the contiguous space in the space_map, so that we can use this
information to make a better decision about which space_map to load.
This requires reallocating all space_map objects to increase their
bonus buffer size sizes enough to fit the new metadata.
The above feature can be enabled via a new feature flag introduced by
this change: com.delphix:spacemap_histogram
In addition to the above, this patch allows the space_map block size to
be increase. Currently the block size is set to be 4K in size, which has
certain implications including the following:
* 4K sector devices will not see any compression benefit
* large space_maps require more metadata on-disk
* large space_maps require more time to load (typically random reads)
Now the space_map block size can adjust as needed up to the maximum size
set via the space_map_max_blksz variable.
A bug was fixed which resulted in potentially leaking an object when
removing a mirrored log device. The previous logic for vdev_remove() did
not deal with removing top-level vdevs that are interior vdevs (i.e.
mirror) correctly. The problem would occur when removing a mirrored log
device, and result in the DTL space map object being leaked; because
top-level vdevs don't have DTL space map objects associated with them.
References:
https://www.illumos.org/issues/4101
https://www.illumos.org/issues/4102
https://www.illumos.org/issues/4103
https://www.illumos.org/issues/4105
https://www.illumos.org/issues/4106
https://github.com/illumos/illumos-gate/commit/0713e23
Porting notes:
A handful of kmem_alloc() calls were converted to kmem_zalloc(). Also,
the KM_PUSHPAGE and TQ_PUSHPAGE flags were used as necessary.
Ported-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Prakash Surya <surya1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2488
2013-10-01 21:25:53 +00:00
|
|
|
.TH ZPOOL-FEATURES 5 "Aug 27, 2013"
|
2012-12-13 23:24:15 +00:00
|
|
|
.SH NAME
|
|
|
|
zpool\-features \- ZFS pool feature descriptions
|
|
|
|
.SH DESCRIPTION
|
|
|
|
.sp
|
|
|
|
.LP
|
|
|
|
ZFS pool on\-disk format versions are specified via "features" which replace
|
|
|
|
the old on\-disk format numbers (the last supported on\-disk format number is
|
2012-12-14 23:00:45 +00:00
|
|
|
28). To enable a feature on a pool use the \fBupgrade\fR subcommand of the
|
2013-02-04 20:35:25 +00:00
|
|
|
\fBzpool\fR(8) command, or set the \fBfeature@\fR\fIfeature_name\fR property
|
2012-12-14 23:00:45 +00:00
|
|
|
to \fBenabled\fR.
|
2012-12-13 23:24:15 +00:00
|
|
|
.sp
|
|
|
|
.LP
|
|
|
|
The pool format does not affect file system version compatibility or the ability
|
|
|
|
to send file systems between pools.
|
|
|
|
.sp
|
|
|
|
.LP
|
|
|
|
Since most features can be enabled independently of each other the on\-disk
|
|
|
|
format of the pool is specified by the set of all features marked as
|
|
|
|
\fBactive\fR on the pool. If the pool was created by another software version
|
|
|
|
this set may include unsupported features.
|
|
|
|
.SS "Identifying features"
|
|
|
|
.sp
|
|
|
|
.LP
|
|
|
|
Every feature has a guid of the form \fIcom.example:feature_name\fR. The reverse
|
|
|
|
DNS name ensures that the feature's guid is unique across all ZFS
|
|
|
|
implementations. When unsupported features are encountered on a pool they will
|
|
|
|
be identified by their guids. Refer to the documentation for the ZFS
|
|
|
|
implementation that created the pool for information about those features.
|
|
|
|
.sp
|
|
|
|
.LP
|
|
|
|
Each supported feature also has a short name. By convention a feature's short
|
|
|
|
name is the portion of its guid which follows the ':' (e.g.
|
|
|
|
\fIcom.example:feature_name\fR would have the short name \fIfeature_name\fR),
|
|
|
|
however a feature's short name may differ across ZFS implementations if
|
|
|
|
following the convention would result in name conflicts.
|
|
|
|
.SS "Feature states"
|
|
|
|
.sp
|
|
|
|
.LP
|
|
|
|
Features can be in one of three states:
|
|
|
|
.sp
|
|
|
|
.ne 2
|
|
|
|
.na
|
|
|
|
\fB\fBactive\fR\fR
|
|
|
|
.ad
|
|
|
|
.RS 12n
|
|
|
|
This feature's on\-disk format changes are in effect on the pool. Support for
|
|
|
|
this feature is required to import the pool in read\-write mode. If this
|
|
|
|
feature is not read-only compatible, support is also required to import the pool
|
|
|
|
in read\-only mode (see "Read\-only compatibility").
|
|
|
|
.RE
|
|
|
|
|
|
|
|
.sp
|
|
|
|
.ne 2
|
|
|
|
.na
|
|
|
|
\fB\fBenabled\fR\fR
|
|
|
|
.ad
|
|
|
|
.RS 12n
|
|
|
|
An administrator has marked this feature as enabled on the pool, but the
|
|
|
|
feature's on\-disk format changes have not been made yet. The pool can still be
|
|
|
|
imported by software that does not support this feature, but changes may be made
|
|
|
|
to the on\-disk format at any time which will move the feature to the
|
|
|
|
\fBactive\fR state. Some features may support returning to the \fBenabled\fR
|
|
|
|
state after becoming \fBactive\fR. See feature\-specific documentation for
|
|
|
|
details.
|
|
|
|
.RE
|
|
|
|
|
|
|
|
.sp
|
|
|
|
.ne 2
|
|
|
|
.na
|
|
|
|
\fBdisabled\fR
|
|
|
|
.ad
|
|
|
|
.RS 12n
|
|
|
|
This feature's on\-disk format changes have not been made and will not be made
|
|
|
|
unless an administrator moves the feature to the \fBenabled\fR state. Features
|
|
|
|
cannot be disabled once they have been enabled.
|
|
|
|
.RE
|
|
|
|
|
|
|
|
.sp
|
|
|
|
.LP
|
|
|
|
The state of supported features is exposed through pool properties of the form
|
|
|
|
\fIfeature@short_name\fR.
|
|
|
|
.SS "Read\-only compatibility"
|
|
|
|
.sp
|
|
|
|
.LP
|
|
|
|
Some features may make on\-disk format changes that do not interfere with other
|
|
|
|
software's ability to read from the pool. These features are referred to as
|
|
|
|
"read\-only compatible". If all unsupported features on a pool are read\-only
|
|
|
|
compatible, the pool can be imported in read\-only mode by setting the
|
2013-02-04 20:35:25 +00:00
|
|
|
\fBreadonly\fR property during import (see \fBzpool\fR(8) for details on
|
2012-12-13 23:24:15 +00:00
|
|
|
importing pools).
|
|
|
|
.SS "Unsupported features"
|
|
|
|
.sp
|
|
|
|
.LP
|
|
|
|
For each unsupported feature enabled on an imported pool a pool property
|
|
|
|
named \fIunsupported@feature_guid\fR will indicate why the import was allowed
|
|
|
|
despite the unsupported feature. Possible values for this property are:
|
|
|
|
|
|
|
|
.sp
|
|
|
|
.ne 2
|
|
|
|
.na
|
|
|
|
\fB\fBinactive\fR\fR
|
|
|
|
.ad
|
|
|
|
.RS 12n
|
|
|
|
The feature is in the \fBenabled\fR state and therefore the pool's on\-disk
|
|
|
|
format is still compatible with software that does not support this feature.
|
|
|
|
.RE
|
|
|
|
|
|
|
|
.sp
|
|
|
|
.ne 2
|
|
|
|
.na
|
|
|
|
\fB\fBreadonly\fR\fR
|
|
|
|
.ad
|
|
|
|
.RS 12n
|
|
|
|
The feature is read\-only compatible and the pool has been imported in
|
|
|
|
read\-only mode.
|
|
|
|
.RE
|
|
|
|
|
|
|
|
.SS "Feature dependencies"
|
|
|
|
.sp
|
|
|
|
.LP
|
|
|
|
Some features depend on other features being enabled in order to function
|
|
|
|
properly. Enabling a feature will automatically enable any features it
|
|
|
|
depends on.
|
|
|
|
.SH FEATURES
|
|
|
|
.sp
|
|
|
|
.LP
|
|
|
|
The following features are supported on this system:
|
|
|
|
.sp
|
|
|
|
.ne 2
|
|
|
|
.na
|
|
|
|
\fB\fBasync_destroy\fR\fR
|
|
|
|
.ad
|
|
|
|
.RS 4n
|
|
|
|
.TS
|
|
|
|
l l .
|
|
|
|
GUID com.delphix:async_destroy
|
|
|
|
READ\-ONLY COMPATIBLE yes
|
|
|
|
DEPENDENCIES none
|
|
|
|
.TE
|
|
|
|
|
|
|
|
Destroying a file system requires traversing all of its data in order to
|
|
|
|
return its used space to the pool. Without \fBasync_destroy\fR the file system
|
|
|
|
is not fully removed until all space has been reclaimed. If the destroy
|
|
|
|
operation is interrupted by a reboot or power outage the next attempt to open
|
|
|
|
the pool will need to complete the destroy operation synchronously.
|
|
|
|
|
|
|
|
When \fBasync_destroy\fR is enabled the file system's data will be reclaimed
|
|
|
|
by a background process, allowing the destroy operation to complete without
|
|
|
|
traversing the entire file system. The background process is able to resume
|
|
|
|
interrupted destroys after the pool has been opened, eliminating the need
|
|
|
|
to finish interrupted destroys as part of the open operation. The amount
|
|
|
|
of space remaining to be reclaimed by the background process is available
|
|
|
|
through the \fBfreeing\fR property.
|
|
|
|
|
|
|
|
This feature is only \fBactive\fR while \fBfreeing\fR is non\-zero.
|
|
|
|
.RE
|
2012-12-23 23:57:14 +00:00
|
|
|
|
|
|
|
.sp
|
|
|
|
.ne 2
|
|
|
|
.na
|
|
|
|
\fB\fBempty_bpobj\fR\fR
|
|
|
|
.ad
|
|
|
|
.RS 4n
|
|
|
|
.TS
|
|
|
|
l l .
|
|
|
|
GUID com.delphix:empty_bpobj
|
|
|
|
READ\-ONLY COMPATIBLE yes
|
|
|
|
DEPENDENCIES none
|
|
|
|
.TE
|
|
|
|
|
|
|
|
This feature increases the performance of creating and using a large
|
|
|
|
number of snapshots of a single filesystem or volume, and also reduces
|
|
|
|
the disk space required.
|
|
|
|
|
|
|
|
When there are many snapshots, each snapshot uses many Block Pointer
|
|
|
|
Objects (bpobj's) to track blocks associated with that snapshot.
|
|
|
|
However, in common use cases, most of these bpobj's are empty. This
|
|
|
|
feature allows us to create each bpobj on-demand, thus eliminating the
|
|
|
|
empty bpobjs.
|
|
|
|
|
|
|
|
This feature is \fBactive\fR while there are any filesystems, volumes,
|
|
|
|
or snapshots which were created after enabling this feature.
|
|
|
|
.RE
|
|
|
|
|
2013-01-23 09:54:30 +00:00
|
|
|
.sp
|
|
|
|
.ne 2
|
|
|
|
.na
|
|
|
|
\fB\fBlz4_compress\fR\fR
|
|
|
|
.ad
|
|
|
|
.RS 4n
|
|
|
|
.TS
|
|
|
|
l l .
|
|
|
|
GUID org.illumos:lz4_compress
|
|
|
|
READ\-ONLY COMPATIBLE no
|
|
|
|
DEPENDENCIES none
|
|
|
|
.TE
|
|
|
|
|
|
|
|
\fBlz4\fR is a high-performance real-time compression algorithm that
|
|
|
|
features significantly faster compression and decompression as well as a
|
|
|
|
higher compression ratio than the older \fBlzjb\fR compression.
|
|
|
|
Typically, \fBlz4\fR compression is approximately 50% faster on
|
|
|
|
compressible data and 200% faster on incompressible data than
|
|
|
|
\fBlzjb\fR. It is also approximately 80% faster on decompression, while
|
|
|
|
giving approximately 10% better compression ratio.
|
|
|
|
|
|
|
|
When the \fBlz4_compress\fR feature is set to \fBenabled\fR, the
|
|
|
|
administrator can turn on \fBlz4\fR compression on any dataset on the
|
2013-02-04 20:35:25 +00:00
|
|
|
pool using the \fBzfs\fR(8) command. Please note that doing so will
|
2013-01-23 09:54:30 +00:00
|
|
|
immediately activate the \fBlz4_compress\fR feature on the underlying
|
|
|
|
pool (even before any data is written). Since this feature is not
|
|
|
|
read-only compatible, this operation will render the pool unimportable
|
|
|
|
on systems without support for the \fBlz4_compress\fR feature. At the
|
|
|
|
moment, this operation cannot be reversed. Booting off of
|
|
|
|
\fBlz4\fR-compressed root pools is supported.
|
Illumos #4101, #4102, #4103, #4105, #4106
4101 metaslab_debug should allow for fine-grained control
4102 space_maps should store more information about themselves
4103 space map object blocksize should be increased
4105 removing a mirrored log device results in a leaked object
4106 asynchronously load metaslab
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Sebastien Roy <seb@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Prior to this patch, space_maps were preferred solely based on the
amount of free space left in each. Unfortunately, this heuristic didn't
contain any information about the make-up of that free space, which
meant we could keep preferring and loading a highly fragmented space map
that wouldn't actually have enough contiguous space to satisfy the
allocation; then unloading that space_map and repeating the process.
This change modifies the space_map's to store additional information
about the contiguous space in the space_map, so that we can use this
information to make a better decision about which space_map to load.
This requires reallocating all space_map objects to increase their
bonus buffer size sizes enough to fit the new metadata.
The above feature can be enabled via a new feature flag introduced by
this change: com.delphix:spacemap_histogram
In addition to the above, this patch allows the space_map block size to
be increase. Currently the block size is set to be 4K in size, which has
certain implications including the following:
* 4K sector devices will not see any compression benefit
* large space_maps require more metadata on-disk
* large space_maps require more time to load (typically random reads)
Now the space_map block size can adjust as needed up to the maximum size
set via the space_map_max_blksz variable.
A bug was fixed which resulted in potentially leaking an object when
removing a mirrored log device. The previous logic for vdev_remove() did
not deal with removing top-level vdevs that are interior vdevs (i.e.
mirror) correctly. The problem would occur when removing a mirrored log
device, and result in the DTL space map object being leaked; because
top-level vdevs don't have DTL space map objects associated with them.
References:
https://www.illumos.org/issues/4101
https://www.illumos.org/issues/4102
https://www.illumos.org/issues/4103
https://www.illumos.org/issues/4105
https://www.illumos.org/issues/4106
https://github.com/illumos/illumos-gate/commit/0713e23
Porting notes:
A handful of kmem_alloc() calls were converted to kmem_zalloc(). Also,
the KM_PUSHPAGE and TQ_PUSHPAGE flags were used as necessary.
Ported-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Prakash Surya <surya1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2488
2013-10-01 21:25:53 +00:00
|
|
|
.RE
|
|
|
|
|
|
|
|
.sp
|
|
|
|
.ne 2
|
|
|
|
.na
|
|
|
|
\fB\fBspacemap_histogram\fR\fR
|
|
|
|
.ad
|
|
|
|
.RS 4n
|
|
|
|
.TS
|
|
|
|
l l .
|
|
|
|
GUID com.delphix:spacemap_histogram
|
|
|
|
READ\-ONLY COMPATIBLE yes
|
|
|
|
DEPENDENCIES none
|
|
|
|
.TE
|
|
|
|
|
|
|
|
This features allows ZFS to maintain more information about how free space
|
|
|
|
is organized within the pool. If this feature is \fBenabled\fR, ZFS will
|
|
|
|
set this feature to \fBactive\fR when a new space map object is created or
|
|
|
|
an existing space map is upgraded to the new format. Once the feature is
|
|
|
|
\fBactive\fR, it will remain in that state until the pool is destroyed.
|
2013-01-23 09:54:30 +00:00
|
|
|
|
|
|
|
.RE
|
|
|
|
|
2012-12-13 23:24:15 +00:00
|
|
|
.SH "SEE ALSO"
|
2013-02-04 20:35:25 +00:00
|
|
|
\fBzpool\fR(8)
|