zfs/man/man8/zpool.8

562 lines
16 KiB
Groff
Raw Normal View History

.\"
.\" CDDL HEADER START
.\"
.\" The contents of this file are subject to the terms of the
.\" Common Development and Distribution License (the "License").
.\" You may not use this file except in compliance with the License.
.\"
.\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
.\" or http://www.opensolaris.org/os/licensing.
.\" See the License for the specific language governing permissions
.\" and limitations under the License.
.\"
.\" When distributing Covered Code, include this CDDL HEADER in each
.\" file and include the License file at usr/src/OPENSOLARIS.LICENSE.
.\" If applicable, add the following below this CDDL HEADER, with the
.\" fields enclosed by brackets "[]" replaced with your own identifying
.\" information: Portions Copyright [yyyy] [name of copyright owner]
.\"
.\" CDDL HEADER END
.\"
.\"
.\" Copyright (c) 2007, Sun Microsystems, Inc. All Rights Reserved.
.\" Copyright (c) 2012, 2018 by Delphix. All rights reserved.
.\" Copyright (c) 2012 Cyril Plisko. All Rights Reserved.
.\" Copyright (c) 2017 Datto Inc.
.\" Copyright (c) 2018 George Melikov. All Rights Reserved.
.\" Copyright 2017 Nexenta Systems, Inc.
.\" Copyright (c) 2017 Open-E, Inc. All Rights Reserved.
.\"
Add subcommand to wait for background zfs activity to complete Currently the best way to wait for the completion of a long-running operation in a pool, like a scrub or device removal, is to poll 'zpool status' and parse its output, which is neither efficient nor convenient. This change adds a 'wait' subcommand to the zpool command. When invoked, 'zpool wait' will block until a specified type of background activity completes. Currently, this subcommand can wait for any of the following: - Scrubs or resilvers to complete - Devices to initialized - Devices to be replaced - Devices to be removed - Checkpoints to be discarded - Background freeing to complete For example, a scrub that is in progress could be waited for by running zpool wait -t scrub <pool> This also adds a -w flag to the attach, checkpoint, initialize, replace, remove, and scrub subcommands. When used, this flag makes the operations kicked off by these subcommands synchronous instead of asynchronous. This functionality is implemented using a new ioctl. The type of activity to wait for is provided as input to the ioctl, and the ioctl blocks until all activity of that type has completed. An ioctl was used over other methods of kernel-userspace communiction primarily for the sake of portability. Porting Notes: This is ported from Delphix OS change DLPX-44432. The following changes were made while porting: - Added ZoL-style ioctl input declaration. - Reorganized error handling in zpool_initialize in libzfs to integrate better with changes made for TRIM support. - Fixed check for whether a checkpoint discard is in progress. Previously it also waited if the pool had a checkpoint, instead of just if a checkpoint was being discarded. - Exposed zfs_initialize_chunk_size as a ZoL-style tunable. - Updated more existing tests to make use of new 'zpool wait' functionality, tests that don't exist in Delphix OS. - Used existing ZoL tunable zfs_scan_suspend_progress, together with zinject, in place of a new tunable zfs_scan_max_blks_per_txg. - Added support for a non-integral interval argument to zpool wait. Future work: ZoL has support for trimming devices, which Delphix OS does not. In the future, 'zpool wait' could be extended to add the ability to wait for trim operations to complete. Reviewed-by: Matt Ahrens <matt@delphix.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: John Gallagher <john.gallagher@delphix.com> Closes #9162
2019-09-14 01:09:06 +00:00
.Dd August 9, 2019
.Dt ZPOOL 8
.Os Linux
.Sh NAME
.Nm zpool
.Nd configure ZFS storage pools
.Sh SYNOPSIS
.Nm
.Fl ?V
.Nm
.Cm version
.Nm
.Cm <subcommand>
.Op Ar <args>
.Sh DESCRIPTION
The
.Nm
command configures ZFS storage pools.
A storage pool is a collection of devices that provides physical storage and
data replication for ZFS datasets.
All datasets within a storage pool share the same space.
See
.Xr zfs 8
for information on managing datasets.
.Pp
For an overview of creating and managing ZFS storage pools see the
.Xr zpoolconcepts 8
manual page.
.Sh SUBCOMMANDS
All subcommands that modify state are logged persistently to the pool in their
original form.
.Pp
The
.Nm
command provides subcommands to create and destroy storage pools, add capacity
to storage pools, and provide information about the storage pools.
The following subcommands are supported:
.Bl -tag -width Ds
.It Xo
.Nm
.Fl ?
.Xc
Displays a help message.
.It Xo
.Nm
.Fl V, -version
.Xc
An alias for the
.Nm zpool Cm version
subcommand.
.It Xo
.Nm
.Cm version
.Xc
Displays the software version of the
.Nm
userland utility and the zfs kernel module.
.El
.Ss Creation
OpenZFS 9166 - zfs storage pool checkpoint Details about the motivation of this feature and its usage can be found in this blogpost: https://sdimitro.github.io/post/zpool-checkpoint/ A lightning talk of this feature can be found here: https://www.youtube.com/watch?v=fPQA8K40jAM Implementation details can be found in big block comment of spa_checkpoint.c Side-changes that are relevant to this commit but not explained elsewhere: * renames members of "struct metaslab trees to be shorter without losing meaning * space_map_{alloc,truncate}() accept a block size as a parameter. The reason is that in the current state all space maps that we allocate through the DMU use a global tunable (space_map_blksz) which defauls to 4KB. This is ok for metaslab space maps in terms of bandwirdth since they are scattered all over the disk. But for other space maps this default is probably not what we want. Examples are device removal's vdev_obsolete_sm or vdev_chedkpoint_sm from this review. Both of these have a 1:1 relationship with each vdev and could benefit from a bigger block size. Porting notes: * The part of dsl_scan_sync() which handles async destroys has been moved into the new dsl_process_async_destroys() function. * Remove "VERIFY(!(flags & FWRITE))" in "kernel.c" so zhack can write to block device backed pools. * ZTS: * Fix get_txg() in zpool_sync_001_pos due to "checkpoint_txg". * Don't use large dd block sizes on /dev/urandom under Linux in checkpoint_capacity. * Adopt Delphix-OS's setting of 4 (spa_asize_inflation = SPA_DVAS_PER_BP + 1) for the checkpoint_capacity test to speed its attempts to fill the pool * Create the base and nested pools with sync=disabled to speed up the "setup" phase. * Clear labels in test pool between checkpoint tests to avoid duplicate pool issues. * The import_rewind_device_replaced test has been marked as "known to fail" for the reasons listed in its DISCLAIMER. * New module parameters: zfs_spa_discard_memory_limit, zfs_remove_max_bytes_pause (not documented - debugging only) vdev_max_ms_count (formerly metaslabs_per_vdev) vdev_min_ms_count Authored by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Approved by: Richard Lowe <richlowe@richlowe.net> Ported-by: Tim Chase <tim@chase2k.com> Signed-off-by: Tim Chase <tim@chase2k.com> OpenZFS-issue: https://illumos.org/issues/9166 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/7159fdb8 Closes #7570
2016-12-16 22:11:29 +00:00
.Bl -tag -width Ds
.It Xr zpool-create 8
Creates a new storage pool containing the virtual devices specified on the
command line.
.It Xr zpool-initialize 8
Begins initializing by writing to all unallocated regions on the specified
devices, or all eligible devices in the pool if no individual devices are
specified.
.El
.Ss Destruction
.Bl -tag -width Ds
.It Xr zpool-destroy 8
Destroys the given pool, freeing up any devices for other use.
.It Xr zpool-labelclear 8
Removes ZFS label information from the specified
.Ar device .
.El
.Ss Virtual Devices
.Bl -tag -width Ds
.It Xo
.Xr zpool-attach 8 /
.Xr zpool-detach 8
.Xc
Increases or decreases redundancy by
.Cm attach Ns -ing or
.Cm detach Ns -ing a device on an existing vdev (virtual device).
.It Xo
.Xr zpool-add 8 /
.Xr zpool-remove 8
.Xc
Adds the specified virtual devices to the given pool,
or removes the specified device from the pool.
.It Xr zpool-replace 8
Replaces an existing device (which may be faulted) with a new one.
.It Xr zpool-split 8
Creates a new pool by splitting all mirrors in an existing pool (which decreases its redundancy).
.El
.Ss Properties
Available pool properties listed in the
.Xr zpoolprops 8
manual page.
.Bl -tag -width Ds
.It Xr zpool-list 8
Lists the given pools along with a health status and space usage.
.It Xo
.Xr zpool-get 8 /
.Xr zpool-set 8
.Xc
Retrieves the given list of properties
.Po
or all properties if
.Sy all
is used
.Pc
for the specified storage pool(s).
.El
.Ss Monitoring
.Bl -tag -width Ds
.It Xr zpool-status 8
Displays the detailed health status for the given pools.
.It Xr zpool-iostat 8
Displays logical I/O statistics for the given pools/vdevs. Physical I/Os may
be observed via
.Xr iostat 1 .
.It Xr zpool-events 8
Lists all recent events generated by the ZFS kernel modules. These events
are consumed by the
.Xr zed 8
and used to automate administrative tasks such as replacing a failed device
with a hot spare. For more information about the subclasses and event payloads
that can be generated see the
.Xr zfs-events 5
man page.
.It Xr zpool-history 8
Displays the command history of the specified pool(s) or all pools if no pool is
specified.
OpenZFS 7614, 9064 - zfs device evacuation/removal OpenZFS 7614 - zfs device evacuation/removal OpenZFS 9064 - remove_mirror should wait for device removal to complete This project allows top-level vdevs to be removed from the storage pool with "zpool remove", reducing the total amount of storage in the pool. This operation copies all allocated regions of the device to be removed onto other devices, recording the mapping from old to new location. After the removal is complete, read and free operations to the removed (now "indirect") vdev must be remapped and performed at the new location on disk. The indirect mapping table is kept in memory whenever the pool is loaded, so there is minimal performance overhead when doing operations on the indirect vdev. The size of the in-memory mapping table will be reduced when its entries become "obsolete" because they are no longer used by any block pointers in the pool. An entry becomes obsolete when all the blocks that use it are freed. An entry can also become obsolete when all the snapshots that reference it are deleted, and the block pointers that reference it have been "remapped" in all filesystems/zvols (and clones). Whenever an indirect block is written, all the block pointers in it will be "remapped" to their new (concrete) locations if possible. This process can be accelerated by using the "zfs remap" command to proactively rewrite all indirect blocks that reference indirect (removed) vdevs. Note that when a device is removed, we do not verify the checksum of the data that is copied. This makes the process much faster, but if it were used on redundant vdevs (i.e. mirror or raidz vdevs), it would be possible to copy the wrong data, when we have the correct data on e.g. the other side of the mirror. At the moment, only mirrors and simple top-level vdevs can be removed and no removal is allowed if any of the top-level vdevs are raidz. Porting Notes: * Avoid zero-sized kmem_alloc() in vdev_compact_children(). The device evacuation code adds a dependency that vdev_compact_children() be able to properly empty the vdev_child array by setting it to NULL and zeroing vdev_children. Under Linux, kmem_alloc() and related functions return a sentinel pointer rather than NULL for zero-sized allocations. * Remove comment regarding "mpt" driver where zfs_remove_max_segment is initialized to SPA_MAXBLOCKSIZE. Change zfs_condense_indirect_commit_entry_delay_ticks to zfs_condense_indirect_commit_entry_delay_ms for consistency with most other tunables in which delays are specified in ms. * ZTS changes: Use set_tunable rather than mdb Use zpool sync as appropriate Use sync_pool instead of sync Kill jobs during test_removal_with_operation to allow unmount/export Don't add non-disk names such as "mirror" or "raidz" to $DISKS Use $TEST_BASE_DIR instead of /tmp Increase HZ from 100 to 1000 which is more common on Linux removal_multiple_indirection.ksh Reduce iterations in order to not time out on the code coverage builders. removal_resume_export: Functionally, the test case is correct but there exists a race where the kernel thread hasn't been fully started yet and is not visible. Wait for up to 1 second for the removal thread to be started before giving up on it. Also, increase the amount of data copied in order that the removal not finish before the export has a chance to fail. * MMP compatibility, the concept of concrete versus non-concrete devices has slightly changed the semantics of vdev_writeable(). Update mmp_random_leaf_impl() accordingly. * Updated dbuf_remap() to handle the org.zfsonlinux:large_dnode pool feature which is not supported by OpenZFS. * Added support for new vdev removal tracepoints. * Test cases removal_with_zdb and removal_condense_export have been intentionally disabled. When run manually they pass as intended, but when running in the automated test environment they produce unreliable results on the latest Fedora release. They may work better once the upstream pool import refectoring is merged into ZoL at which point they will be re-enabled. Authored by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Alex Reece <alex@delphix.com> Reviewed-by: George Wilson <george.wilson@delphix.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Richard Laager <rlaager@wiktel.com> Reviewed by: Tim Chase <tim@chase2k.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Approved by: Garrett D'Amore <garrett@damore.org> Ported-by: Tim Chase <tim@chase2k.com> Signed-off-by: Tim Chase <tim@chase2k.com> OpenZFS-issue: https://www.illumos.org/issues/7614 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/f539f1eb Closes #6900
2016-09-22 16:30:13 +00:00
.El
.Ss Maintenance
.Bl -tag -width Ds
.It Xr zpool-scrub 8
Begins a scrub or resumes a paused scrub.
.It Xr zpool-checkpoint 8
Checkpoints the current state of
Add TRIM support UNMAP/TRIM support is a frequently-requested feature to help prevent performance from degrading on SSDs and on various other SAN-like storage back-ends. By issuing UNMAP/TRIM commands for sectors which are no longer allocated the underlying device can often more efficiently manage itself. This TRIM implementation is modeled on the `zpool initialize` feature which writes a pattern to all unallocated space in the pool. The new `zpool trim` command uses the same vdev_xlate() code to calculate what sectors are unallocated, the same per- vdev TRIM thread model and locking, and the same basic CLI for a consistent user experience. The core difference is that instead of writing a pattern it will issue UNMAP/TRIM commands for those extents. The zio pipeline was updated to accommodate this by adding a new ZIO_TYPE_TRIM type and associated spa taskq. This new type makes is straight forward to add the platform specific TRIM/UNMAP calls to vdev_disk.c and vdev_file.c. These new ZIO_TYPE_TRIM zios are handled largely the same way as ZIO_TYPE_READs or ZIO_TYPE_WRITEs. This makes it possible to largely avoid changing the pipieline, one exception is that TRIM zio's may exceed the 16M block size limit since they contain no data. In addition to the manual `zpool trim` command, a background automatic TRIM was added and is controlled by the 'autotrim' property. It relies on the exact same infrastructure as the manual TRIM. However, instead of relying on the extents in a metaslab's ms_allocatable range tree, a ms_trim tree is kept per metaslab. When 'autotrim=on', ranges added back to the ms_allocatable tree are also added to the ms_free tree. The ms_free tree is then periodically consumed by an autotrim thread which systematically walks a top level vdev's metaslabs. Since the automatic TRIM will skip ranges it considers too small there is value in occasionally running a full `zpool trim`. This may occur when the freed blocks are small and not enough time was allowed to aggregate them. An automatic TRIM and a manual `zpool trim` may be run concurrently, in which case the automatic TRIM will yield to the manual TRIM. Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Tim Chase <tim@chase2k.com> Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: George Wilson <george.wilson@delphix.com> Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com> Contributions-by: Saso Kiselkov <saso.kiselkov@nexenta.com> Contributions-by: Tim Chase <tim@chase2k.com> Contributions-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #8419 Closes #598
2019-03-29 16:13:20 +00:00
.Ar pool
, which can be later restored by
.Nm zpool Cm import --rewind-to-checkpoint .
.It Xr zpool-trim 8
Add TRIM support UNMAP/TRIM support is a frequently-requested feature to help prevent performance from degrading on SSDs and on various other SAN-like storage back-ends. By issuing UNMAP/TRIM commands for sectors which are no longer allocated the underlying device can often more efficiently manage itself. This TRIM implementation is modeled on the `zpool initialize` feature which writes a pattern to all unallocated space in the pool. The new `zpool trim` command uses the same vdev_xlate() code to calculate what sectors are unallocated, the same per- vdev TRIM thread model and locking, and the same basic CLI for a consistent user experience. The core difference is that instead of writing a pattern it will issue UNMAP/TRIM commands for those extents. The zio pipeline was updated to accommodate this by adding a new ZIO_TYPE_TRIM type and associated spa taskq. This new type makes is straight forward to add the platform specific TRIM/UNMAP calls to vdev_disk.c and vdev_file.c. These new ZIO_TYPE_TRIM zios are handled largely the same way as ZIO_TYPE_READs or ZIO_TYPE_WRITEs. This makes it possible to largely avoid changing the pipieline, one exception is that TRIM zio's may exceed the 16M block size limit since they contain no data. In addition to the manual `zpool trim` command, a background automatic TRIM was added and is controlled by the 'autotrim' property. It relies on the exact same infrastructure as the manual TRIM. However, instead of relying on the extents in a metaslab's ms_allocatable range tree, a ms_trim tree is kept per metaslab. When 'autotrim=on', ranges added back to the ms_allocatable tree are also added to the ms_free tree. The ms_free tree is then periodically consumed by an autotrim thread which systematically walks a top level vdev's metaslabs. Since the automatic TRIM will skip ranges it considers too small there is value in occasionally running a full `zpool trim`. This may occur when the freed blocks are small and not enough time was allowed to aggregate them. An automatic TRIM and a manual `zpool trim` may be run concurrently, in which case the automatic TRIM will yield to the manual TRIM. Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Tim Chase <tim@chase2k.com> Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: George Wilson <george.wilson@delphix.com> Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com> Contributions-by: Saso Kiselkov <saso.kiselkov@nexenta.com> Contributions-by: Tim Chase <tim@chase2k.com> Contributions-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #8419 Closes #598
2019-03-29 16:13:20 +00:00
Initiates an immediate on-demand TRIM operation for all of the free space in
a pool. This operation informs the underlying storage devices of all blocks
in the pool which are no longer allocated and allows thinly provisioned
devices to reclaim the space.
.It Xr zpool-sync 8
This command forces all in-core dirty data to be written to the primary
pool storage and not the ZIL. It will also update administrative
information including quota reporting. Without arguments,
.Sy zpool sync
will sync all pools on the system. Otherwise, it will sync only the
specified pool(s).
.It Xr zpool-upgrade 8
Manage the on-disk format version of storage pools.
.It Xr zpool-wait 8
Add subcommand to wait for background zfs activity to complete Currently the best way to wait for the completion of a long-running operation in a pool, like a scrub or device removal, is to poll 'zpool status' and parse its output, which is neither efficient nor convenient. This change adds a 'wait' subcommand to the zpool command. When invoked, 'zpool wait' will block until a specified type of background activity completes. Currently, this subcommand can wait for any of the following: - Scrubs or resilvers to complete - Devices to initialized - Devices to be replaced - Devices to be removed - Checkpoints to be discarded - Background freeing to complete For example, a scrub that is in progress could be waited for by running zpool wait -t scrub <pool> This also adds a -w flag to the attach, checkpoint, initialize, replace, remove, and scrub subcommands. When used, this flag makes the operations kicked off by these subcommands synchronous instead of asynchronous. This functionality is implemented using a new ioctl. The type of activity to wait for is provided as input to the ioctl, and the ioctl blocks until all activity of that type has completed. An ioctl was used over other methods of kernel-userspace communiction primarily for the sake of portability. Porting Notes: This is ported from Delphix OS change DLPX-44432. The following changes were made while porting: - Added ZoL-style ioctl input declaration. - Reorganized error handling in zpool_initialize in libzfs to integrate better with changes made for TRIM support. - Fixed check for whether a checkpoint discard is in progress. Previously it also waited if the pool had a checkpoint, instead of just if a checkpoint was being discarded. - Exposed zfs_initialize_chunk_size as a ZoL-style tunable. - Updated more existing tests to make use of new 'zpool wait' functionality, tests that don't exist in Delphix OS. - Used existing ZoL tunable zfs_scan_suspend_progress, together with zinject, in place of a new tunable zfs_scan_max_blks_per_txg. - Added support for a non-integral interval argument to zpool wait. Future work: ZoL has support for trimming devices, which Delphix OS does not. In the future, 'zpool wait' could be extended to add the ability to wait for trim operations to complete. Reviewed-by: Matt Ahrens <matt@delphix.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: John Gallagher <john.gallagher@delphix.com> Closes #9162
2019-09-14 01:09:06 +00:00
Waits until all background activity of the given types has ceased in the given
pool.
.El
.Ss Fault Resolution
Add subcommand to wait for background zfs activity to complete Currently the best way to wait for the completion of a long-running operation in a pool, like a scrub or device removal, is to poll 'zpool status' and parse its output, which is neither efficient nor convenient. This change adds a 'wait' subcommand to the zpool command. When invoked, 'zpool wait' will block until a specified type of background activity completes. Currently, this subcommand can wait for any of the following: - Scrubs or resilvers to complete - Devices to initialized - Devices to be replaced - Devices to be removed - Checkpoints to be discarded - Background freeing to complete For example, a scrub that is in progress could be waited for by running zpool wait -t scrub <pool> This also adds a -w flag to the attach, checkpoint, initialize, replace, remove, and scrub subcommands. When used, this flag makes the operations kicked off by these subcommands synchronous instead of asynchronous. This functionality is implemented using a new ioctl. The type of activity to wait for is provided as input to the ioctl, and the ioctl blocks until all activity of that type has completed. An ioctl was used over other methods of kernel-userspace communiction primarily for the sake of portability. Porting Notes: This is ported from Delphix OS change DLPX-44432. The following changes were made while porting: - Added ZoL-style ioctl input declaration. - Reorganized error handling in zpool_initialize in libzfs to integrate better with changes made for TRIM support. - Fixed check for whether a checkpoint discard is in progress. Previously it also waited if the pool had a checkpoint, instead of just if a checkpoint was being discarded. - Exposed zfs_initialize_chunk_size as a ZoL-style tunable. - Updated more existing tests to make use of new 'zpool wait' functionality, tests that don't exist in Delphix OS. - Used existing ZoL tunable zfs_scan_suspend_progress, together with zinject, in place of a new tunable zfs_scan_max_blks_per_txg. - Added support for a non-integral interval argument to zpool wait. Future work: ZoL has support for trimming devices, which Delphix OS does not. In the future, 'zpool wait' could be extended to add the ability to wait for trim operations to complete. Reviewed-by: Matt Ahrens <matt@delphix.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: John Gallagher <john.gallagher@delphix.com> Closes #9162
2019-09-14 01:09:06 +00:00
.Bl -tag -width Ds
.It Xo
.Xr zpool-offline 8
.Xr zpool-online 8
.Xc
Takes the specified physical device offline or brings it online.
.It Xr zpool-resilver 8
Starts a resilver. If an existing resilver is already running it will be
restarted from the beginning.
.It Xr zpool-reopen 8
Reopen all the vdevs associated with the pool.
.It Xr zpool-clear 8
Clears device errors in a pool.
Add subcommand to wait for background zfs activity to complete Currently the best way to wait for the completion of a long-running operation in a pool, like a scrub or device removal, is to poll 'zpool status' and parse its output, which is neither efficient nor convenient. This change adds a 'wait' subcommand to the zpool command. When invoked, 'zpool wait' will block until a specified type of background activity completes. Currently, this subcommand can wait for any of the following: - Scrubs or resilvers to complete - Devices to initialized - Devices to be replaced - Devices to be removed - Checkpoints to be discarded - Background freeing to complete For example, a scrub that is in progress could be waited for by running zpool wait -t scrub <pool> This also adds a -w flag to the attach, checkpoint, initialize, replace, remove, and scrub subcommands. When used, this flag makes the operations kicked off by these subcommands synchronous instead of asynchronous. This functionality is implemented using a new ioctl. The type of activity to wait for is provided as input to the ioctl, and the ioctl blocks until all activity of that type has completed. An ioctl was used over other methods of kernel-userspace communiction primarily for the sake of portability. Porting Notes: This is ported from Delphix OS change DLPX-44432. The following changes were made while porting: - Added ZoL-style ioctl input declaration. - Reorganized error handling in zpool_initialize in libzfs to integrate better with changes made for TRIM support. - Fixed check for whether a checkpoint discard is in progress. Previously it also waited if the pool had a checkpoint, instead of just if a checkpoint was being discarded. - Exposed zfs_initialize_chunk_size as a ZoL-style tunable. - Updated more existing tests to make use of new 'zpool wait' functionality, tests that don't exist in Delphix OS. - Used existing ZoL tunable zfs_scan_suspend_progress, together with zinject, in place of a new tunable zfs_scan_max_blks_per_txg. - Added support for a non-integral interval argument to zpool wait. Future work: ZoL has support for trimming devices, which Delphix OS does not. In the future, 'zpool wait' could be extended to add the ability to wait for trim operations to complete. Reviewed-by: Matt Ahrens <matt@delphix.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: John Gallagher <john.gallagher@delphix.com> Closes #9162
2019-09-14 01:09:06 +00:00
.El
.Ss Import & Export
.Bl -tag -width Ds
.It Xr zpool-import 8
Make disks containing ZFS storage pools available for use on the system.
.It Xr zpool-export 8
Exports the given pools from the system.
.It Xr zpool-reguid 8
Generates a new unique identifier for the pool.
.El
.Sh EXIT STATUS
The following exit values are returned:
.Bl -tag -width Ds
.It Sy 0
Successful completion.
.It Sy 1
An error occurred.
.It Sy 2
Invalid command line options were specified.
.El
.Sh EXAMPLES
.Bl -tag -width Ds
.It Sy Example 1 No Creating a RAID-Z Storage Pool
The following command creates a pool with a single raidz root vdev that
consists of six disks.
.Bd -literal
# zpool create tank raidz sda sdb sdc sdd sde sdf
.Ed
.It Sy Example 2 No Creating a Mirrored Storage Pool
The following command creates a pool with two mirrors, where each mirror
contains two disks.
.Bd -literal
# zpool create tank mirror sda sdb mirror sdc sdd
.Ed
.It Sy Example 3 No Creating a ZFS Storage Pool by Using Partitions
The following command creates an unmirrored pool using two disk partitions.
.Bd -literal
# zpool create tank sda1 sdb2
.Ed
.It Sy Example 4 No Creating a ZFS Storage Pool by Using Files
The following command creates an unmirrored pool using files.
While not recommended, a pool based on files can be useful for experimental
purposes.
.Bd -literal
# zpool create tank /path/to/file/a /path/to/file/b
.Ed
.It Sy Example 5 No Adding a Mirror to a ZFS Storage Pool
The following command adds two mirrored disks to the pool
.Em tank ,
assuming the pool is already made up of two-way mirrors.
The additional space is immediately available to any datasets within the pool.
.Bd -literal
# zpool add tank mirror sda sdb
.Ed
.It Sy Example 6 No Listing Available ZFS Storage Pools
The following command lists all available pools on the system.
In this case, the pool
.Em zion
is faulted due to a missing device.
The results from this command are similar to the following:
.Bd -literal
# zpool list
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
rpool 19.9G 8.43G 11.4G - 33% 42% 1.00x ONLINE -
tank 61.5G 20.0G 41.5G - 48% 32% 1.00x ONLINE -
zion - - - - - - - FAULTED -
.Ed
.It Sy Example 7 No Destroying a ZFS Storage Pool
The following command destroys the pool
.Em tank
and any datasets contained within.
.Bd -literal
# zpool destroy -f tank
.Ed
.It Sy Example 8 No Exporting a ZFS Storage Pool
The following command exports the devices in pool
.Em tank
so that they can be relocated or later imported.
.Bd -literal
# zpool export tank
.Ed
.It Sy Example 9 No Importing a ZFS Storage Pool
The following command displays available pools, and then imports the pool
.Em tank
for use on the system.
The results from this command are similar to the following:
.Bd -literal
# zpool import
pool: tank
id: 15451357997522795478
state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:
tank ONLINE
mirror ONLINE
sda ONLINE
sdb ONLINE
# zpool import tank
.Ed
.It Sy Example 10 No Upgrading All ZFS Storage Pools to the Current Version
The following command upgrades all ZFS Storage pools to the current version of
the software.
.Bd -literal
# zpool upgrade -a
This system is currently running ZFS version 2.
.Ed
.It Sy Example 11 No Managing Hot Spares
The following command creates a new pool with an available hot spare:
.Bd -literal
# zpool create tank mirror sda sdb spare sdc
.Ed
.Pp
If one of the disks were to fail, the pool would be reduced to the degraded
state.
The failed device can be replaced using the following command:
.Bd -literal
# zpool replace tank sda sdd
.Ed
.Pp
Once the data has been resilvered, the spare is automatically removed and is
made available for use should another device fail.
The hot spare can be permanently removed from the pool using the following
command:
.Bd -literal
# zpool remove tank sdc
.Ed
.It Sy Example 12 No Creating a ZFS Pool with Mirrored Separate Intent Logs
The following command creates a ZFS storage pool consisting of two, two-way
mirrors and mirrored log devices:
.Bd -literal
# zpool create pool mirror sda sdb mirror sdc sdd log mirror \\
sde sdf
.Ed
.It Sy Example 13 No Adding Cache Devices to a ZFS Pool
The following command adds two disks for use as cache devices to a ZFS storage
pool:
.Bd -literal
# zpool add pool cache sdc sdd
.Ed
.Pp
Once added, the cache devices gradually fill with content from main memory.
Depending on the size of your cache devices, it could take over an hour for
them to fill.
Capacity and reads can be monitored using the
.Cm iostat
option as follows:
.Bd -literal
# zpool iostat -v pool 5
.Ed
OpenZFS 7614, 9064 - zfs device evacuation/removal OpenZFS 7614 - zfs device evacuation/removal OpenZFS 9064 - remove_mirror should wait for device removal to complete This project allows top-level vdevs to be removed from the storage pool with "zpool remove", reducing the total amount of storage in the pool. This operation copies all allocated regions of the device to be removed onto other devices, recording the mapping from old to new location. After the removal is complete, read and free operations to the removed (now "indirect") vdev must be remapped and performed at the new location on disk. The indirect mapping table is kept in memory whenever the pool is loaded, so there is minimal performance overhead when doing operations on the indirect vdev. The size of the in-memory mapping table will be reduced when its entries become "obsolete" because they are no longer used by any block pointers in the pool. An entry becomes obsolete when all the blocks that use it are freed. An entry can also become obsolete when all the snapshots that reference it are deleted, and the block pointers that reference it have been "remapped" in all filesystems/zvols (and clones). Whenever an indirect block is written, all the block pointers in it will be "remapped" to their new (concrete) locations if possible. This process can be accelerated by using the "zfs remap" command to proactively rewrite all indirect blocks that reference indirect (removed) vdevs. Note that when a device is removed, we do not verify the checksum of the data that is copied. This makes the process much faster, but if it were used on redundant vdevs (i.e. mirror or raidz vdevs), it would be possible to copy the wrong data, when we have the correct data on e.g. the other side of the mirror. At the moment, only mirrors and simple top-level vdevs can be removed and no removal is allowed if any of the top-level vdevs are raidz. Porting Notes: * Avoid zero-sized kmem_alloc() in vdev_compact_children(). The device evacuation code adds a dependency that vdev_compact_children() be able to properly empty the vdev_child array by setting it to NULL and zeroing vdev_children. Under Linux, kmem_alloc() and related functions return a sentinel pointer rather than NULL for zero-sized allocations. * Remove comment regarding "mpt" driver where zfs_remove_max_segment is initialized to SPA_MAXBLOCKSIZE. Change zfs_condense_indirect_commit_entry_delay_ticks to zfs_condense_indirect_commit_entry_delay_ms for consistency with most other tunables in which delays are specified in ms. * ZTS changes: Use set_tunable rather than mdb Use zpool sync as appropriate Use sync_pool instead of sync Kill jobs during test_removal_with_operation to allow unmount/export Don't add non-disk names such as "mirror" or "raidz" to $DISKS Use $TEST_BASE_DIR instead of /tmp Increase HZ from 100 to 1000 which is more common on Linux removal_multiple_indirection.ksh Reduce iterations in order to not time out on the code coverage builders. removal_resume_export: Functionally, the test case is correct but there exists a race where the kernel thread hasn't been fully started yet and is not visible. Wait for up to 1 second for the removal thread to be started before giving up on it. Also, increase the amount of data copied in order that the removal not finish before the export has a chance to fail. * MMP compatibility, the concept of concrete versus non-concrete devices has slightly changed the semantics of vdev_writeable(). Update mmp_random_leaf_impl() accordingly. * Updated dbuf_remap() to handle the org.zfsonlinux:large_dnode pool feature which is not supported by OpenZFS. * Added support for new vdev removal tracepoints. * Test cases removal_with_zdb and removal_condense_export have been intentionally disabled. When run manually they pass as intended, but when running in the automated test environment they produce unreliable results on the latest Fedora release. They may work better once the upstream pool import refectoring is merged into ZoL at which point they will be re-enabled. Authored by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Alex Reece <alex@delphix.com> Reviewed-by: George Wilson <george.wilson@delphix.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Richard Laager <rlaager@wiktel.com> Reviewed by: Tim Chase <tim@chase2k.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Approved by: Garrett D'Amore <garrett@damore.org> Ported-by: Tim Chase <tim@chase2k.com> Signed-off-by: Tim Chase <tim@chase2k.com> OpenZFS-issue: https://www.illumos.org/issues/7614 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/f539f1eb Closes #6900
2016-09-22 16:30:13 +00:00
.It Sy Example 14 No Removing a Mirrored top-level (Log or Data) Device
The following commands remove the mirrored log device
.Sy mirror-2
and mirrored top-level data device
.Sy mirror-1 .
.Pp
Given this configuration:
.Bd -literal
pool: tank
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sda ONLINE 0 0 0
sdb ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
sdc ONLINE 0 0 0
sdd ONLINE 0 0 0
logs
mirror-2 ONLINE 0 0 0
sde ONLINE 0 0 0
sdf ONLINE 0 0 0
.Ed
.Pp
The command to remove the mirrored log
.Sy mirror-2
is:
.Bd -literal
# zpool remove tank mirror-2
.Ed
OpenZFS 7614, 9064 - zfs device evacuation/removal OpenZFS 7614 - zfs device evacuation/removal OpenZFS 9064 - remove_mirror should wait for device removal to complete This project allows top-level vdevs to be removed from the storage pool with "zpool remove", reducing the total amount of storage in the pool. This operation copies all allocated regions of the device to be removed onto other devices, recording the mapping from old to new location. After the removal is complete, read and free operations to the removed (now "indirect") vdev must be remapped and performed at the new location on disk. The indirect mapping table is kept in memory whenever the pool is loaded, so there is minimal performance overhead when doing operations on the indirect vdev. The size of the in-memory mapping table will be reduced when its entries become "obsolete" because they are no longer used by any block pointers in the pool. An entry becomes obsolete when all the blocks that use it are freed. An entry can also become obsolete when all the snapshots that reference it are deleted, and the block pointers that reference it have been "remapped" in all filesystems/zvols (and clones). Whenever an indirect block is written, all the block pointers in it will be "remapped" to their new (concrete) locations if possible. This process can be accelerated by using the "zfs remap" command to proactively rewrite all indirect blocks that reference indirect (removed) vdevs. Note that when a device is removed, we do not verify the checksum of the data that is copied. This makes the process much faster, but if it were used on redundant vdevs (i.e. mirror or raidz vdevs), it would be possible to copy the wrong data, when we have the correct data on e.g. the other side of the mirror. At the moment, only mirrors and simple top-level vdevs can be removed and no removal is allowed if any of the top-level vdevs are raidz. Porting Notes: * Avoid zero-sized kmem_alloc() in vdev_compact_children(). The device evacuation code adds a dependency that vdev_compact_children() be able to properly empty the vdev_child array by setting it to NULL and zeroing vdev_children. Under Linux, kmem_alloc() and related functions return a sentinel pointer rather than NULL for zero-sized allocations. * Remove comment regarding "mpt" driver where zfs_remove_max_segment is initialized to SPA_MAXBLOCKSIZE. Change zfs_condense_indirect_commit_entry_delay_ticks to zfs_condense_indirect_commit_entry_delay_ms for consistency with most other tunables in which delays are specified in ms. * ZTS changes: Use set_tunable rather than mdb Use zpool sync as appropriate Use sync_pool instead of sync Kill jobs during test_removal_with_operation to allow unmount/export Don't add non-disk names such as "mirror" or "raidz" to $DISKS Use $TEST_BASE_DIR instead of /tmp Increase HZ from 100 to 1000 which is more common on Linux removal_multiple_indirection.ksh Reduce iterations in order to not time out on the code coverage builders. removal_resume_export: Functionally, the test case is correct but there exists a race where the kernel thread hasn't been fully started yet and is not visible. Wait for up to 1 second for the removal thread to be started before giving up on it. Also, increase the amount of data copied in order that the removal not finish before the export has a chance to fail. * MMP compatibility, the concept of concrete versus non-concrete devices has slightly changed the semantics of vdev_writeable(). Update mmp_random_leaf_impl() accordingly. * Updated dbuf_remap() to handle the org.zfsonlinux:large_dnode pool feature which is not supported by OpenZFS. * Added support for new vdev removal tracepoints. * Test cases removal_with_zdb and removal_condense_export have been intentionally disabled. When run manually they pass as intended, but when running in the automated test environment they produce unreliable results on the latest Fedora release. They may work better once the upstream pool import refectoring is merged into ZoL at which point they will be re-enabled. Authored by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Alex Reece <alex@delphix.com> Reviewed-by: George Wilson <george.wilson@delphix.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Richard Laager <rlaager@wiktel.com> Reviewed by: Tim Chase <tim@chase2k.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Approved by: Garrett D'Amore <garrett@damore.org> Ported-by: Tim Chase <tim@chase2k.com> Signed-off-by: Tim Chase <tim@chase2k.com> OpenZFS-issue: https://www.illumos.org/issues/7614 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/f539f1eb Closes #6900
2016-09-22 16:30:13 +00:00
.Pp
The command to remove the mirrored data
.Sy mirror-1
is:
.Bd -literal
# zpool remove tank mirror-1
.Ed
.It Sy Example 15 No Displaying expanded space on a device
The following command displays the detailed information for the pool
.Em data .
This pool is comprised of a single raidz vdev where one of its devices
increased its capacity by 10GB.
In this example, the pool will not be able to utilize this extra capacity until
all the devices under the raidz vdev have been expanded.
.Bd -literal
# zpool list -v data
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
data 23.9G 14.6G 9.30G - 48% 61% 1.00x ONLINE -
raidz1 23.9G 14.6G 9.30G - 48%
sda - - - - -
sdb - - - 10G -
sdc - - - - -
.Ed
.It Sy Example 16 No Adding output columns
Additional columns can be added to the
.Nm zpool Cm status
and
.Nm zpool Cm iostat
output with
.Fl c
option.
.Bd -literal
# zpool status -c vendor,model,size
NAME STATE READ WRITE CKSUM vendor model size
tank ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
U1 ONLINE 0 0 0 SEAGATE ST8000NM0075 7.3T
U10 ONLINE 0 0 0 SEAGATE ST8000NM0075 7.3T
U11 ONLINE 0 0 0 SEAGATE ST8000NM0075 7.3T
U12 ONLINE 0 0 0 SEAGATE ST8000NM0075 7.3T
U13 ONLINE 0 0 0 SEAGATE ST8000NM0075 7.3T
U14 ONLINE 0 0 0 SEAGATE ST8000NM0075 7.3T
# zpool iostat -vc slaves
capacity operations bandwidth
pool alloc free read write read write slaves
---------- ----- ----- ----- ----- ----- ----- ---------
tank 20.4G 7.23T 26 152 20.7M 21.6M
mirror 20.4G 7.23T 26 152 20.7M 21.6M
U1 - - 0 31 1.46K 20.6M sdb sdff
U10 - - 0 1 3.77K 13.3K sdas sdgw
U11 - - 0 1 288K 13.3K sdat sdgx
U12 - - 0 1 78.4K 13.3K sdau sdgy
U13 - - 0 1 128K 13.3K sdav sdgz
U14 - - 0 1 63.2K 13.3K sdfk sdg
.Ed
.El
.Sh ENVIRONMENT VARIABLES
.Bl -tag -width "ZFS_ABORT"
.It Ev ZFS_ABORT
Cause
.Nm zpool
to dump core on exit for the purposes of running
.Sy ::findleaks .
.El
.Bl -tag -width "ZPOOL_IMPORT_PATH"
.It Ev ZPOOL_IMPORT_PATH
The search path for devices or files to use with the pool. This is a colon-separated list of directories in which
.Nm zpool
looks for device nodes and files.
Similar to the
.Fl d
option in
.Nm zpool import .
.El
Implement ZPOOL_IMPORT_UDEV_TIMEOUT_MS Since 0.7.0, zpool import would unconditionally block on udev for 30 seconds. This introduced a regression in initramfs environments that lack udev (particularly mdev based environments), yet use a zfs userland tools intended for the system that had been built against udev. Gentoo's genkernel is the main example, although custom user initramfs environments would be similarly impacted unless special builds of the ZFS userland utilities were done for them. Such environments already have their own mechanisms for blocking until device nodes are ready (such as genkernel's scandelay parameter), so it is unnecessary for zpool import to block on a non-existent udev until a timeout is reached inside of them. Rather than trying to intelligently determine whether udev is available on the system to avoid unnecessarily blocking in such environments, it seems best to just allow the environment to override the timeout. I propose that we add an environment variable called ZPOOL_IMPORT_UDEV_TIMEOUT_MS. Setting it to 0 would restore the 0.6.x behavior that was more desirable in mdev based initramfs environments. This allows the system user land utilities to be reused when building mdev-based initramfs archives. Reviewed-by: Igor Kozhukhov <igor@dilos.org> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Georgy Yakovlev <gyakovlev@gentoo.org> Signed-off-by: Richard Yao <ryao@gentoo.org> Closes #9436
2019-10-09 19:16:12 +00:00
.Bl -tag -width "ZPOOL_IMPORT_UDEV_TIMEOUT_MS"
.It Ev ZPOOL_IMPORT_UDEV_TIMEOUT_MS
The maximum time in milliseconds that
.Nm zpool import
will wait for an expected device to be available.
.El
.Bl -tag -width "ZPOOL_VDEV_NAME_GUID"
.It Ev ZPOOL_VDEV_NAME_GUID
Cause
.Nm zpool
subcommands to output vdev guids by default. This behavior is identical to the
.Nm zpool status -g
command line option.
.El
.Bl -tag -width "ZPOOL_VDEV_NAME_FOLLOW_LINKS"
.It Ev ZPOOL_VDEV_NAME_FOLLOW_LINKS
Cause
.Nm zpool
subcommands to follow links for vdev names by default. This behavior is identical to the
.Nm zpool status -L
command line option.
.El
.Bl -tag -width "ZPOOL_VDEV_NAME_PATH"
.It Ev ZPOOL_VDEV_NAME_PATH
Cause
.Nm zpool
subcommands to output full vdev path names by default. This
behavior is identical to the
.Nm zpool status -p
command line option.
.El
.Bl -tag -width "ZFS_VDEV_DEVID_OPT_OUT"
.It Ev ZFS_VDEV_DEVID_OPT_OUT
Older ZFS on Linux implementations had issues when attempting to display pool
config VDEV names if a
.Sy devid
NVP value is present in the pool's config.
.Pp
For example, a pool that originated on illumos platform would have a devid
value in the config and
.Nm zpool status
would fail when listing the config.
This would also be true for future Linux based pools.
.Pp
A pool can be stripped of any
.Sy devid
values on import or prevented from adding
them on
.Nm zpool create
or
.Nm zpool add
by setting
.Sy ZFS_VDEV_DEVID_OPT_OUT .
.El
.Bl -tag -width "ZPOOL_SCRIPTS_AS_ROOT"
.It Ev ZPOOL_SCRIPTS_AS_ROOT
Allow a privileged user to run the
.Nm zpool status/iostat
with the
.Fl c
option. Normally, only unprivileged users are allowed to run
.Fl c .
.El
.Bl -tag -width "ZPOOL_SCRIPTS_PATH"
.It Ev ZPOOL_SCRIPTS_PATH
The search path for scripts when running
.Nm zpool status/iostat
with the
.Fl c
option. This is a colon-separated list of directories and overrides the default
.Pa ~/.zpool.d
and
.Pa /etc/zfs/zpool.d
search paths.
.El
.Bl -tag -width "ZPOOL_SCRIPTS_ENABLED"
.It Ev ZPOOL_SCRIPTS_ENABLED
Allow a user to run
.Nm zpool status/iostat
with the
.Fl c
option. If
.Sy ZPOOL_SCRIPTS_ENABLED
is not set, it is assumed that the user is allowed to run
.Nm zpool status/iostat -c .
.El
.Sh INTERFACE STABILITY
.Sy Evolving
.Sh SEE ALSO
.Xr zpoolconcepts 8 ,
.Xr zpoolprops 8 ,
.Xr zfs-events 5 ,
.Xr zfs-module-parameters 5 ,
.Xr zpool-features 5 ,
.Xr zed 8 ,
.Xr zfs 8