From 075ad350f8a68c13db36f4a8769517dbec817e6d Mon Sep 17 00:00:00 2001 From: George Melikov Date: Thu, 21 May 2020 21:11:38 +0300 Subject: [PATCH] Redirect all pages to new documentation resource Signed-off-by: George Melikov --- Admin-Documentation.md | 9 +- Async-Write.md | 34 +- Buildbot-Options.md | 179 +- Building-ZFS.md | 151 +- Checksums.md | 59 +- Custom-Packages.md | 129 +- Debian-Buster-Encrypted-Root-on-ZFS.md | 34 +- Debian-Buster-Root-on-ZFS.md | 744 +-- Debian-GNU-Linux-initrd-documentation.md | 71 +- Debian-Stretch-Root-on-ZFS.md | 703 +- Debian.md | 42 +- Debugging.md | 1 - Developer-Resources.md | 17 +- FAQ.md | 418 +- Fedora.md | 45 +- Getting-Started.md | 17 +- Git-and-GitHub-for-beginners.md | 147 +- ...U-Linux-to-a-Native-ZFS-Root-Filesystem.md | 4 +- Home.md | 7 +- License.md | 6 +- Mailing-Lists.md | 17 +- OpenZFS-Patches.md | 200 +- OpenZFS-Tracking.md | 4 +- OpenZFS-exceptions.md | 18 +- Project-and-Community.md | 17 +- RHEL-and-CentOS.md | 109 +- Signing-Keys.md | 57 +- Troubleshooting.md | 67 +- Ubuntu-16.04-Root-on-ZFS.md | 605 +- Ubuntu-18.04-Root-on-ZFS.md | 735 +- Ubuntu.md | 10 +- Workflow-Accept-PR.md | 8 +- Workflow-Close-PR.md | 2 +- Workflow-Commit-Often.md | 14 +- Workflow-Commit.md | 33 +- Workflow-Conflicts.md | 2 +- Workflow-Create-Branch.md | 23 +- Workflow-Create-Github-Account.md | 15 +- Workflow-Create-Test.md | 2 +- Workflow-Delete-Branch.md | 6 +- Workflow-Generate-PR.md | 2 +- Workflow-Get-Source.md | 32 +- Workflow-Install-Git.md | 38 +- Workflow-Large-Features.md | 2 +- Workflow-Merge-PR.md | 6 +- Workflow-Rebase.md | 19 +- Workflow-Squash.md | 2 +- Workflow-Test.md | 60 +- Workflow-Update-PR.md | 2 +- Workflow-Zloop-Debugging.md | 2 +- ZFS-Transaction-Delay.md | 99 +- ZFS-on-Linux-Module-Parameters.md | 5902 +---------------- ZIO-Scheduler.md | 75 +- _Footer.md | 2 +- _Sidebar.md | 50 - dRAID-HOWTO.md | 290 +- hole_birth-FAQ.md | 25 +- 57 files changed, 101 insertions(+), 11268 deletions(-) delete mode 100644 Debugging.md delete mode 100644 _Sidebar.md diff --git a/Admin-Documentation.md b/Admin-Documentation.md index 6148d09..15ef6e7 100644 --- a/Admin-Documentation.md +++ b/Admin-Documentation.md @@ -1,7 +1,4 @@ -* [Aaron Toponce's ZFS on Linux User Guide][zol-guide] -* [OpenZFS System Administration][openzfs-docs] -* [Oracle Solaris ZFS Administration Guide][solaris-docs] -[zol-guide]: https://pthree.org/2012/04/17/install-zfs-on-debian-gnulinux/ -[openzfs-docs]: http://open-zfs.org/wiki/System_Administration -[solaris-docs]: http://docs.oracle.com/cd/E19253-01/819-5461/ \ No newline at end of file +This page was moved to: https://openzfs.github.io/openzfs-docs/Project%20and%20Community/Admin%20Documentation.html + +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Async-Write.md b/Async-Write.md index 14d803c..6c4b15b 100644 --- a/Async-Write.md +++ b/Async-Write.md @@ -1,33 +1,3 @@ -### Async Writes +This page was moved to: https://openzfs.github.io/openzfs-docs/Performance%20and%20tuning/Async%20Write.html -The number of concurrent operations issued for the async write I/O class -follows a piece-wise linear function defined by a few adjustable points. - -``` - | o---------| <-- zfs_vdev_async_write_max_active - ^ | /^ | - | | / | | -active | / | | - I/O | / | | -count | / | | - | / | | - |-------o | | <-- zfs_vdev_async_write_min_active - 0|_______^______|_________| - 0% | | 100% of zfs_dirty_data_max - | | - | `-- zfs_vdev_async_write_active_max_dirty_percent - `--------- zfs_vdev_async_write_active_min_dirty_percent -``` - -Until the amount of dirty data exceeds a minimum percentage of the dirty -data allowed in the pool, the I/O scheduler will limit the number of -concurrent operations to the minimum. As that threshold is crossed, the -number of concurrent operations issued increases linearly to the maximum at -the specified maximum percentage of the dirty data allowed in the pool. - -Ideally, the amount of dirty data on a busy pool will stay in the sloped -part of the function between zfs_vdev_async_write_active_min_dirty_percent -and zfs_vdev_async_write_active_max_dirty_percent. If it exceeds the -maximum percentage, this indicates that the rate of incoming data is -greater than the rate that the backend storage can handle. In this case, we -must further throttle incoming writes, as described in the next section. +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Buildbot-Options.md b/Buildbot-Options.md index bfe22a3..3adb024 100644 --- a/Buildbot-Options.md +++ b/Buildbot-Options.md @@ -1,178 +1,3 @@ -There are a number of ways to control the ZFS Buildbot at a commit level. This page -provides a summary of various options that the ZFS Buildbot supports and how it impacts -testing. More detailed information regarding its implementation can be found at the -[ZFS Buildbot Github page](https://github.com/zfsonlinux/zfs-buildbot). +This page was moved to: https://openzfs.github.io/openzfs-docs/Developer%20Resources/Buildbot%20Options.html -## Choosing Builders -By default, all commits in your ZFS pull request are compiled by the BUILD -builders. Additionally, the top commit of your ZFS pull request is tested by -TEST builders. However, there is the option to override which types of builder -should be used on a per commit basis. In this case, you can add -`Requires-builders: ` to your -commit message. A comma separated list of options can be -provided. Supported options are: - -* `all`: This commit should be built by all available builders -* `none`: This commit should not be built by any builders -* `style`: This commit should be built by STYLE builders -* `build`: This commit should be built by all BUILD builders -* `arch`: This commit should be built by BUILD builders tagged as 'Architectures' -* `distro`: This commit should be built by BUILD builders tagged as 'Distributions' -* `test`: This commit should be built and tested by the TEST builders (excluding the Coverage TEST builders) -* `perf`: This commit should be built and tested by the PERF builders -* `coverage` : This commit should be built and tested by the Coverage TEST builders -* `unstable` : This commit should be built and tested by the Unstable TEST builders (currently only the Fedora Rawhide TEST builder) - -A couple of examples on how to use `Requires-builders:` in commit messages can be found below. - -### Preventing a commit from being built and tested. -``` -This is a commit message - -This text is part of the commit message body. - -Signed-off-by: Contributor -Requires-builders: none -``` - -### Submitting a commit to STYLE and TEST builders only. -``` -This is a commit message - -This text is part of the commit message body. - -Signed-off-by: Contributor -Requires-builders: style test -``` - -## Requiring SPL Versions -Currently, the ZFS Buildbot attempts to choose the correct SPL branch to build -based on a pull request's base branch. In the cases where a specific SPL version -needs to be built, the ZFS buildbot supports specifying an SPL version for pull -request testing. By opening a pull request against ZFS and adding `Requires-spl:` -in a commit message, you can instruct the buildbot to use a specific SPL version. -Below are examples of a commit messages that specify the SPL version. - -### Build SPL from a specific pull request -``` -This is a commit message - -This text is part of the commit message body. - -Signed-off-by: Contributor -Requires-spl: refs/pull/123/head -``` - -### Build SPL branch `spl-branch-name` from `zfsonlinux/spl` repository -``` -This is a commit message - -This text is part of the commit message body. - -Signed-off-by: Contributor -Requires-spl: spl-branch-name -``` - -## Requiring Kernel Version -Currently, Kernel.org builders will clone and build the master branch of Linux. -In cases where a specific version of the Linux kernel needs to be built, the ZFS -buildbot supports specifying the Linux kernel to be built via commit message. -By opening a pull request against ZFS and adding `Requires-kernel:` in a commit -message, you can instruct the buildbot to use a specific Linux kernel. -Below is an example commit message that specifies a specific Linux kernel tag. - -### Build Linux Kernel Version 4.14 -``` -This is a commit message - -This text is part of the commit message body. - -Signed-off-by: Contributor -Requires-kernel: v4.14 -``` - -## Build Steps Overrides -Each builder will execute or skip build steps based on its default -preferences. In some scenarios, it might be possible to skip various build -steps. The ZFS buildbot supports overriding the defaults of all builders -in a commit message. The list of available overrides are: - -* `Build-linux: `: All builders should build Linux for this commit -* `Build-lustre: `: All builders should build Lustre for this commit -* `Build-spl: `: All builders should build the SPL for this commit -* `Build-zfs: `: All builders should build ZFS for this commit -* `Built-in: `: All Linux builds should build in SPL and ZFS -* `Check-lint: `: All builders should perform lint checks for this commit -* `Configure-lustre: `: Provide `` as configure flags when building Lustre -* `Configure-spl: `: Provide `` as configure flags when building the SPL -* `Configure-zfs: `: Provide `` as configure flags when building ZFS - -A couple of examples on how to use overrides in commit messages can be found below. - -### Skip building the SPL and build Lustre without ldiskfs -``` -This is a commit message - -This text is part of the commit message body. - -Signed-off-by: Contributor -Build-lustre: Yes -Configure-lustre: --disable-ldiskfs -Build-spl: No -``` - -### Build ZFS Only -``` -This is a commit message - -This text is part of the commit message body. - -Signed-off-by: Contributor -Build-lustre: No -Build-spl: No -``` - -## Configuring Tests with the TEST File -At the top level of the ZFS source tree, there is the [`TEST` -file](https://github.com/zfsonlinux/zfs/blob/master/TEST) which contains variables -that control if and how a specific test should run. Below is a list of each variable -and a brief description of what each variable controls. - -* `TEST_PREPARE_WATCHDOG` - Enables the Linux kernel watchdog -* `TEST_PREPARE_SHARES` - Start NFS and Samba servers -* `TEST_SPLAT_SKIP` - Determines if `splat` testing is skipped -* `TEST_SPLAT_OPTIONS` - Command line options to provide to `splat` -* `TEST_ZTEST_SKIP` - Determines if `ztest` testing is skipped -* `TEST_ZTEST_TIMEOUT` - The length of time `ztest` should run -* `TEST_ZTEST_DIR` - Directory where `ztest` will create vdevs -* `TEST_ZTEST_OPTIONS` - Options to pass to `ztest` -* `TEST_ZTEST_CORE_DIR` - Directory for `ztest` to store core dumps -* `TEST_ZIMPORT_SKIP` - Determines if `zimport` testing is skipped -* `TEST_ZIMPORT_DIR` - Directory used during `zimport` -* `TEST_ZIMPORT_VERSIONS` - Source versions to test -* `TEST_ZIMPORT_POOLS` - Names of the pools for `zimport` to use for testing -* `TEST_ZIMPORT_OPTIONS` - Command line options to provide to `zimport` -* `TEST_XFSTESTS_SKIP` - Determines if `xfstest` testing is skipped -* `TEST_XFSTESTS_URL` - URL to download `xfstest` from -* `TEST_XFSTESTS_VER` - Name of the tarball to download from `TEST_XFSTESTS_URL` -* `TEST_XFSTESTS_POOL` - Name of pool to create and used by `xfstest` -* `TEST_XFSTESTS_FS` - Name of dataset for use by `xfstest` -* `TEST_XFSTESTS_VDEV` - Name of the vdev used by `xfstest` -* `TEST_XFSTESTS_OPTIONS` - Command line options to provide to `xfstest` -* `TEST_ZFSTESTS_SKIP` - Determines if `zfs-tests` testing is skipped -* `TEST_ZFSTESTS_DIR` - Directory to store files and loopback devices -* `TEST_ZFSTESTS_DISKS` - Space delimited list of disks that `zfs-tests` is allowed to use -* `TEST_ZFSTESTS_DISKSIZE` - File size of file based vdevs used by `zfs-tests` -* `TEST_ZFSTESTS_ITERS` - Number of times `test-runner` should execute its set of tests -* `TEST_ZFSTESTS_OPTIONS` - Options to provide `zfs-tests` -* `TEST_ZFSTESTS_RUNFILE` - The runfile to use when running `zfs-tests` -* `TEST_ZFSTESTS_TAGS` - List of tags to provide to `test-runner` -* `TEST_ZFSSTRESS_SKIP` - Determines if `zfsstress` testing is skipped -* `TEST_ZFSSTRESS_URL` - URL to download `zfsstress` from -* `TEST_ZFSSTRESS_VER` - Name of the tarball to download from `TEST_ZFSSTRESS_URL` -* `TEST_ZFSSTRESS_RUNTIME` - Duration to run `runstress.sh` -* `TEST_ZFSSTRESS_POOL` - Name of pool to create and use for `zfsstress` testing -* `TEST_ZFSSTRESS_FS` - Name of dataset for use during `zfsstress` tests -* `TEST_ZFSSTRESS_FSOPT` - File system options to provide to `zfsstress` -* `TEST_ZFSSTRESS_VDEV` - Directory to store vdevs for use during `zfsstress` tests -* `TEST_ZFSSTRESS_OPTIONS` - Command line options to provide to `runstress.sh` \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Building-ZFS.md b/Building-ZFS.md index e1c3eca..a70e396 100644 --- a/Building-ZFS.md +++ b/Building-ZFS.md @@ -1,150 +1,3 @@ -### GitHub Repositories +This page was moved to: https://openzfs.github.io/openzfs-docs/Developer%20Resources/Building%20ZFS.html -The official source for ZFS on Linux is maintained at GitHub by the [zfsonlinux][zol-org] organization. The project consists of two primary git repositories named [spl][spl-repo] and [zfs][zfs-repo], both are required to build ZFS on Linux. - -**NOTE:** The SPL was merged in to the [zfs][zfs-repo] repository, the last major release with a separate SPL is `0.7`. - -* **SPL**: The SPL is thin shim layer which is responsible for implementing the fundamental interfaces required by OpenZFS. It's this layer which allows OpenZFS to be used across multiple platforms. - -* **ZFS**: The ZFS repository contains a copy of the upstream OpenZFS code which has been adapted and extended for Linux. The vast majority of the core OpenZFS code is self-contained and can be used without modification. - -### Installing Dependencies - -The first thing you'll need to do is prepare your environment by installing a full development tool chain. In addition, development headers for both the kernel and the following libraries must be available. It is important to note that if the development kernel headers for the currently running kernel aren't installed, the modules won't compile properly. - -The following dependencies should be installed to build the latest ZFS 0.8 release. - -* **RHEL/CentOS 7**: -```sh -sudo yum install epel-release gcc make autoconf automake libtool rpm-build dkms libtirpc-devel libblkid-devel libuuid-devel libudev-devel openssl-devel zlib-devel libaio-devel libattr-devel elfutils-libelf-devel kernel-devel-$(uname -r) python python2-devel python-setuptools python-cffi libffi-devel -``` - -* **RHEL/CentOS 8, Fedora**: -```sh -sudo dnf install gcc make autoconf automake libtool rpm-build dkms libtirpc-devel libblkid-devel libuuid-devel libudev-devel openssl-devel zlib-devel libaio-devel libattr-devel elfutils-libelf-devel kernel-devel-$(uname -r) python3 python3-devel python3-setuptools python3-cffi libffi-devel -``` - -* **Debian, Ubuntu**: -```sh -sudo apt install build-essential autoconf automake libtool gawk alien fakeroot dkms libblkid-dev uuid-dev libudev-dev libssl-dev zlib1g-dev libaio-dev libattr1-dev libelf-dev linux-headers-$(uname -r) python3 python3-dev python3-setuptools python3-cffi libffi-dev -``` - -### Build Options - -There are two options for building ZFS on Linux, the correct one largely depends on your requirements. - -* **Packages**: Often it can be useful to build custom packages from git which can be installed on a system. This is the best way to perform integration testing with systemd, dracut, and udev. The downside to using packages it is greatly increases the time required to build, install, and test a change. - -* **In-tree**: Development can be done entirely in the SPL and ZFS source trees. This speeds up development by allowing developers to rapidly iterate on a patch. When working in-tree developers can leverage incremental builds, load/unload kernel modules, execute utilities, and verify all their changes with the ZFS Test Suite. - -The remainder of this page focuses on the **in-tree** option which is the recommended method of development for the majority of changes. See the [[custom-packages]] page for additional information on building custom packages. - -### Developing In-Tree - -#### Clone from GitHub - -Start by cloning the SPL and ZFS repositories from GitHub. The repositories have a **master** branch for development and a series of **\*-release** branches for tagged releases. After checking out the repository your clone will default to the master branch. Tagged releases may be built by checking out spl/zfs-x.y.z tags with matching version numbers or matching release branches. Avoid using mismatched versions, this can result build failures due to interface changes. - -**NOTE:** SPL was merged in to the [zfs][zfs-repo] repository, last release with separate SPL is `0.7`. -``` -git clone https://github.com/zfsonlinux/zfs -``` - -If you need 0.7 release or older: -``` -git clone https://github.com/zfsonlinux/spl -``` - -#### Configure and Build - -For developers working on a change always create a new topic branch based off of master. This will make it easy to open a pull request with your change latter. The master branch is kept stable with extensive [regression testing][buildbot] of every pull request before and after it's merged. Every effort is made to catch defects as early as possible and to keep them out of the tree. Developers should be comfortable frequently rebasing their work against the latest master branch. - -If you want to build 0.7 release or older, you should compile SPL first: - -``` -cd ./spl -git checkout master -sh autogen.sh -./configure -make -s -j$(nproc) -``` - -In this example we'll use the master branch and walk through a stock **in-tree** build, so we don't need to build SPL separately. Start by checking out the desired branch then build the ZFS and SPL source in the tradition autotools fashion. - -``` -cd ./zfs -git checkout master -sh autogen.sh -./configure -make -s -j$(nproc) -``` - -**tip:** `--with-linux=PATH` and `--with-linux-obj=PATH` can be passed to configure to specify a kernel installed in a non-default location. This option is also supported when building ZFS. -**tip:** `--enable-debug` can be set to enable all ASSERTs and additional correctness tests. This option is also supported when building ZFS. -**tip:** for version `<=0.7` `--with-spl=PATH` and `--with-spl-obj=PATH`, where `PATH` is a full path, can be passed to configure if it is unable to locate the SPL. - -**Optional** Build packages - -``` -make deb #example for Debian/Ubuntu -``` - -#### Install - -You can run `zfs-tests.sh` without installing ZFS, see below. If you have reason to install ZFS after building it, pay attention to how your distribution handles kernel modules. -On Ubuntu, for example, the modules from this repository install in the `extra` kernel module path, which is not in the standard `depmod` search path. Therefore, for the duration of your testing, edit `/etc/depmod.d/ubuntu.conf` and add `extra` to the beginning of the search path. - -You may then install using `sudo make install; sudo ldconfig; sudo depmod`. You'd uninstall with `sudo make uninstall; sudo ldconfig; sudo depmod`. - -#### Running zloop.sh and zfs-tests.sh - -If you wish to run the ZFS Test Suite (ZTS), then `ksh` and a few additional utilities must be installed. - -* **RHEL/CentOS 7:** -```sh -sudo yum install ksh bc fio acl sysstat mdadm lsscsi parted attr dbench nfs-utils samba rng-tools pax perf -``` - -* **RHEL/CentOS 8, Fedora:** -```sh -sudo dnf install ksh bc fio acl sysstat mdadm lsscsi parted attr dbench nfs-utils samba rng-tools pax perf -``` - -* **Debian, Ubuntu:** -```sh -sudo apt install ksh bc fio acl sysstat mdadm lsscsi parted attr dbench nfs-kernel-server samba rng-tools pax linux-tools-common selinux-utils quota -``` - -There are a few helper scripts provided in the top-level scripts directory designed to aid developers working with in-tree builds. - -* **zfs-helper.sh:** Certain functionality (i.e. /dev/zvol/) depends on the ZFS provided udev helper scripts being installed on the system. This script can be used to create symlinks on the system from the installation location to the in-tree helper. These links must be in place to successfully run the ZFS Test Suite. The **-i** and **-r** options can be used to install and remove the symlinks. - -``` -sudo ./scripts/zfs-helpers.sh -i -``` - -* **zfs.sh:** The freshly built kernel modules can be loaded using `zfs.sh`. This script can latter be used to unload the kernel modules with the **-u** option. - -``` -sudo ./scripts/zfs.sh -``` - -* **zloop.sh:** A wrapper to run ztest repeatedly with randomized arguments. The ztest command is a user space stress test designed to detect correctness issues by concurrently running a random set of test cases. If a crash is encountered, the ztest logs, any associated vdev files, and core file (if one exists) are collected and moved to the output directory for analysis. - -``` -sudo ./scripts/zloop.sh -``` - -* **zfs-tests.sh:** A wrapper which can be used to launch the ZFS Test Suite. Three loopback devices are created on top of sparse files located in `/var/tmp/` and used for the regression test. Detailed directions for the ZFS Test Suite can be found in the [README][zts-readme] located in the top-level tests directory. - -``` - ./scripts/zfs-tests.sh -vx -``` - -**tip:** The **delegate** tests will be skipped unless group read permission is set on the zfs directory and its parents. - -[zol-org]: https://github.com/zfsonlinux/ -[spl-repo]: https://github.com/zfsonlinux/spl -[zfs-repo]: https://github.com/zfsonlinux/zfs -[buildbot]: http://build.zfsonlinux.org/ -[zts-readme]: https://github.com/zfsonlinux/zfs/tree/master/tests +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Checksums.md b/Checksums.md index 9a67aa5..2000f7e 100644 --- a/Checksums.md +++ b/Checksums.md @@ -1,58 +1,3 @@ -### Checksums and Their Use in ZFS +This page was moved to: https://openzfs.github.io/openzfs-docs/Basics%20concepts/Checksums.html -End-to-end checksums are a key feature of ZFS and an important differentiator -for ZFS over other RAID implementations and filesystems. -Advantages of end-to-end checksums include: -+ detects data corruption upon reading from media -+ blocks that are detected as corrupt are automatically repaired if possible, by -using the RAID protection in suitably configured pools, or redundant copies (see -the zfs `copies` property) -+ periodic scrubs can check data to detect and repair latent media degradation -(bit rot) and corruption from other sources -+ checksums on ZFS replication streams, `zfs send` and `zfs receive`, ensure the -data received is not corrupted by intervening storage or transport mechanisms - -#### Checksum Algorithms - -The checksum algorithms in ZFS can be changed for datasets (filesystems or -volumes). The checksum algorithm used for each block is stored in the block -pointer (metadata). The block checksum is calculated when the block is written, -so changing the algorithm only affects writes occurring after the change. - -The checksum algorithm for a dataset can be changed by setting the `checksum` -property: -```bash -zfs set checksum=sha256 pool_name/dataset_name -``` - -| Checksum | Ok for dedup and nopwrite? | Compatible with other ZFS implementations? | Notes -|---|---|---|--- -| on | see notes | yes | `on` is a short hand for `fletcher4` for non-deduped datasets and `sha256` for deduped datasets -| off | no | yes | Do not do use `off` -| fletcher2 | no | yes | Deprecated implementation of Fletcher checksum, use `fletcher4` instead -| fletcher4 | no | yes | Fletcher algorithm, also used for `zfs send` streams -| sha256 | yes | yes | Default for deduped datasets -| noparity | no | yes | Do not use `noparity` -| sha512 | yes | requires pool feature `org.illumos:sha512` | salted `sha512` currently not supported for any filesystem on the boot pools -| skein | yes | requires pool feature `org.illumos:skein` | salted `skein` currently not supported for any filesystem on the boot pools -| edonr | yes | requires pool feature `org.illumos:edonr` | salted `edonr` currently not supported for any filesystem on the boot pools - -#### Checksum Accelerators -ZFS has the ability to offload checksum operations to the Intel QuickAssist -Technology (QAT) adapters. - -#### Checksum Microbenchmarks -Some ZFS features use microbenchmarks when the `zfs.ko` kernel module is loaded -to determine the optimal algorithm for checksums. The results of the microbenchmarks -are observable in the `/proc/spl/kstat/zfs` directory. The winning algorithm is -reported as the "fastest" and becomes the default. The default can be overridden -by setting zfs module parameters. - -| Checksum | Results Filename | `zfs` module parameter -|---|---|--- -| Fletcher4 | /proc/spl/kstat/zfs/fletcher_4_bench | zfs_fletcher_4_impl - -#### Disabling Checksums -While it may be tempting to disable checksums to improve CPU performance, it is -widely considered by the ZFS community to be an extrodinarily bad idea. Don't -disable checksums. +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Custom-Packages.md b/Custom-Packages.md index c4f4b0b..3d8dcb3 100644 --- a/Custom-Packages.md +++ b/Custom-Packages.md @@ -1,128 +1,3 @@ -The following instructions assume you are building from an official [release tarball][release] (version 0.8.0 or newer) or directly from the [git repository][git]. Most users should not need to do this and should preferentially use the distribution packages. As a general rule the distribution packages will be more tightly integrated, widely tested, and better supported. However, if your distribution of choice doesn't provide packages, or you're a developer and want to roll your own, here's how to do it. +This page was moved to: https://openzfs.github.io/openzfs-docs/Developer%20Resources/Custom%20Packages.html -The first thing to be aware of is that the build system is capable of generating several different types of packages. Which type of package you choose depends on what's supported on your platform and exactly what your needs are. - -* **DKMS** packages contain only the source code and scripts for rebuilding the kernel modules. When the DKMS package is installed kernel modules will be built for all available kernels. Additionally, when the kernel is upgraded new kernel modules will be automatically built for that kernel. This is particularly convenient for desktop systems which receive frequent kernel updates. The downside is that because the DKMS packages build the kernel modules from source a full development environment is required which may not be appropriate for large deployments. - -* **kmods** packages are binary kernel modules which are compiled against a specific version of the kernel. This means that if you update the kernel you must compile and install a new kmod package. If you don't frequently update your kernel, or if you're managing a large number of systems, then kmod packages are a good choice. - -* **kABI-tracking kmod** Packages are similar to standard binary kmods and may be used with Enterprise Linux distributions like Red Hat and CentOS. These distributions provide a stable kABI (Kernel Application Binary Interface) which allows the same binary modules to be used with new versions of the distribution provided kernel. - -By default the build system will generate user packages and both DKMS and kmod style kernel packages if possible. The user packages can be used with either set of kernel packages and do not need to be rebuilt when the kernel is updated. You can also streamline the build process by building only the DKMS or kmod packages as shown below. - -Be aware that when building directly from a git repository you must first run the *autogen.sh* script to create the *configure* script. This will require installing the GNU autotools packages for your distribution. To perform any of the builds, you must install all the necessary development tools and headers for your distribution. - -It is important to note that if the development kernel headers for the currently running kernel aren't installed, the modules won't compile properly. - -* [Red Hat, CentOS and Fedora](#red-hat-centos-and-fedora) -* [Debian and Ubuntu](#debian-and-ubuntu) - -## RHEL, CentOS and Fedora - -Make sure that the required packages are installed to build the latest ZFS 0.8 release: - -* **RHEL/CentOS 7**: -```sh -sudo yum install epel-release gcc make autoconf automake libtool rpm-build dkms libtirpc-devel libblkid-devel libuuid-devel libudev-devel openssl-devel zlib-devel libaio-devel libattr-devel elfutils-libelf-devel kernel-devel-$(uname -r) python python2-devel python-setuptools python-cffi libffi-devel -``` - -* **RHEL/CentOS 8, Fedora**: -```sh -sudo dnf install gcc make autoconf automake libtool rpm-build kernel-rpm-macros dkms libtirpc-devel libblkid-devel libuuid-devel libudev-devel openssl-devel zlib-devel libaio-devel libattr-devel elfutils-libelf-devel kernel-devel-$(uname -r) python3 python3-devel python3-setuptools python3-cffi libffi-devel -``` - -[Get the source code](#get-the-source-code). - -### DKMS - -Building rpm-based DKMS and user packages can be done as follows: - -```sh -$ cd zfs -$ ./configure -$ make -j1 rpm-utils rpm-dkms -$ sudo yum localinstall *.$(uname -p).rpm *.noarch.rpm -``` - -### kmod - -The key thing to know when building a kmod package is that a specific Linux kernel must be specified. At configure time the build system will make an educated guess as to which kernel you want to build against. However, if configure is unable to locate your kernel development headers, or you want to build against a different kernel, you must specify the exact path with the *--with-linux* and *--with-linux-obj* options. - -```sh -$ cd zfs -$ ./configure -$ make -j1 rpm-utils rpm-kmod -$ sudo yum localinstall *.$(uname -p).rpm -``` - -### kABI-tracking kmod - -The process for building kABI-tracking kmods is almost identical to for building normal kmods. However, it will only produce binaries which can be used by multiple kernels if the distribution supports a stable kABI. In order to request kABI-tracking package the *--with-spec=redhat* option must be passed to configure. - -**NOTE:** This type of package is not available for Fedora. - -```sh -$ cd zfs -$ ./configure --with-spec=redhat -$ make -j1 rpm-utils rpm-kmod -$ sudo yum localinstall *.$(uname -p).rpm -``` - -## Debian and Ubuntu - -Make sure that the required packages are installed: - -```sh -sudo apt install build-essential autoconf automake libtool gawk alien fakeroot dkms libblkid-dev uuid-dev libudev-dev libssl-dev zlib1g-dev libaio-dev libattr1-dev libelf-dev linux-headers-$(uname -r) python3 python3-dev python3-setuptools python3-cffi libffi-dev -``` - -[Get the source code](#get-the-source-code). - -### kmod - -The key thing to know when building a kmod package is that a specific Linux kernel must be specified. At configure time the build system will make an educated guess as to which kernel you want to build against. However, if configure is unable to locate your kernel development headers, or you want to build against a different kernel, you must specify the exact path with the *--with-linux* and *--with-linux-obj* options. - -```sh -$ cd zfs -$ ./configure --enable-systemd -$ make -j1 deb-utils deb-kmod -$ for file in *.deb; do sudo gdebi -q --non-interactive $file; done -``` - -### DKMS - -Building deb-based DKMS and user packages can be done as follows: - -```sh -$ sudo apt-get install dkms -$ cd zfs -$ ./configure --enable-systemd -$ make -j1 deb-utils deb-dkms -$ for file in *.deb; do sudo gdebi -q --non-interactive $file; done -``` - -## Get the Source Code - -### Released Tarball - -The released tarball contains the latest fully tested and released version of ZFS. This is the preferred source code location for use in production systems. If you want to use the official released tarballs, then use the following commands to fetch and prepare the source. - -```sh -$ wget http://archive.zfsonlinux.org/downloads/zfsonlinux/zfs/zfs-x.y.z.tar.gz -$ tar -xzf zfs-x.y.z.tar.gz -``` - -### Git Master Branch - -The Git *master* branch contains the latest version of the software, and will probably contain fixes that, for some reason, weren't included in the released tarball. This is the preferred source code location for developers who intend to modify ZFS. If you would like to use the git version, you can clone it from Github and prepare the source like this. - -```sh -$ git clone https://github.com/zfsonlinux/zfs.git -$ cd zfs -$ ./autogen.sh -``` - -Once the source has been prepared you'll need to decide what kind of packages you're building and jump the to appropriate section above. Note that not all package types are supported for all platforms. - -[release]: https://github.com/zfsonlinux/zfs/releases/latest -[git]: https://github.com/zfsonlinux/zfs +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Debian-Buster-Encrypted-Root-on-ZFS.md b/Debian-Buster-Encrypted-Root-on-ZFS.md index c7138ee..ace6405 100644 --- a/Debian-Buster-Encrypted-Root-on-ZFS.md +++ b/Debian-Buster-Encrypted-Root-on-ZFS.md @@ -1,33 +1 @@ -This experimental guide has been made official at [[Debian Buster Root on ZFS]]. - -If you have an existing system installed from the experimental guide, adjust your sources: - - vi /etc/apt/sources.list.d/buster-backports.list - deb http://deb.debian.org/debian buster-backports main contrib - deb-src http://deb.debian.org/debian buster-backports main contrib - - vi /etc/apt/preferences.d/90_zfs - Package: libnvpair1linux libuutil1linux libzfs2linux libzpool2linux zfs-dkms zfs-initramfs zfs-test zfsutils-linux zfs-zed - Pin: release n=buster-backports - Pin-Priority: 990 - -This will allow you to upgrade from the locally-built packages to the official buster-backports packages. - -You should set a root password before upgrading: - - passwd - -Apply updates: - - apt update - apt dist-upgrade - -Reboot: - - reboot - -If the bpool fails to import, then enter the rescue shell (which requires a root password) and run: - - zpool import -f bpool - zpool export bpool - reboot +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Debian-Buster-Root-on-ZFS.md b/Debian-Buster-Root-on-ZFS.md index 50b7ae4..df556c5 100644 --- a/Debian-Buster-Root-on-ZFS.md +++ b/Debian-Buster-Root-on-ZFS.md @@ -1,743 +1,3 @@ -### Caution -* This HOWTO uses a whole physical disk. -* Do not use these instructions for dual-booting. -* Backup your data. Any existing data will be lost. +This page was moved to: https://openzfs.github.io/openzfs-docs/Getting%20Started/Debian/Debian%20Buster%20Root%20on%20ZFS.html -### System Requirements -* [64-bit Debian GNU/Linux Buster Live CD w/ GUI (e.g. gnome iso)](https://cdimage.debian.org/mirror/cdimage/release/current-live/amd64/iso-hybrid/) -* [A 64-bit kernel is *strongly* encouraged.](https://github.com/zfsonlinux/zfs/wiki/FAQ#32-bit-vs-64-bit-systems) -* Installing on a drive which presents 4KiB logical sectors (a “4Kn” drive) only works with UEFI booting. This not unique to ZFS. [GRUB does not and will not work on 4Kn with legacy (BIOS) booting.](http://savannah.gnu.org/bugs/?46700) - -Computers that have less than 2 GiB of memory run ZFS slowly. 4 GiB of memory is recommended for normal performance in basic workloads. If you wish to use deduplication, you will need [massive amounts of RAM](http://wiki.freebsd.org/ZFSTuningGuide#Deduplication). Enabling deduplication is a permanent change that cannot be easily reverted. - -## Support - -If you need help, reach out to the community using the [zfs-discuss mailing list](https://github.com/zfsonlinux/zfs/wiki/Mailing-Lists) or IRC at #zfsonlinux on [freenode](https://freenode.net/). If you have a bug report or feature request related to this HOWTO, please [file a new issue](https://github.com/zfsonlinux/zfs/issues/new) and mention @rlaager. - -## Contributing - -Edit permission on this wiki is restricted. Also, GitHub wikis do not support pull requests. However, you can clone the wiki using git. - -1) `git clone https://github.com/zfsonlinux/zfs.wiki.git` -2) Make your changes. -3) Use `git diff > my-changes.patch` to create a patch. (Advanced git users may wish to `git commit` to a branch and `git format-patch`.) -4) [File a new issue](https://github.com/zfsonlinux/zfs/issues/new), mention @rlaager, and attach the patch. - -## Encryption - -This guide supports three different encryption options: unencrypted, LUKS (full-disk encryption), and ZFS native encryption. With any option, all ZFS features are fully available. - -Unencrypted does not encrypt anything, of course. With no encryption happening, this option naturally has the best performance. - -LUKS encrypts almost everything: the OS, swap, home directories, and anything else. The only unencrypted data is the bootloader, kernel, and initrd. The system cannot boot without the passphrase being entered at the console. Performance is good, but LUKS sits underneath ZFS, so if multiple disks (mirror or raidz topologies) are used, the data has to be encrypted once per disk. - -ZFS native encryption encrypts the data and most metadata in the root pool. It does not encrypt dataset or snapshot names or properties. The boot pool is not encrypted at all, but it only contains the bootloader, kernel, and initrd. (Unless you put a password in `/etc/fstab`, the initrd is unlikely to contain sensitive data.) The system cannot boot without the passphrase being entered at the console. Performance is good. As the encryption happens in ZFS, even if multiple disks (mirror or raidz topologies) are used, the data only has to be encrypted once. - -## Step 1: Prepare The Install Environment - -1.1 Boot the Debian GNU/Linux Live CD. If prompted, login with the username `user` and password `live`. Connect your system to the Internet as appropriate (e.g. join your WiFi network). - -1.2 Optional: Install and start the OpenSSH server in the Live CD environment: - -If you have a second system, using SSH to access the target system can be convenient. - - sudo apt update - sudo apt install --yes openssh-server - sudo systemctl restart ssh - -**Hint:** You can find your IP address with `ip addr show scope global | grep inet`. Then, from your main machine, connect with `ssh user@IP`. - -1.3 Become root: - - sudo -i - -1.4 Setup and update the repositories: - - echo deb http://deb.debian.org/debian buster contrib >> /etc/apt/sources.list - echo deb http://deb.debian.org/debian buster-backports main contrib >> /etc/apt/sources.list - apt update - -1.5 Install ZFS in the Live CD environment: - - apt install --yes debootstrap gdisk dkms dpkg-dev linux-headers-$(uname -r) - apt install --yes -t buster-backports --no-install-recommends zfs-dkms - modprobe zfs - apt install --yes -t buster-backports zfsutils-linux - -* The dkms dependency is installed manually just so it comes from buster and not buster-backports. This is not critical. -* We need to get the module built and loaded before installing zfsutils-linux or [zfs-mount.service will fail to start](https://github.com/zfsonlinux/zfs/issues/9599). - -## Step 2: Disk Formatting - -2.1 Set a variable with the disk name: - - DISK=/dev/disk/by-id/scsi-SATA_disk1 - -Always use the long `/dev/disk/by-id/*` aliases with ZFS. Using the `/dev/sd*` device nodes directly can cause sporadic import failures, especially on systems that have more than one storage pool. - -**Hints:** -* `ls -la /dev/disk/by-id` will list the aliases. -* Are you doing this in a virtual machine? If your virtual disk is missing from `/dev/disk/by-id`, use `/dev/vda` if you are using KVM with virtio; otherwise, read the [troubleshooting](#troubleshooting) section. - -2.2 If you are re-using a disk, clear it as necessary: - -If the disk was previously used in an MD array, zero the superblock: - - apt install --yes mdadm - mdadm --zero-superblock --force $DISK - -Clear the partition table: - - sgdisk --zap-all $DISK - -2.3 Partition your disk(s): - -Run this if you need legacy (BIOS) booting: - - sgdisk -a1 -n1:24K:+1000K -t1:EF02 $DISK - -Run this for UEFI booting (for use now or in the future): - - sgdisk -n2:1M:+512M -t2:EF00 $DISK - -Run this for the boot pool: - - sgdisk -n3:0:+1G -t3:BF01 $DISK - -Choose one of the following options: - -2.3a Unencrypted or ZFS native encryption: - - sgdisk -n4:0:0 -t4:BF01 $DISK - -2.3b LUKS: - - sgdisk -n4:0:0 -t4:8300 $DISK - -If you are creating a mirror or raidz topology, repeat the partitioning commands for all the disks which will be part of the pool. - -2.4 Create the boot pool: - - zpool create -o ashift=12 -d \ - -o feature@async_destroy=enabled \ - -o feature@bookmarks=enabled \ - -o feature@embedded_data=enabled \ - -o feature@empty_bpobj=enabled \ - -o feature@enabled_txg=enabled \ - -o feature@extensible_dataset=enabled \ - -o feature@filesystem_limits=enabled \ - -o feature@hole_birth=enabled \ - -o feature@large_blocks=enabled \ - -o feature@lz4_compress=enabled \ - -o feature@spacemap_histogram=enabled \ - -o feature@userobj_accounting=enabled \ - -o feature@zpool_checkpoint=enabled \ - -o feature@spacemap_v2=enabled \ - -o feature@project_quota=enabled \ - -o feature@resilver_defer=enabled \ - -o feature@allocation_classes=enabled \ - -O acltype=posixacl -O canmount=off -O compression=lz4 -O devices=off \ - -O normalization=formD -O relatime=on -O xattr=sa \ - -O mountpoint=/ -R /mnt bpool ${DISK}-part3 - -You should not need to customize any of the options for the boot pool. - -GRUB does not support all of the zpool features. See `spa_feature_names` in [grub-core/fs/zfs/zfs.c](http://git.savannah.gnu.org/cgit/grub.git/tree/grub-core/fs/zfs/zfs.c#n276). This step creates a separate boot pool for `/boot` with the features limited to only those that GRUB supports, allowing the root pool to use any/all features. Note that GRUB opens the pool read-only, so all read-only compatible features are "supported" by GRUB. - -**Hints:** -* If you are creating a mirror or raidz topology, create the pool using `zpool create ... bpool mirror /dev/disk/by-id/scsi-SATA_disk1-part3 /dev/disk/by-id/scsi-SATA_disk2-part3` (or replace `mirror` with `raidz`, `raidz2`, or `raidz3` and list the partitions from additional disks). -* The pool name is arbitrary. If changed, the new name must be used consistently. The `bpool` convention originated in this HOWTO. - -2.5 Create the root pool: - -Choose one of the following options: - -2.5a Unencrypted: - - zpool create -o ashift=12 \ - -O acltype=posixacl -O canmount=off -O compression=lz4 \ - -O dnodesize=auto -O normalization=formD -O relatime=on -O xattr=sa \ - -O mountpoint=/ -R /mnt rpool ${DISK}-part4 - -2.5b LUKS: - - apt install --yes cryptsetup - cryptsetup luksFormat -c aes-xts-plain64 -s 512 -h sha256 ${DISK}-part4 - cryptsetup luksOpen ${DISK}-part4 luks1 - zpool create -o ashift=12 \ - -O acltype=posixacl -O canmount=off -O compression=lz4 \ - -O dnodesize=auto -O normalization=formD -O relatime=on -O xattr=sa \ - -O mountpoint=/ -R /mnt rpool /dev/mapper/luks1 - -2.5c ZFS native encryption: - - zpool create -o ashift=12 \ - -O acltype=posixacl -O canmount=off -O compression=lz4 \ - -O dnodesize=auto -O normalization=formD -O relatime=on -O xattr=sa \ - -O encryption=aes-256-gcm -O keylocation=prompt -O keyformat=passphrase \ - -O mountpoint=/ -R /mnt rpool ${DISK}-part4 - -* The use of `ashift=12` is recommended here because many drives today have 4KiB (or larger) physical sectors, even though they present 512B logical sectors. Also, a future replacement drive may have 4KiB physical sectors (in which case `ashift=12` is desirable) or 4KiB logical sectors (in which case `ashift=12` is required). -* Setting `-O acltype=posixacl` enables POSIX ACLs globally. If you do not want this, remove that option, but later add `-o acltype=posixacl` (note: lowercase "o") to the `zfs create` for `/var/log`, as [journald requires ACLs](https://askubuntu.com/questions/970886/journalctl-says-failed-to-search-journal-acl-operation-not-supported) -* Setting `normalization=formD` eliminates some corner cases relating to UTF-8 filename normalization. It also implies `utf8only=on`, which means that only UTF-8 filenames are allowed. If you care to support non-UTF-8 filenames, do not use this option. For a discussion of why requiring UTF-8 filenames may be a bad idea, see [The problems with enforced UTF-8 only filenames](http://utcc.utoronto.ca/~cks/space/blog/linux/ForcedUTF8Filenames). -* Setting `relatime=on` is a middle ground between classic POSIX `atime` behavior (with its significant performance impact) and `atime=off` (which provides the best performance by completely disabling atime updates). Since Linux 2.6.30, `relatime` has been the default for other filesystems. See [RedHat's documentation](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/power_management_guide/relatime) for further information. -* Setting `xattr=sa` [vastly improves the performance of extended attributes](https://github.com/zfsonlinux/zfs/commit/82a37189aac955c81a59a5ecc3400475adb56355). Inside ZFS, extended attributes are used to implement POSIX ACLs. Extended attributes can also be used by user-space applications. [They are used by some desktop GUI applications.](https://en.wikipedia.org/wiki/Extended_file_attributes#Linux) [They can be used by Samba to store Windows ACLs and DOS attributes; they are required for a Samba Active Directory domain controller.](https://wiki.samba.org/index.php/Setting_up_a_Share_Using_Windows_ACLs) Note that [`xattr=sa` is Linux-specific.](http://open-zfs.org/wiki/Platform_code_differences) If you move your `xattr=sa` pool to another OpenZFS implementation besides ZFS-on-Linux, extended attributes will not be readable (though your data will be). If portability of extended attributes is important to you, omit the `-O xattr=sa` above. Even if you do not want `xattr=sa` for the whole pool, it is probably fine to use it for `/var/log`. -* Make sure to include the `-part4` portion of the drive path. If you forget that, you are specifying the whole disk, which ZFS will then re-partition, and you will lose the bootloader partition(s). -* For LUKS, the key size chosen is 512 bits. However, XTS mode requires two keys, so the LUKS key is split in half. Thus, `-s 512` means AES-256. -* ZFS native encryption uses `aes-256-ccm` by default. [AES-GCM seems to be generally preferred over AES-CCM](https://crypto.stackexchange.com/questions/6842/how-to-choose-between-aes-ccm-and-aes-gcm-for-storage-volume-encryption), [is faster now](https://github.com/zfsonlinux/zfs/pull/9749#issuecomment-569132997), and [will be even faster in the future](https://github.com/zfsonlinux/zfs/pull/9749). -* Your passphrase will likely be the weakest link. Choose wisely. See [section 5 of the cryptsetup FAQ](https://gitlab.com/cryptsetup/cryptsetup/wikis/FrequentlyAskedQuestions#5-security-aspects) for guidance. - -**Hints:** -* If you are creating a mirror or raidz topology, create the pool using `zpool create ... rpool mirror /dev/disk/by-id/scsi-SATA_disk1-part4 /dev/disk/by-id/scsi-SATA_disk2-part4` (or replace `mirror` with `raidz`, `raidz2`, or `raidz3` and list the partitions from additional disks). For LUKS, use `/dev/mapper/luks1`, `/dev/mapper/luks2`, etc., which you will have to create using `cryptsetup`. -* The pool name is arbitrary. If changed, the new name must be used consistently. On systems that can automatically install to ZFS, the root pool is named `rpool` by default. - -## Step 3: System Installation - -3.1 Create filesystem datasets to act as containers: - - zfs create -o canmount=off -o mountpoint=none rpool/ROOT - zfs create -o canmount=off -o mountpoint=none bpool/BOOT - -On Solaris systems, the root filesystem is cloned and the suffix is incremented for major system changes through `pkg image-update` or `beadm`. Similar functionality for APT is possible but currently unimplemented. Even without such a tool, it can still be used for manually created clones. - -3.2 Create filesystem datasets for the root and boot filesystems: - - zfs create -o canmount=noauto -o mountpoint=/ rpool/ROOT/debian - zfs mount rpool/ROOT/debian - - zfs create -o canmount=noauto -o mountpoint=/boot bpool/BOOT/debian - zfs mount bpool/BOOT/debian - -With ZFS, it is not normally necessary to use a mount command (either `mount` or `zfs mount`). This situation is an exception because of `canmount=noauto`. - -3.3 Create datasets: - - zfs create rpool/home - zfs create -o mountpoint=/root rpool/home/root - zfs create -o canmount=off rpool/var - zfs create -o canmount=off rpool/var/lib - zfs create rpool/var/log - zfs create rpool/var/spool - -The datasets below are optional, depending on your preferences and/or software -choices. - -If you wish to exclude these from snapshots: - - zfs create -o com.sun:auto-snapshot=false rpool/var/cache - zfs create -o com.sun:auto-snapshot=false rpool/var/tmp - chmod 1777 /mnt/var/tmp - -If you use /opt on this system: - - zfs create rpool/opt - -If you use /srv on this system: - - zfs create rpool/srv - -If you use /usr/local on this system: - - zfs create -o canmount=off rpool/usr - zfs create rpool/usr/local - -If this system will have games installed: - - zfs create rpool/var/games - -If this system will store local email in /var/mail: - - zfs create rpool/var/mail - -If this system will use Snap packages: - - zfs create rpool/var/snap - -If you use /var/www on this system: - - zfs create rpool/var/www - -If this system will use GNOME: - - zfs create rpool/var/lib/AccountsService - -If this system will use Docker (which manages its own datasets & snapshots): - - zfs create -o com.sun:auto-snapshot=false rpool/var/lib/docker - -If this system will use NFS (locking): - - zfs create -o com.sun:auto-snapshot=false rpool/var/lib/nfs - -A tmpfs is recommended later, but if you want a separate dataset for /tmp: - - zfs create -o com.sun:auto-snapshot=false rpool/tmp - chmod 1777 /mnt/tmp - -The primary goal of this dataset layout is to separate the OS from user data. This allows the root filesystem to be rolled back without rolling back user data such as logs (in `/var/log`). This will be especially important if/when a `beadm` or similar utility is integrated. The `com.sun.auto-snapshot` setting is used by some ZFS snapshot utilities to exclude transient data. - -If you do nothing extra, `/tmp` will be stored as part of the root filesystem. Alternatively, you can create a separate dataset for `/tmp`, as shown above. This keeps the `/tmp` data out of snapshots of your root filesystem. It also allows you to set a quota on `rpool/tmp`, if you want to limit the maximum space used. Otherwise, you can use a tmpfs (RAM filesystem) later. - -3.4 Install the minimal system: - - debootstrap buster /mnt - zfs set devices=off rpool - -The `debootstrap` command leaves the new system in an unconfigured state. An alternative to using `debootstrap` is to copy the entirety of a working system into the new ZFS root. - -## Step 4: System Configuration - -4.1 Configure the hostname (change `HOSTNAME` to the desired hostname). - - echo HOSTNAME > /mnt/etc/hostname - - vi /mnt/etc/hosts - Add a line: - 127.0.1.1 HOSTNAME - or if the system has a real name in DNS: - 127.0.1.1 FQDN HOSTNAME - -**Hint:** Use `nano` if you find `vi` confusing. - -4.2 Configure the network interface: - -Find the interface name: - - ip addr show - -Adjust NAME below to match your interface name: - - vi /mnt/etc/network/interfaces.d/NAME - auto NAME - iface NAME inet dhcp - -Customize this file if the system is not a DHCP client. - -4.3 Configure the package sources: - - vi /mnt/etc/apt/sources.list - deb http://deb.debian.org/debian buster main contrib - deb-src http://deb.debian.org/debian buster main contrib - - vi /mnt/etc/apt/sources.list.d/buster-backports.list - deb http://deb.debian.org/debian buster-backports main contrib - deb-src http://deb.debian.org/debian buster-backports main contrib - - vi /mnt/etc/apt/preferences.d/90_zfs - Package: libnvpair1linux libuutil1linux libzfs2linux libzfslinux-dev libzpool2linux python3-pyzfs pyzfs-doc spl spl-dkms zfs-dkms zfs-dracut zfs-initramfs zfs-test zfsutils-linux zfsutils-linux-dev zfs-zed - Pin: release n=buster-backports - Pin-Priority: 990 - -4.4 Bind the virtual filesystems from the LiveCD environment to the new system and `chroot` into it: - - mount --rbind /dev /mnt/dev - mount --rbind /proc /mnt/proc - mount --rbind /sys /mnt/sys - chroot /mnt /usr/bin/env DISK=$DISK bash --login - -**Note:** This is using `--rbind`, not `--bind`. - -4.5 Configure a basic system environment: - - ln -s /proc/self/mounts /etc/mtab - apt update - - apt install --yes locales - dpkg-reconfigure locales - -Even if you prefer a non-English system language, always ensure that `en_US.UTF-8` is available. - - dpkg-reconfigure tzdata - -4.6 Install ZFS in the chroot environment for the new system: - - apt install --yes dpkg-dev linux-headers-amd64 linux-image-amd64 - apt install --yes zfs-initramfs - -4.7 For LUKS installs only, setup crypttab: - - apt install --yes cryptsetup - - echo luks1 UUID=$(blkid -s UUID -o value ${DISK}-part4) none \ - luks,discard,initramfs > /etc/crypttab - -* The use of `initramfs` is a work-around for [cryptsetup does not support ZFS](https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/1612906). - -**Hint:** If you are creating a mirror or raidz topology, repeat the `/etc/crypttab` entries for `luks2`, etc. adjusting for each disk. - -4.8 Install GRUB - -Choose one of the following options: - -4.8a Install GRUB for legacy (BIOS) booting - - apt install --yes grub-pc - -Install GRUB to the disk(s), not the partition(s). - -4.8b Install GRUB for UEFI booting - - apt install dosfstools - mkdosfs -F 32 -s 1 -n EFI ${DISK}-part2 - mkdir /boot/efi - echo PARTUUID=$(blkid -s PARTUUID -o value ${DISK}-part2) \ - /boot/efi vfat nofail,x-systemd.device-timeout=1 0 1 >> /etc/fstab - mount /boot/efi - apt install --yes grub-efi-amd64 shim-signed - -* The `-s 1` for `mkdosfs` is only necessary for drives which present 4 KiB logical sectors (“4Kn” drives) to meet the minimum cluster size (given the partition size of 512 MiB) for FAT32. It also works fine on drives which present 512 B sectors. - -**Note:** If you are creating a mirror or raidz topology, this step only installs GRUB on the first disk. The other disk(s) will be handled later. - -4.9 Set a root password - - passwd - -4.10 Enable importing bpool - -This ensures that `bpool` is always imported, regardless of whether `/etc/zfs/zpool.cache` exists, whether it is in the cachefile or not, or whether `zfs-import-scan.service` is enabled. -``` - vi /etc/systemd/system/zfs-import-bpool.service - [Unit] - DefaultDependencies=no - Before=zfs-import-scan.service - Before=zfs-import-cache.service - - [Service] - Type=oneshot - RemainAfterExit=yes - ExecStart=/sbin/zpool import -N -o cachefile=none bpool - - [Install] - WantedBy=zfs-import.target -``` - - systemctl enable zfs-import-bpool.service - -4.11 Optional (but recommended): Mount a tmpfs to /tmp - -If you chose to create a `/tmp` dataset above, skip this step, as they are mutually exclusive choices. Otherwise, you can put `/tmp` on a tmpfs (RAM filesystem) by enabling the `tmp.mount` unit. - - cp /usr/share/systemd/tmp.mount /etc/systemd/system/ - systemctl enable tmp.mount - -4.12 Optional (but kindly requested): Install popcon - -The `popularity-contest` package reports the list of packages install on your system. Showing that ZFS is popular may be helpful in terms of long-term attention from the distro. - - apt install --yes popularity-contest - -Choose Yes at the prompt. - -## Step 5: GRUB Installation - -5.1 Verify that the ZFS boot filesystem is recognized: - - grub-probe /boot - -5.2 Refresh the initrd files: - - update-initramfs -u -k all - -**Note:** When using LUKS, this will print "WARNING could not determine root device from /etc/fstab". This is because [cryptsetup does not support ZFS](https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/1612906). - -5.3 Workaround GRUB's missing zpool-features support: - - vi /etc/default/grub - Set: GRUB_CMDLINE_LINUX="root=ZFS=rpool/ROOT/debian" - -5.4 Optional (but highly recommended): Make debugging GRUB easier: - - vi /etc/default/grub - Remove quiet from: GRUB_CMDLINE_LINUX_DEFAULT - Uncomment: GRUB_TERMINAL=console - Save and quit. - -Later, once the system has rebooted twice and you are sure everything is working, you can undo these changes, if desired. - -5.5 Update the boot configuration: - - update-grub - -**Note:** Ignore errors from `osprober`, if present. - -5.6 Install the boot loader - -5.6a For legacy (BIOS) booting, install GRUB to the MBR: - - grub-install $DISK - -Note that you are installing GRUB to the whole disk, not a partition. - -If you are creating a mirror or raidz topology, repeat the `grub-install` command for each disk in the pool. - -5.6b For UEFI booting, install GRUB: - - grub-install --target=x86_64-efi --efi-directory=/boot/efi \ - --bootloader-id=debian --recheck --no-floppy - -It is not necessary to specify the disk here. If you are creating a mirror or raidz topology, the additional disks will be handled later. - -5.7 Verify that the ZFS module is installed: - - ls /boot/grub/*/zfs.mod - -5.8 Fix filesystem mount ordering - -Until there is support for mounting `/boot` in the initramfs, we also need to mount that, because it was marked `canmount=noauto`. Also, with UEFI, we need to ensure it is mounted before its child filesystem `/boot/efi`. - -We need to activate `zfs-mount-generator`. This makes systemd aware of the separate mountpoints, which is important for things like `/var/log` and `/var/tmp`. In turn, `rsyslog.service` depends on `var-log.mount` by way of `local-fs.target` and services using the `PrivateTmp` feature of systemd automatically use `After=var-tmp.mount`. - -For UEFI booting, unmount /boot/efi first: - - umount /boot/efi - -Everything else applies to both BIOS and UEFI booting: - - zfs set mountpoint=legacy bpool/BOOT/debian - echo bpool/BOOT/debian /boot zfs \ - nodev,relatime,x-systemd.requires=zfs-import-bpool.service 0 0 >> /etc/fstab - - mkdir /etc/zfs/zfs-list.cache - touch /etc/zfs/zfs-list.cache/rpool - ln -s /usr/lib/zfs-linux/zed.d/history_event-zfs-list-cacher.sh /etc/zfs/zed.d - zed -F & - -Verify that zed updated the cache by making sure this is not empty: - - cat /etc/zfs/zfs-list.cache/rpool - -If it is empty, force a cache update and check again: - - zfs set canmount=noauto rpool/ROOT/debian - -Stop zed: - - fg - Press Ctrl-C. - -Fix the paths to eliminate /mnt: - - sed -Ei "s|/mnt/?|/|" /etc/zfs/zfs-list.cache/rpool - -## Step 6: First Boot - -6.1 Snapshot the initial installation: - - zfs snapshot bpool/BOOT/debian@install - zfs snapshot rpool/ROOT/debian@install - -In the future, you will likely want to take snapshots before each upgrade, and remove old snapshots (including this one) at some point to save space. - -6.2 Exit from the `chroot` environment back to the LiveCD environment: - - exit - -6.3 Run these commands in the LiveCD environment to unmount all filesystems: - - mount | grep -v zfs | tac | awk '/\/mnt/ {print $3}' | xargs -i{} umount -lf {} - zpool export -a - -6.4 Reboot: - - reboot - -6.5 Wait for the newly installed system to boot normally. Login as root. - -6.6 Create a user account: - - zfs create rpool/home/YOURUSERNAME - adduser YOURUSERNAME - cp -a /etc/skel/. /home/YOURUSERNAME - chown -R YOURUSERNAME:YOURUSERNAME /home/YOURUSERNAME - -6.7 Add your user account to the default set of groups for an administrator: - - usermod -a -G audio,cdrom,dip,floppy,netdev,plugdev,sudo,video YOURUSERNAME - -6.8 Mirror GRUB - -If you installed to multiple disks, install GRUB on the additional disks: - -6.8a For legacy (BIOS) booting: - - dpkg-reconfigure grub-pc - Hit enter until you get to the device selection screen. - Select (using the space bar) all of the disks (not partitions) in your pool. - -6.8b UEFI - - umount /boot/efi - -For the second and subsequent disks (increment debian-2 to -3, etc.): - - dd if=/dev/disk/by-id/scsi-SATA_disk1-part2 \ - of=/dev/disk/by-id/scsi-SATA_disk2-part2 - efibootmgr -c -g -d /dev/disk/by-id/scsi-SATA_disk2 \ - -p 2 -L "debian-2" -l '\EFI\debian\grubx64.efi' - - mount /boot/efi - -## Step 7: (Optional) Configure Swap - -**Caution**: On systems with extremely high memory pressure, using a zvol for swap can result in lockup, regardless of how much swap is still available. This issue is currently being investigated in: https://github.com/zfsonlinux/zfs/issues/7734 - -7.1 Create a volume dataset (zvol) for use as a swap device: - - zfs create -V 4G -b $(getconf PAGESIZE) -o compression=zle \ - -o logbias=throughput -o sync=always \ - -o primarycache=metadata -o secondarycache=none \ - -o com.sun:auto-snapshot=false rpool/swap - -You can adjust the size (the `4G` part) to your needs. - -The compression algorithm is set to `zle` because it is the cheapest available algorithm. As this guide recommends `ashift=12` (4 kiB blocks on disk), the common case of a 4 kiB page size means that no compression algorithm can reduce I/O. The exception is all-zero pages, which are dropped by ZFS; but some form of compression has to be enabled to get this behavior. - -7.2 Configure the swap device: - -**Caution**: Always use long `/dev/zvol` aliases in configuration files. Never use a short `/dev/zdX` device name. - - mkswap -f /dev/zvol/rpool/swap - echo /dev/zvol/rpool/swap none swap discard 0 0 >> /etc/fstab - echo RESUME=none > /etc/initramfs-tools/conf.d/resume - -The `RESUME=none` is necessary to disable resuming from hibernation. This does not work, as the zvol is not present (because the pool has not yet been imported) at the time the resume script runs. If it is not disabled, the boot process hangs for 30 seconds waiting for the swap zvol to appear. - -7.3 Enable the swap device: - - swapon -av - -## Step 8: Full Software Installation - -8.1 Upgrade the minimal system: - - apt dist-upgrade --yes - -8.2 Install a regular set of software: - - tasksel - -8.3 Optional: Disable log compression: - -As `/var/log` is already compressed by ZFS, logrotate’s compression is going to burn CPU and disk I/O for (in most cases) very little gain. Also, if you are making snapshots of `/var/log`, logrotate’s compression will actually waste space, as the uncompressed data will live on in the snapshot. You can edit the files in `/etc/logrotate.d` by hand to comment out `compress`, or use this loop (copy-and-paste highly recommended): - - for file in /etc/logrotate.d/* ; do - if grep -Eq "(^|[^#y])compress" "$file" ; then - sed -i -r "s/(^|[^#y])(compress)/\1#\2/" "$file" - fi - done - -8.4 Reboot: - - reboot - -### Step 9: Final Cleanup - -9.1 Wait for the system to boot normally. Login using the account you created. Ensure the system (including networking) works normally. - -9.2 Optional: Delete the snapshots of the initial installation: - - sudo zfs destroy bpool/BOOT/debian@install - sudo zfs destroy rpool/ROOT/debian@install - -9.3 Optional: Disable the root password - - sudo usermod -p '*' root - -9.4 Optional: Re-enable the graphical boot process: - -If you prefer the graphical boot process, you can re-enable it now. If you are using LUKS, it makes the prompt look nicer. - - sudo vi /etc/default/grub - Add quiet to GRUB_CMDLINE_LINUX_DEFAULT - Comment out GRUB_TERMINAL=console - Save and quit. - - sudo update-grub - -**Note:** Ignore errors from `osprober`, if present. - -9.5 Optional: For LUKS installs only, backup the LUKS header: - - sudo cryptsetup luksHeaderBackup /dev/disk/by-id/scsi-SATA_disk1-part4 \ - --header-backup-file luks1-header.dat - -Store that backup somewhere safe (e.g. cloud storage). It is protected by your LUKS passphrase, but you may wish to use additional encryption. - -**Hint:** If you created a mirror or raidz topology, repeat this for each LUKS volume (`luks2`, etc.). - -## Troubleshooting - -### Rescuing using a Live CD - -Go through [Step 1: Prepare The Install Environment](#step-1-prepare-the-install-environment). - -For LUKS, first unlock the disk(s): - - apt install --yes cryptsetup - cryptsetup luksOpen /dev/disk/by-id/scsi-SATA_disk1-part4 luks1 - Repeat for additional disks, if this is a mirror or raidz topology. - -Mount everything correctly: - - zpool export -a - zpool import -N -R /mnt rpool - zpool import -N -R /mnt bpool - zfs load-key -a - zfs mount rpool/ROOT/debian - zfs mount -a - -If needed, you can chroot into your installed environment: - - mount --rbind /dev /mnt/dev - mount --rbind /proc /mnt/proc - mount --rbind /sys /mnt/sys - chroot /mnt /bin/bash --login - mount /boot - mount -a - -Do whatever you need to do to fix your system. - -When done, cleanup: - - exit - mount | grep -v zfs | tac | awk '/\/mnt/ {print $3}' | xargs -i{} umount -lf {} - zpool export -a - reboot - -### MPT2SAS - -Most problem reports for this tutorial involve `mpt2sas` hardware that does slow asynchronous drive initialization, like some IBM M1015 or OEM-branded cards that have been flashed to the reference LSI firmware. - -The basic problem is that disks on these controllers are not visible to the Linux kernel until after the regular system is started, and ZoL does not hotplug pool members. See https://github.com/zfsonlinux/zfs/issues/330. - -Most LSI cards are perfectly compatible with ZoL. If your card has this glitch, try setting ZFS_INITRD_PRE_MOUNTROOT_SLEEP=X in /etc/default/zfs. The system will wait X seconds for all drives to appear before importing the pool. - -### Areca - -Systems that require the `arcsas` blob driver should add it to the `/etc/initramfs-tools/modules` file and run `update-initramfs -u -k all`. - -Upgrade or downgrade the Areca driver if something like `RIP: 0010:[] [] native_read_tsc+0x6/0x20` appears anywhere in kernel log. ZoL is unstable on systems that emit this error message. - -### VMware - -* Set `disk.EnableUUID = "TRUE"` in the vmx file or vsphere configuration. Doing this ensures that `/dev/disk` aliases are created in the guest. - -### QEMU/KVM/XEN - -Set a unique serial number on each virtual disk using libvirt or qemu (e.g. `-drive if=none,id=disk1,file=disk1.qcow2,serial=1234567890`). - -To be able to use UEFI in guests (instead of only BIOS booting), run this on the host: - - sudo apt install ovmf - - sudo vi /etc/libvirt/qemu.conf - Uncomment these lines: - nvram = [ - "/usr/share/OVMF/OVMF_CODE.fd:/usr/share/OVMF/OVMF_VARS.fd", - "/usr/share/OVMF/OVMF_CODE.secboot.fd:/usr/share/OVMF/OVMF_VARS.fd", - "/usr/share/AAVMF/AAVMF_CODE.fd:/usr/share/AAVMF/AAVMF_VARS.fd", - "/usr/share/AAVMF/AAVMF32_CODE.fd:/usr/share/AAVMF/AAVMF32_VARS.fd" - ] - - sudo systemctl restart libvirtd.service +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Debian-GNU-Linux-initrd-documentation.md b/Debian-GNU-Linux-initrd-documentation.md index e37de96..9ce1bac 100644 --- a/Debian-GNU-Linux-initrd-documentation.md +++ b/Debian-GNU-Linux-initrd-documentation.md @@ -1,70 +1,3 @@ -# Supported boot parameters -* rollback=\ Do a rollback of specified snapshot. -* zfs_debug=\ Debug the initrd script -* zfs_force=\ Force importing the pool. Should not be necessary. -* zfs=\ Don't try to import ANY pool, mount ANY filesystem or even load the module. -* rpool=\ Use this pool for root pool. -* bootfs=\/\ Use this dataset for root filesystem. -* root=\/\ Use this dataset for root filesystem. -* root=ZFS=\/\ Use this dataset for root filesystem. -* root=zfs:\/\ Use this dataset for root filesystem. -* root=zfs:AUTO Try to detect both pool and rootfs +This page was moved to: https://openzfs.github.io/openzfs-docs/Getting%20Started/Debian/Debian%20GNU%20Linux%20initrd%20documentation.html -In all these cases, \ could also be \@\. - -The reason there are so many supported boot options to get the root filesystem, is that there are a lot of different ways too boot ZFS out there, and I wanted to make sure I supported them all. - -# Pool imports -## Import using /dev/disk/by-* -The initrd will, if the variable USE_DISK_BY_ID is set in the file /etc/default/zfs, to import using the /dev/disk/by-* links. It will try to import in this order: - -1. /dev/disk/by-vdev -2. /dev/disk/by-\* -3. /dev - -## Import using cache file -If all of these imports fail (or if USE_DISK_BY_ID is unset), it will then try to import using the cache file. - -## Last ditch attempt at importing -If that ALSO fails, it will try one more time, without any -d or -c options. - -# Booting -## Booting from snapshot: -Enter the snapshot for the root= parameter like in this example: - -``` -linux /ROOT/debian-1@/boot/vmlinuz-3.2.0-4-amd64 root=ZFS=rpool/ROOT/debian-1@some_snapshot ro boot=zfs $bootfs quiet -``` - -This will clone the snapshot rpool/ROOT/debian-1@some_snapshot into the filesystem rpool/ROOT/debian-1_some_snapshot and use that as root filesystem. The original filesystem and snapshot is left alone in this case. - -**BEWARE** that it will first destroy, blindingly, the rpool/ROOT/debian-1_some_snapshot filesystem before trying to clone the snapshot into it again. So if you've booted from the same snapshot previously and done some changes in that root filesystem, they will be undone by the destruction of the filesystem. - -## Snapshot rollback -From version 0.6.4-1-3 it is now also possible to specify rollback=1 to do a rollback of the snapshot instead of cloning it. **BEWARE** that this will destroy _all_ snapshots done after the specified snapshot! - -## Select snapshot dynamically -From version 0.6.4-1-3 it is now also possible to specify a NULL snapshot name (such as root=rpool/ROOT/debian-1@) and if so, the initrd script will discover all snapshots below that filesystem (sans the at), and output a list of snapshot for the user to choose from. - -## Booting from native encrypted filesystem -Although there is currently no support for native encryption in ZFS On Linux, there is a patch floating around 'out there' and the initrd supports loading key and unlock such encrypted filesystem. - -## Separated filesystems -### Descended filesystems -If there are separate filesystems (for example a separate dataset for /usr), the snapshot boot code will try to find the snapshot under each filesystems and clone (or rollback) them. - -Example: - -``` -rpool/ROOT/debian-1@some_snapshot -rpool/ROOT/debian-1/usr@some_snapshot -``` - -These will create the following filesystems respectively (if not doing a rollback): - -``` -rpool/ROOT/debian-1_some_snapshot -rpool/ROOT/debian-1/usr_some_snapshot -``` - -The initrd code will use the mountpoint option (if any) in the original (without the snapshot part) dataset to find _where_ it should mount the dataset. Or it will use the name of the dataset below the root filesystem (rpool/ROOT/debian-1 in this example) for the mount point. \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Debian-Stretch-Root-on-ZFS.md b/Debian-Stretch-Root-on-ZFS.md index 795952b..120a15c 100644 --- a/Debian-Stretch-Root-on-ZFS.md +++ b/Debian-Stretch-Root-on-ZFS.md @@ -1,702 +1,3 @@ -### Newer release available -* See [[Debian Buster Root on ZFS]] for new installs. +This page was moved to: https://openzfs.github.io/openzfs-docs/Getting%20Started/Debian/Debian%20Stretch%20Root%20on%20ZFS.html -### Caution -* This HOWTO uses a whole physical disk. -* Do not use these instructions for dual-booting. -* Backup your data. Any existing data will be lost. - -### System Requirements -* [64-bit Debian GNU/Linux Stretch Live CD](http://cdimage.debian.org/debian-cd/current-live/amd64/iso-hybrid/) -* [A 64-bit kernel is *strongly* encouraged.](https://github.com/zfsonlinux/zfs/wiki/FAQ#32-bit-vs-64-bit-systems) -* Installing on a drive which presents 4KiB logical sectors (a “4Kn” drive) only works with UEFI booting. This not unique to ZFS. [GRUB does not and will not work on 4Kn with legacy (BIOS) booting.](http://savannah.gnu.org/bugs/?46700) - -Computers that have less than 2 GiB of memory run ZFS slowly. 4 GiB of memory is recommended for normal performance in basic workloads. If you wish to use deduplication, you will need [massive amounts of RAM](http://wiki.freebsd.org/ZFSTuningGuide#Deduplication). Enabling deduplication is a permanent change that cannot be easily reverted. - -## Support - -If you need help, reach out to the community using the [zfs-discuss mailing list](https://github.com/zfsonlinux/zfs/wiki/Mailing-Lists) or IRC at #zfsonlinux on [freenode](https://freenode.net/). If you have a bug report or feature request related to this HOWTO, please [file a new issue](https://github.com/zfsonlinux/zfs/issues/new) and mention @rlaager. - -## Contributing - -Edit permission on this wiki is restricted. Also, GitHub wikis do not support pull requests. However, you can clone the wiki using git. - -1) `git clone https://github.com/zfsonlinux/zfs.wiki.git` -2) Make your changes. -3) Use `git diff > my-changes.patch` to create a patch. (Advanced git users may wish to `git commit` to a branch and `git format-patch`.) -4) [File a new issue](https://github.com/zfsonlinux/zfs/issues/new), mention @rlaager, and attach the patch. - -## Encryption - -This guide supports two different encryption options: unencrypted and LUKS (full-disk encryption). ZFS native encryption has not yet been released. With either option, all ZFS features are fully available. - -Unencrypted does not encrypt anything, of course. With no encryption happening, this option naturally has the best performance. - -LUKS encrypts almost everything: the OS, swap, home directories, and anything else. The only unencrypted data is the bootloader, kernel, and initrd. The system cannot boot without the passphrase being entered at the console. Performance is good, but LUKS sits underneath ZFS, so if multiple disks (mirror or raidz topologies) are used, the data has to be encrypted once per disk. - -## Step 1: Prepare The Install Environment - -1.1 Boot the Debian GNU/Linux Live CD. If prompted, login with the username `user` and password `live`. Connect your system to the Internet as appropriate (e.g. join your WiFi network). - -1.2 Optional: Install and start the OpenSSH server in the Live CD environment: - -If you have a second system, using SSH to access the target system can be convenient. - - $ sudo apt update - $ sudo apt install --yes openssh-server - $ sudo systemctl restart ssh - -**Hint:** You can find your IP address with `ip addr show scope global | grep inet`. Then, from your main machine, connect with `ssh user@IP`. - -1.3 Become root: - - $ sudo -i - -1.4 Setup and update the repositories: - - # echo deb http://deb.debian.org/debian stretch contrib >> /etc/apt/sources.list - # echo deb http://deb.debian.org/debian stretch-backports main contrib >> /etc/apt/sources.list - # apt update - -1.5 Install ZFS in the Live CD environment: - - # apt install --yes debootstrap gdisk dkms dpkg-dev linux-headers-$(uname -r) - # apt install --yes -t stretch-backports zfs-dkms - # modprobe zfs - -* The dkms dependency is installed manually just so it comes from stretch and not stretch-backports. This is not critical. - -## Step 2: Disk Formatting - -2.1 If you are re-using a disk, clear it as necessary: - - If the disk was previously used in an MD array, zero the superblock: - # apt install --yes mdadm - # mdadm --zero-superblock --force /dev/disk/by-id/scsi-SATA_disk1 - - Clear the partition table: - # sgdisk --zap-all /dev/disk/by-id/scsi-SATA_disk1 - -2.2 Partition your disk(s): - - Run this if you need legacy (BIOS) booting: - # sgdisk -a1 -n1:24K:+1000K -t1:EF02 /dev/disk/by-id/scsi-SATA_disk1 - - Run this for UEFI booting (for use now or in the future): - # sgdisk -n2:1M:+512M -t2:EF00 /dev/disk/by-id/scsi-SATA_disk1 - - Run this for the boot pool: - # sgdisk -n3:0:+1G -t3:BF01 /dev/disk/by-id/scsi-SATA_disk1 - -Choose one of the following options: - -2.2a Unencrypted: - - # sgdisk -n4:0:0 -t4:BF01 /dev/disk/by-id/scsi-SATA_disk1 - -2.2b LUKS: - - # sgdisk -n4:0:0 -t4:8300 /dev/disk/by-id/scsi-SATA_disk1 - -Always use the long `/dev/disk/by-id/*` aliases with ZFS. Using the `/dev/sd*` device nodes directly can cause sporadic import failures, especially on systems that have more than one storage pool. - -**Hints:** -* `ls -la /dev/disk/by-id` will list the aliases. -* Are you doing this in a virtual machine? If your virtual disk is missing from `/dev/disk/by-id`, use `/dev/vda` if you are using KVM with virtio; otherwise, read the [troubleshooting](#troubleshooting) section. -* If you are creating a mirror or raidz topology, repeat the partitioning commands for all the disks which will be part of the pool. - -2.3 Create the boot pool: - - # zpool create -o ashift=12 -d \ - -o feature@async_destroy=enabled \ - -o feature@bookmarks=enabled \ - -o feature@embedded_data=enabled \ - -o feature@empty_bpobj=enabled \ - -o feature@enabled_txg=enabled \ - -o feature@extensible_dataset=enabled \ - -o feature@filesystem_limits=enabled \ - -o feature@hole_birth=enabled \ - -o feature@large_blocks=enabled \ - -o feature@lz4_compress=enabled \ - -o feature@spacemap_histogram=enabled \ - -o feature@userobj_accounting=enabled \ - -O acltype=posixacl -O canmount=off -O compression=lz4 -O devices=off \ - -O normalization=formD -O relatime=on -O xattr=sa \ - -O mountpoint=/ -R /mnt \ - bpool /dev/disk/by-id/scsi-SATA_disk1-part3 - -You should not need to customize any of the options for the boot pool. - -GRUB does not support all of the zpool features. See `spa_feature_names` in [grub-core/fs/zfs/zfs.c](http://git.savannah.gnu.org/cgit/grub.git/tree/grub-core/fs/zfs/zfs.c#n276). This step creates a separate boot pool for `/boot` with the features limited to only those that GRUB supports, allowing the root pool to use any/all features. Note that GRUB opens the pool read-only, so all read-only compatible features are "supported" by GRUB. - -**Hints:** -* If you are creating a mirror or raidz topology, create the pool using `zpool create ... bpool mirror /dev/disk/by-id/scsi-SATA_disk1-part3 /dev/disk/by-id/scsi-SATA_disk2-part3` (or replace `mirror` with `raidz`, `raidz2`, or `raidz3` and list the partitions from additional disks). -* The pool name is arbitrary. If changed, the new name must be used consistently. The `bpool` convention originated in this HOWTO. - -2.4 Create the root pool: - -Choose one of the following options: - -2.4a Unencrypted: - - # zpool create -o ashift=12 \ - -O acltype=posixacl -O canmount=off -O compression=lz4 \ - -O dnodesize=auto -O normalization=formD -O relatime=on -O xattr=sa \ - -O mountpoint=/ -R /mnt \ - rpool /dev/disk/by-id/scsi-SATA_disk1-part4 - -2.4b LUKS: - - # apt install --yes cryptsetup - # cryptsetup luksFormat -c aes-xts-plain64 -s 512 -h sha256 \ - /dev/disk/by-id/scsi-SATA_disk1-part4 - # cryptsetup luksOpen /dev/disk/by-id/scsi-SATA_disk1-part4 luks1 - # zpool create -o ashift=12 \ - -O acltype=posixacl -O canmount=off -O compression=lz4 \ - -O dnodesize=auto -O normalization=formD -O relatime=on -O xattr=sa \ - -O mountpoint=/ -R /mnt \ - rpool /dev/mapper/luks1 - -* The use of `ashift=12` is recommended here because many drives today have 4KiB (or larger) physical sectors, even though they present 512B logical sectors. Also, a future replacement drive may have 4KiB physical sectors (in which case `ashift=12` is desirable) or 4KiB logical sectors (in which case `ashift=12` is required). -* Setting `-O acltype=posixacl` enables POSIX ACLs globally. If you do not want this, remove that option, but later add `-o acltype=posixacl` (note: lowercase "o") to the `zfs create` for `/var/log`, as [journald requires ACLs](https://askubuntu.com/questions/970886/journalctl-says-failed-to-search-journal-acl-operation-not-supported) -* Setting `normalization=formD` eliminates some corner cases relating to UTF-8 filename normalization. It also implies `utf8only=on`, which means that only UTF-8 filenames are allowed. If you care to support non-UTF-8 filenames, do not use this option. For a discussion of why requiring UTF-8 filenames may be a bad idea, see [The problems with enforced UTF-8 only filenames](http://utcc.utoronto.ca/~cks/space/blog/linux/ForcedUTF8Filenames). -* Setting `relatime=on` is a middle ground between classic POSIX `atime` behavior (with its significant performance impact) and `atime=off` (which provides the best performance by completely disabling atime updates). Since Linux 2.6.30, `relatime` has been the default for other filesystems. See [RedHat's documentation](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/power_management_guide/relatime) for further information. -* Setting `xattr=sa` [vastly improves the performance of extended attributes](https://github.com/zfsonlinux/zfs/commit/82a37189aac955c81a59a5ecc3400475adb56355). Inside ZFS, extended attributes are used to implement POSIX ACLs. Extended attributes can also be used by user-space applications. [They are used by some desktop GUI applications.](https://en.wikipedia.org/wiki/Extended_file_attributes#Linux) [They can be used by Samba to store Windows ACLs and DOS attributes; they are required for a Samba Active Directory domain controller.](https://wiki.samba.org/index.php/Setting_up_a_Share_Using_Windows_ACLs) Note that [`xattr=sa` is Linux-specific.](http://open-zfs.org/wiki/Platform_code_differences) If you move your `xattr=sa` pool to another OpenZFS implementation besides ZFS-on-Linux, extended attributes will not be readable (though your data will be). If portability of extended attributes is important to you, omit the `-O xattr=sa` above. Even if you do not want `xattr=sa` for the whole pool, it is probably fine to use it for `/var/log`. -* Make sure to include the `-part4` portion of the drive path. If you forget that, you are specifying the whole disk, which ZFS will then re-partition, and you will lose the bootloader partition(s). -* For LUKS, the key size chosen is 512 bits. However, XTS mode requires two keys, so the LUKS key is split in half. Thus, `-s 512` means AES-256. -* Your passphrase will likely be the weakest link. Choose wisely. See [section 5 of the cryptsetup FAQ](https://gitlab.com/cryptsetup/cryptsetup/wikis/FrequentlyAskedQuestions#5-security-aspects) for guidance. - -**Hints:** -* If you are creating a mirror or raidz topology, create the pool using `zpool create ... rpool mirror /dev/disk/by-id/scsi-SATA_disk1-part4 /dev/disk/by-id/scsi-SATA_disk2-part4` (or replace `mirror` with `raidz`, `raidz2`, or `raidz3` and list the partitions from additional disks). For LUKS, use `/dev/mapper/luks1`, `/dev/mapper/luks2`, etc., which you will have to create using `cryptsetup`. -* The pool name is arbitrary. If changed, the new name must be used consistently. On systems that can automatically install to ZFS, the root pool is named `rpool` by default. - -## Step 3: System Installation - -3.1 Create filesystem datasets to act as containers: - - # zfs create -o canmount=off -o mountpoint=none rpool/ROOT - # zfs create -o canmount=off -o mountpoint=none bpool/BOOT - -On Solaris systems, the root filesystem is cloned and the suffix is incremented for major system changes through `pkg image-update` or `beadm`. Similar functionality for APT is possible but currently unimplemented. Even without such a tool, it can still be used for manually created clones. - -3.2 Create filesystem datasets for the root and boot filesystems: - - # zfs create -o canmount=noauto -o mountpoint=/ rpool/ROOT/debian - # zfs mount rpool/ROOT/debian - - # zfs create -o canmount=noauto -o mountpoint=/boot bpool/BOOT/debian - # zfs mount bpool/BOOT/debian - -With ZFS, it is not normally necessary to use a mount command (either `mount` or `zfs mount`). This situation is an exception because of `canmount=noauto`. - -3.3 Create datasets: - - # zfs create rpool/home - # zfs create -o mountpoint=/root rpool/home/root - # zfs create -o canmount=off rpool/var - # zfs create -o canmount=off rpool/var/lib - # zfs create rpool/var/log - # zfs create rpool/var/spool - - The datasets below are optional, depending on your preferences and/or - software choices: - - If you wish to exclude these from snapshots: - # zfs create -o com.sun:auto-snapshot=false rpool/var/cache - # zfs create -o com.sun:auto-snapshot=false rpool/var/tmp - # chmod 1777 /mnt/var/tmp - - If you use /opt on this system: - # zfs create rpool/opt - - If you use /srv on this system: - # zfs create rpool/srv - - If you use /usr/local on this system: - # zfs create -o canmount=off rpool/usr - # zfs create rpool/usr/local - - If this system will have games installed: - # zfs create rpool/var/games - - If this system will store local email in /var/mail: - # zfs create rpool/var/mail - - If this system will use Snap packages: - # zfs create rpool/var/snap - - If you use /var/www on this system: - # zfs create rpool/var/www - - If this system will use GNOME: - # zfs create rpool/var/lib/AccountsService - - If this system will use Docker (which manages its own datasets & snapshots): - # zfs create -o com.sun:auto-snapshot=false rpool/var/lib/docker - - If this system will use NFS (locking): - # zfs create -o com.sun:auto-snapshot=false rpool/var/lib/nfs - - A tmpfs is recommended later, but if you want a separate dataset for /tmp: - # zfs create -o com.sun:auto-snapshot=false rpool/tmp - # chmod 1777 /mnt/tmp - -The primary goal of this dataset layout is to separate the OS from user data. This allows the root filesystem to be rolled back without rolling back user data such as logs (in `/var/log`). This will be especially important if/when a `beadm` or similar utility is integrated. The `com.sun.auto-snapshot` setting is used by some ZFS snapshot utilities to exclude transient data. - -If you do nothing extra, `/tmp` will be stored as part of the root filesystem. Alternatively, you can create a separate dataset for `/tmp`, as shown above. This keeps the `/tmp` data out of snapshots of your root filesystem. It also allows you to set a quota on `rpool/tmp`, if you want to limit the maximum space used. Otherwise, you can use a tmpfs (RAM filesystem) later. - -3.4 Install the minimal system: - - # debootstrap stretch /mnt - # zfs set devices=off rpool - -The `debootstrap` command leaves the new system in an unconfigured state. An alternative to using `debootstrap` is to copy the entirety of a working system into the new ZFS root. - -## Step 4: System Configuration - -4.1 Configure the hostname (change `HOSTNAME` to the desired hostname). - - # echo HOSTNAME > /mnt/etc/hostname - - # vi /mnt/etc/hosts - Add a line: - 127.0.1.1 HOSTNAME - or if the system has a real name in DNS: - 127.0.1.1 FQDN HOSTNAME - -**Hint:** Use `nano` if you find `vi` confusing. - -4.2 Configure the network interface: - - Find the interface name: - # ip addr show - - # vi /mnt/etc/network/interfaces.d/NAME - auto NAME - iface NAME inet dhcp - -Customize this file if the system is not a DHCP client. - -4.3 Configure the package sources: - - # vi /mnt/etc/apt/sources.list - deb http://deb.debian.org/debian stretch main contrib - deb-src http://deb.debian.org/debian stretch main contrib - - # vi /mnt/etc/apt/sources.list.d/stretch-backports.list - deb http://deb.debian.org/debian stretch-backports main contrib - deb-src http://deb.debian.org/debian stretch-backports main contrib - - # vi /mnt/etc/apt/preferences.d/90_zfs - Package: libnvpair1linux libuutil1linux libzfs2linux libzpool2linux spl-dkms zfs-dkms zfs-test zfsutils-linux zfsutils-linux-dev zfs-zed - Pin: release n=stretch-backports - Pin-Priority: 990 - -4.4 Bind the virtual filesystems from the LiveCD environment to the new system and `chroot` into it: - - # mount --rbind /dev /mnt/dev - # mount --rbind /proc /mnt/proc - # mount --rbind /sys /mnt/sys - # chroot /mnt /bin/bash --login - -**Note:** This is using `--rbind`, not `--bind`. - -4.5 Configure a basic system environment: - - # ln -s /proc/self/mounts /etc/mtab - # apt update - - # apt install --yes locales - # dpkg-reconfigure locales - -Even if you prefer a non-English system language, always ensure that `en_US.UTF-8` is available. - - # dpkg-reconfigure tzdata - -4.6 Install ZFS in the chroot environment for the new system: - - # apt install --yes dpkg-dev linux-headers-amd64 linux-image-amd64 - # apt install --yes zfs-initramfs - -4.7 For LUKS installs only, setup crypttab: - - # apt install --yes cryptsetup - - # echo luks1 UUID=$(blkid -s UUID -o value \ - /dev/disk/by-id/scsi-SATA_disk1-part4) none \ - luks,discard,initramfs > /etc/crypttab - -* The use of `initramfs` is a work-around for [cryptsetup does not support ZFS](https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/1612906). - -**Hint:** If you are creating a mirror or raidz topology, repeat the `/etc/crypttab` entries for `luks2`, etc. adjusting for each disk. - -4.8 Install GRUB - -Choose one of the following options: - -4.8a Install GRUB for legacy (BIOS) booting - - # apt install --yes grub-pc - -Install GRUB to the disk(s), not the partition(s). - -4.8b Install GRUB for UEFI booting - - # apt install dosfstools - # mkdosfs -F 32 -s 1 -n EFI /dev/disk/by-id/scsi-SATA_disk1-part2 - # mkdir /boot/efi - # echo PARTUUID=$(blkid -s PARTUUID -o value \ - /dev/disk/by-id/scsi-SATA_disk1-part2) \ - /boot/efi vfat nofail,x-systemd.device-timeout=1 0 1 >> /etc/fstab - # mount /boot/efi - # apt install --yes grub-efi-amd64 shim - -* The `-s 1` for `mkdosfs` is only necessary for drives which present 4 KiB logical sectors (“4Kn” drives) to meet the minimum cluster size (given the partition size of 512 MiB) for FAT32. It also works fine on drives which present 512 B sectors. - -**Note:** If you are creating a mirror or raidz topology, this step only installs GRUB on the first disk. The other disk(s) will be handled later. - -4.9 Set a root password - - # passwd - -4.10 Enable importing bpool - -This ensures that `bpool` is always imported, regardless of whether `/etc/zfs/zpool.cache` exists, whether it is in the cachefile or not, or whether `zfs-import-scan.service` is enabled. -``` - # vi /etc/systemd/system/zfs-import-bpool.service - [Unit] - DefaultDependencies=no - Before=zfs-import-scan.service - Before=zfs-import-cache.service - - [Service] - Type=oneshot - RemainAfterExit=yes - ExecStart=/sbin/zpool import -N -o cachefile=none bpool - - [Install] - WantedBy=zfs-import.target - - # systemctl enable zfs-import-bpool.service -``` - -4.11 Optional (but recommended): Mount a tmpfs to /tmp - -If you chose to create a `/tmp` dataset above, skip this step, as they are mutually exclusive choices. Otherwise, you can put `/tmp` on a tmpfs (RAM filesystem) by enabling the `tmp.mount` unit. - - # cp /usr/share/systemd/tmp.mount /etc/systemd/system/ - # systemctl enable tmp.mount - -4.12 Optional (but kindly requested): Install popcon - -The `popularity-contest` package reports the list of packages install on your system. Showing that ZFS is popular may be helpful in terms of long-term attention from the distro. - - # apt install --yes popularity-contest - -Choose Yes at the prompt. - -## Step 5: GRUB Installation - -5.1 Verify that the ZFS boot filesystem is recognized: - - # grub-probe /boot - zfs - -5.2 Refresh the initrd files: - - # update-initramfs -u -k all - update-initramfs: Generating /boot/initrd.img-4.9.0-8-amd64 - -**Note:** When using LUKS, this will print "WARNING could not determine root device from /etc/fstab". This is because [cryptsetup does not support ZFS](https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/1612906). - -5.3 Workaround GRUB's missing zpool-features support: - - # vi /etc/default/grub - Set: GRUB_CMDLINE_LINUX="root=ZFS=rpool/ROOT/debian" - -5.4 Optional (but highly recommended): Make debugging GRUB easier: - - # vi /etc/default/grub - Remove quiet from: GRUB_CMDLINE_LINUX_DEFAULT - Uncomment: GRUB_TERMINAL=console - Save and quit. - -Later, once the system has rebooted twice and you are sure everything is working, you can undo these changes, if desired. - -5.5 Update the boot configuration: - - # update-grub - Generating grub configuration file ... - Found linux image: /boot/vmlinuz-4.9.0-8-amd64 - Found initrd image: /boot/initrd.img-4.9.0-8-amd64 - done - -**Note:** Ignore errors from `osprober`, if present. - -5.6 Install the boot loader - -5.6a For legacy (BIOS) booting, install GRUB to the MBR: - - # grub-install /dev/disk/by-id/scsi-SATA_disk1 - Installing for i386-pc platform. - Installation finished. No error reported. - -Do not reboot the computer until you get exactly that result message. Note that you are installing GRUB to the whole disk, not a partition. - -If you are creating a mirror or raidz topology, repeat the `grub-install` command for each disk in the pool. - -5.6b For UEFI booting, install GRUB: - - # grub-install --target=x86_64-efi --efi-directory=/boot/efi \ - --bootloader-id=debian --recheck --no-floppy - -5.7 Verify that the ZFS module is installed: - - # ls /boot/grub/*/zfs.mod - -5.8 Fix filesystem mount ordering - -[Until ZFS gains a systemd mount generator](https://github.com/zfsonlinux/zfs/issues/4898), there are races between mounting filesystems and starting certain daemons. In practice, the issues (e.g. [#5754](https://github.com/zfsonlinux/zfs/issues/5754)) seem to be with certain filesystems in `/var`, specifically `/var/log` and `/var/tmp`. Setting these to use `legacy` mounting, and listing them in `/etc/fstab` makes systemd aware that these are separate mountpoints. In turn, `rsyslog.service` depends on `var-log.mount` by way of `local-fs.target` and services using the `PrivateTmp` feature of systemd automatically use `After=var-tmp.mount`. - -Until there is support for mounting `/boot` in the initramfs, we also need to mount that, because it was marked `canmount=noauto`. Also, with UEFI, we need to ensure it is mounted before its child filesystem `/boot/efi`. - -`rpool` is guaranteed to be imported by the initramfs, so there is no point in adding `x-systemd.requires=zfs-import.target` to those filesystems. - - - For UEFI booting, unmount /boot/efi first: - # umount /boot/efi - - Everything else applies to both BIOS and UEFI booting: - - # zfs set mountpoint=legacy bpool/BOOT/debian - # echo bpool/BOOT/debian /boot zfs \ - nodev,relatime,x-systemd.requires=zfs-import-bpool.service 0 0 >> /etc/fstab - - # zfs set mountpoint=legacy rpool/var/log - # echo rpool/var/log /var/log zfs nodev,relatime 0 0 >> /etc/fstab - - # zfs set mountpoint=legacy rpool/var/spool - # echo rpool/var/spool /var/spool zfs nodev,relatime 0 0 >> /etc/fstab - - If you created a /var/tmp dataset: - # zfs set mountpoint=legacy rpool/var/tmp - # echo rpool/var/tmp /var/tmp zfs nodev,relatime 0 0 >> /etc/fstab - - If you created a /tmp dataset: - # zfs set mountpoint=legacy rpool/tmp - # echo rpool/tmp /tmp zfs nodev,relatime 0 0 >> /etc/fstab - -## Step 6: First Boot - -6.1 Snapshot the initial installation: - - # zfs snapshot bpool/BOOT/debian@install - # zfs snapshot rpool/ROOT/debian@install - -In the future, you will likely want to take snapshots before each upgrade, and remove old snapshots (including this one) at some point to save space. - -6.2 Exit from the `chroot` environment back to the LiveCD environment: - - # exit - -6.3 Run these commands in the LiveCD environment to unmount all filesystems: - - # mount | grep -v zfs | tac | awk '/\/mnt/ {print $3}' | xargs -i{} umount -lf {} - # zpool export -a - -6.4 Reboot: - - # reboot - -6.5 Wait for the newly installed system to boot normally. Login as root. - -6.6 Create a user account: - - # zfs create rpool/home/YOURUSERNAME - # adduser YOURUSERNAME - # cp -a /etc/skel/.[!.]* /home/YOURUSERNAME - # chown -R YOURUSERNAME:YOURUSERNAME /home/YOURUSERNAME - -6.7 Add your user account to the default set of groups for an administrator: - - # usermod -a -G audio,cdrom,dip,floppy,netdev,plugdev,sudo,video YOURUSERNAME - -6.8 Mirror GRUB - -If you installed to multiple disks, install GRUB on the additional disks: - -6.8a For legacy (BIOS) booting: - - # dpkg-reconfigure grub-pc - Hit enter until you get to the device selection screen. - Select (using the space bar) all of the disks (not partitions) in your pool. - -6.8b UEFI - - # umount /boot/efi - - For the second and subsequent disks (increment debian-2 to -3, etc.): - # dd if=/dev/disk/by-id/scsi-SATA_disk1-part2 \ - of=/dev/disk/by-id/scsi-SATA_disk2-part2 - # efibootmgr -c -g -d /dev/disk/by-id/scsi-SATA_disk2 \ - -p 2 -L "debian-2" -l '\EFI\debian\grubx64.efi' - - # mount /boot/efi - -## Step 7: (Optional) Configure Swap - -**Caution**: On systems with extremely high memory pressure, using a zvol for swap can result in lockup, regardless of how much swap is still available. This issue is currently being investigated in: https://github.com/zfsonlinux/zfs/issues/7734 - -7.1 Create a volume dataset (zvol) for use as a swap device: - - # zfs create -V 4G -b $(getconf PAGESIZE) -o compression=zle \ - -o logbias=throughput -o sync=always \ - -o primarycache=metadata -o secondarycache=none \ - -o com.sun:auto-snapshot=false rpool/swap - -You can adjust the size (the `4G` part) to your needs. - -The compression algorithm is set to `zle` because it is the cheapest available algorithm. As this guide recommends `ashift=12` (4 kiB blocks on disk), the common case of a 4 kiB page size means that no compression algorithm can reduce I/O. The exception is all-zero pages, which are dropped by ZFS; but some form of compression has to be enabled to get this behavior. - -7.2 Configure the swap device: - -**Caution**: Always use long `/dev/zvol` aliases in configuration files. Never use a short `/dev/zdX` device name. - - # mkswap -f /dev/zvol/rpool/swap - # echo /dev/zvol/rpool/swap none swap discard 0 0 >> /etc/fstab - # echo RESUME=none > /etc/initramfs-tools/conf.d/resume - -The `RESUME=none` is necessary to disable resuming from hibernation. This does not work, as the zvol is not present (because the pool has not yet been imported) at the time the resume script runs. If it is not disabled, the boot process hangs for 30 seconds waiting for the swap zvol to appear. - -7.3 Enable the swap device: - - # swapon -av - -## Step 8: Full Software Installation - -8.1 Upgrade the minimal system: - - # apt dist-upgrade --yes - -8.2 Install a regular set of software: - - # tasksel - -8.3 Optional: Disable log compression: - -As `/var/log` is already compressed by ZFS, logrotate’s compression is going to burn CPU and disk I/O for (in most cases) very little gain. Also, if you are making snapshots of `/var/log`, logrotate’s compression will actually waste space, as the uncompressed data will live on in the snapshot. You can edit the files in `/etc/logrotate.d` by hand to comment out `compress`, or use this loop (copy-and-paste highly recommended): - - # for file in /etc/logrotate.d/* ; do - if grep -Eq "(^|[^#y])compress" "$file" ; then - sed -i -r "s/(^|[^#y])(compress)/\1#\2/" "$file" - fi - done - -8.4 Reboot: - - # reboot - -### Step 9: Final Cleanup - -9.1 Wait for the system to boot normally. Login using the account you created. Ensure the system (including networking) works normally. - -9.2 Optional: Delete the snapshots of the initial installation: - - $ sudo zfs destroy bpool/BOOT/debian@install - $ sudo zfs destroy rpool/ROOT/debian@install - -9.3 Optional: Disable the root password - - $ sudo usermod -p '*' root - -9.4 Optional: Re-enable the graphical boot process: - -If you prefer the graphical boot process, you can re-enable it now. If you are using LUKS, it makes the prompt look nicer. - - $ sudo vi /etc/default/grub - Add quiet to GRUB_CMDLINE_LINUX_DEFAULT - Comment out GRUB_TERMINAL=console - Save and quit. - - $ sudo update-grub - -**Note:** Ignore errors from `osprober`, if present. - -9.5 Optional: For LUKS installs only, backup the LUKS header: - - $ sudo cryptsetup luksHeaderBackup /dev/disk/by-id/scsi-SATA_disk1-part4 \ - --header-backup-file luks1-header.dat - -Store that backup somewhere safe (e.g. cloud storage). It is protected by your LUKS passphrase, but you may wish to use additional encryption. - -**Hint:** If you created a mirror or raidz topology, repeat this for each LUKS volume (`luks2`, etc.). - -## Troubleshooting - -### Rescuing using a Live CD - -Go through [Step 1: Prepare The Install Environment](#step-1-prepare-the-install-environment). - -This will automatically import your pool. Export it and re-import it to get the mounts right: - - For LUKS, first unlock the disk(s): - # apt install --yes cryptsetup - # cryptsetup luksOpen /dev/disk/by-id/scsi-SATA_disk1-part4 luks1 - Repeat for additional disks, if this is a mirror or raidz topology. - - # zpool export -a - # zpool import -N -R /mnt rpool - # zpool import -N -R /mnt bpool - # zfs mount rpool/ROOT/debian - # zfs mount -a - -If needed, you can chroot into your installed environment: - - # mount --rbind /dev /mnt/dev - # mount --rbind /proc /mnt/proc - # mount --rbind /sys /mnt/sys - # chroot /mnt /bin/bash --login - # mount /boot - # mount -a - -Do whatever you need to do to fix your system. - -When done, cleanup: - - # exit - # mount | grep -v zfs | tac | awk '/\/mnt/ {print $3}' | xargs -i{} umount -lf {} - # zpool export -a - # reboot - -### MPT2SAS - -Most problem reports for this tutorial involve `mpt2sas` hardware that does slow asynchronous drive initialization, like some IBM M1015 or OEM-branded cards that have been flashed to the reference LSI firmware. - -The basic problem is that disks on these controllers are not visible to the Linux kernel until after the regular system is started, and ZoL does not hotplug pool members. See https://github.com/zfsonlinux/zfs/issues/330. - -Most LSI cards are perfectly compatible with ZoL. If your card has this glitch, try setting ZFS_INITRD_PRE_MOUNTROOT_SLEEP=X in /etc/default/zfs. The system will wait X seconds for all drives to appear before importing the pool. - -### Areca - -Systems that require the `arcsas` blob driver should add it to the `/etc/initramfs-tools/modules` file and run `update-initramfs -u -k all`. - -Upgrade or downgrade the Areca driver if something like `RIP: 0010:[] [] native_read_tsc+0x6/0x20` appears anywhere in kernel log. ZoL is unstable on systems that emit this error message. - -### VMware - -* Set `disk.EnableUUID = "TRUE"` in the vmx file or vsphere configuration. Doing this ensures that `/dev/disk` aliases are created in the guest. - -### QEMU/KVM/XEN - -Set a unique serial number on each virtual disk using libvirt or qemu (e.g. `-drive if=none,id=disk1,file=disk1.qcow2,serial=1234567890`). - -To be able to use UEFI in guests (instead of only BIOS booting), run this on the host: - - $ sudo apt install ovmf - $ sudo vi /etc/libvirt/qemu.conf - Uncomment these lines: - nvram = [ - "/usr/share/OVMF/OVMF_CODE.fd:/usr/share/OVMF/OVMF_VARS.fd", - "/usr/share/AAVMF/AAVMF_CODE.fd:/usr/share/AAVMF/AAVMF_VARS.fd" - ] - $ sudo service libvirt-bin restart +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Debian.md b/Debian.md index 48b8525..1a386a9 100644 --- a/Debian.md +++ b/Debian.md @@ -1,42 +1,4 @@ -Offical ZFS on Linux [DKMS](https://en.wikipedia.org/wiki/Dynamic_Kernel_Module_Support) style packages are available from the [Debian GNU/Linux repository](https://tracker.debian.org/pkg/zfs-linux) for the following configurations. The packages previously hosted at archive.zfsonlinux.org will not be updated and are not recommended for new installations. -**Debian Releases:** Stretch (oldstable), Buster (stable), and newer (testing, sid) -**Architectures:** amd64 +This page was moved to: https://openzfs.github.io/openzfs-docs/Getting%20Started/Debian/index.html -# Table of contents -- [Installation](#installation) -- [Related Links](#related-links) - -## Installation -For Debian Buster, ZFS packages are included in the [contrib repository](https://packages.debian.org/source/buster/zfs-linux). - -If you want to boot from ZFS, see [[Debian Buster Root on ZFS]] instead. For troubleshooting existing installations on Stretch, see [[Debian Stretch Root on ZFS]]. - -The [backports repository](https://backports.debian.org/Instructions/) often provides newer releases of ZFS. You can use it as follows: - -Add the backports repository: - - # vi /etc/apt/sources.list.d/buster-backports.list - deb http://deb.debian.org/debian buster-backports main contrib - deb-src http://deb.debian.org/debian buster-backports main contrib - - # vi /etc/apt/preferences.d/90_zfs - Package: libnvpair1linux libuutil1linux libzfs2linux libzpool2linux spl-dkms zfs-dkms zfs-test zfsutils-linux zfsutils-linux-dev zfs-zed - Pin: release n=buster-backports - Pin-Priority: 990 - -Update the list of packages: - - # apt update - -Install the kernel headers and other dependencies: - - # apt install --yes dpkg-dev linux-headers-$(uname -r) linux-image-amd64 - -Install the zfs packages: - - # apt-get install zfs-dkms zfsutils-linux - -## Related Links -- [[Debian GNU Linux initrd documentation]] -- [[Debian Buster Root on ZFS]] +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Debugging.md b/Debugging.md deleted file mode 100644 index 1daa55e..0000000 --- a/Debugging.md +++ /dev/null @@ -1 +0,0 @@ -The future home for documenting ZFS on Linux development and debugging techniques. \ No newline at end of file diff --git a/Developer-Resources.md b/Developer-Resources.md index dce0822..1906979 100644 --- a/Developer-Resources.md +++ b/Developer-Resources.md @@ -1,16 +1,3 @@ -# Developer Resources +This page was moved to: https://openzfs.github.io/openzfs-docs/Developer%20Resources/index.html -[[Custom Packages]] -[[Building ZFS]] -[Buildbot Status][buildbot-status] -[Buildbot Options][control-buildbot] -[OpenZFS Tracking][openzfs-tracking] -[[OpenZFS Patches]] -[[OpenZFS Exceptions]] -[OpenZFS Documentation][openzfs-devel] -[[Git and GitHub for beginners]] - -[openzfs-devel]: http://open-zfs.org/wiki/Developer_resources -[openzfs-tracking]: http://build.zfsonlinux.org/openzfs-tracking.html -[buildbot-status]: http://build.zfsonlinux.org/tgrid?length=100&branch=master&category=Tests&rev_order=desc -[control-buildbot]: https://github.com/zfsonlinux/zfs/wiki/Buildbot-Options \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/FAQ.md b/FAQ.md index c7f2a32..9ae84e8 100644 --- a/FAQ.md +++ b/FAQ.md @@ -1,418 +1,4 @@ -## Table Of Contents -- [What is ZFS on Linux](#what-is-zfs-on-linux) -- [Hardware Requirements](#hardware-requirements) -- [Do I have to use ECC memory for ZFS?](#do-i-have-to-use-ecc-memory-for-zfs) -- [Installation](#installation) -- [Supported Architectures](#supported-architectures) -- [Supported Kernels](#supported-kernels) -- [32-bit vs 64-bit Systems](#32-bit-vs-64-bit-systems) -- [Booting from ZFS](#booting-from-zfs) -- [Selecting /dev/ names when creating a pool](#selecting-dev-names-when-creating-a-pool) -- [Setting up the /etc/zfs/vdev_id.conf file](#setting-up-the-etczfsvdev_idconf-file) -- [Changing /dev/ names on an existing pool](#changing-dev-names-on-an-existing-pool) -- [The /etc/zfs/zpool.cache file](#the-etczfszpoolcache-file) -- [Generating a new /etc/zfs/zpool.cache file](#generating-a-new-etczfszpoolcache-file) -- [Sending and Receiving Streams](#sending-and-receiving-streams) - * [hole_birth Bugs](#hole_birth-bugs) - * [Sending Large Blocks](#sending-large-blocks) -- [CEPH/ZFS](#cephzfs) - * [ZFS Configuration](#zfs-configuration) - * [CEPH Configuration (ceph.conf}](#ceph-configuration-cephconf) - * [Other General Guidelines](#other-general-guidelines) -- [Performance Considerations](#performance-considerations) -- [Advanced Format Disks](#advanced-format-disks) -- [ZVOL used space larger than expected](#ZVOL-used-space-larger-than-expected) -- [Using a zvol for a swap device](#using-a-zvol-for-a-swap-device) -- [Using ZFS on Xen Hypervisor or Xen Dom0](#using-zfs-on-xen-hypervisor-or-xen-dom0) -- [udisks2 creates /dev/mapper/ entries for zvol](#udisks2-creating-devmapper-entries-for-zvol) -- [Licensing](#licensing) -- [Reporting a problem](#reporting-a-problem) -- [Does ZFS on Linux have a Code of Conduct?](#does-zfs-on-linux-have-a-code-of-conduct) - -## What is ZFS on Linux -The ZFS on Linux project is an implementation of [OpenZFS][OpenZFS] designed to work in a Linux environment. OpenZFS is an outstanding storage platform that encompasses the functionality of traditional filesystems, volume managers, and more, with consistent reliability, functionality and performance across all distributions. Additional information about OpenZFS can be found in the [OpenZFS wikipedia article][wikipedia]. +This page was moved to: https://openzfs.github.io/openzfs-docs/Project%20and%20Community/FAQ.html -## Hardware Requirements - -Because ZFS was originally designed for Sun Solaris it was long considered a filesystem for large servers and for companies that could afford the best and most powerful hardware available. But since the porting of ZFS to numerous OpenSource platforms (The BSDs, Illumos and Linux - under the umbrella organization "OpenZFS"), these requirements have been lowered. - -The suggested hardware requirements are: - * ECC memory. This isn't really a requirement, but it's highly recommended. - * 8GB+ of memory for the best performance. It's perfectly possible to run with 2GB or less (and people do), but you'll need more if using deduplication. - -## Do I have to use ECC memory for ZFS? - -Using ECC memory for OpenZFS is strongly recommended for enterprise environments where the strongest data integrity guarantees are required. Without ECC memory rare random bit flips caused by cosmic rays or by faulty memory can go undetected. If this were to occur OpenZFS (or any other filesystem) will write the damaged data to disk and be unable to automatically detect the corruption. - -Unfortunately, ECC memory is not always supported by consumer grade hardware. And even when it is ECC memory will be more expensive. For home users the additional safety brought by ECC memory might not justify the cost. It's up to you to determine what level of protection your data requires. - -## Installation - -ZFS on Linux is available for all major Linux distributions. Refer to the [[getting started]] section of the wiki for links to installations instructions for many popular distributions. If your distribution isn't listed you can always build ZFS on Linux from the latest official [tarball][releases]. - -## Supported Architectures - -ZFS on Linux is regularly compiled for the following architectures: x86_64, x86, aarch64, arm, ppc64, ppc. - -## Supported Kernels - -The [notes][releases] for a given ZFS on Linux release will include a range of supported kernels. Point releases will be tagged as needed in order to support the *stable* kernel available from [kernel.org][kernel]. The oldest supported kernel is 2.6.32 due to its prominence in Enterprise Linux distributions. - -## 32-bit vs 64-bit Systems - -You are **strongly** encouraged to use a 64-bit kernel. ZFS on Linux will build for 32-bit kernels but you may encounter stability problems. - -ZFS was originally developed for the Solaris kernel which differs from the Linux kernel in several significant ways. Perhaps most importantly for ZFS it is common practice in the Solaris kernel to make heavy use of the virtual address space. However, use of the virtual address space is strongly discouraged in the Linux kernel. This is particularly true on 32-bit architectures where the virtual address space is limited to 100M by default. Using the virtual address space on 64-bit Linux kernels is also discouraged but the address space is so much larger than physical memory it is less of an issue. - -If you are bumping up against the virtual memory limit on a 32-bit system you will see the following message in your system logs. You can increase the virtual address size with the boot option `vmalloc=512M`. - -``` -vmap allocation for size 4198400 failed: use vmalloc= to increase size. -``` - -However, even after making this change your system will likely not be entirely stable. Proper support for 32-bit systems is contingent upon the OpenZFS code being weaned off its dependence on virtual memory. This will take some time to do correctly but it is planned for OpenZFS. This change is also expected to improve how efficiently OpenZFS manages the ARC cache and allow for tighter integration with the standard Linux page cache. - -## Booting from ZFS - -Booting from ZFS on Linux is possible and many people do it. There are excellent walk throughs available for [[Debian]], [[Ubuntu]] and [Gentoo][gentoo-root]. - -## Selecting /dev/ names when creating a pool - -There are different /dev/ names that can be used when creating a ZFS pool. Each option has advantages and drawbacks, the right choice for your ZFS pool really depends on your requirements. For development and testing using /dev/sdX naming is quick and easy. A typical home server might prefer /dev/disk/by-id/ naming for simplicity and readability. While very large configurations with multiple controllers, enclosures, and switches will likely prefer /dev/disk/by-vdev naming for maximum control. But in the end, how you choose to identify your disks is up to you. - -* **/dev/sdX, /dev/hdX:** Best for development/test pools - * Summary: The top level /dev/ names are the default for consistency with other ZFS implementations. They are available under all Linux distributions and are commonly used. However, because they are not persistent they should only be used with ZFS for development/test pools. - * Benefits:This method is easy for a quick test, the names are short, and they will be available on all Linux distributions. - * Drawbacks:The names are not persistent and will change depending on what order they disks are detected in. Adding or removing hardware for your system can easily cause the names to change. You would then need to remove the zpool.cache file and re-import the pool using the new names. - * Example: `zpool create tank sda sdb` - -* **/dev/disk/by-id/:** Best for small pools (less than 10 disks) - * Summary: This directory contains disk identifiers with more human readable names. The disk identifier usually consists of the interface type, vendor name, model number, device serial number, and partition number. This approach is more user friendly because it simplifies identifying a specific disk. - * Benefits: Nice for small systems with a single disk controller. Because the names are persistent and guaranteed not to change, it doesn't matter how the disks are attached to the system. You can take them all out, randomly mixed them up on the desk, put them back anywhere in the system and your pool will still be automatically imported correctly. - * Drawbacks: Configuring redundancy groups based on physical location becomes difficult and error prone. - * Example: `zpool create tank scsi-SATA_Hitachi_HTS7220071201DP1D10DGG6HMRP` - -* **/dev/disk/by-path/:** Good for large pools (greater than 10 disks) - * Summary: This approach is to use device names which include the physical cable layout in the system, which means that a particular disk is tied to a specific location. The name describes the PCI bus number, as well as enclosure names and port numbers. This allows the most control when configuring a large pool. - * Benefits: Encoding the storage topology in the name is not only helpful for locating a disk in large installations. But it also allows you to explicitly layout your redundancy groups over multiple adapters or enclosures. - * Drawbacks: These names are long, cumbersome, and difficult for a human to manage. - * Example: `zpool create tank pci-0000:00:1f.2-scsi-0:0:0:0 pci-0000:00:1f.2-scsi-1:0:0:0` - -* **/dev/disk/by-vdev/:** Best for large pools (greater than 10 disks) - * Summary: This approach provides administrative control over device naming using the configuration file /etc/zfs/vdev_id.conf. Names for disks in JBODs can be generated automatically to reflect their physical location by enclosure IDs and slot numbers. The names can also be manually assigned based on existing udev device links, including those in /dev/disk/by-path or /dev/disk/by-id. This allows you to pick your own unique meaningful names for the disks. These names will be displayed by all the zfs utilities so it can be used to clarify the administration of a large complex pool. See the vdev_id and vdev_id.conf man pages for further details. - * Benefits: The main benefit of this approach is that it allows you to choose meaningful human-readable names. Beyond that, the benefits depend on the naming method employed. If the names are derived from the physical path the benefits of /dev/disk/by-path are realized. On the other hand, aliasing the names based on drive identifiers or WWNs has the same benefits as using /dev/disk/by-id. - * Drawbacks: This method relies on having a /etc/zfs/vdev_id.conf file properly configured for your system. To configure this file please refer to section [Setting up the /etc/zfs/vdev_id.conf file](#setting-up-the-etczfsvdev_idconf-file). As with benefits, the drawbacks of /dev/disk/by-id or /dev/disk/by-path may apply depending on the naming method employed. - * Example: `zpool create tank mirror A1 B1 mirror A2 B2` - -## Setting up the /etc/zfs/vdev_id.conf file - -In order to use /dev/disk/by-vdev/ naming the `/etc/zfs/vdev_id.conf` must be configured. The format of this file is described in the vdev_id.conf man page. Several examples follow. - -A non-multipath configuration with direct-attached SAS enclosures and an arbitrary slot re-mapping. - -``` - multipath no - topology sas_direct - phys_per_port 4 - - # PCI_SLOT HBA PORT CHANNEL NAME - channel 85:00.0 1 A - channel 85:00.0 0 B - - # Linux Mapped - # Slot Slot - slot 0 2 - slot 1 6 - slot 2 0 - slot 3 3 - slot 4 5 - slot 5 7 - slot 6 4 - slot 7 1 -``` - -A SAS-switch topology. Note that the channel keyword takes only two arguments in this example. - -``` - topology sas_switch - - # SWITCH PORT CHANNEL NAME - channel 1 A - channel 2 B - channel 3 C - channel 4 D -``` - -A multipath configuration. Note that channel names have multiple definitions - one per physical path. - -``` - multipath yes - - # PCI_SLOT HBA PORT CHANNEL NAME - channel 85:00.0 1 A - channel 85:00.0 0 B - channel 86:00.0 1 A - channel 86:00.0 0 B -``` - -A configuration using device link aliases. - -``` - # by-vdev - # name fully qualified or base name of device link - alias d1 /dev/disk/by-id/wwn-0x5000c5002de3b9ca - alias d2 wwn-0x5000c5002def789e -``` - -After defining the new disk names run `udevadm trigger` to prompt udev to parse the configuration file. This will result in a new /dev/disk/by-vdev directory which is populated with symlinks to /dev/sdX names. Following the first example above, you could then create the new pool of mirrors with the following command: - -``` -$ zpool create tank \ - mirror A0 B0 mirror A1 B1 mirror A2 B2 mirror A3 B3 \ - mirror A4 B4 mirror A5 B5 mirror A6 B6 mirror A7 B7 - -$ zpool status - pool: tank - state: ONLINE - scan: none requested -config: - - NAME STATE READ WRITE CKSUM - tank ONLINE 0 0 0 - mirror-0 ONLINE 0 0 0 - A0 ONLINE 0 0 0 - B0 ONLINE 0 0 0 - mirror-1 ONLINE 0 0 0 - A1 ONLINE 0 0 0 - B1 ONLINE 0 0 0 - mirror-2 ONLINE 0 0 0 - A2 ONLINE 0 0 0 - B2 ONLINE 0 0 0 - mirror-3 ONLINE 0 0 0 - A3 ONLINE 0 0 0 - B3 ONLINE 0 0 0 - mirror-4 ONLINE 0 0 0 - A4 ONLINE 0 0 0 - B4 ONLINE 0 0 0 - mirror-5 ONLINE 0 0 0 - A5 ONLINE 0 0 0 - B5 ONLINE 0 0 0 - mirror-6 ONLINE 0 0 0 - A6 ONLINE 0 0 0 - B6 ONLINE 0 0 0 - mirror-7 ONLINE 0 0 0 - A7 ONLINE 0 0 0 - B7 ONLINE 0 0 0 - -errors: No known data errors -``` - -## Changing /dev/ names on an existing pool - -Changing the /dev/ names on an existing pool can be done by simply exporting the pool and re-importing it with the -d option to specify which new names should be used. For example, to use the custom names in /dev/disk/by-vdev: - -``` -$ zpool export tank -$ zpool import -d /dev/disk/by-vdev tank -``` - -## The /etc/zfs/zpool.cache file - -Whenever a pool is imported on the system it will be added to the `/etc/zfs/zpool.cache file`. This file stores pool configuration information, such as the device names and pool state. If this file exists when running the `zpool import` command then it will be used to determine the list of pools available for import. When a pool is not listed in the cache file it will need to be detected and imported using the `zpool import -d /dev/disk/by-id` command. - -## Generating a new /etc/zfs/zpool.cache file - -The `/etc/zfs/zpool.cache` file will be automatically updated when your pool configuration is changed. However, if for some reason it becomes stale you can force the generation of a new `/etc/zfs/zpool.cache` file by setting the cachefile property on the pool. - -``` -$ zpool set cachefile=/etc/zfs/zpool.cache tank -``` - -Conversely the cache file can be disabled by setting `cachefile=none`. This is useful for failover configurations where the pool should always be explicitly imported by the failover software. - -``` -$ zpool set cachefile=none tank -``` -## Sending and Receiving Streams - -### hole_birth Bugs - -The hole_birth feature has/had bugs, the result of which is that, if you do a `zfs send -i` (or `-R`, since it uses `-i`) from an affected dataset, the receiver *will not see any checksum or other errors, but will not match the source*. - -ZoL versions 0.6.5.8 and 0.7.0-rc1 (and above) default to ignoring the faulty metadata which causes this issue *on the sender side*. - -For more details, see the [[hole_birth FAQ]]. - -### Sending Large Blocks - -When sending incremental streams which contain large blocks (>128K) the `--large-block` flag must be specified. Inconsist use of the flag between incremental sends can result in files being incorrectly zeroed when they are received. Raw encrypted send/recvs automatically imply the `--large-block` flag and are therefore unaffected. - -For more details, see [issue 6224](https://github.com/zfsonlinux/zfs/issues/6224). - - -## CEPH/ZFS - -There is a lot of tuning that can be done that's dependent on the workload that is being put on CEPH/ZFS, as well as some general guidelines. Some are as follow; - -### ZFS Configuration - -The CEPH filestore back-end heavily relies on xattrs, for optimal performance all CEPH workloads will benefit from the following ZFS dataset parameters -* `xattr=sa` -* `dnodesize=auto` - -Beyond that typically rbd/cephfs focused workloads benefit from small recordsize({16K-128K), while objectstore/s3/rados focused workloads benefit from large recordsize (128K-1M). - -### CEPH Configuration (ceph.conf} - -Additionally CEPH sets various values internally for handling xattrs based on the underlying filesystem. As CEPH only officially supports/detects XFS and BTRFS, for all other filesystems it falls back to rather [limited "safe" values](https://github.com/ceph/ceph/blob/4fe7e2a458a1521839bc390c2e3233dd809ec3ac/src/common/config_opts.h#L1125-L1148). On newer releases need for larger xattrs will prevent OSD's from even starting. - -The officially recommended workaround ([see here](http://docs.ceph.com/docs/jewel/rados/configuration/filesystem-recommendations/#not-recommended)) has some severe downsides, and more specifically is geared toward filesystems with "limited" xattr support such as ext4. - -ZFS does not have a limit internally to xattrs length, as such we can treat it similarly to how CEPH treats XFS. We can set overrides to set 3 internal values to the same as those used with XFS([see here](https://github.com/ceph/ceph/blob/9b317f7322848802b3aab9fec3def81dddd4a49b/src/os/filestore/FileStore.cc#L5714-L5737) and [here](https://github.com/ceph/ceph/blob/4fe7e2a458a1521839bc390c2e3233dd809ec3ac/src/common/config_opts.h#L1125-L1148)) and allow it be used without the severe limitations of the "official" workaround. - -``` -[osd] -filestore_max_inline_xattrs = 10 -filestore_max_inline_xattr_size = 65536 -filestore_max_xattr_value_size = 65536 -``` - -### Other General Guidelines - -* Use a separate journal device. Do not don't collocate CEPH journal on ZFS dataset if at all possible, this will quickly lead to terrible fragmentation, not to mention terrible performance upfront even before fragmentation (CEPH journal does a dsync for every write). -* Use a SLOG device, even with a separate CEPH journal device. For some workloads, skipping SLOG and setting `logbias=throughput` may be acceptable. -* Use a high-quality SLOG/CEPH journal device, consumer based SSD, or even NVMe WILL NOT DO (Samsung 830, 840, 850, etc) for a variety of reasons. CEPH will kill them quickly, on-top of the performance being quite low in this use. Generally recommended are [Intel DC S3610, S3700, S3710, P3600, P3700], or [Samsung SM853, SM863], or better. -* If using an high quality SSD or NVMe device(as mentioned above), you CAN share SLOG and CEPH Journal to good results on single device. A ratio of 4 HDDs to 1 SSD (Intel DC S3710 200GB), with each SSD partitioned (remember to align!) to 4x10GB (for ZIL/SLOG) + 4x20GB (for CEPH journal) has been reported to work well. - -Again - CEPH + ZFS will KILL a consumer based SSD VERY quickly. Even ignoring the lack of power-loss protection, and endurance ratings, you will be very disappointed with performance of consumer based SSD under such a workload. - -## Performance Considerations - -To achieve good performance with your pool there are some easy best practices you should follow. Additionally, it should be made clear that the ZFS on Linux implementation has not yet been optimized for performance. As the project matures we can expect performance to improve. - -* **Evenly balance your disk across controllers:** Often the limiting factor for performance is not the disk but the controller. By balancing your disks evenly across controllers you can often improve throughput. -* **Create your pool using whole disks:** When running zpool create use whole disk names. This will allow ZFS to automatically partition the disk to ensure correct alignment. It will also improve interoperability with other OpenZFS implementations which honor the wholedisk property. -* **Have enough memory:** A minimum of 2GB of memory is recommended for ZFS. Additional memory is strongly recommended when the compression and deduplication features are enabled. -* **Improve performance by setting ashift=12:** You may be able to improve performance for some workloads by setting `ashift=12`. This tuning can only be set when block devices are first added to a pool, such as when the pool is first created or when a new vdev is added to the pool. This tuning parameter can result in a decrease of capacity for RAIDZ configuratons. - -## Advanced Format Disks - -Advanced Format (AF) is a new disk format which natively uses a 4,096 byte, instead of 512 byte, sector size. To maintain compatibility with legacy systems many AF disks emulate a sector size of 512 bytes. By default, ZFS will automatically detect the sector size of the drive. This combination can result in poorly aligned disk accesses which will greatly degrade the pool performance. - -Therefore, the ability to set the ashift property has been added to the zpool command. This allows users to explicitly assign the sector size when devices are first added to a pool (typically at pool creation time or adding a vdev to the pool). The ashift values range from 9 to 16 with the default value 0 meaning that zfs should auto-detect the sector size. This value is actually a bit shift value, so an ashift value for 512 bytes is 9 (2^9 = 512) while the ashift value for 4,096 bytes is 12 (2^12 = 4,096). - -To force the pool to use 4,096 byte sectors at pool creation time, you may run: - -``` -$ zpool create -o ashift=12 tank mirror sda sdb -``` - -To force the pool to use 4,096 byte sectors when adding a vdev to a pool, you may run: - -``` -$ zpool add -o ashift=12 tank mirror sdc sdd -``` - -## ZVOL used space larger than expected - -Depending on the filesystem used on the zvol (e.g. ext4) and the usage (e.g. deletion and creation of many files) the `used` and `referenced` properties reported by the zvol may be larger than the "actual" space that is being used as reported by the consumer. -This can happen due to the way some filesystems work, in which they prefer to allocate files in new untouched blocks rather than the fragmented used blocks marked as free. This forces zfs to reference all blocks that the underlying filesystem has ever touched. -This is in itself not much of a problem, as when the `used` property reaches the configured `volsize` the underlying filesystem will start reusing blocks. But the problem arises if it is desired to snapshot the zvol, as the space referenced by the snapshots will contain the unused blocks. - -This issue can be prevented, by using the `fstrim` command to allow the kernel to specify to zfs which blocks are unused. -Executing a `fstrim` command before a snapshot is taken will ensure a minimum snapshot size. -Adding the `discard` option for the mounted ZVOL in `\etc\fstab` effectively enables the Linux kernel to issue the trim commands continuously, without the need to execute fstrim on-demand. - -## Using a zvol for a swap device - -You may use a zvol as a swap device but you'll need to configure it appropriately. - -**CAUTION:** for now swap on zvol may lead to deadlock, in this case please send your logs [here](https://github.com/zfsonlinux/zfs/issues/7734). - -* Set the volume block size to match your systems page size. This tuning prevents ZFS from having to perform read-modify-write options on a larger block while the system is already low on memory. -* Set the `logbias=throughput` and `sync=always` properties. Data written to the volume will be flushed immediately to disk freeing up memory as quickly as possible. -* Set `primarycache=metadata` to avoid keeping swap data in RAM via the ARC. -* Disable automatic snapshots of the swap device. - -``` -$ zfs create -V 4G -b $(getconf PAGESIZE) \ - -o logbias=throughput -o sync=always \ - -o primarycache=metadata \ - -o com.sun:auto-snapshot=false rpool/swap -``` - -## Using ZFS on Xen Hypervisor or Xen Dom0 - -It is usually recommended to keep virtual machine storage and hypervisor pools, quite separate. Although few people have managed to successfully deploy and run ZFS on Linux using the same machine configured as Dom0. There are few caveats: - - * Set a fair amount of memory in grub.conf, dedicated to Dom0. - * dom0_mem=16384M,max:16384M - * Allocate no more of 30-40% of Dom0's memory to ZFS in `/etc/modprobe.d/zfs.conf`. - * options zfs zfs_arc_max=6442450944 - * Disable Xen's auto-ballooning in `/etc/xen/xl.conf` - * Watch out for any Xen bugs, such as [this one][xen-bug] related to ballooning - -## udisks2 creating /dev/mapper/ entries for zvol - -To prevent udisks2 from creating /dev/mapper entries that must be manually removed or maintained during zvol remove / rename, create a udev rule such as `/etc/udev/rules.d/80-udisks2-ignore-zfs.rules` with the following contents: - -``` -ENV{ID_PART_ENTRY_SCHEME}=="gpt", ENV{ID_FS_TYPE}=="zfs_member", ENV{ID_PART_ENTRY_TYPE}=="6a898cc3-1dd2-11b2-99a6-080020736631", ENV{UDISKS_IGNORE}="1" -``` - -## Licensing - -ZFS is licensed under the Common Development and Distribution License ([CDDL][cddl]), and the Linux kernel is licensed under the GNU General Public License Version 2 ([GPLv2][gpl]). While both are free open source licenses they are restrictive licenses. The combination of them causes problems because it prevents using pieces of code exclusively available under one license with pieces of code exclusively available under the other in the same binary. In the case of the kernel, this prevents us from distributing ZFS on Linux as part of the kernel binary. However, there is nothing in either license that prevents distributing it in the form of a binary module or in the form of source code. - -Additional reading and opinions: - -* [Software Freedom Law Center][lawcenter] -* [Software Freedom Conservancy][conservancy] -* [Free Software Foundation][fsf] -* [Encouraging closed source modules][networkworld] - -## Reporting a problem - -You can open a new issue and search existing issues using the public [issue tracker][issues]. The issue tracker is used to organize outstanding bug reports, feature requests, and other development tasks. Anyone may post comments after signing up for a github account. - -Please make sure that what you're actually seeing is a bug and not a support issue. If in doubt, please ask on the mailing list first, and if you're then asked to file an issue, do so. - -When opening a new issue include this information at the top of the issue: - -* What distribution you're using and the version. -* What spl/zfs packages you're using and the version. -* Describe the problem you're observing. -* Describe how to reproduce the problem. -* Including any warning/errors/backtraces from the system logs. - -When a new issue is opened it's not uncommon for a developer to request additional information about the problem. In general, the more detail you share about a problem the quicker a developer can resolve it. For example, providing a simple test case is always exceptionally helpful. Be prepared to work with the developer looking in to your bug in order to get it resolved. They may ask for information like: - -* Your pool configuration as reported by `zdb` or `zpool status`. -* Your hardware configuration, such as - * Number of CPUs. - * Amount of memory. - * Whether your system has ECC memory. - * Whether it is running under a VMM/Hypervisor. - * Kernel version. - * Values of the spl/zfs module parameters. -* Stack traces which may be logged to `dmesg`. - -## Does ZFS on Linux have a Code of Conduct? - -Yes, the ZFS on Linux community has a code of conduct. See the [Code of Conduct][CoC] for details. - -[OpenZFS]: http://open-zfs.org/wiki/Main_Page -[wikipedia]: https://en.wikipedia.org/wiki/OpenZFS -[releases]: https://github.com/zfsonlinux/zfs/releases -[kernel]: https://www.kernel.org/ -[gentoo-root]: https://github.com/pendor/gentoo-zfs-install/tree/master/install -[xen-bug]: https://github.com/zfsonlinux/zfs/issues/1067 -[cddl]: http://hub.opensolaris.org/bin/view/Main/opensolaris_license -[gpl]: http://www.gnu.org/licenses/gpl2.html -[lawcenter]: https://www.softwarefreedom.org/resources/2016/linux-kernel-cddl.html -[conservancy]: https://sfconservancy.org/blog/2016/feb/25/zfs-and-linux/ -[fsf]: https://www.fsf.org/licensing/zfs-and-linux -[networkworld]: http://www.networkworld.com/article/2301697/smb/encouraging-closed-source-modules-part-1--copyright-and-software.html -[issues]: https://github.com/zfsonlinux/zfs/issues -[CoC]: http://open-zfs.org/wiki/Code_of_Conduct +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Fedora.md b/Fedora.md index ccae958..217f365 100644 --- a/Fedora.md +++ b/Fedora.md @@ -1,44 +1,3 @@ -Only [DKMS][dkms] style packages can be provided for Fedora from the official zfsonlinux.org repository. This is because Fedora is a fast moving distribution which does not provide a stable kABI. These packages track the official ZFS on Linux tags and are updated as new versions are released. Packages are available for the following configurations: +This page was moved to: https://openzfs.github.io/openzfs-docs/Getting%20Started/Fedora.html -**Fedora Releases:** 30, 31, 32 -**Architectures:** x86_64 - -To simplify installation a zfs-release package is provided which includes a zfs.repo configuration file and the ZFS on Linux public signing key. All official ZFS on Linux packages are signed using this key, and by default both yum and dnf will verify a package's signature before allowing it be to installed. Users are strongly encouraged to verify the authenticity of the ZFS on Linux public key using the fingerprint listed here. - -**Location:** /etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux -**Fedora 30 Package:** http://download.zfsonlinux.org/fedora/zfs-release.fc30.noarch.rpm -**Fedora 31 Package:** http://download.zfsonlinux.org/fedora/zfs-release.fc31.noarch.rpm -**Fedora 32 Package:** http://download.zfsonlinux.org/fedora/zfs-release.fc32.noarch.rpm -**Download from:** [pgp.mit.edu][pubkey] -**Fingerprint:** C93A FFFD 9F3F 7B03 C310 CEB6 A9D5 A1C0 F14A B620 - -```sh -$ sudo dnf install http://download.zfsonlinux.org/fedora/zfs-release$(rpm -E %dist).noarch.rpm -$ gpg --quiet --with-fingerprint /etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux -pub 2048R/F14AB620 2013-03-21 ZFS on Linux - Key fingerprint = C93A FFFD 9F3F 7B03 C310 CEB6 A9D5 A1C0 F14A B620 - sub 2048R/99685629 2013-03-21 -``` - -The ZFS on Linux packages should be installed with `dnf` on Fedora. Note that it is important to make sure that the matching *kernel-devel* package is installed for the running kernel since DKMS requires it to build ZFS. - -```sh -$ sudo dnf install kernel-devel zfs -``` - -If the Fedora provided *zfs-fuse* package is already installed on the system. Then the `dnf swap` command should be used to replace the existing fuse packages with the ZFS on Linux packages. - -```sh -$ sudo dnf swap zfs-fuse zfs -``` - -## Testing Repositories - -In addition to the primary *zfs* repository a *zfs-testing* repository is available. This repository, which is disabled by default, contains the latest version of ZFS on Linux which is under active development. These packages are made available in order to get feedback from users regarding the functionality and stability of upcoming releases. These packages **should not** be used on production systems. Packages from the testing repository can be installed as follows. - -``` -$ sudo dnf --enablerepo=zfs-testing install kernel-devel zfs -``` - -[dkms]: https://en.wikipedia.org/wiki/Dynamic_Kernel_Module_Support -[pubkey]: http://pgp.mit.edu/pks/lookup?search=0xF14AB620&op=index&fingerprint=on \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Getting-Started.md b/Getting-Started.md index 63cbcc5..178965a 100644 --- a/Getting-Started.md +++ b/Getting-Started.md @@ -1,16 +1,3 @@ -To get started with OpenZFS refer to the provided documentation for your distribution. It will cover the recommended installation method and any distribution specific information. First time OpenZFS users are encouraged to check out Aaron Toponce's [excellent documentation][docs]. +This page was moved to: https://openzfs.github.io/openzfs-docs/Getting%20Started/index.html -[ArchLinux][arch] -[[Debian]] -[[Fedora]] -[FreeBSD][freebsd] -[Gentoo][gentoo] -[openSUSE][opensuse] -[[RHEL and CentOS]] -[[Ubuntu]] - -[arch]: https://wiki.archlinux.org/index.php/ZFS -[freebsd]: https://zfsonfreebsd.github.io/ZoF/ -[gentoo]: https://wiki.gentoo.org/wiki/ZFS -[opensuse]: https://software.opensuse.org/package/zfs -[docs]: https://pthree.org/2012/04/17/install-zfs-on-debian-gnulinux/ +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Git-and-GitHub-for-beginners.md b/Git-and-GitHub-for-beginners.md index 1144902..4273128 100644 --- a/Git-and-GitHub-for-beginners.md +++ b/Git-and-GitHub-for-beginners.md @@ -1,146 +1,3 @@ -# Git and GitHub for beginners (ZoL edition) +This page was moved to: https://openzfs.github.io/openzfs-docs/Developer%20Resources/Git%20and%20GitHub%20for%20beginners.html -This is a very basic rundown of how to use Git and GitHub to make changes. - -Recommended reading: [ZFS on Linux CONTRIBUTING.md](https://github.com/zfsonlinux/zfs/blob/master/.github/CONTRIBUTING.md) - -# First time setup - -If you've never used Git before, you'll need a little setup to start things off. - -``` -git config --global user.name "My Name" -git config --global user.email myemail@noreply.non -``` - -# Cloning the initial repository - -The easiest way to get started is to click the fork icon at the top of the main repository page. From there you need to download a copy of the forked repository to your computer: - -``` -git clone https://github.com//zfs.git -``` - -This sets the "origin" repository to your fork. This will come in handy -when creating pull requests. To make pulling from the "upstream" repository -as changes are made, it is very useful to establish the upstream repository -as another remote (man git-remote): - -``` -cd zfs -git remote add upstream https://github.com/zfsonlinux/zfs.git -``` - -# Preparing and making changes - -In order to make changes it is recommended to make a branch, this lets you work on several unrelated changes at once. It is also not recommended to make changes to the master branch unless you own the repository. - -``` -git checkout -b my-new-branch -``` - -From here you can make your changes and move on to the next step. - -Recommended reading: [C Style and Coding Standards for SunOS](https://www.cis.upenn.edu/~lee/06cse480/data/cstyle.ms.pdf), [ZFS on Linux Developer Resources](https://github.com/zfsonlinux/zfs/wiki/Developer-Resources), [OpenZFS Developer Resources](http://open-zfs.org/wiki/Developer_resources) - -# Testing your patches before pushing - -Before committing and pushing, you may want to test your patches. There are several tests you can run against your branch such as style checking, and functional tests. All pull requests go through these tests before being pushed to the main repository, however testing locally takes the load off the build/test servers. This step is optional but highly recommended, however the test suite should be run on a virtual machine or a host that currently does not use ZFS. You may need to install `shellcheck` and `flake8` to run the `checkstyle` correctly. - -``` -sh autogen.sh -./configure -make checkstyle -``` - -Recommended reading: [Building ZFS](https://github.com/zfsonlinux/zfs/wiki/Building-ZFS), [ZFS Test Suite README](https://github.com/zfsonlinux/zfs/blob/master/tests/README.md) - -# Committing your changes to be pushed - -When you are done making changes to your branch there are a few more steps before you can make a pull request. - -``` -git commit --all --signoff -``` - -This command opens an editor and adds all unstaged files from your branch. Here you need to describe your change and add a few things: - -``` - -# Please enter the commit message for your changes. Lines starting -# with '#' will be ignored, and an empty message aborts the commit. -# On branch my-new-branch -# Changes to be committed: -# (use "git reset HEAD ..." to unstage) -# -# modified: hello.c -# -``` - -The first thing we need to add is the commit message. This is what is displayed on the git log, and should be a short description of the change. By style guidelines, this has to be less than 72 characters in length. - -Underneath the commit message you can add a more descriptive text to your commit. The lines in this section have to be less than 72 characters. - -When you are done, the commit should look like this: - -``` -Add hello command - -This is a test commit with a descriptive commit message. -This message can be more than one line as shown here. - -Signed-off-by: My Name -Closes #9998 -Issue #9999 -# Please enter the commit message for your changes. Lines starting -# with '#' will be ignored, and an empty message aborts the commit. -# On branch my-new-branch -# Changes to be committed: -# (use "git reset HEAD ..." to unstage) -# -# modified: hello.c -# -``` - -You can also reference issues and pull requests if you are filing a pull request for an existing issue as shown above. Save and exit the editor when you are done. - -# Pushing and creating the pull request - -Home stretch. You've made your change and made the commit. Now it's time to push it. - -``` -git push --set-upstream origin my-new-branch -``` - -This should ask you for your github credentials and upload your changes to your repository. - -The last step is to either go to your repository or the upstream repository on GitHub and you should see a button for making a new pull request for your recently committed branch. - -# Correcting issues with your pull request - -Sometimes things don't always go as planned and you may need to update your pull request with a correction to either your commit message, or your changes. This can be accomplished by re-pushing your branch. If you need to make code changes or `git add` a file, you can do those now, along with the following: - -``` -git commit --amend -git push --force -``` - -This will return you to the commit editor screen, and push your changes over top of the old ones. Do note that this will restart the process of any build/test servers currently running and excessively pushing can cause delays in processing of all pull requests. - -# Maintaining your repository - -When you wish to make changes in the future you will want to have an up-to-date copy of the upstream repository to make your changes on. Here is how you keep updated: - -``` -git checkout master -git pull upstream master -git push origin master -``` - -This will make sure you are on the master branch of the repository, grab the changes from upstream, then push them back to your repository. - -# Final words - -This is a very basic introduction to Git and GitHub, but should get you on your way to contributing to many open source projects. Not all projects have style requirements and some may have different processes to getting changes committed so please refer to their documentation to see if you need to do anything different. One topic we have not touched on is the `git rebase` command which is a little more advanced for this wiki article. - -Additional resources: [Github Help](https://help.github.com/), [Atlassian Git Tutorials](https://www.atlassian.com/git/tutorials) \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/HOWTO-install-Debian-GNU-Linux-to-a-Native-ZFS-Root-Filesystem.md b/HOWTO-install-Debian-GNU-Linux-to-a-Native-ZFS-Root-Filesystem.md index 1845d61..2a0755e 100644 --- a/HOWTO-install-Debian-GNU-Linux-to-a-Native-ZFS-Root-Filesystem.md +++ b/HOWTO-install-Debian-GNU-Linux-to-a-Native-ZFS-Root-Filesystem.md @@ -1 +1,3 @@ -This page has moved to [[Debian Jessie Root on ZFS]]. +This page was moved to: https://openzfs.github.io/openzfs-docs/Getting%20Started/Debian/index.html + +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Home.md b/Home.md index 22e41b1..568e829 100644 --- a/Home.md +++ b/Home.md @@ -1,8 +1,3 @@

[[/img/480px-Open-ZFS-Secondary-Logo-Colour-halfsize.png|alt=openzfs]]

-Welcome to the OpenZFS GitHub wiki. This wiki provides documentation for users and developers working -with (or contributing to) the OpenZFS project. New users or system administrators should refer to the documentation for their favorite platform to get started. - -| [[Getting Started]] | [[Project and Community]] | [[Developer Resources]] | -|------------------------------|-------------------------------|------------------------ | -| How to get started with OpenZFS on your favorite platform | About the project and how to contribute | Technical documentation discussing the OpenZFS implementation | \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/License.md b/License.md index 32ae74a..455a0b4 100644 --- a/License.md +++ b/License.md @@ -1,5 +1,3 @@ -[![Creative Commons License](https://i.creativecommons.org/l/by-sa/3.0/88x31.png)][license] +This page was moved to: https://openzfs.github.io/openzfs-docs/License.html -Wiki content is licensed under a [Creative Commons Attribution-ShareAlike license][license] unless otherwise noted. - -[license]: http://creativecommons.org/licenses/by-sa/3.0/ +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Mailing-Lists.md b/Mailing-Lists.md index 0ca1e3c..aeaa3c1 100644 --- a/Mailing-Lists.md +++ b/Mailing-Lists.md @@ -1,15 +1,4 @@ -|                         List                         | Description | List Archive | -|--------------------------|-------------|:------------:| -| [zfs-announce@list.zfsonlinux.org][zfs-ann] | A low-traffic list for announcements such as new releases | [archive][zfs-ann-archive] | -| [zfs-discuss@list.zfsonlinux.org][zfs-discuss] | A user discussion list for issues related to functionality and usability | [archive][zfs-discuss-archive] | -| [zfs-devel@list.zfsonlinux.org][zfs-devel] | A development list for developers to discuss technical issues | [archive][zfs-devel-archive] | -| [developer@open-zfs.org][open-zfs] | A platform-independent mailing list for ZFS developers to review ZFS code and architecture changes from all platforms | [archive][open-zfs-archive] | -[zfs-ann]: https://zfsonlinux.topicbox.com/groups/zfs-announce -[zfs-ann-archive]: https://zfsonlinux.topicbox.com/groups/zfs-announce -[zfs-discuss]: https://zfsonlinux.topicbox.com/groups/zfs-discuss -[zfs-discuss-archive]: https://zfsonlinux.topicbox.com/groups/zfs-discuss -[zfs-devel]: https://zfsonlinux.topicbox.com/groups/zfs-devel -[zfs-devel-archive]: https://zfsonlinux.topicbox.com/groups/zfs-devel -[open-zfs]: http://open-zfs.org/wiki/Mailing_list -[open-zfs-archive]: https://openzfs.topicbox.com/groups/developer +This page was moved to: https://openzfs.github.io/openzfs-docs/Project%20and%20Community/Mailing%20Lists.html + +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/OpenZFS-Patches.md b/OpenZFS-Patches.md index 104e833..343b97a 100644 --- a/OpenZFS-Patches.md +++ b/OpenZFS-Patches.md @@ -1,199 +1,3 @@ -The ZFS on Linux project is an adaptation of the upstream [OpenZFS repository][openzfs-repo] designed to work in a Linux environment. This upstream repository acts as a location where new features, bug fixes, and performance improvements from all the OpenZFS platforms can be integrated. Each platform is responsible for tracking the OpenZFS repository and merging the relevant improvements back in to their release. +This page was moved to: https://openzfs.github.io/openzfs-docs/Developer%20Resources/OpenZFS%20Patches.html -For the ZFS on Linux project this tracking is managed through an [OpenZFS tracking](http://build.zfsonlinux.org/openzfs-tracking.html) page. The page is updated regularly and shows a list of OpenZFS commits and their status in regard to the ZFS on Linux master branch. - -This page describes the process of applying outstanding OpenZFS commits to ZFS on Linux and submitting those changes for inclusion. As a developer this is a great way to familiarize yourself with ZFS on Linux and to begin quickly making a valuable contribution to the project. The following guide assumes you have a [github account][github-account], are familiar with git, and are used to developing in a Linux environment. - -## Porting OpenZFS changes to ZFS on Linux - -### Setup the Environment - -**Clone the source.** Start by making a local clone of the [spl][spl-repo] and [zfs][zfs-repo] repositories. - -``` -$ git clone -o zfsonlinux https://github.com/zfsonlinux/spl.git -$ git clone -o zfsonlinux https://github.com/zfsonlinux/zfs.git -``` - -**Add remote repositories.** Using the GitHub web interface [fork][github-fork] the [zfs][zfs-repo] repository in to your personal GitHub account. Add your new zfs fork and the [openzfs][openzfs-repo] repository as remotes and then fetch both repositories. The OpenZFS repository is large and the initial fetch may take some time over a slow connection. - -``` -$ cd zfs -$ git remote add git@github.com:/zfs.git -$ git remote add openzfs https://github.com/openzfs/openzfs.git -$ git fetch --all -``` - -**Build the source.** Compile the spl and zfs master branches. These branches are always kept stable and this is a useful verification that you have a full build environment installed and all the required dependencies are available. This may also speed up the compile time latter for small patches where incremental builds are an option. - -``` -$ cd ../spl -$ sh autogen.sh && ./configure --enable-debug && make -s -j$(nproc) -$ -$ cd ../zfs -$ sh autogen.sh && ./configure --enable-debug && make -s -j$(nproc) -``` - -### Pick a patch - -Consult the [OpenZFS tracking](http://build.zfsonlinux.org/openzfs-tracking.html) page and select a patch which has not yet been applied. For your first patch you will want to select a small patch to familiarize yourself with the process. - -### Porting a Patch - -There are 2 methods: -- [cherry-pick (easier)](#cherry-pick) -- [manual merge](#manual-merge) - -Please read about [manual merge](#manual-merge) first to learn the whole process. - -#### Cherry-pick - -You can start to [cherry-pick](https://git-scm.com/docs/git-cherry-pick) by your own, but we have made a special [script](https://github.com/zfsonlinux/zfs-buildbot/blob/master/scripts/openzfs-merge.sh), which tries to [cherry-pick](https://git-scm.com/docs/git-cherry-pick) the patch automatically and generates the description. - -0) Prepare environment: - -Mandatory git settings (add to `~/.gitconfig`): -``` -[merge] - renameLimit = 999999 -[user] - email = mail@yourmail.com - name = Your Name -``` - -Download the script: -``` -wget https://raw.githubusercontent.com/zfsonlinux/zfs-buildbot/master/scripts/openzfs-merge.sh -``` - -1) Run: -``` -./openzfs-merge.sh -d path_to_zfs_folder -c openzfs_commit_hash -``` -This command will fetch all repositories, create a new branch `autoport-ozXXXX` (XXXX - OpenZFS issue number), try to cherry-pick, compile and check cstyle on success. - -If it succeeds without any merge conflicts - go to `autoport-ozXXXX` branch, it will have ready to pull commit. Congratulations, you can go to step 7! - -Otherwise you should go to step 2. - -2) Resolve all merge conflicts manually. Easy method - install [Meld](http://meldmerge.org/) or any other diff tool and run `git mergetool`. - -3) Check all compile and cstyle errors (See [Testing a patch](#testing-a-patch)). - -4) Commit your changes with any description. - -5) Update commit description (last commit will be changed): -``` -./openzfs-merge.sh -d path_to_zfs_folder -g openzfs_commit_hash -``` - -6) Add any porting notes (if you have modified something): `git commit --amend` - -7) Push your commit to github: `git push autoport-ozXXXX` - -8) Create a pull request to ZoL master branch. - -9) Go to [Testing a patch](#testing-a-patch) section. - -#### Manual merge - -**Create a new branch.** It is important to create a new branch for every commit you port to ZFS on Linux. This will allow you to easily submit your work as a GitHub pull request and it makes it possible to work on multiple OpenZFS changes concurrently. All development branches need to be based off of the ZFS master branch and it's helpful to name the branches after the issue number you're working on. - -``` -$ git checkout -b openzfs- master -``` - -**Generate a patch.** One of the first things you'll notice about the ZFS on Linux repository is that it is laid out differently than the OpenZFS repository. Organizationally it is much flatter, this is possible because it only contains the code for OpenZFS not an entire OS. That means that in order to apply a patch from OpenZFS the path names in the patch must be changed. A script called zfs2zol-patch.sed has been provided to perform this translation. Use the `git format-patch` command and this script to generate a patch. - -``` -$ git format-patch --stdout ^.. | \ - ./scripts/zfs2zol-patch.sed >openzfs-.diff -``` - -**Apply the patch.** In many cases the generated patch will apply cleanly to the repository. However, it's important to keep in mind the zfs2zol-patch.sed script only translates the paths. There are often additional reasons why a patch might not apply. In some cases hunks of the patch may not be applicable to Linux and should be dropped. In other cases a patch may depend on other changes which must be applied first. The changes may also conflict with Linux specific modifications. In all of these cases the patch will need to be manually modified to apply cleanly while preserving the its original intent. - -``` -$ git am ./openzfs-.diff -``` - -**Update the commit message.** By using `git format-patch` to generate the patch and then `git am` to apply it the original comment and authorship will be preserved. However, due to the formatting of the OpenZFS commit you will likely find that the entire commit comment has been squashed in to the subject line. Use `git commit --amend` to cleanup the comment and be careful to follow [these standard guidelines][guidelines]. - -The summary line of an OpenZFS commit is often very long and you should truncate it to 50 characters. This is useful because it preserves the correct formatting of `git log --pretty=oneline` command. Make sure to leave a blank line between the summary and body of the commit. Then include the full OpenZFS commit message wrapping any lines which exceed 72 characters. Finally, add a `Ported-by` tag with your contact information and both a `OpenZFS-issue` and `OpenZFS-commit` tag with appropriate links. You'll want to verify your commit contains all of the following information: - - * The subject line from the original OpenZFS patch in the form: "OpenZFS \ - short description". - * The original patch authorship should be preserved. - * The OpenZFS commit message. - * The following tags: - * **Authored by:** Original patch author - * **Reviewed by:** All OpenZFS reviewers from the original patch. - * **Approved by:** All OpenZFS reviewers from the original patch. - * **Ported-by:** Your name and email address. - * **OpenZFS-issue:** https ://www.illumos.org/issues/issue - * **OpenZFS-commit:** https ://github.com/openzfs/openzfs/commit/hash - * **Porting Notes:** An optional section describing any changes required when porting. - -For example, OpenZFS issue 6873 was [applied to Linux][zol-6873] from this upstream [OpenZFS commit][openzfs-6873]. - -``` -OpenZFS 6873 - zfs_destroy_snaps_nvl leaks errlist - -Authored by: Chris Williamson -Reviewed by: Matthew Ahrens -Reviewed by: Paul Dagnelie -Ported-by: Denys Rtveliashvili - -lzc_destroy_snaps() returns an nvlist in errlist. -zfs_destroy_snaps_nvl() should nvlist_free() it before returning. - -OpenZFS-issue: https://www.illumos.org/issues/6873 -OpenZFS-commit: https://github.com/openzfs/openzfs/commit/ee06391 -``` - -### Testing a Patch - -**Build the source.** Verify the patched source compiles without errors and all warnings are resolved. - -``` -$ make -s -j$(nproc) -``` - -**Run the style checker.** Verify the patched source passes the style checker, the command should return without printing any output. - -``` -$ make cstyle -``` - -**Open a Pull Request.** When your patch builds cleanly and passes the style checks [open a new pull request][github-pr]. The pull request will be queued for [automated testing][buildbot]. As part of the testing the change is built for a wide range of Linux distributions and a battery of functional and stress tests are run to detect regressions. - -``` -$ git push openzfs- -``` - -**Fix any issues.** Testing takes approximately 2 hours to fully complete and the results are posted in the GitHub [pull request][openzfs-pr]. All the tests are expected to pass and you should investigate and resolve any test failures. The [test scripts][buildbot-scripts] are all available and designed to run locally in order reproduce an issue. Once you've resolved the issue force update the pull request to trigger a new round of testing. Iterate until all the tests are passing. - -``` -# Fix issue, amend commit, force update branch. -$ git commit --amend -$ git push --force openzfs- -``` - -### Merging the Patch - -**Review.** Lastly one of the ZFS on Linux maintainers will make a final review of the patch and may request additional changes. Once the maintainer is happy with the final version of the patch they will add their signed-off-by, merge it to the master branch, mark it complete on the tracking page, and thank you for your contribution to the project! - -## Porting ZFS on Linux changes to OpenZFS - -Often an issue will be first fixed in ZFS on Linux or a new feature developed. Changes which are not Linux specific should be submitted upstream to the OpenZFS GitHub repository for review. The process for this is described in the [OpenZFS README][openzfs-repo]. - -[github-account]: https://help.github.com/articles/signing-up-for-a-new-github-account/ -[github-pr]: https://help.github.com/articles/creating-a-pull-request/ -[github-fork]: https://help.github.com/articles/fork-a-repo/ -[buildbot]: https://github.com/zfsonlinux/zfs-buildbot/ -[buildbot-scripts]: https://github.com/zfsonlinux/zfs-buildbot/tree/master/scripts -[spl-repo]: https://github.com/zfsonlinux/spl -[zfs-repo]: https://github.com/zfsonlinux/zfs -[openzfs-repo]: https://github.com/openzfs/openzfs/ -[openzfs-6873]: https://github.com/openzfs/openzfs/commit/ee06391 -[zol-6873]: https://github.com/zfsonlinux/zfs/commit/b3744ae -[openzfs-pr]: https://github.com/zfsonlinux/zfs/pull/4594 -[guidelines]: http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/OpenZFS-Tracking.md b/OpenZFS-Tracking.md index ca149dc..082c904 100644 --- a/OpenZFS-Tracking.md +++ b/OpenZFS-Tracking.md @@ -1 +1,3 @@ -This page is obsolete, use http://build.zfsonlinux.org/openzfs-tracking.html \ No newline at end of file +This page is obsolete, use http://build.zfsonlinux.org/openzfs-tracking.html + +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/OpenZFS-exceptions.md b/OpenZFS-exceptions.md index f531db1..d4b0ab2 100644 --- a/OpenZFS-exceptions.md +++ b/OpenZFS-exceptions.md @@ -1,16 +1,20 @@ +**This page will be moved to: https://openzfs.github.io/openzfs-docs/Developer%20Resources/OpenZFS%20Exceptions.html** + +**DON'T EDIT THIS PAGE!** + Commit exceptions used to explicitly reference a given Linux commit. These exceptions are useful for a variety of reasons. **This page is used to generate [OpenZFS Tracking](http://build.zfsonlinux.org/openzfs-tracking.html) page.** #### Format: -- `|-|` - -The OpenZFS commit isn't applicable to Linux, or +- `|-|` - +The OpenZFS commit isn't applicable to Linux, or the OpenZFS -> ZFS on Linux commit matching is unable to associate the related commits due to lack of information (denoted by a -). -- `||` - +- `||` - The fix was merged to Linux prior to their being an OpenZFS issue. -- `|!|` - +- `|!|` - The commit is applicable but not applied for the reason described in the comment. OpenZFS issue id | status/ZFS commit | comment @@ -43,7 +47,7 @@ OpenZFS issue id | status/ZFS commit | comment 8942|650258d7| 8941|390d679a| 8858|- |Not applicable to Linux -8856|- |Not applicable to Linux due to Encryption (b525630) +8856|- |Not applicable to Linux due to Encryption (b525630) 8809|! |Adding libfakekernel needs to be done by refactoring existing code. 8713|871e0732| 8661|1ce23dca| @@ -92,7 +96,7 @@ OpenZFS issue id | status/ZFS commit | comment 7542|- |The Linux libshare code differs significantly from the upstream OpenZFS code. Since this change doesn't address a Linux specific issue it doesn't need to be ported. The eventual plan is to retire all of the existing libshare code and use the ZED to more flexibly control filesystem sharing. 7512|- |None of the illumos build system is used under Linux. 7497|- |DTrace is isn't readily available under Linux. -7446|! |Need to assess applicability to Linux. +7446|! |Need to assess applicability to Linux. 7430|68cbd56| 7402|690fe64| 7345|058ac9b| @@ -137,7 +141,7 @@ OpenZFS issue id | status/ZFS commit | comment 6249|6bb24f4| 6248|6bb24f4| 6220|- |The b_thawed debug code was unused under Linux and removed. -6209|- |The Linux user space mutex implementation is based on phtread primitives. +6209|- |The Linux user space mutex implementation is based on phtread primitives. 6095|f866a4ea| 6091|c11f100| 5984|480f626| diff --git a/Project-and-Community.md b/Project-and-Community.md index ca0c0c2..5d433a8 100644 --- a/Project-and-Community.md +++ b/Project-and-Community.md @@ -1,16 +1,3 @@ -OpenZFS is storage software which combines the functionality of traditional filesystems, volume manager, and more. OpenZFS includes protection against data corruption, support for high storage capacities, efficient data compression, snapshots and copy-on-write clones, continuous integrity checking and automatic repair, remote replication with ZFS send and receive, and RAID-Z. +This page was moved to: https://openzfs.github.io/openzfs-docs/Project%20and%20Community/index.html -OpenZFS brings together developers from the illumos, Linux, FreeBSD and OS X platforms, and a wide range of companies -- both online and at the annual OpenZFS Developer Summit. High-level goals of the project include raising awareness of the quality, utility and availability of open-source implementations of ZFS, encouraging open communication about ongoing efforts toward improving open-source variants of ZFS, and ensuring consistent reliability, functionality and performance of all distributions of ZFS. - -[Admin Documentation][admin-docs] -[[FAQ]] -[[Mailing Lists]] -[Releases][releases] -[Issue Tracker][issues] -[Roadmap][roadmap] -[[Signing Keys]] - -[admin-docs]: https://pthree.org/2012/04/17/install-zfs-on-debian-gnulinux/ -[issues]: https://github.com/zfsonlinux/zfs/issues -[roadmap]: https://github.com/zfsonlinux/zfs/milestones -[releases]: https://github.com/zfsonlinux/zfs/releases +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/RHEL-and-CentOS.md b/RHEL-and-CentOS.md index 695d381..cb4f444 100644 --- a/RHEL-and-CentOS.md +++ b/RHEL-and-CentOS.md @@ -1,108 +1,3 @@ -[kABI-tracking kmod][kmod] or [DKMS][dkms] style packages are provided for RHEL / CentOS based distributions from the official zfsonlinux.org repository. These packages track the official ZFS on Linux tags and are updated as new versions are released. Packages are available for the following configurations: +This page was moved to: https://openzfs.github.io/openzfs-docs/Getting%20Started/RHEL%20and%20CentOS.html -**EL Releases:** 6.x, 7.x, 8.x -**Architectures:** x86_64 - -To simplify installation a zfs-release package is provided which includes a zfs.repo configuration file and the ZFS on Linux public signing key. All official ZFS on Linux packages are signed using this key, and by default yum will verify a package's signature before allowing it be to installed. Users are strongly encouraged to verify the authenticity of the ZFS on Linux public key using the fingerprint listed here. - -**Location:** /etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux -**EL6 Package:** http://download.zfsonlinux.org/epel/zfs-release.el6.noarch.rpm -**EL7.5 Package:** http://download.zfsonlinux.org/epel/zfs-release.el7_5.noarch.rpm -**EL7.6 Package:** http://download.zfsonlinux.org/epel/zfs-release.el7_6.noarch.rpm -**EL7.7 Package:** http://download.zfsonlinux.org/epel/zfs-release.el7_7.noarch.rpm -**EL7.8 Package:** http://download.zfsonlinux.org/epel/zfs-release.el7_8.noarch.rpm -**EL8.0 Package:** http://download.zfsonlinux.org/epel/zfs-release.el8_0.noarch.rpm -**EL8.1 Package:** http://download.zfsonlinux.org/epel/zfs-release.el8_1.noarch.rpm -**Note:** Starting with EL7.7 **zfs-0.8** will become the default, EL7.6 and older will continue to track the **zfs-0.7** point releases. - -**Download from:** [pgp.mit.edu][pubkey] -**Fingerprint:** C93A FFFD 9F3F 7B03 C310 CEB6 A9D5 A1C0 F14A B620 - -``` -$ sudo yum install http://download.zfsonlinux.org/epel/zfs-release..noarch.rpm -$ gpg --quiet --with-fingerprint /etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux -pub 2048R/F14AB620 2013-03-21 ZFS on Linux - Key fingerprint = C93A FFFD 9F3F 7B03 C310 CEB6 A9D5 A1C0 F14A B620 - sub 2048R/99685629 2013-03-21 -``` - -After installing the zfs-release package and verifying the public key users can opt to install ether the kABI-tracking kmod or DKMS style packages. For most users the kABI-tracking kmod packages are recommended in order to avoid needing to rebuild ZFS for every kernel update. DKMS packages are recommended for users running a non-distribution kernel or for users who wish to apply local customizations to ZFS on Linux. - -## kABI-tracking kmod - -By default the zfs-release package is configured to install DKMS style packages so they will work with a wide range of kernels. In order to install the kABI-tracking kmods the default repository in the */etc/yum.repos.d/zfs.repo* file must be switch from *zfs* to *zfs-kmod*. Keep in mind that the kABI-tracking kmods are only verified to work with the distribution provided kernel. - -```diff -# /etc/yum.repos.d/zfs.repo - [zfs] - name=ZFS on Linux for EL 7 - dkms - baseurl=http://download.zfsonlinux.org/epel/7/$basearch/ --enabled=1 -+enabled=0 - metadata_expire=7d - gpgcheck=1 - gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux -@@ -9,7 +9,7 @@ - [zfs-kmod] - name=ZFS on Linux for EL 7 - kmod - baseurl=http://download.zfsonlinux.org/epel/7/kmod/$basearch/ --enabled=0 -+enabled=1 - metadata_expire=7d - gpgcheck=1 - gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux -``` - -The ZFS on Linux packages can now be installed using yum. - -``` -$ sudo yum install zfs -``` - -## DKMS - -To install DKMS style packages issue the following yum commands. First add the [EPEL repository](https://fedoraproject.org/wiki/EPEL) which provides DKMS by installing the *epel-release* package, then the *kernel-devel* and *zfs* packages. Note that it is important to make sure that the matching *kernel-devel* package is installed for the running kernel since DKMS requires it to build ZFS. - -``` -$ sudo yum install epel-release -$ sudo yum install "kernel-devel-uname-r == $(uname -r)" zfs -``` - -## Important Notices - -### RHEL/CentOS 7.x kmod package upgrade - -When updating to a new RHEL/CentOS 7.x release the existing kmod packages will not work due to upstream kABI changes in the kernel. After upgrading to 7.x users must uninstall ZFS and then reinstall it as described in the [kABI-tracking kmod](https://github.com/zfsonlinux/zfs/wiki/RHEL-%26-CentOS/#kabi-tracking-kmod) section. Compatible kmod packages will be installed from the matching CentOS 7.x repository. - -``` -$ sudo yum remove zfs zfs-kmod spl spl-kmod libzfs2 libnvpair1 libuutil1 libzpool2 zfs-release -$ sudo yum install http://download.zfsonlinux.org/epel/zfs-release.el7_6.noarch.rpm -$ sudo yum autoremove -$ sudo yum clean metadata -$ sudo yum install zfs -``` - -### Switching from DKMS to kABI-tracking kmod - -When switching from DKMS to kABI-tracking kmods first uninstall the existing DKMS packages. This should remove the kernel modules for all installed kernels but in practice it's not always perfectly reliable. Therefore, it's recommended that you manually remove any remaining ZFS kernel modules as shown. At this point the kABI-tracking kmods can be installed as described in the section above. - -``` -$ sudo yum remove zfs zfs-kmod spl spl-kmod libzfs2 libnvpair1 libuutil1 libzpool2 zfs-release - -$ sudo find /lib/modules/ \( -name "splat.ko" -or -name "zcommon.ko" \ --or -name "zpios.ko" -or -name "spl.ko" -or -name "zavl.ko" -or \ --name "zfs.ko" -or -name "znvpair.ko" -or -name "zunicode.ko" \) \ --exec /bin/rm {} \; -``` - -## Testing Repositories - -In addition to the primary *zfs* repository a *zfs-testing* repository is available. This repository, which is disabled by default, contains the latest version of ZFS on Linux which is under active development. These packages are made available in order to get feedback from users regarding the functionality and stability of upcoming releases. These packages **should not** be used on production systems. Packages from the testing repository can be installed as follows. - -``` -$ sudo yum --enablerepo=zfs-testing install kernel-devel zfs -``` - -[kmod]: http://elrepoproject.blogspot.com/2016/02/kabi-tracking-kmod-packages.html -[dkms]: https://en.wikipedia.org/wiki/Dynamic_Kernel_Module_Support -[pubkey]: http://pgp.mit.edu/pks/lookup?search=0xF14AB620&op=index&fingerprint=on \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Signing-Keys.md b/Signing-Keys.md index f428cbb..52c006e 100644 --- a/Signing-Keys.md +++ b/Signing-Keys.md @@ -1,57 +1,4 @@ -All tagged ZFS on Linux [releases][releases] are signed by the official maintainer for that branch. These signatures are automatically verified by GitHub and can be checked locally by downloading the maintainers public key. -## Maintainers +This page was moved to: https://openzfs.github.io/openzfs-docs/Project%20and%20Community/Signing%20Keys.html -### Release branch (spl/zfs-*-release) - -**Maintainer:** [Ned Bass][nedbass] -**Download:** [pgp.mit.edu][nedbass-pubkey] -**Key ID:** C77B9667 -**Fingerprint:** 29D5 610E AE29 41E3 55A2 FE8A B974 67AA C77B 9667 - -**Maintainer:** [Tony Hutter][tonyhutter] -**Download:** [pgp.mit.edu][tonyhutter-pubkey] -**Key ID:** D4598027 -**Fingerprint:** 4F3B A9AB 6D1F 8D68 3DC2 DFB5 6AD8 60EE D459 8027 - -### Master branch (master) - -**Maintainer:** [Brian Behlendorf][behlendorf] -**Download:** [pgp.mit.edu][behlendorf-pubkey] -**Key ID:** C6AF658B -**Fingerprint:** C33D F142 657E D1F7 C328 A296 0AB9 E991 C6AF 658B - -## Checking the Signature of a Git Tag - -First import the public key listed above in to your key ring. - -``` -$ gpg --keyserver pgp.mit.edu --recv C6AF658B -gpg: requesting key C6AF658B from hkp server pgp.mit.edu -gpg: key C6AF658B: "Brian Behlendorf " not changed -gpg: Total number processed: 1 -gpg: unchanged: 1 -``` - -After the pubic key is imported the signature of a git tag can be verified as shown. - -``` -$ git tag --verify zfs-0.6.5 -object 7a27ad00ae142b38d4aef8cc0af7a72b4c0e44fe -type commit -tag zfs-0.6.5 -tagger Brian Behlendorf 1441996302 -0700 - -ZFS Version 0.6.5 -gpg: Signature made Fri 11 Sep 2015 11:31:42 AM PDT using DSA key ID C6AF658B -gpg: Good signature from "Brian Behlendorf " -gpg: aka "Brian Behlendorf (LLNL) " -``` - -[nedbass]: https://github.com/nedbass -[nedbass-pubkey]: http://pgp.mit.edu/pks/lookup?op=vindex&search=0xB97467AAC77B9667&fingerprint=on -[tonyhutter]: https://github.com/tonyhutter -[tonyhutter-pubkey]: http://pgp.mit.edu/pks/lookup?op=vindex&search=0x6ad860eed4598027&fingerprint=on -[behlendorf]: https://github.com/behlendorf -[behlendorf-pubkey]: http://pgp.mit.edu/pks/lookup?op=vindex&search=0x0AB9E991C6AF658B&fingerprint=on -[releases]: https://github.com/zfsonlinux/zfs/releases \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Troubleshooting.md b/Troubleshooting.md index 980de4b..7c0467e 100644 --- a/Troubleshooting.md +++ b/Troubleshooting.md @@ -1,66 +1,3 @@ -# DRAFT -This page contains tips for troubleshooting ZFS on Linux and what info developers might want for bug triage. +This page was moved to: https://openzfs.github.io/openzfs-docs/Basics%20concepts/Troubleshooting.html -- [About Log Files](#about-log-files) - - [Generic Kernel Log](#generic-kernel-log) - - [ZFS Kernel Module Debug Messages](#zfs-kernel-module-debug-messages) -- [Unkillable Process](#unkillable-process) -- [ZFS Events](#zfs-events) - -*** -## About Log Files -Log files can be very useful for troubleshooting. In some cases, interesting information is stored in multiple log files that are correlated to system events. - -Pro tip: logging infrastructure tools like _elasticsearch_, _fluentd_, _influxdb_, or _splunk_ can simplify log analysis and event correlation. - -### Generic Kernel Log -Typically, Linux kernel log messages are available from `dmesg -T`, `/var/log/syslog`, or where kernel log messages are sent (eg by `rsyslogd`). - -### ZFS Kernel Module Debug Messages -The ZFS kernel modules use an internal log buffer for detailed logging information. -This log information is available in the pseudo file `/proc/spl/kstat/zfs/dbgmsg` for ZFS builds where ZFS module parameter [zfs_dbgmsg_enable = 1](https://github.com/zfsonlinux/zfs/wiki/ZFS-on-Linux-Module-Parameters#zfs_dbgmsg_enable) - -*** -## Unkillable Process -Symptom: `zfs` or `zpool` command appear hung, does not return, and is not killable - -Likely cause: kernel thread hung or panic - -Log files of interest: [Generic Kernel Log](#generic-kernel-log), [ZFS Kernel Module Debug Messages](#zfs-kernel-module-debug-messages) - -Important information: if a kernel thread is stuck, then a backtrace of the stuck thread can be in the logs. -In some cases, the stuck thread is not logged until the deadman timer expires. See also [debug tunables](https://github.com/zfsonlinux/zfs/wiki/ZFS-on-Linux-Module-Parameters#debug) - -*** -## ZFS Events -ZFS uses an event-based messaging interface for communication of important events to -other consumers running on the system. The ZFS Event Daemon (zed) is a userland daemon that -listens for these events and processes them. zed is extensible so you can write shell scripts -or other programs that subscribe to events and take action. For example, the script usually -installed at `/etc/zfs/zed.d/all-syslog.sh` writes a formatted event message to `syslog.` -See the man page for `zed(8)` for more information. - -A history of events is also available via the `zpool events` command. This history begins at -ZFS kernel module load and includes events from any pool. These events are stored in RAM and -limited in count to a value determined by the kernel tunable [zfs_event_len_max](https://github.com/zfsonlinux/zfs/wiki/ZFS-on-Linux-Module-Parameters#zfs_zevent_len_max). -`zed` has an internal throttling mechanism to prevent overconsumption of system resources -processing ZFS events. - -More detailed information about events is observable using `zpool events -v` -The contents of the verbose events is subject to change, based on the event and information -available at the time of the event. - -Each event has a class identifier used for filtering event types. Commonly seen events are -those related to pool management with class `sysevent.fs.zfs.*` including import, export, -configuration updates, and `zpool history` updates. - -Events related to errors are reported as class `ereport.*` These can be invaluable for -troubleshooting. Some faults can cause multiple ereports as various layers of the software -deal with the fault. For example, on a simple pool without parity protection, a faulty -disk could cause an `ereport.io` during a read from the disk that results in an -`erport.fs.zfs.checksum` at the pool level. These events are also reflected by the error -counters observed in `zpool status` -If you see checksum or read/write errors in `zpool status` then there should be one or more -corresponding ereports in the `zpool events` output. - -# DRAFT +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Ubuntu-16.04-Root-on-ZFS.md b/Ubuntu-16.04-Root-on-ZFS.md index 5562d25..2b8edfd 100644 --- a/Ubuntu-16.04-Root-on-ZFS.md +++ b/Ubuntu-16.04-Root-on-ZFS.md @@ -1,604 +1,3 @@ -### Newer release available -* See [[Ubuntu 18.04 Root on ZFS]] for new installs. +This page was moved to: https://openzfs.github.io/openzfs-docs/Getting%20Started/Ubuntu/Ubuntu%2016.04%20Root%20on%20ZFS.html -### Caution -* This HOWTO uses a whole physical disk. -* Do not use these instructions for dual-booting. -* Backup your data. Any existing data will be lost. - -### System Requirements -* [64-bit Ubuntu 16.04.5 ("Xenial") Desktop CD](http://releases.ubuntu.com/16.04/ubuntu-16.04.5-desktop-amd64.iso) (*not* the server image) -* [A 64-bit kernel is *strongly* encouraged.](https://github.com/zfsonlinux/zfs/wiki/FAQ#32-bit-vs-64-bit-systems) -* A drive which presents 512B logical sectors. Installing on a drive which presents 4KiB logical sectors (a “4Kn” drive) should work with UEFI partitioning, but this has not been tested. - -Computers that have less than 2 GiB of memory run ZFS slowly. 4 GiB of memory is recommended for normal performance in basic workloads. If you wish to use deduplication, you will need [massive amounts of RAM](http://wiki.freebsd.org/ZFSTuningGuide#Deduplication). Enabling deduplication is a permanent change that cannot be easily reverted. - -## Support - -If you need help, reach out to the community using the [zfs-discuss mailing list](https://github.com/zfsonlinux/zfs/wiki/Mailing-Lists) or IRC at #zfsonlinux on [freenode](https://freenode.net/). If you have a bug report or feature request related to this HOWTO, please [file a new issue](https://github.com/zfsonlinux/zfs/issues/new) and mention @rlaager. - -## Encryption - -This guide supports the three different Ubuntu encryption options: unencrypted, LUKS (full-disk encryption), and eCryptfs (home directory encryption). - -Unencrypted does not encrypt anything, of course. All ZFS features are fully available. With no encryption happening, this option naturally has the best performance. - -LUKS encrypts almost everything: the OS, swap, home directories, and anything else. The only unencrypted data is the bootloader, kernel, and initrd. The system cannot boot without the passphrase being entered at the console. All ZFS features are fully available. Performance is good, but LUKS sits underneath ZFS, so if multiple disks (mirror or raidz configurations) are used, the data has to be encrypted once per disk. - -eCryptfs protects the contents of the specified home directories. This guide also recommends encrypted swap when using eCryptfs. Other operating system directories, which may contain sensitive data, logs, and/or configuration information, are not encrypted. ZFS compression is useless on the encrypted home directories. ZFS snapshots are not automatically and transparently mounted when using eCryptfs, and manually mounting them requires serious knowledge of eCryptfs administrative commands. eCryptfs sits above ZFS, so the encryption only happens once, regardless of the number of disks in the pool. The performance of eCryptfs may be lower than LUKS in single-disk scenarios. - -If you want encryption, LUKS is recommended. - -## Step 1: Prepare The Install Environment - -1.1 Boot the Ubuntu Live CD. Select Try Ubuntu. Connect your system to the Internet as appropriate (e.g. join your WiFi network). Open a terminal (press Ctrl-Alt-T). - -1.2 Setup and update the repositories: - - $ sudo apt-add-repository universe - $ sudo apt update - -1.3 Optional: Start the OpenSSH server in the Live CD environment: - -If you have a second system, using SSH to access the target system can be convenient. - - $ passwd - There is no current password; hit enter at that prompt. - $ sudo apt --yes install openssh-server - -**Hint:** You can find your IP address with `ip addr show scope global | grep inet`. Then, from your main machine, connect with `ssh ubuntu@IP`. - -1.4 Become root: - - $ sudo -i - -1.5 Install ZFS in the Live CD environment: - - # apt install --yes debootstrap gdisk zfs-initramfs - -**Note:** You can ignore the two error lines about "AppStream". They are harmless. - -## Step 2: Disk Formatting - -2.1 If you are re-using a disk, clear it as necessary: - - If the disk was previously used in an MD array, zero the superblock: - # apt install --yes mdadm - # mdadm --zero-superblock --force /dev/disk/by-id/scsi-SATA_disk1 - - Clear the partition table: - # sgdisk --zap-all /dev/disk/by-id/scsi-SATA_disk1 - -2.2 Partition your disk: - - Run this if you need legacy (BIOS) booting: - # sgdisk -a1 -n2:34:2047 -t2:EF02 /dev/disk/by-id/scsi-SATA_disk1 - - Run this for UEFI booting (for use now or in the future): - # sgdisk -n3:1M:+512M -t3:EF00 /dev/disk/by-id/scsi-SATA_disk1 - -Choose one of the following options: - -2.2a Unencrypted or eCryptfs: - - # sgdisk -n1:0:0 -t1:BF01 /dev/disk/by-id/scsi-SATA_disk1 - -2.2b LUKS: - - # sgdisk -n4:0:+512M -t4:8300 /dev/disk/by-id/scsi-SATA_disk1 - # sgdisk -n1:0:0 -t1:8300 /dev/disk/by-id/scsi-SATA_disk1 - -Always use the long `/dev/disk/by-id/*` aliases with ZFS. Using the `/dev/sd*` device nodes directly can cause sporadic import failures, especially on systems that have more than one storage pool. - -**Hints:** -* `ls -la /dev/disk/by-id` will list the aliases. -* Are you doing this in a virtual machine? If your virtual disk is missing from `/dev/disk/by-id`, use `/dev/vda` if you are using KVM with virtio; otherwise, read the [troubleshooting](https://github.com/zfsonlinux/zfs/wiki/Ubuntu-16.04-Root-on-ZFS#troubleshooting) section. - -2.3 Create the root pool: - -Choose one of the following options: - -2.3a Unencrypted or eCryptfs: - - # zpool create -o ashift=12 \ - -O atime=off -O canmount=off -O compression=lz4 -O normalization=formD \ - -O mountpoint=/ -R /mnt \ - rpool /dev/disk/by-id/scsi-SATA_disk1-part1 - -2.3b LUKS: - - # cryptsetup luksFormat -c aes-xts-plain64 -s 256 -h sha256 \ - /dev/disk/by-id/scsi-SATA_disk1-part1 - # cryptsetup luksOpen /dev/disk/by-id/scsi-SATA_disk1-part1 luks1 - # zpool create -o ashift=12 \ - -O atime=off -O canmount=off -O compression=lz4 -O normalization=formD \ - -O mountpoint=/ -R /mnt \ - rpool /dev/mapper/luks1 - -**Notes:** -* The use of `ashift=12` is recommended here because many drives today have 4KiB (or larger) physical sectors, even though they present 512B logical sectors. Also, a future replacement drive may have 4KiB physical sectors (in which case `ashift=12` is desirable) or 4KiB logical sectors (in which case `ashift=12` is required). -* Setting `normalization=formD` eliminates some corner cases relating to UTF-8 filename normalization. It also implies `utf8only=on`, which means that only UTF-8 filenames are allowed. If you care to support non-UTF-8 filenames, do not use this option. For a discussion of why requiring UTF-8 filenames may be a bad idea, see [The problems with enforced UTF-8 only filenames](http://utcc.utoronto.ca/~cks/space/blog/linux/ForcedUTF8Filenames). -* Make sure to include the `-part1` portion of the drive path. If you forget that, you are specifying the whole disk, which ZFS will then re-partition, and you will lose the bootloader partition(s). -* For LUKS, the key size chosen is 256 bits. However, XTS mode requires two keys, so the LUKS key is split in half. Thus, `-s 256` means AES-128, which is the LUKS and Ubuntu default. -* Your passphrase will likely be the weakest link. Choose wisely. See [section 5 of the cryptsetup FAQ](https://gitlab.com/cryptsetup/cryptsetup/wikis/FrequentlyAskedQuestions#5-security-aspects) for guidance. - -**Hints:** -* The root pool does not have to be a single disk; it can have a mirror or raidz topology. In that case, repeat the partitioning commands for all the disks which will be part of the pool. Then, create the pool using `zpool create ... rpool mirror /dev/disk/by-id/scsi-SATA_disk1-part1 /dev/disk/by-id/scsi-SATA_disk2-part1` (or replace `mirror` with `raidz`, `raidz2`, or `raidz3` and list the partitions from additional disks). -* The pool name is arbitrary. On systems that can automatically install to ZFS, the root pool is named `rpool` by default. If you work with multiple systems, it might be wise to use `hostname`, `hostname0`, or `hostname-1` instead. - -## Step 3: System Installation - -3.1 Create a filesystem dataset to act as a container: - - # zfs create -o canmount=off -o mountpoint=none rpool/ROOT - -On Solaris systems, the root filesystem is cloned and the suffix is incremented for major system changes through `pkg image-update` or `beadm`. Similar functionality for APT is possible but currently unimplemented. Even without such a tool, it can still be used for manually created clones. - -3.2 Create a filesystem dataset for the root filesystem of the Ubuntu system: - - # zfs create -o canmount=noauto -o mountpoint=/ rpool/ROOT/ubuntu - # zfs mount rpool/ROOT/ubuntu - -With ZFS, it is not normally necessary to use a mount command (either `mount` or `zfs mount`). This situation is an exception because of `canmount=noauto`. - -3.3 Create datasets: - - # zfs create -o setuid=off rpool/home - # zfs create -o mountpoint=/root rpool/home/root - # zfs create -o canmount=off -o setuid=off -o exec=off rpool/var - # zfs create -o com.sun:auto-snapshot=false rpool/var/cache - # zfs create rpool/var/log - # zfs create rpool/var/spool - # zfs create -o com.sun:auto-snapshot=false -o exec=on rpool/var/tmp - - If you use /srv on this system: - # zfs create rpool/srv - - If this system will have games installed: - # zfs create rpool/var/games - - If this system will store local email in /var/mail: - # zfs create rpool/var/mail - - If this system will use NFS (locking): - # zfs create -o com.sun:auto-snapshot=false \ - -o mountpoint=/var/lib/nfs rpool/var/nfs - -The primary goal of this dataset layout is to separate the OS from user data. This allows the root filesystem to be rolled back without rolling back user data such as logs (in `/var/log`). This will be especially important if/when a `beadm` or similar utility is integrated. Since we are creating multiple datasets anyway, it is trivial to add some restrictions (for extra security) at the same time. The `com.sun.auto-snapshot` setting is used by some ZFS snapshot utilities to exclude transient data. - -3.4 For LUKS installs only: - - # mke2fs -t ext2 /dev/disk/by-id/scsi-SATA_disk1-part4 - # mkdir /mnt/boot - # mount /dev/disk/by-id/scsi-SATA_disk1-part4 /mnt/boot - -3.5 Install the minimal system: - - # chmod 1777 /mnt/var/tmp - # debootstrap xenial /mnt - # zfs set devices=off rpool - -The `debootstrap` command leaves the new system in an unconfigured state. An alternative to using `debootstrap` is to copy the entirety of a working system into the new ZFS root. - -## Step 4: System Configuration - -4.1 Configure the hostname (change `HOSTNAME` to the desired hostname). - - # echo HOSTNAME > /mnt/etc/hostname - - # vi /mnt/etc/hosts - Add a line: - 127.0.1.1 HOSTNAME - or if the system has a real name in DNS: - 127.0.1.1 FQDN HOSTNAME - -**Hint:** Use `nano` if you find `vi` confusing. - -4.2 Configure the network interface: - - Find the interface name: - # ip addr show - - # vi /mnt/etc/network/interfaces.d/NAME - auto NAME - iface NAME inet dhcp - -Customize this file if the system is not a DHCP client. - -4.3 Configure the package sources: - - # vi /mnt/etc/apt/sources.list - deb http://archive.ubuntu.com/ubuntu xenial main universe - deb-src http://archive.ubuntu.com/ubuntu xenial main universe - - deb http://security.ubuntu.com/ubuntu xenial-security main universe - deb-src http://security.ubuntu.com/ubuntu xenial-security main universe - - deb http://archive.ubuntu.com/ubuntu xenial-updates main universe - deb-src http://archive.ubuntu.com/ubuntu xenial-updates main universe - -4.4 Bind the virtual filesystems from the LiveCD environment to the new system and `chroot` into it: - - # mount --rbind /dev /mnt/dev - # mount --rbind /proc /mnt/proc - # mount --rbind /sys /mnt/sys - # chroot /mnt /bin/bash --login - -**Note:** This is using `--rbind`, not `--bind`. - -4.5 Configure a basic system environment: - - # locale-gen en_US.UTF-8 - -Even if you prefer a non-English system language, always ensure that `en_US.UTF-8` is available. - - # echo LANG=en_US.UTF-8 > /etc/default/locale - - # dpkg-reconfigure tzdata - - # ln -s /proc/self/mounts /etc/mtab - # apt update - # apt install --yes ubuntu-minimal - - If you prefer nano over vi, install it: - # apt install --yes nano - -4.6 Install ZFS in the chroot environment for the new system: - - # apt install --yes --no-install-recommends linux-image-generic - # apt install --yes zfs-initramfs - -4.7 For LUKS installs only: - - # echo UUID=$(blkid -s UUID -o value \ - /dev/disk/by-id/scsi-SATA_disk1-part4) \ - /boot ext2 defaults 0 2 >> /etc/fstab - - # apt install --yes cryptsetup - - # echo luks1 UUID=$(blkid -s UUID -o value \ - /dev/disk/by-id/scsi-SATA_disk1-part1) none \ - luks,discard,initramfs > /etc/crypttab - - # vi /etc/udev/rules.d/99-local-crypt.rules - ENV{DM_NAME}!="", SYMLINK+="$env{DM_NAME}" - ENV{DM_NAME}!="", SYMLINK+="dm-name-$env{DM_NAME}" - - # ln -s /dev/mapper/luks1 /dev/luks1 - -**Notes:** -* The use of `initramfs` is a work-around for [cryptsetup does not support ZFS](https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/1612906). -* The 99-local-crypt.rules file and symlink in /dev are a work-around for [grub-probe assuming all devices are in /dev](https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1527727). - -4.8 Install GRUB - -Choose one of the following options: - -4.8a Install GRUB for legacy (MBR) booting - - # apt install --yes grub-pc - -Install GRUB to the disk(s), not the partition(s). - -4.8b Install GRUB for UEFI booting - - # apt install dosfstools - # mkdosfs -F 32 -n EFI /dev/disk/by-id/scsi-SATA_disk1-part3 - # mkdir /boot/efi - # echo PARTUUID=$(blkid -s PARTUUID -o value \ - /dev/disk/by-id/scsi-SATA_disk1-part3) \ - /boot/efi vfat nofail,x-systemd.device-timeout=1 0 1 >> /etc/fstab - # mount /boot/efi - # apt install --yes grub-efi-amd64 - -4.9 Setup system groups: - - # addgroup --system lpadmin - # addgroup --system sambashare - -4.10 Set a root password - - # passwd - -4.11 Fix filesystem mount ordering - -[Until ZFS gains a systemd mount generator](https://github.com/zfsonlinux/zfs/issues/4898), there are races between mounting filesystems and starting certain daemons. In practice, the issues (e.g. [#5754](https://github.com/zfsonlinux/zfs/issues/5754)) seem to be with certain filesystems in `/var`, specifically `/var/log` and `/var/tmp`. Setting these to use `legacy` mounting, and listing them in `/etc/fstab` makes systemd aware that these are separate mountpoints. In turn, `rsyslog.service` depends on `var-log.mount` by way of `local-fs.target` and services using the `PrivateTmp` feature of systemd automatically use `After=var-tmp.mount`. - - # zfs set mountpoint=legacy rpool/var/log - # zfs set mountpoint=legacy rpool/var/tmp - # cat >> /etc/fstab << EOF - rpool/var/log /var/log zfs defaults 0 0 - rpool/var/tmp /var/tmp zfs defaults 0 0 - EOF - -## Step 5: GRUB Installation - -5.1 Verify that the ZFS root filesystem is recognized: - - # grub-probe / - zfs - -**Note:** GRUB uses `zpool status` in order to determine the location of devices. [grub-probe assumes all devices are in /dev](https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1527727). The `zfs-initramfs` package [ships udev rules that create symlinks](https://packages.ubuntu.com/xenial-updates/all/zfs-initramfs/filelist) to [work around the problem](https://bugs.launchpad.net/ubuntu/+source/zfs-initramfs/+bug/1530953), but [there have still been reports of problems](https://github.com/zfsonlinux/grub/issues/5#issuecomment-249427634). If this happens, you will get an error saying `grub-probe: error: failed to get canonical path` and should run the following: - - # export ZPOOL_VDEV_NAME_PATH=YES - -5.2 Refresh the initrd files: - - # update-initramfs -c -k all - update-initramfs: Generating /boot/initrd.img-4.4.0-21-generic - -**Note:** When using LUKS, this will print "WARNING could not determine root device from /etc/fstab". This is because [cryptsetup does not support ZFS](https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/1612906). - -5.3 Optional (but highly recommended): Make debugging GRUB easier: - - # vi /etc/default/grub - Comment out: GRUB_HIDDEN_TIMEOUT=0 - Remove quiet and splash from: GRUB_CMDLINE_LINUX_DEFAULT - Uncomment: GRUB_TERMINAL=console - Save and quit. - -Later, once the system has rebooted twice and you are sure everything is working, you can undo these changes, if desired. - -5.4 Update the boot configuration: - - # update-grub - Generating grub configuration file ... - Found linux image: /boot/vmlinuz-4.4.0-21-generic - Found initrd image: /boot/initrd.img-4.4.0-21-generic - done - -5.5 Install the boot loader - -5.5a For legacy (MBR) booting, install GRUB to the MBR: - - # grub-install /dev/disk/by-id/scsi-SATA_disk1 - Installing for i386-pc platform. - Installation finished. No error reported. - -Do not reboot the computer until you get exactly that result message. Note that you are installing GRUB to the whole disk, not a partition. - -If you are creating a mirror, repeat the grub-install command for each disk in the pool. - -5.5b For UEFI booting, install GRUB: - - # grub-install --target=x86_64-efi --efi-directory=/boot/efi \ - --bootloader-id=ubuntu --recheck --no-floppy - -5.6 Verify that the ZFS module is installed: - - # ls /boot/grub/*/zfs.mod - -## Step 6: First Boot - -6.1 Snapshot the initial installation: - - # zfs snapshot rpool/ROOT/ubuntu@install - -In the future, you will likely want to take snapshots before each upgrade, and remove old snapshots (including this one) at some point to save space. - -6.2 Exit from the `chroot` environment back to the LiveCD environment: - - # exit - -6.3 Run these commands in the LiveCD environment to unmount all filesystems: - - # mount | grep -v zfs | tac | awk '/\/mnt/ {print $3}' | xargs -i{} umount -lf {} - # zpool export rpool - -6.4 Reboot: - - # reboot - -6.5 Wait for the newly installed system to boot normally. Login as root. - -6.6 Create a user account: - -Choose one of the following options: - -6.6a Unencrypted or LUKS: - - # zfs create rpool/home/YOURUSERNAME - # adduser YOURUSERNAME - # cp -a /etc/skel/.[!.]* /home/YOURUSERNAME - # chown -R YOURUSERNAME:YOURUSERNAME /home/YOURUSERNAME - -6.6b eCryptfs: - - # apt install ecryptfs-utils - - # zfs create -o compression=off -o mountpoint=/home/.ecryptfs/YOURUSERNAME \ - rpool/home/temp-YOURUSERNAME - # adduser --encrypt-home YOURUSERNAME - # zfs rename rpool/home/temp-YOURUSERNAME rpool/home/YOURUSERNAME - -The temporary name for the dataset is required to work-around [a bug in ecryptfs-setup-private](https://bugs.launchpad.net/ubuntu/+source/ecryptfs-utils/+bug/1574174). Otherwise, it will fail with an error saying the home directory is already mounted; that check is not specific enough in the pattern it uses. - -**Note:** Automatically mounted snapshots (i.e. the `.zfs/snapshots` directory) will not work through eCryptfs. You can do another eCryptfs mount manually if you need to access files in a snapshot. A script to automate the mounting should be possible, but has not yet been implemented. - -6.7 Add your user account to the default set of groups for an administrator: - - # usermod -a -G adm,cdrom,dip,lpadmin,plugdev,sambashare,sudo YOURUSERNAME - -6.8 Mirror GRUB - -If you installed to multiple disks, install GRUB on the additional disks: - -6.8a For legacy (MBR) booting: - - # dpkg-reconfigure grub-pc - Hit enter until you get to the device selection screen. - Select (using the space bar) all of the disks (not partitions) in your pool. - -6.8b UEFI - - # umount /boot/efi - - For the second and subsequent disks (increment ubuntu-2 to -3, etc.): - # dd if=/dev/disk/by-id/scsi-SATA_disk1-part3 \ - of=/dev/disk/by-id/scsi-SATA_disk2-part3 - # efibootmgr -c -g -d /dev/disk/by-id/scsi-SATA_disk2 \ - -p 3 -L "ubuntu-2" -l '\EFI\Ubuntu\grubx64.efi' - - # mount /boot/efi - -## Step 7: Configure Swap - -7.1 Create a volume dataset (zvol) for use as a swap device: - - # zfs create -V 4G -b $(getconf PAGESIZE) -o compression=zle \ - -o logbias=throughput -o sync=always \ - -o primarycache=metadata -o secondarycache=none \ - -o com.sun:auto-snapshot=false rpool/swap - -You can adjust the size (the `4G` part) to your needs. - -The compression algorithm is set to `zle` because it is the cheapest available algorithm. As this guide recommends `ashift=12` (4 kiB blocks on disk), the common case of a 4 kiB page size means that no compression algorithm can reduce I/O. The exception is all-zero pages, which are dropped by ZFS; but some form of compression has to be enabled to get this behavior. - -7.2 Configure the swap device: - -Choose one of the following options: - -7.2a Unencrypted or LUKS: - -**Caution**: Always use long `/dev/zvol` aliases in configuration files. Never use a short `/dev/zdX` device name. - - # mkswap -f /dev/zvol/rpool/swap - # echo /dev/zvol/rpool/swap none swap defaults 0 0 >> /etc/fstab - -7.2b eCryptfs: - - # apt install cryptsetup - # echo cryptswap1 /dev/zvol/rpool/swap /dev/urandom \ - swap,cipher=aes-xts-plain64:sha256,size=256 >> /etc/crypttab - # systemctl daemon-reload - # systemctl start systemd-cryptsetup@cryptswap1.service - # echo /dev/mapper/cryptswap1 none swap defaults 0 0 >> /etc/fstab - -7.3 Enable the swap device: - - # swapon -av - -## Step 8: Full Software Installation - -8.1 Upgrade the minimal system: - - # apt dist-upgrade --yes - -8.2 Install a regular set of software: - -Choose one of the following options: - -8.2a Install a command-line environment only: - - # apt install --yes ubuntu-standard - -8.2b Install a full GUI environment: - - # apt install --yes ubuntu-desktop - -**Hint**: If you are installing a full GUI environment, you will likely want to manage your network with NetworkManager. In that case, `rm /etc/network/interfaces.d/eth0`. - -8.3 Optional: Disable log compression: - -As `/var/log` is already compressed by ZFS, logrotate’s compression is going to burn CPU and disk I/O for (in most cases) very little gain. Also, if you are making snapshots of `/var/log`, logrotate’s compression will actually waste space, as the uncompressed data will live on in the snapshot. You can edit the files in `/etc/logrotate.d` by hand to comment out `compress`, or use this loop (copy-and-paste highly recommended): - - # for file in /etc/logrotate.d/* ; do - if grep -Eq "(^|[^#y])compress" "$file" ; then - sed -i -r "s/(^|[^#y])(compress)/\1#\2/" "$file" - fi - done - -8.4 Reboot: - - # reboot - -### Step 9: Final Cleanup - -9.1 Wait for the system to boot normally. Login using the account you created. Ensure the system (including networking) works normally. - -9.2 Optional: Delete the snapshot of the initial installation: - - $ sudo zfs destroy rpool/ROOT/ubuntu@install - -9.3 Optional: Disable the root password - - $ sudo usermod -p '*' root - -9.4 Optional: - -If you prefer the graphical boot process, you can re-enable it now. If you are using LUKS, it makes the prompt look nicer. - - $ sudo vi /etc/default/grub - Uncomment GRUB_HIDDEN_TIMEOUT=0 - Add quiet and splash to GRUB_CMDLINE_LINUX_DEFAULT - Comment out GRUB_TERMINAL=console - Save and quit. - - $ sudo update-grub - -## Troubleshooting - -### Rescuing using a Live CD - -Boot the Live CD and open a terminal. - -Become root and install the ZFS utilities: - - $ sudo -i - # apt update - # apt install --yes zfsutils-linux - -This will automatically import your pool. Export it and re-import it to get the mounts right: - - # zpool export -a - # zpool import -N -R /mnt rpool - # zfs mount rpool/ROOT/ubuntu - # zfs mount -a - -If needed, you can chroot into your installed environment: - - # mount --rbind /dev /mnt/dev - # mount --rbind /proc /mnt/proc - # mount --rbind /sys /mnt/sys - # chroot /mnt /bin/bash --login - -Do whatever you need to do to fix your system. - -When done, cleanup: - - # mount | grep -v zfs | tac | awk '/\/mnt/ {print $3}' | xargs -i{} umount -lf {} - # zpool export rpool - # reboot - -### MPT2SAS - -Most problem reports for this tutorial involve `mpt2sas` hardware that does slow asynchronous drive initialization, like some IBM M1015 or OEM-branded cards that have been flashed to the reference LSI firmware. - -The basic problem is that disks on these controllers are not visible to the Linux kernel until after the regular system is started, and ZoL does not hotplug pool members. See https://github.com/zfsonlinux/zfs/issues/330. - -Most LSI cards are perfectly compatible with ZoL. If your card has this glitch, try setting rootdelay=X in GRUB_CMDLINE_LINUX. The system will wait up to X seconds for all drives to appear before importing the pool. - -### Areca - -Systems that require the `arcsas` blob driver should add it to the `/etc/initramfs-tools/modules` file and run `update-initramfs -c -k all`. - -Upgrade or downgrade the Areca driver if something like `RIP: 0010:[] [] native_read_tsc+0x6/0x20` appears anywhere in kernel log. ZoL is unstable on systems that emit this error message. - -### VMware - -* Set `disk.EnableUUID = "TRUE"` in the vmx file or vsphere configuration. Doing this ensures that `/dev/disk` aliases are created in the guest. - -### QEMU/KVM/XEN - -Set a unique serial number on each virtual disk using libvirt or qemu (e.g. `-drive if=none,id=disk1,file=disk1.qcow2,serial=1234567890`). - -To be able to use UEFI in guests (instead of only BIOS booting), run this on the host: - - $ sudo apt install ovmf - $ sudo vi /etc/libvirt/qemu.conf - Uncomment these lines: - nvram = [ - "/usr/share/OVMF/OVMF_CODE.fd:/usr/share/OVMF/OVMF_VARS.fd", - "/usr/share/AAVMF/AAVMF_CODE.fd:/usr/share/AAVMF/AAVMF_VARS.fd" - ] - $ sudo service libvirt-bin restart +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Ubuntu-18.04-Root-on-ZFS.md b/Ubuntu-18.04-Root-on-ZFS.md index 543020b..1220f8e 100644 --- a/Ubuntu-18.04-Root-on-ZFS.md +++ b/Ubuntu-18.04-Root-on-ZFS.md @@ -1,734 +1,3 @@ -### Caution -* This HOWTO uses a whole physical disk. -* Do not use these instructions for dual-booting. -* Backup your data. Any existing data will be lost. +This page was moved to: https://openzfs.github.io/openzfs-docs/Getting%20Started/Ubuntu/Ubuntu%2018.04%20Root%20on%20ZFS.html -### System Requirements -* [Ubuntu 18.04.3 ("Bionic") Desktop CD](http://releases.ubuntu.com/18.04.3/ubuntu-18.04.3-desktop-amd64.iso) (*not* any server images) -* Installing on a drive which presents 4KiB logical sectors (a “4Kn” drive) only works with UEFI booting. This not unique to ZFS. [GRUB does not and will not work on 4Kn with legacy (BIOS) booting.](http://savannah.gnu.org/bugs/?46700) - -Computers that have less than 2 GiB of memory run ZFS slowly. 4 GiB of memory is recommended for normal performance in basic workloads. If you wish to use deduplication, you will need [massive amounts of RAM](http://wiki.freebsd.org/ZFSTuningGuide#Deduplication). Enabling deduplication is a permanent change that cannot be easily reverted. - -## Support - -If you need help, reach out to the community using the [zfs-discuss mailing list](https://github.com/zfsonlinux/zfs/wiki/Mailing-Lists) or IRC at #zfsonlinux on [freenode](https://freenode.net/). If you have a bug report or feature request related to this HOWTO, please [file a new issue](https://github.com/zfsonlinux/zfs/issues/new) and mention @rlaager. - -## Contributing - -Edit permission on this wiki is restricted. Also, GitHub wikis do not support pull requests. However, you can clone the wiki using git. - -1) `git clone https://github.com/zfsonlinux/zfs.wiki.git` -2) Make your changes. -3) Use `git diff > my-changes.patch` to create a patch. (Advanced git users may wish to `git commit` to a branch and `git format-patch`.) -4) [File a new issue](https://github.com/zfsonlinux/zfs/issues/new), mention @rlaager, and attach the patch. - -## Encryption - -This guide supports two different encryption options: unencrypted and LUKS (full-disk encryption). ZFS native encryption has not yet been released. With either option, all ZFS features are fully available. - -Unencrypted does not encrypt anything, of course. With no encryption happening, this option naturally has the best performance. - -LUKS encrypts almost everything: the OS, swap, home directories, and anything else. The only unencrypted data is the bootloader, kernel, and initrd. The system cannot boot without the passphrase being entered at the console. Performance is good, but LUKS sits underneath ZFS, so if multiple disks (mirror or raidz topologies) are used, the data has to be encrypted once per disk. - -## Step 1: Prepare The Install Environment - -1.1 Boot the Ubuntu Live CD. Select Try Ubuntu. Connect your system to the Internet as appropriate (e.g. join your WiFi network). Open a terminal (press Ctrl-Alt-T). - -1.2 Setup and update the repositories: - - sudo apt-add-repository universe - sudo apt update - -1.3 Optional: Install and start the OpenSSH server in the Live CD environment: - -If you have a second system, using SSH to access the target system can be convenient. - - passwd - There is no current password; hit enter at that prompt. - sudo apt install --yes openssh-server - -**Hint:** You can find your IP address with `ip addr show scope global | grep inet`. Then, from your main machine, connect with `ssh ubuntu@IP`. - -1.4 Become root: - - sudo -i - -1.5 Install ZFS in the Live CD environment: - - apt install --yes debootstrap gdisk zfs-initramfs - -## Step 2: Disk Formatting - -2.1 Set a variable with the disk name: - - DISK=/dev/disk/by-id/scsi-SATA_disk1 - -Always use the long `/dev/disk/by-id/*` aliases with ZFS. Using the `/dev/sd*` device nodes directly can cause sporadic import failures, especially on systems that have more than one storage pool. - -**Hints:** -* `ls -la /dev/disk/by-id` will list the aliases. -* Are you doing this in a virtual machine? If your virtual disk is missing from `/dev/disk/by-id`, use `/dev/vda` if you are using KVM with virtio; otherwise, read the [troubleshooting](#troubleshooting) section. - -2.2 If you are re-using a disk, clear it as necessary: - -If the disk was previously used in an MD array, zero the superblock: - - apt install --yes mdadm - mdadm --zero-superblock --force $DISK - -Clear the partition table: - - sgdisk --zap-all $DISK - -2.3 Partition your disk(s): - -Run this if you need legacy (BIOS) booting: - - sgdisk -a1 -n1:24K:+1000K -t1:EF02 $DISK - -Run this for UEFI booting (for use now or in the future): - - sgdisk -n2:1M:+512M -t2:EF00 $DISK - -Run this for the boot pool: - - sgdisk -n3:0:+1G -t3:BF01 $DISK - -Choose one of the following options: - -2.3a Unencrypted: - - sgdisk -n4:0:0 -t4:BF01 $DISK - -2.3b LUKS: - - sgdisk -n4:0:0 -t4:8300 $DISK - -If you are creating a mirror or raidz topology, repeat the partitioning commands for all the disks which will be part of the pool. - -2.4 Create the boot pool: - - zpool create -o ashift=12 -d \ - -o feature@async_destroy=enabled \ - -o feature@bookmarks=enabled \ - -o feature@embedded_data=enabled \ - -o feature@empty_bpobj=enabled \ - -o feature@enabled_txg=enabled \ - -o feature@extensible_dataset=enabled \ - -o feature@filesystem_limits=enabled \ - -o feature@hole_birth=enabled \ - -o feature@large_blocks=enabled \ - -o feature@lz4_compress=enabled \ - -o feature@spacemap_histogram=enabled \ - -o feature@userobj_accounting=enabled \ - -O acltype=posixacl -O canmount=off -O compression=lz4 -O devices=off \ - -O normalization=formD -O relatime=on -O xattr=sa \ - -O mountpoint=/ -R /mnt bpool ${DISK}-part3 - -You should not need to customize any of the options for the boot pool. - -GRUB does not support all of the zpool features. See `spa_feature_names` in [grub-core/fs/zfs/zfs.c](http://git.savannah.gnu.org/cgit/grub.git/tree/grub-core/fs/zfs/zfs.c#n276). This step creates a separate boot pool for `/boot` with the features limited to only those that GRUB supports, allowing the root pool to use any/all features. Note that GRUB opens the pool read-only, so all read-only compatible features are "supported" by GRUB. - -**Hints:** -* If you are creating a mirror or raidz topology, create the pool using `zpool create ... bpool mirror /dev/disk/by-id/scsi-SATA_disk1-part3 /dev/disk/by-id/scsi-SATA_disk2-part3` (or replace `mirror` with `raidz`, `raidz2`, or `raidz3` and list the partitions from additional disks). -* The pool name is arbitrary. If changed, the new name must be used consistently. The `bpool` convention originated in this HOWTO. - -2.5 Create the root pool: - -Choose one of the following options: - -2.5a Unencrypted: - - zpool create -o ashift=12 \ - -O acltype=posixacl -O canmount=off -O compression=lz4 \ - -O dnodesize=auto -O normalization=formD -O relatime=on -O xattr=sa \ - -O mountpoint=/ -R /mnt rpool ${DISK}-part4 - -2.5b LUKS: - - cryptsetup luksFormat -c aes-xts-plain64 -s 512 -h sha256 ${DISK}-part4 - cryptsetup luksOpen ${DISK}-part4 luks1 - zpool create -o ashift=12 \ - -O acltype=posixacl -O canmount=off -O compression=lz4 \ - -O dnodesize=auto -O normalization=formD -O relatime=on -O xattr=sa \ - -O mountpoint=/ -R /mnt rpool /dev/mapper/luks1 - -* The use of `ashift=12` is recommended here because many drives today have 4KiB (or larger) physical sectors, even though they present 512B logical sectors. Also, a future replacement drive may have 4KiB physical sectors (in which case `ashift=12` is desirable) or 4KiB logical sectors (in which case `ashift=12` is required). -* Setting `-O acltype=posixacl` enables POSIX ACLs globally. If you do not want this, remove that option, but later add `-o acltype=posixacl` (note: lowercase "o") to the `zfs create` for `/var/log`, as [journald requires ACLs](https://askubuntu.com/questions/970886/journalctl-says-failed-to-search-journal-acl-operation-not-supported) -* Setting `normalization=formD` eliminates some corner cases relating to UTF-8 filename normalization. It also implies `utf8only=on`, which means that only UTF-8 filenames are allowed. If you care to support non-UTF-8 filenames, do not use this option. For a discussion of why requiring UTF-8 filenames may be a bad idea, see [The problems with enforced UTF-8 only filenames](http://utcc.utoronto.ca/~cks/space/blog/linux/ForcedUTF8Filenames). -* Setting `relatime=on` is a middle ground between classic POSIX `atime` behavior (with its significant performance impact) and `atime=off` (which provides the best performance by completely disabling atime updates). Since Linux 2.6.30, `relatime` has been the default for other filesystems. See [RedHat's documentation](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/power_management_guide/relatime) for further information. -* Setting `xattr=sa` [vastly improves the performance of extended attributes](https://github.com/zfsonlinux/zfs/commit/82a37189aac955c81a59a5ecc3400475adb56355). Inside ZFS, extended attributes are used to implement POSIX ACLs. Extended attributes can also be used by user-space applications. [They are used by some desktop GUI applications.](https://en.wikipedia.org/wiki/Extended_file_attributes#Linux) [They can be used by Samba to store Windows ACLs and DOS attributes; they are required for a Samba Active Directory domain controller.](https://wiki.samba.org/index.php/Setting_up_a_Share_Using_Windows_ACLs) Note that [`xattr=sa` is Linux-specific.](http://open-zfs.org/wiki/Platform_code_differences) If you move your `xattr=sa` pool to another OpenZFS implementation besides ZFS-on-Linux, extended attributes will not be readable (though your data will be). If portability of extended attributes is important to you, omit the `-O xattr=sa` above. Even if you do not want `xattr=sa` for the whole pool, it is probably fine to use it for `/var/log`. -* Make sure to include the `-part4` portion of the drive path. If you forget that, you are specifying the whole disk, which ZFS will then re-partition, and you will lose the bootloader partition(s). -* For LUKS, the key size chosen is 512 bits. However, XTS mode requires two keys, so the LUKS key is split in half. Thus, `-s 512` means AES-256. -* Your passphrase will likely be the weakest link. Choose wisely. See [section 5 of the cryptsetup FAQ](https://gitlab.com/cryptsetup/cryptsetup/wikis/FrequentlyAskedQuestions#5-security-aspects) for guidance. - -**Hints:** -* If you are creating a mirror or raidz topology, create the pool using `zpool create ... rpool mirror /dev/disk/by-id/scsi-SATA_disk1-part4 /dev/disk/by-id/scsi-SATA_disk2-part4` (or replace `mirror` with `raidz`, `raidz2`, or `raidz3` and list the partitions from additional disks). For LUKS, use `/dev/mapper/luks1`, `/dev/mapper/luks2`, etc., which you will have to create using `cryptsetup`. -* The pool name is arbitrary. If changed, the new name must be used consistently. On systems that can automatically install to ZFS, the root pool is named `rpool` by default. - -## Step 3: System Installation - -3.1 Create filesystem datasets to act as containers: - - zfs create -o canmount=off -o mountpoint=none rpool/ROOT - zfs create -o canmount=off -o mountpoint=none bpool/BOOT - -On Solaris systems, the root filesystem is cloned and the suffix is incremented for major system changes through `pkg image-update` or `beadm`. Similar functionality for APT is possible but currently unimplemented. Even without such a tool, it can still be used for manually created clones. - -3.2 Create filesystem datasets for the root and boot filesystems: - - zfs create -o canmount=noauto -o mountpoint=/ rpool/ROOT/ubuntu - zfs mount rpool/ROOT/ubuntu - - zfs create -o canmount=noauto -o mountpoint=/boot bpool/BOOT/ubuntu - zfs mount bpool/BOOT/ubuntu - -With ZFS, it is not normally necessary to use a mount command (either `mount` or `zfs mount`). This situation is an exception because of `canmount=noauto`. - -3.3 Create datasets: - - zfs create rpool/home - zfs create -o mountpoint=/root rpool/home/root - zfs create -o canmount=off rpool/var - zfs create -o canmount=off rpool/var/lib - zfs create rpool/var/log - zfs create rpool/var/spool - -The datasets below are optional, depending on your preferences and/or software -choices. - -If you wish to exclude these from snapshots: - - zfs create -o com.sun:auto-snapshot=false rpool/var/cache - zfs create -o com.sun:auto-snapshot=false rpool/var/tmp - chmod 1777 /mnt/var/tmp - -If you use /opt on this system: - - zfs create rpool/opt - -If you use /srv on this system: - - zfs create rpool/srv - -If you use /usr/local on this system: - - zfs create -o canmount=off rpool/usr - zfs create rpool/usr/local - -If this system will have games installed: - - zfs create rpool/var/games - -If this system will store local email in /var/mail: - - zfs create rpool/var/mail - -If this system will use Snap packages: - - zfs create rpool/var/snap - -If you use /var/www on this system: - - zfs create rpool/var/www - -If this system will use GNOME: - - zfs create rpool/var/lib/AccountsService - -If this system will use Docker (which manages its own datasets & snapshots): - - zfs create -o com.sun:auto-snapshot=false rpool/var/lib/docker - -If this system will use NFS (locking): - - zfs create -o com.sun:auto-snapshot=false rpool/var/lib/nfs - -A tmpfs is recommended later, but if you want a separate dataset for /tmp: - - zfs create -o com.sun:auto-snapshot=false rpool/tmp - chmod 1777 /mnt/tmp - -The primary goal of this dataset layout is to separate the OS from user data. This allows the root filesystem to be rolled back without rolling back user data such as logs (in `/var/log`). This will be especially important if/when a `beadm` or similar utility is integrated. The `com.sun.auto-snapshot` setting is used by some ZFS snapshot utilities to exclude transient data. - -If you do nothing extra, `/tmp` will be stored as part of the root filesystem. Alternatively, you can create a separate dataset for `/tmp`, as shown above. This keeps the `/tmp` data out of snapshots of your root filesystem. It also allows you to set a quota on `rpool/tmp`, if you want to limit the maximum space used. Otherwise, you can use a tmpfs (RAM filesystem) later. - -3.4 Install the minimal system: - - debootstrap bionic /mnt - zfs set devices=off rpool - -The `debootstrap` command leaves the new system in an unconfigured state. An alternative to using `debootstrap` is to copy the entirety of a working system into the new ZFS root. - -## Step 4: System Configuration - -4.1 Configure the hostname (change `HOSTNAME` to the desired hostname). - - echo HOSTNAME > /mnt/etc/hostname - - vi /mnt/etc/hosts - Add a line: - 127.0.1.1 HOSTNAME - or if the system has a real name in DNS: - 127.0.1.1 FQDN HOSTNAME - -**Hint:** Use `nano` if you find `vi` confusing. - -4.2 Configure the network interface: - -Find the interface name: - - ip addr show - -Adjust NAME below to match your interface name: - - vi /mnt/etc/netplan/01-netcfg.yaml - network: - version: 2 - ethernets: - NAME: - dhcp4: true - -Customize this file if the system is not a DHCP client. - -4.3 Configure the package sources: - - vi /mnt/etc/apt/sources.list - deb http://archive.ubuntu.com/ubuntu bionic main universe - deb-src http://archive.ubuntu.com/ubuntu bionic main universe - - deb http://security.ubuntu.com/ubuntu bionic-security main universe - deb-src http://security.ubuntu.com/ubuntu bionic-security main universe - - deb http://archive.ubuntu.com/ubuntu bionic-updates main universe - deb-src http://archive.ubuntu.com/ubuntu bionic-updates main universe - -4.4 Bind the virtual filesystems from the LiveCD environment to the new system and `chroot` into it: - - mount --rbind /dev /mnt/dev - mount --rbind /proc /mnt/proc - mount --rbind /sys /mnt/sys - chroot /mnt /usr/bin/env DISK=$DISK bash --login - -**Note:** This is using `--rbind`, not `--bind`. - -4.5 Configure a basic system environment: - - ln -s /proc/self/mounts /etc/mtab - apt update - - dpkg-reconfigure locales - -Even if you prefer a non-English system language, always ensure that `en_US.UTF-8` is available. - - dpkg-reconfigure tzdata - -If you prefer nano over vi, install it: - - apt install --yes nano - -4.6 Install ZFS in the chroot environment for the new system: - - apt install --yes --no-install-recommends linux-image-generic - apt install --yes zfs-initramfs - -**Hint:** For the HWE kernel, install `linux-image-generic-hwe-18.04` instead of `linux-image-generic`. - -4.7 For LUKS installs only, setup crypttab: - - apt install --yes cryptsetup - - echo luks1 UUID=$(blkid -s UUID -o value ${DISK}-part4) none \ - luks,discard,initramfs > /etc/crypttab - -* The use of `initramfs` is a work-around for [cryptsetup does not support ZFS](https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/1612906). - -**Hint:** If you are creating a mirror or raidz topology, repeat the `/etc/crypttab` entries for `luks2`, etc. adjusting for each disk. - -4.8 Install GRUB - -Choose one of the following options: - -4.8a Install GRUB for legacy (BIOS) booting - - apt install --yes grub-pc - -Install GRUB to the disk(s), not the partition(s). - -4.8b Install GRUB for UEFI booting - - apt install dosfstools - mkdosfs -F 32 -s 1 -n EFI ${DISK}-part2 - mkdir /boot/efi - echo PARTUUID=$(blkid -s PARTUUID -o value ${DISK}-part2) \ - /boot/efi vfat nofail,x-systemd.device-timeout=1 0 1 >> /etc/fstab - mount /boot/efi - apt install --yes grub-efi-amd64-signed shim-signed - -* The `-s 1` for `mkdosfs` is only necessary for drives which present 4 KiB logical sectors (“4Kn” drives) to meet the minimum cluster size (given the partition size of 512 MiB) for FAT32. It also works fine on drives which present 512 B sectors. - -**Note:** If you are creating a mirror or raidz topology, this step only installs GRUB on the first disk. The other disk(s) will be handled later. - -4.9 Set a root password - - passwd - -4.10 Enable importing bpool - -This ensures that `bpool` is always imported, regardless of whether `/etc/zfs/zpool.cache` exists, whether it is in the cachefile or not, or whether `zfs-import-scan.service` is enabled. -``` - vi /etc/systemd/system/zfs-import-bpool.service - [Unit] - DefaultDependencies=no - Before=zfs-import-scan.service - Before=zfs-import-cache.service - - [Service] - Type=oneshot - RemainAfterExit=yes - ExecStart=/sbin/zpool import -N -o cachefile=none bpool - - [Install] - WantedBy=zfs-import.target -``` - - systemctl enable zfs-import-bpool.service - -4.11 Optional (but recommended): Mount a tmpfs to /tmp - -If you chose to create a `/tmp` dataset above, skip this step, as they are mutually exclusive choices. Otherwise, you can put `/tmp` on a tmpfs (RAM filesystem) by enabling the `tmp.mount` unit. - - cp /usr/share/systemd/tmp.mount /etc/systemd/system/ - systemctl enable tmp.mount - -4.12 Setup system groups: - - addgroup --system lpadmin - addgroup --system sambashare - -## Step 5: GRUB Installation - -5.1 Verify that the ZFS boot filesystem is recognized: - - grub-probe /boot - -5.2 Refresh the initrd files: - - update-initramfs -u -k all - -**Note:** When using LUKS, this will print "WARNING could not determine root device from /etc/fstab". This is because [cryptsetup does not support ZFS](https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/1612906). - -5.3 Workaround GRUB's missing zpool-features support: - - vi /etc/default/grub - Set: GRUB_CMDLINE_LINUX="root=ZFS=rpool/ROOT/ubuntu" - -5.4 Optional (but highly recommended): Make debugging GRUB easier: - - vi /etc/default/grub - Comment out: GRUB_TIMEOUT_STYLE=hidden - Set: GRUB_TIMEOUT=5 - Below GRUB_TIMEOUT, add: GRUB_RECORDFAIL_TIMEOUT=5 - Remove quiet and splash from: GRUB_CMDLINE_LINUX_DEFAULT - Uncomment: GRUB_TERMINAL=console - Save and quit. - -Later, once the system has rebooted twice and you are sure everything is working, you can undo these changes, if desired. - -5.5 Update the boot configuration: - - update-grub - -**Note:** Ignore errors from `osprober`, if present. - -5.6 Install the boot loader - -5.6a For legacy (BIOS) booting, install GRUB to the MBR: - - grub-install $DISK - -Note that you are installing GRUB to the whole disk, not a partition. - -If you are creating a mirror or raidz topology, repeat the `grub-install` command for each disk in the pool. - -5.6b For UEFI booting, install GRUB: - - grub-install --target=x86_64-efi --efi-directory=/boot/efi \ - --bootloader-id=ubuntu --recheck --no-floppy - -It is not necessary to specify the disk here. If you are creating a mirror or raidz topology, the additional disks will be handled later. - -5.7 Verify that the ZFS module is installed: - - ls /boot/grub/*/zfs.mod - -5.8 Fix filesystem mount ordering - -[Until ZFS gains a systemd mount generator](https://github.com/zfsonlinux/zfs/issues/4898), there are races between mounting filesystems and starting certain daemons. In practice, the issues (e.g. [#5754](https://github.com/zfsonlinux/zfs/issues/5754)) seem to be with certain filesystems in `/var`, specifically `/var/log` and `/var/tmp`. Setting these to use `legacy` mounting, and listing them in `/etc/fstab` makes systemd aware that these are separate mountpoints. In turn, `rsyslog.service` depends on `var-log.mount` by way of `local-fs.target` and services using the `PrivateTmp` feature of systemd automatically use `After=var-tmp.mount`. - -Until there is support for mounting `/boot` in the initramfs, we also need to mount that, because it was marked `canmount=noauto`. Also, with UEFI, we need to ensure it is mounted before its child filesystem `/boot/efi`. - -`rpool` is guaranteed to be imported by the initramfs, so there is no point in adding `x-systemd.requires=zfs-import.target` to those filesystems. - - -For UEFI booting, unmount /boot/efi first: - - umount /boot/efi - -Everything else applies to both BIOS and UEFI booting: - - zfs set mountpoint=legacy bpool/BOOT/ubuntu - echo bpool/BOOT/ubuntu /boot zfs \ - nodev,relatime,x-systemd.requires=zfs-import-bpool.service 0 0 >> /etc/fstab - - zfs set mountpoint=legacy rpool/var/log - echo rpool/var/log /var/log zfs nodev,relatime 0 0 >> /etc/fstab - - zfs set mountpoint=legacy rpool/var/spool - echo rpool/var/spool /var/spool zfs nodev,relatime 0 0 >> /etc/fstab - -If you created a /var/tmp dataset: - - zfs set mountpoint=legacy rpool/var/tmp - echo rpool/var/tmp /var/tmp zfs nodev,relatime 0 0 >> /etc/fstab - -If you created a /tmp dataset: - - zfs set mountpoint=legacy rpool/tmp - echo rpool/tmp /tmp zfs nodev,relatime 0 0 >> /etc/fstab - -## Step 6: First Boot - -6.1 Snapshot the initial installation: - - zfs snapshot bpool/BOOT/ubuntu@install - zfs snapshot rpool/ROOT/ubuntu@install - -In the future, you will likely want to take snapshots before each upgrade, and remove old snapshots (including this one) at some point to save space. - -6.2 Exit from the `chroot` environment back to the LiveCD environment: - - exit - -6.3 Run these commands in the LiveCD environment to unmount all filesystems: - - mount | grep -v zfs | tac | awk '/\/mnt/ {print $3}' | xargs -i{} umount -lf {} - zpool export -a - -6.4 Reboot: - - reboot - -6.5 Wait for the newly installed system to boot normally. Login as root. - -6.6 Create a user account: - - zfs create rpool/home/YOURUSERNAME - adduser YOURUSERNAME - cp -a /etc/skel/. /home/YOURUSERNAME - chown -R YOURUSERNAME:YOURUSERNAME /home/YOURUSERNAME - -6.7 Add your user account to the default set of groups for an administrator: - - usermod -a -G adm,cdrom,dip,lpadmin,plugdev,sambashare,sudo YOURUSERNAME - -6.8 Mirror GRUB - -If you installed to multiple disks, install GRUB on the additional disks: - -6.8a For legacy (BIOS) booting: - - dpkg-reconfigure grub-pc - Hit enter until you get to the device selection screen. - Select (using the space bar) all of the disks (not partitions) in your pool. - -6.8b UEFI - - umount /boot/efi - -For the second and subsequent disks (increment ubuntu-2 to -3, etc.): - - dd if=/dev/disk/by-id/scsi-SATA_disk1-part2 \ - of=/dev/disk/by-id/scsi-SATA_disk2-part2 - efibootmgr -c -g -d /dev/disk/by-id/scsi-SATA_disk2 \ - -p 2 -L "ubuntu-2" -l '\EFI\ubuntu\shimx64.efi' - - mount /boot/efi - -## Step 7: (Optional) Configure Swap - -**Caution**: On systems with extremely high memory pressure, using a zvol for swap can result in lockup, regardless of how much swap is still available. This issue is currently being investigated in: https://github.com/zfsonlinux/zfs/issues/7734 - -7.1 Create a volume dataset (zvol) for use as a swap device: - - zfs create -V 4G -b $(getconf PAGESIZE) -o compression=zle \ - -o logbias=throughput -o sync=always \ - -o primarycache=metadata -o secondarycache=none \ - -o com.sun:auto-snapshot=false rpool/swap - -You can adjust the size (the `4G` part) to your needs. - -The compression algorithm is set to `zle` because it is the cheapest available algorithm. As this guide recommends `ashift=12` (4 kiB blocks on disk), the common case of a 4 kiB page size means that no compression algorithm can reduce I/O. The exception is all-zero pages, which are dropped by ZFS; but some form of compression has to be enabled to get this behavior. - -7.2 Configure the swap device: - -**Caution**: Always use long `/dev/zvol` aliases in configuration files. Never use a short `/dev/zdX` device name. - - mkswap -f /dev/zvol/rpool/swap - echo /dev/zvol/rpool/swap none swap discard 0 0 >> /etc/fstab - echo RESUME=none > /etc/initramfs-tools/conf.d/resume - -The `RESUME=none` is necessary to disable resuming from hibernation. This does not work, as the zvol is not present (because the pool has not yet been imported) at the time the resume script runs. If it is not disabled, the boot process hangs for 30 seconds waiting for the swap zvol to appear. - -7.3 Enable the swap device: - - swapon -av - -## Step 8: Full Software Installation - -8.1 Upgrade the minimal system: - - apt dist-upgrade --yes - -8.2 Install a regular set of software: - -Choose one of the following options: - -8.2a Install a command-line environment only: - - apt install --yes ubuntu-standard - -8.2b Install a full GUI environment: - - apt install --yes ubuntu-desktop - vi /etc/gdm3/custom.conf - In the [daemon] section, add: InitialSetupEnable=false - -**Hint**: If you are installing a full GUI environment, you will likely want to manage your network with NetworkManager: - - vi /etc/netplan/01-netcfg.yaml - network: - version: 2 - renderer: NetworkManager - -8.3 Optional: Disable log compression: - -As `/var/log` is already compressed by ZFS, logrotate’s compression is going to burn CPU and disk I/O for (in most cases) very little gain. Also, if you are making snapshots of `/var/log`, logrotate’s compression will actually waste space, as the uncompressed data will live on in the snapshot. You can edit the files in `/etc/logrotate.d` by hand to comment out `compress`, or use this loop (copy-and-paste highly recommended): - - for file in /etc/logrotate.d/* ; do - if grep -Eq "(^|[^#y])compress" "$file" ; then - sed -i -r "s/(^|[^#y])(compress)/\1#\2/" "$file" - fi - done - -8.4 Reboot: - - reboot - -### Step 9: Final Cleanup - -9.1 Wait for the system to boot normally. Login using the account you created. Ensure the system (including networking) works normally. - -9.2 Optional: Delete the snapshots of the initial installation: - - sudo zfs destroy bpool/BOOT/ubuntu@install - sudo zfs destroy rpool/ROOT/ubuntu@install - -9.3 Optional: Disable the root password - - sudo usermod -p '*' root - -9.4 Optional: Re-enable the graphical boot process: - -If you prefer the graphical boot process, you can re-enable it now. If you are using LUKS, it makes the prompt look nicer. - - sudo vi /etc/default/grub - Uncomment: GRUB_TIMEOUT_STYLE=hidden - Add quiet and splash to: GRUB_CMDLINE_LINUX_DEFAULT - Comment out: GRUB_TERMINAL=console - Save and quit. - - sudo update-grub - -**Note:** Ignore errors from `osprober`, if present. - -9.5 Optional: For LUKS installs only, backup the LUKS header: - - sudo cryptsetup luksHeaderBackup /dev/disk/by-id/scsi-SATA_disk1-part4 \ - --header-backup-file luks1-header.dat - -Store that backup somewhere safe (e.g. cloud storage). It is protected by your LUKS passphrase, but you may wish to use additional encryption. - -**Hint:** If you created a mirror or raidz topology, repeat this for each LUKS volume (`luks2`, etc.). - -## Troubleshooting - -### Rescuing using a Live CD - -Go through [Step 1: Prepare The Install Environment](#step-1-prepare-the-install-environment). - -For LUKS, first unlock the disk(s): - - cryptsetup luksOpen /dev/disk/by-id/scsi-SATA_disk1-part4 luks1 - Repeat for additional disks, if this is a mirror or raidz topology. - -Mount everything correctly: - - zpool export -a - zpool import -N -R /mnt rpool - zpool import -N -R /mnt bpool - zfs mount rpool/ROOT/ubuntu - zfs mount -a - -If needed, you can chroot into your installed environment: - - mount --rbind /dev /mnt/dev - mount --rbind /proc /mnt/proc - mount --rbind /sys /mnt/sys - chroot /mnt /bin/bash --login - mount /boot - mount -a - -Do whatever you need to do to fix your system. - -When done, cleanup: - - exit - mount | grep -v zfs | tac | awk '/\/mnt/ {print $3}' | xargs -i{} umount -lf {} - zpool export -a - reboot - -### MPT2SAS - -Most problem reports for this tutorial involve `mpt2sas` hardware that does slow asynchronous drive initialization, like some IBM M1015 or OEM-branded cards that have been flashed to the reference LSI firmware. - -The basic problem is that disks on these controllers are not visible to the Linux kernel until after the regular system is started, and ZoL does not hotplug pool members. See https://github.com/zfsonlinux/zfs/issues/330. - -Most LSI cards are perfectly compatible with ZoL. If your card has this glitch, try setting ZFS_INITRD_PRE_MOUNTROOT_SLEEP=X in /etc/default/zfs. The system will wait X seconds for all drives to appear before importing the pool. - -### Areca - -Systems that require the `arcsas` blob driver should add it to the `/etc/initramfs-tools/modules` file and run `update-initramfs -u -k all`. - -Upgrade or downgrade the Areca driver if something like `RIP: 0010:[] [] native_read_tsc+0x6/0x20` appears anywhere in kernel log. ZoL is unstable on systems that emit this error message. - -### VMware - -* Set `disk.EnableUUID = "TRUE"` in the vmx file or vsphere configuration. Doing this ensures that `/dev/disk` aliases are created in the guest. - -### QEMU/KVM/XEN - -Set a unique serial number on each virtual disk using libvirt or qemu (e.g. `-drive if=none,id=disk1,file=disk1.qcow2,serial=1234567890`). - -To be able to use UEFI in guests (instead of only BIOS booting), run this on the host: - - sudo apt install ovmf - - sudo vi /etc/libvirt/qemu.conf - Uncomment these lines: - nvram = [ - "/usr/share/OVMF/OVMF_CODE.fd:/usr/share/OVMF/OVMF_VARS.fd", - "/usr/share/AAVMF/AAVMF_CODE.fd:/usr/share/AAVMF/AAVMF_VARS.fd" - ] - - sudo service libvirt-bin restart +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Ubuntu.md b/Ubuntu.md index f70dcc0..8029ea0 100644 --- a/Ubuntu.md +++ b/Ubuntu.md @@ -1,9 +1,3 @@ -ZFS packages are [provided by the distribution][ubuntu-wiki]. +This page was moved to: https://openzfs.github.io/openzfs-docs/Getting%20Started/Ubuntu/index.html -If you want to use ZFS as your root filesystem, see these instructions: -* [[Ubuntu 18.04 Root on ZFS]] - -For troubleshooting existing installations, see: -* 16.04: [[Ubuntu 16.04 Root on ZFS]] - -[ubuntu-wiki]: https://wiki.ubuntu.com/Kernel/Reference/ZFS +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Accept-PR.md b/Workflow-Accept-PR.md index b590623..ace6405 100644 --- a/Workflow-Accept-PR.md +++ b/Workflow-Accept-PR.md @@ -1,7 +1 @@ -# Accept a PR - -After a PR is generated, it is available to be commented upon by project members. They may request additional changes, please work with them. - -In addition, project members may accept PRs; this is not an automatic process. By convention, PRs aren't accepted for at least a day, to allow all members a chance to comment. - -After a PR has been accepted, it is available to be merged. \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Close-PR.md b/Workflow-Close-PR.md index 4816934..ace6405 100644 --- a/Workflow-Close-PR.md +++ b/Workflow-Close-PR.md @@ -1 +1 @@ -# Close a PR \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Commit-Often.md b/Workflow-Commit-Often.md index a20e49e..ace6405 100644 --- a/Workflow-Commit-Often.md +++ b/Workflow-Commit-Often.md @@ -1,13 +1 @@ -# Commit Often - -When writing complex code, it is strongly suggested that developers save their changes, and commit those changes to their local repository, on a frequent basis. In general, this means every hour or two, or when a specific milestone is hit in the development. This allows you to easily *checkpoint* your work. - -Details of this process can be found in the [Commit the changes][W-commit] page. - -In addition, it is suggested that the changes be pushed to your forked Github repository with the `git push` command at least every day, as a backup. Changes should also be pushed prior to running a test, in case your system crashes. This project works with kernel software. A crash while testing development software could easily cause loss of data. - -For developers who want to keep their development branches clean, it might be useful to [*squash*][W-squash] commits from time to time, even before you're ready to [create a PR][W-create-PR]. - -[W-commit]: https://github.com/zfsonlinux/zfs/wiki/Workflow-Commit -[W-squash]: https://github.com/zfsonlinux/zfs/wiki/Workflow-Squash -[W-create-PR]: https://github.com/zfsonlinux/zfs/wiki/Workflow-Create-PR \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Commit.md b/Workflow-Commit.md index 92d67ce..ace6405 100644 --- a/Workflow-Commit.md +++ b/Workflow-Commit.md @@ -1,32 +1 @@ -# Commit the Changes - -In order for your changes to be merged into the ZFS on Linux project, you must first send the changes made in your *topic* branch to your *local* repository. This can be done with the `git commit -sa`. If there are any new files, they will be reported as *untracked*, and they will not be created in the *local* repository. To add newly created files to the *local* repository, use the `git add (file-name) ...` command. - -The `-s` option adds a *signed off by* line to the commit. This *signed off by* line is required for the ZFS on Linux project. It performs the following functions: -* It is an acceptance of the [License Terms][license] of the project. -* It is the developer's certification that they have the right to submit the patch for inclusion into the code base. -* It indicates agreement to the [Developer's Certificate of Origin][COA]. - -The `-a` option causes all modified files in the current branch to be *staged* prior to performing the commit. A list of the modified files in the *local* branch can be created by the use of the `git status` command. If there are files that have been modified that shouldn't be part of the commit, they can either be rolled back in the current branch, or the files can be manually staged with the `git add (file-name) ...` command, and the `git commit -s` command can be run without the `-a` option. - -When you run the `git commit` command, an editor will appear to allow you to enter the commit messages. The following requirements apply to a commit message: -* The first line is a title for the commit, and must be bo longer than 50 characters. -* The second line should be blank, separating the title of the commit message from the body of the commit message. -* There may be one or more lines in the commit message describing the reason for the changes (the body of the commit message). These lines must be no longer than 72 characters, and may contain blank lines. - * If the commit closes an Issue, there should be a line in the body with the string `Closes`, followed by the issue number. If multiple issues are closed, multiple lines should be used. -* After the body of the commit message, there should be a blank line. This separates the body from the *signed off by* line. -* The *signed off by* line should have been created by the `git commit -s` command. If not, the line has the following format: - * The string "Signed-off-by:" - * The name of the developer. Please do not use any no pseudonyms or make any anonymous contributions. - * The email address of the developer, enclosed by angle brackets ("<>"). - * An example of this is `Signed-off-by: Random Developer ` -* If the commit changes only documentation, the line `Requires-builders: style` may be included in the body. This will cause only the *style* testing to be run. This can save a significant amount of time when Github runs the automated testing. For information on other testing options, please see the [Buildbot options][buildbot-options] page. - -For more information about writing commit messages, please visit [How to Write a Git Commit Message][writing-commit-message]. - -After the changes have been committed to your *local* repository, they should be pushed to your *forked* repository. This is done with the `git push` command. - -[license]: https://github.com/zfsonlinux/zfs/blob/master/COPYRIGHT -[COA]: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin -[buildbot-options]: https://github.com/zfsonlinux/zfs/wiki/Buildbot-Options -[writing-commit-message]: https://chris.beams.io/posts/git-commit/ \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Conflicts.md b/Workflow-Conflicts.md index ba8e43d..ace6405 100644 --- a/Workflow-Conflicts.md +++ b/Workflow-Conflicts.md @@ -1 +1 @@ -# Fix Conflicts \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Create-Branch.md b/Workflow-Create-Branch.md index c8fd73d..ace6405 100644 --- a/Workflow-Create-Branch.md +++ b/Workflow-Create-Branch.md @@ -1,22 +1 @@ -# Create a Branch - -With small projects, it's possible to develop code as commits directly on the *master* branch. In the ZFS-on-Linux project, that sort of development would create havoc and make it difficult to open a PR or rebase the code. For this reason, development in the ZFS-on-Linux project is done on *topic* branches. - -The following commands will perform the required functions: -``` -$ cd zfs -$ git fetch upstream master -$ git checkout master -$ git merge upstream/master -$ git branch (topic-branch-name) -$ git checkout (topic-branch-name) -``` - -1. Navigate to your *local* repository. -1. Fetch the updates from the *upstream* repository. -1. Set the current branch to *master*. -1. Merge the fetched updates into the *local* repository. -1. Create a new *topic* branch on the updated *master* branch. The name of the branch should be either the name of the feature (preferred for development of features) or an indication of the issue being worked on (preferred for bug fixes). -1. Set the current branch to the newly created *topic* branch. - -**Pro Tip**: The `git checkout -b (topic-branch-name)` command can be used to create and checkout a new branch with one command. \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Create-Github-Account.md b/Workflow-Create-Github-Account.md index 6278242..ace6405 100644 --- a/Workflow-Create-Github-Account.md +++ b/Workflow-Create-Github-Account.md @@ -1,14 +1 @@ -# Create a Github Account - -This page goes over how to create a Github account. There are no special settings needed to use your Github account on the [ZFS on Linux Project][zol]. - -Github did an excellent job of documenting how to create an account. The following link provides everything you need to know to get your Github account up and running. - -https://help.github.com/articles/signing-up-for-a-new-github-account/ - -In addition, the following articles might be useful: -* https://help.github.com/articles/keeping-your-account-and-data-secure/ -* https://help.github.com/articles/securing-your-account-with-two-factor-authentication-2fa/ -* https://help.github.com/articles/adding-a-fallback-authentication-method-with-recover-accounts-elsewhere/ - -[zol]: https://github.com/zfsonlinux \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Create-Test.md b/Workflow-Create-Test.md index b124357..ace6405 100644 --- a/Workflow-Create-Test.md +++ b/Workflow-Create-Test.md @@ -1 +1 @@ -# Create a New Test \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Delete-Branch.md b/Workflow-Delete-Branch.md index 82c6e0d..ace6405 100644 --- a/Workflow-Delete-Branch.md +++ b/Workflow-Delete-Branch.md @@ -1,5 +1 @@ -# Delete a Branch - -When a commit has been accepted and merged into the main ZFS repository, the developer's topic branch should be deleted. This is also appropriate if the developer abandons the change, and could be appropriate if they change the direction of the change. - -To delete a topic branch, navigate to the base directory of your local Git repository and use the `git branch -d (branch-name)` command. The name of the branch should be the same as the branch that was created. \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Generate-PR.md b/Workflow-Generate-PR.md index 6c3e445..ace6405 100644 --- a/Workflow-Generate-PR.md +++ b/Workflow-Generate-PR.md @@ -1 +1 @@ -# Generate a PR \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Get-Source.md b/Workflow-Get-Source.md index aa5665b..ace6405 100644 --- a/Workflow-Get-Source.md +++ b/Workflow-Get-Source.md @@ -1,31 +1 @@ - - -# Get the Source Code - -This document goes over how a developer can get the ZFS source code for the purpose of making changes to it. For other purposes, please see the [Get the Source Code][get-source] page. - -The Git *master* branch contains the latest version of the software, including changes that weren't included in the released tarball. This is the preferred source code location and procedure for ZFS development. If you would like to do development work for the [ZFS on Linux Project][zol], you can fork the Github repository and prepare the source by using the following process. - -1. Go to the [ZFS on Linux Project][zol] and fork both the ZFS and SPL repositories. This will create two new repositories (your *forked* repositories) under your account. Detailed instructions can be found at https://help.github.com/articles/fork-a-repo/. -1. Clone both of these repositories onto your development system. This will create your *local* repositories. As an example, if your Github account is *newzfsdeveloper*, the commands to clone the repositories would be: -``` -$ mkdir zfs-on-linux -$ cd zfs-on-linux -$ git clone https://github.com/newzfsdeveloper/spl.git -$ git clone https://github.com/newzfsdeveloper/zfs.git -``` -3. Enter the following commands to make the necessary linkage to the *upstream master* repositories and prepare the source to be compiled: -``` -$ cd spl -$ git remote add upstream https://github.com/zfsonlinux/spl.git -$ ./autogen.sh -$ cd ../zfs -$ git remote add upstream https://github.com/zfsonlinux/zfs.git -$ ./autogen.sh -cd .. -``` - -The `./autogen.sh` script generates the build files. If the build system is updated by any developer, these scripts need to be run again. - -[zol]: https://github.com/zfsonlinux -[get-source]: https://github.com/zfsonlinux/zfs/wiki/Get-the-Source-Code \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Install-Git.md b/Workflow-Install-Git.md index 2aa060d..ace6405 100644 --- a/Workflow-Install-Git.md +++ b/Workflow-Install-Git.md @@ -1,37 +1 @@ -# Install Git - -To work with the ZFS software on Github, it's necessary to install the Git software on your computer and set it up. This page covers that process for some common Linux operating systems. Other Linux operating systems should be similar. - -## Install the Software Package - -The first step is to actually install the Git software package. This package can be found in the repositories used by most Linux distributions. If your distribution isn't listed here, or you'd like to install from source, please have a look in the [official Git documentation][git-install-linux]. - -### Red Hat and CentOS - -``` -# yum install git -``` - -### Fedora - -``` -$ sudo dnf install git -``` - -### Debian and Ubuntu - -``` -$ sudo apt install git -``` - -## Configuring Git - -Your user name and email address must be set within Git before you can make commits to the ZFS project. In addition, your preferred text editor should be set to whatever you would like to use. - -``` -$ git config --global user.name "John Doe" -$ git config --global user.email johndoe@example.com -$ git config --global core.editor emacs -``` - -[git-install-linux]: https://git-scm.com/download/linux \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Large-Features.md b/Workflow-Large-Features.md index 2d5b8db..ace6405 100644 --- a/Workflow-Large-Features.md +++ b/Workflow-Large-Features.md @@ -1 +1 @@ -# Adding Large Features \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Merge-PR.md b/Workflow-Merge-PR.md index f613ce4..ace6405 100644 --- a/Workflow-Merge-PR.md +++ b/Workflow-Merge-PR.md @@ -1,5 +1 @@ -# Merge a PR - -Once all the feedback has been addressed, the PR will be merged into the *master* branch by a member with write permission (most members don't have this permission). - -After the PR has been merged, it is eligible to be added to the *release* branch. \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Rebase.md b/Workflow-Rebase.md index 929ccee..ace6405 100644 --- a/Workflow-Rebase.md +++ b/Workflow-Rebase.md @@ -1,18 +1 @@ -# Rebase the Update - -Updates to the ZFS on Linux project should always be based on the current *master* branch. This makes them easier to merge into the repository. - -There are two steps in the rebase process. The first step is to update the *local master* branch from the *upstream master* repository. This can be done by entering the following commands: - -``` -$ git fetch upstream master -$ git checkout master -$ git merge upstream/master -``` - -The second step is to perform the actual rebase of the updates. This is done by entering the command `git rebase upstream/master`. If there are any conflicts between the updates in your *local* branch and the updates in the *upstream master* branch, you will be informed of them, and allowed to correct them (see the [Conflicts][W-conflicts] page). - -This would also be a good time to [*squash*][W-squash] your commits. - -[W-conflicts]: https://github.com/zfsonlinux/zfs/wiki/Workflow-Conflicts -[W-squash]: https://github.com/zfsonlinux/zfs/wiki/Workflow-Squash \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Squash.md b/Workflow-Squash.md index 4e4aacf..ace6405 100644 --- a/Workflow-Squash.md +++ b/Workflow-Squash.md @@ -1 +1 @@ -# Squash the Commits \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Test.md b/Workflow-Test.md index 4e46dc7..ace6405 100644 --- a/Workflow-Test.md +++ b/Workflow-Test.md @@ -1,59 +1 @@ -# Testing Changes to ZFS - -The code in the ZFS on Linux project is quite complex. A minor error in a change could easily introduce new bugs into the software, causing unforeseeable problems. In an attempt to avoid this, the ZTS (ZFS Test Suite) was developed. This test suite is run against multiple architectures and distributions by the Github system when a PR (Pull Request) is submitted. - -A subset of the full test suite can be run by the developer to perform a preliminary verification of the changes in their *local* repository. - -## Style Testing - -The first part of the testing is to verify that the software meets the project's style guidelines. To verify that the code meets those guidelines, run ```make checkstyle``` from the *local* repository. - -## Basic Functionality Testing - -The second part of the testing is to verify basic functionality. This is to ensure that the changes made don't break previous functionality. - -There are a few helper scripts provided in the top-level scripts directory designed to aid developers working with in-tree builds. - -* **zfs-helper.sh:** Certain functionality (i.e. /dev/zvol/) depends on the ZFS provided udev helper scripts being installed on the system. This script can be used to create symlinks on the system from the installation location to the in-tree helper. These links must be in place to successfully run the ZFS Test Suite. The `-i` and `-r` options can be used to install and remove the symlinks. - -``` -$ sudo ./scripts/zfs-helpers.sh -i -``` - -* **zfs.sh:** The freshly built kernel modules from the *local* repository can be loaded using `zfs.sh`. This script will load those modules, **even if there are ZFS modules loaded** from another location, which could cause long-term problems if any of the non-testing file-systems on the system use ZFS. - -This script can latter be used to unload the kernel modules with the `-u` option. - -``` -$ sudo ./scripts/zfs.sh -``` - -* **zfs-tests.sh:** A wrapper which can be used to launch the ZFS Test Suite. Three loopback devices are created on top of sparse files located in `/var/tmp/` and used for the regression test. Detailed directions for the running the ZTS can be found in the [ZTS Readme][zts-readme] file. - -**WARNING**: This script should **only** be run on a development system. It makes configuration changes to the system to run the tests, and it *tries* to remove those changes after completion, but the change removal could fail, and dynamic canges of this nature are usually undesirable on a production system. For more information on the changes made, please see the [ZTS Readme][zts-readme] file. - -``` -$ sudo ./scripts/zfs-tests.sh -vx -``` - -**tip:** The **delegate** tests will be skipped unless group read permission is set on the zfs directory and its parents. - -* **zloop.sh:** A wrapper to run ztest repeatedly with randomized arguments. The ztest command is a user space stress test designed to detect correctness issues by concurrently running a random set of test cases. If a crash is encountered, the ztest logs, any associated vdev files, and core file (if one exists) are collected and moved to the output directory for analysis. - -If there are any failures in this test, please see the [zloop debugging][W-zloop] page. - -``` -$ sudo ./scripts/zloop.sh -``` - -## Change Testing - -Finally, it's necessary to verify that the changes made actually do what they were intended to do. The extent of the testing would depend on the complexity of the changes. - -After the changes are tested, if the testing can be automated for addition to ZTS, a [new test][W-create-test] should be created. This test should be part of the PR that resolves the issue or adds the feature. If the festure is split into multiple PRs, some testing should be included in the first, with additions to the test as required. - -It should be noted that if the change adds too many lines of code that don't get tested by ZTS, the change will not pass testing. - -[zts-readme]: https://github.com/zfsonlinux/zfs/tree/master/tests -[W-zloop]: https://github.com/zfsonlinux/zfs/wiki/Workflow-Zloop-Debugging -[W-create-test]: https://github.com/zfsonlinux/zfs/wiki/Workflow-Create-Test \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Update-PR.md b/Workflow-Update-PR.md index 0cff931..ace6405 100644 --- a/Workflow-Update-PR.md +++ b/Workflow-Update-PR.md @@ -1 +1 @@ -# Update a PR \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/Workflow-Zloop-Debugging.md b/Workflow-Zloop-Debugging.md index bb7858f..ace6405 100644 --- a/Workflow-Zloop-Debugging.md +++ b/Workflow-Zloop-Debugging.md @@ -1 +1 @@ -# Debugging *Zloop* Failures \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/ZFS-Transaction-Delay.md b/ZFS-Transaction-Delay.md index ff7312a..e289cab 100644 --- a/ZFS-Transaction-Delay.md +++ b/ZFS-Transaction-Delay.md @@ -1,98 +1,3 @@ -### ZFS Transaction Delay +This page was moved to: https://openzfs.github.io/openzfs-docs/Performance%20and%20tuning/ZFS%20Transaction%20Delay.html -ZFS write operations are delayed when the -backend storage isn't able to accommodate the rate of incoming writes. -This delay process is known as the ZFS write throttle. - -If there is already a write transaction waiting, the delay is relative to -when that transaction will finish waiting. Thus the calculated delay time -is independent of the number of threads concurrently executing -transactions. - -If there is only one waiter, the delay is relative to when the transaction -started, rather than the current time. This credits the transaction for -"time already served." For example, if a write transaction requires reading -indirect blocks first, then the delay is counted at the start of the -transaction, just prior to the indirect block reads. - -The minimum time for a transaction to take is calculated as: -``` -min_time = zfs_delay_scale * (dirty - min) / (max - dirty) -min_time is then capped at 100 milliseconds -``` - -The delay has two degrees of freedom that can be adjusted via tunables: -1. The percentage of dirty data at which we start to delay is defined by -zfs_delay_min_dirty_percent. This is typically be at or above -zfs_vdev_async_write_active_max_dirty_percent so delays occur -after writing at full speed has failed to keep up with the incoming write -rate. -2. The scale of the curve is defined by zfs_delay_scale. Roughly speaking, -this variable determines the amount of delay at the midpoint of the curve. - -``` -delay - 10ms +-------------------------------------------------------------*+ - | *| - 9ms + *+ - | *| - 8ms + *+ - | * | - 7ms + * + - | * | - 6ms + * + - | * | - 5ms + * + - | * | - 4ms + * + - | * | - 3ms + * + - | * | - 2ms + (midpoint) * + - | | ** | - 1ms + v *** + - | zfs_delay_scale ----------> ******** | - 0 +-------------------------------------*********----------------+ - 0% <- zfs_dirty_data_max -> 100% -``` - -Note that since the delay is added to the outstanding time remaining on the -most recent transaction, the delay is effectively the inverse of IOPS. -Here the midpoint of 500 microseconds translates to 2000 IOPS. -The shape of the curve was chosen such that small changes in the amount of -accumulated dirty data in the first 3/4 of the curve yield relatively small -differences in the amount of delay. - -The effects can be easier to understand when the amount of delay is -represented on a log scale: -``` -delay -100ms +-------------------------------------------------------------++ - + + - | | - + *+ - 10ms + *+ - + ** + - | (midpoint) ** | - + | ** + - 1ms + v **** + - + zfs_delay_scale ----------> ***** + - | **** | - + **** + -100us + ** + - + * + - | * | - + * + - 10us + * + - + + - | | - + + - +--------------------------------------------------------------+ - 0% <- zfs_dirty_data_max -> 100% -``` -Note here that only as the amount of dirty data approaches its limit does -the delay start to increase rapidly. The goal of a properly tuned system -should be to keep the amount of dirty data out of that range by first -ensuring that the appropriate limits are set for the I/O scheduler to reach -optimal throughput on the backend storage, and then by changing the value -of zfs_delay_scale to increase the steepness of the curve. +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/ZFS-on-Linux-Module-Parameters.md b/ZFS-on-Linux-Module-Parameters.md index a78e4a9..34b623c 100644 --- a/ZFS-on-Linux-Module-Parameters.md +++ b/ZFS-on-Linux-Module-Parameters.md @@ -1,5901 +1,3 @@ -# ZFS on Linux Module Parameters +This page was moved to: https://openzfs.github.io/openzfs-docs/Performance%20and%20tuning/ZFS%20on%20Linux%20Module%20Parameters.html -Most of the ZFS kernel module parameters are accessible in the SysFS -`/sys/module/zfs/paramaters` directory. Current value can be observed by - -```shell -cat /sys/module/zfs/parameters/PARAMETER -``` - -Many of these can be changed by writing new values. These are denoted by -Change|Dynamic in the PARAMETER details below. - -```shell -echo NEWVALUE >> /sys/module/zfs/parameters/PARAMETER -``` -If the parameter is not dynamically adjustable, an error can occur and the -value will not be set. It can be helpful to check the permissions for the -PARAMETER file in SysFS. - -In some cases, the parameter must be set prior to loading the kernel modules -or it is desired to have the parameters set automatically at boot time. For -many distros, this can be accomplished by creating a file named -`/etc/modprobe.d/zfs.conf` containing a text line for each module parameter -using the format: - -``` -# change PARAMETER for workload XZY to solve problem PROBLEM_DESCRIPTION -# changed by YOUR_NAME on DATE -options zfs PARAMETER=VALUE -``` - -Some parameters related to ZFS operations are located in module parameters -other than in the `zfs` kernel module. These are documented in the individual -parameter description. -Unless otherwise noted, the tunable applies to the `zfs` kernel module. -For example, the `icp` kernel module parameters are visible -in the `/sys/module/icp/parameters` directory and can be set by default at boot -time by changing the `/etc/modprobe.d/icp.conf` file. - -See the man page for _modprobe.d_ for more information. - -## zfs-module-parameters Manual Page -The _zfs-module-parameters(5)_ man page contains brief descriptions of the -module parameters. Alas, man pages are not as suitable for quick reference -as wiki pages. This wiki page is intended to be a better cross-reference and -capture some of the wisdom of ZFS developers and practitioners. - -## ZFS Module Parameters -The ZFS kernel module, `zfs.ko`, parameters are detailed below. - -To observe the list of parameters along with a short synopsis of each -parameter, use the `modinfo` command: -```bash -modinfo zfs -``` -## Tags -The list of parameters is quite large and resists hierarchical representation. -To assist in quickly finding relevant information quickly, each module parameter -has a "Tags" row with keywords for frequent searches. - -## Tags -#### ABD - * [zfs_abd_scatter_enabled](#zfs_abd_scatter_enabled) - * [zfs_abd_scatter_max_order](#zfs_abd_scatter_max_order) - * [zfs_compressed_arc_enabled](#zfs_compressed_arc_enabled) -#### allocation - * [dmu_object_alloc_chunk_shift](#dmu_object_alloc_chunk_shift) - * [metaslab_aliquot](#metaslab_aliquot) - * [metaslab_bias_enabled](#metaslab_bias_enabled) - * [metaslab_debug_load](#metaslab_debug_load) - * [metaslab_debug_unload](#metaslab_debug_unload) - * [metaslab_force_ganging](#metaslab_force_ganging) - * [metaslab_fragmentation_factor_enabled](#metaslab_fragmentation_factor_enabled) - * [zfs_metaslab_fragmentation_threshold](#zfs_metaslab_fragmentation_threshold) - * [metaslab_lba_weighting_enabled](#metaslab_lba_weighting_enabled) - * [metaslab_preload_enabled](#metaslab_preload_enabled) - * [zfs_metaslab_segment_weight_enabled](#zfs_metaslab_segment_weight_enabled) - * [zfs_metaslab_switch_threshold](#zfs_metaslab_switch_threshold) - * [metaslabs_per_vdev](#metaslabs_per_vdev) - * [zfs_mg_fragmentation_threshold](#zfs_mg_fragmentation_threshold) - * [zfs_mg_noalloc_threshold](#zfs_mg_noalloc_threshold) - * [spa_asize_inflation](#spa_asize_inflation) - * [spa_load_verify_data](#spa_load_verify_data) - * [spa_slop_shift](#spa_slop_shift) - * [zfs_vdev_default_ms_count](#zfs_vdev_default_ms_count) -#### ARC - * [zfs_abd_scatter_min_size](#zfs_abd_scatter_min_size) - * [zfs_arc_average_blocksize](#zfs_arc_average_blocksize) - * [zfs_arc_dnode_limit](#zfs_arc_dnode_limit) - * [zfs_arc_dnode_limit_percent](#zfs_arc_dnode_limit_percent) - * [zfs_arc_dnode_reduce_percent](#zfs_arc_dnode_reduce_percent) - * [zfs_arc_evict_batch_limit](#zfs_arc_evict_batch_limit) - * [zfs_arc_grow_retry](#zfs_arc_grow_retry) - * [zfs_arc_lotsfree_percent](#zfs_arc_lotsfree_percent) - * [zfs_arc_max](#zfs_arc_max) - * [zfs_arc_meta_adjust_restarts](#zfs_arc_meta_adjust_restarts) - * [zfs_arc_meta_limit](#zfs_arc_meta_limit) - * [zfs_arc_meta_limit_percent](#zfs_arc_meta_limit_percent) - * [zfs_arc_meta_min](#zfs_arc_meta_min) - * [zfs_arc_meta_prune](#zfs_arc_meta_prune) - * [zfs_arc_meta_strategy](#zfs_arc_meta_strategy) - * [zfs_arc_min](#zfs_arc_min) - * [zfs_arc_min_prefetch_lifespan](#zfs_arc_min_prefetch_lifespan) - * [zfs_arc_min_prefetch_ms](#zfs_arc_min_prefetch_ms) - * [zfs_arc_min_prescient_prefetch_ms](#zfs_arc_min_prescient_prefetch_ms) - * [zfs_arc_overflow_shift](#zfs_arc_overflow_shift) - * [zfs_arc_p_dampener_disable](#zfs_arc_p_dampener_disable) - * [zfs_arc_p_min_shift](#zfs_arc_p_min_shift) - * [zfs_arc_pc_percent](#zfs_arc_pc_percent) - * [zfs_arc_shrink_shift](#zfs_arc_shrink_shift) - * [zfs_arc_sys_free](#zfs_arc_sys_free) - * [dbuf_cache_max_bytes](#dbuf_cache_max_bytes) - * [dbuf_cache_shift](#dbuf_cache_shift) - * [dbuf_metadata_cache_shift](#dbuf_metadata_cache_shift) - * [zfs_disable_dup_eviction](#zfs_disable_dup_eviction) - * [l2arc_feed_again](#l2arc_feed_again) - * [l2arc_feed_min_ms](#l2arc_feed_min_ms) - * [l2arc_feed_secs](#l2arc_feed_secs) - * [l2arc_headroom](#l2arc_headroom) - * [l2arc_headroom_boost](#l2arc_headroom_boost) - * [l2arc_nocompress](#l2arc_nocompress) - * [l2arc_noprefetch](#l2arc_noprefetch) - * [l2arc_norw](#l2arc_norw) - * [l2arc_write_boost](#l2arc_write_boost) - * [l2arc_write_max](#l2arc_write_max) - * [zfs_multilist_num_sublists](#zfs_multilist_num_sublists) - * [spa_load_verify_shift](#spa_load_verify_shift) -#### channel_programs - * [zfs_lua_max_instrlimit](#zfs_lua_max_instrlimit) - * [zfs_lua_max_memlimit](#zfs_lua_max_memlimit) -#### checkpoint - * [zfs_spa_discard_memory_limit](#zfs_spa_discard_memory_limit) -#### checksum - * [zfs_checksums_per_second](#zfs_checksums_per_second) - * [zfs_fletcher_4_impl](#zfs_fletcher_4_impl) - * [zfs_nopwrite_enabled](#zfs_nopwrite_enabled) - * [zfs_qat_checksum_disable](#zfs_qat_checksum_disable) -#### compression - * [zfs_compressed_arc_enabled](#zfs_compressed_arc_enabled) - * [zfs_qat_compress_disable](#zfs_qat_compress_disable) - * [zfs_qat_disable](#zfs_qat_disable) -#### CPU - * [zfs_fletcher_4_impl](#zfs_fletcher_4_impl) - * [zfs_mdcomp_disable](#zfs_mdcomp_disable) - * [spl_kmem_cache_kmem_threads](#spl_kmem_cache_kmem_threads) - * [spl_kmem_cache_magazine_size](#spl_kmem_cache_magazine_size) - * [spl_taskq_thread_bind](#spl_taskq_thread_bind) - * [spl_taskq_thread_priority](#spl_taskq_thread_priority) - * [spl_taskq_thread_sequential](#spl_taskq_thread_sequential) - * [zfs_vdev_raidz_impl](#zfs_vdev_raidz_impl) -#### dataset - * [zfs_max_dataset_nesting](#zfs_max_dataset_nesting) -#### dbuf_cache - * [dbuf_cache_hiwater_pct](#dbuf_cache_hiwater_pct) - * [dbuf_cache_lowater_pct](#dbuf_cache_lowater_pct) - * [dbuf_cache_max_bytes](#dbuf_cache_max_bytes) - * [dbuf_cache_max_bytes](#dbuf_cache_max_bytes) - * [dbuf_cache_max_shift](#dbuf_cache_max_shift) - * [dbuf_cache_shift](#dbuf_cache_shift) - * [dbuf_metadata_cache_max_bytes](#dbuf_metadata_cache_max_bytes) - * [dbuf_metadata_cache_shift](#dbuf_metadata_cache_shift) -#### debug - * [zfs_dbgmsg_enable](#zfs_dbgmsg_enable) - * [zfs_dbgmsg_maxsize](#zfs_dbgmsg_maxsize) - * [zfs_dbuf_state_index](#zfs_dbuf_state_index) - * [zfs_deadman_checktime_ms](#zfs_deadman_checktime_ms) - * [zfs_deadman_enabled](#zfs_deadman_enabled) - * [zfs_deadman_failmode](#zfs_deadman_failmode) - * [zfs_deadman_synctime_ms](#zfs_deadman_synctime_ms) - * [zfs_deadman_ziotime_ms](#zfs_deadman_ziotime_ms) - * [zfs_flags](#zfs_flags) - * [zfs_free_leak_on_eio](#zfs_free_leak_on_eio) - * [zfs_nopwrite_enabled](#zfs_nopwrite_enabled) - * [zfs_object_mutex_size](#zfs_object_mutex_size) - * [zfs_read_history](#zfs_read_history) - * [zfs_read_history_hits](#zfs_read_history_hits) - * [spl_panic_halt](#spl_panic_halt) - * [zfs_txg_history](#zfs_txg_history) - * [zfs_zevent_cols](#zfs_zevent_cols) - * [zfs_zevent_console](#zfs_zevent_console) - * [zfs_zevent_len_max](#zfs_zevent_len_max) - * [zil_replay_disable](#zil_replay_disable) - * [zio_deadman_log_all](#zio_deadman_log_all) - * [zio_decompress_fail_fraction](#zio_decompress_fail_fraction) - * [zio_delay_max](#zio_delay_max) -#### dedup - * [zfs_ddt_data_is_special](#zfs_ddt_data_is_special) - * [zfs_disable_dup_eviction](#zfs_disable_dup_eviction) -#### delay - * [zfs_delays_per_second](#zfs_delays_per_second) -#### delete - * [zfs_async_block_max_blocks](#zfs_async_block_max_blocks) - * [zfs_delete_blocks](#zfs_delete_blocks) - * [zfs_free_bpobj_enabled](#zfs_free_bpobj_enabled) - * [zfs_free_max_blocks](#zfs_free_max_blocks) - * [zfs_free_min_time_ms](#zfs_free_min_time_ms) - * [zfs_obsolete_min_time_ms](#zfs_obsolete_min_time_ms) - * [zfs_per_txg_dirty_frees_percent](#zfs_per_txg_dirty_frees_percent) -#### discard - * [zvol_max_discard_blocks](#zvol_max_discard_blocks) -#### disks - * [zfs_nocacheflush](#zfs_nocacheflush) - * [zil_nocacheflush](#zil_nocacheflush) -#### DMU - * [zfs_async_block_max_blocks](#zfs_async_block_max_blocks) - * [dmu_object_alloc_chunk_shift](#dmu_object_alloc_chunk_shift) - * [zfs_dmu_offset_next_sync](#zfs_dmu_offset_next_sync) -#### encryption - * [icp_aes_impl](#icp_aes_impl) - * [icp_gcm_impl](#icp_gcm_impl) - * [zfs_key_max_salt_uses](#zfs_key_max_salt_uses) - * [zfs_qat_encrypt_disable](#zfs_qat_encrypt_disable) -#### filesystem - * [zfs_admin_snapshot](#zfs_admin_snapshot) - * [zfs_delete_blocks](#zfs_delete_blocks) - * [zfs_expire_snapshot](#zfs_expire_snapshot) - * [zfs_free_max_blocks](#zfs_free_max_blocks) - * [zfs_max_recordsize](#zfs_max_recordsize) - * [zfs_read_chunk_size](#zfs_read_chunk_size) -#### fragmentation - * [zfs_metaslab_fragmentation_threshold](#zfs_metaslab_fragmentation_threshold) - * [zfs_mg_fragmentation_threshold](#zfs_mg_fragmentation_threshold) - * [zfs_mg_noalloc_threshold](#zfs_mg_noalloc_threshold) -#### HDD - * [metaslab_lba_weighting_enabled](#metaslab_lba_weighting_enabled) - * [zfs_vdev_mirror_rotating_inc](#zfs_vdev_mirror_rotating_inc) - * [zfs_vdev_mirror_rotating_seek_inc](#zfs_vdev_mirror_rotating_seek_inc) - * [zfs_vdev_mirror_rotating_seek_offset](#zfs_vdev_mirror_rotating_seek_offset) -#### hostid - * [spl_hostid](#spl_hostid) - * [spl_hostid_path](#spl_hostid_path) -#### import - * [zfs_autoimport_disable](#zfs_autoimport_disable) - * [zfs_max_missing_tvds](#zfs_max_missing_tvds) - * [zfs_multihost_fail_intervals](#zfs_multihost_fail_intervals) - * [zfs_multihost_history](#zfs_multihost_history) - * [zfs_multihost_import_intervals](#zfs_multihost_import_intervals) - * [zfs_multihost_interval](#zfs_multihost_interval) - * [zfs_recover](#zfs_recover) - * [spa_config_path](#spa_config_path) - * [spa_load_print_vdev_tree](#spa_load_print_vdev_tree) - * [spa_load_verify_maxinflight](#spa_load_verify_maxinflight) - * [spa_load_verify_metadata](#spa_load_verify_metadata) - * [spa_load_verify_shift](#spa_load_verify_shift) - * [zvol_inhibit_dev](#zvol_inhibit_dev) -#### L2ARC - * [l2arc_feed_again](#l2arc_feed_again) - * [l2arc_feed_min_ms](#l2arc_feed_min_ms) - * [l2arc_feed_secs](#l2arc_feed_secs) - * [l2arc_headroom](#l2arc_headroom) - * [l2arc_headroom_boost](#l2arc_headroom_boost) - * [l2arc_nocompress](#l2arc_nocompress) - * [l2arc_noprefetch](#l2arc_noprefetch) - * [l2arc_norw](#l2arc_norw) - * [l2arc_write_boost](#l2arc_write_boost) - * [l2arc_write_max](#l2arc_write_max) -#### memory - * [zfs_abd_scatter_enabled](#zfs_abd_scatter_enabled) - * [zfs_abd_scatter_max_order](#zfs_abd_scatter_max_order) - * [zfs_arc_average_blocksize](#zfs_arc_average_blocksize) - * [zfs_arc_grow_retry](#zfs_arc_grow_retry) - * [zfs_arc_lotsfree_percent](#zfs_arc_lotsfree_percent) - * [zfs_arc_max](#zfs_arc_max) - * [zfs_arc_pc_percent](#zfs_arc_pc_percent) - * [zfs_arc_shrink_shift](#zfs_arc_shrink_shift) - * [zfs_arc_sys_free](#zfs_arc_sys_free) - * [zfs_dedup_prefetch](#zfs_dedup_prefetch) - * [zfs_max_recordsize](#zfs_max_recordsize) - * [metaslab_debug_load](#metaslab_debug_load) - * [metaslab_debug_unload](#metaslab_debug_unload) - * [zfs_scan_mem_lim_fact](#zfs_scan_mem_lim_fact) - * [zfs_scan_strict_mem_lim](#zfs_scan_strict_mem_lim) - * [spl_kmem_alloc_max](#spl_kmem_alloc_max) - * [spl_kmem_alloc_warn](#spl_kmem_alloc_warn) - * [spl_kmem_cache_expire](#spl_kmem_cache_expire) - * [spl_kmem_cache_kmem_limit](#spl_kmem_cache_kmem_limit) - * [spl_kmem_cache_kmem_threads](#spl_kmem_cache_kmem_threads) - * [spl_kmem_cache_magazine_size](#spl_kmem_cache_magazine_size) - * [spl_kmem_cache_max_size](#spl_kmem_cache_max_size) - * [spl_kmem_cache_obj_per_slab](#spl_kmem_cache_obj_per_slab) - * [spl_kmem_cache_obj_per_slab_min](#spl_kmem_cache_obj_per_slab_min) - * [spl_kmem_cache_reclaim](#spl_kmem_cache_reclaim) - * [spl_kmem_cache_slab_limit](#spl_kmem_cache_slab_limit) -#### metadata - * [zfs_mdcomp_disable](#zfs_mdcomp_disable) -#### metaslab - * [metaslab_aliquot](#metaslab_aliquot) - * [metaslab_bias_enabled](#metaslab_bias_enabled) - * [metaslab_debug_load](#metaslab_debug_load) - * [metaslab_debug_unload](#metaslab_debug_unload) - * [metaslab_fragmentation_factor_enabled](#metaslab_fragmentation_factor_enabled) - * [metaslab_lba_weighting_enabled](#metaslab_lba_weighting_enabled) - * [metaslab_preload_enabled](#metaslab_preload_enabled) - * [zfs_metaslab_segment_weight_enabled](#zfs_metaslab_segment_weight_enabled) - * [zfs_metaslab_switch_threshold](#zfs_metaslab_switch_threshold) - * [metaslabs_per_vdev](#metaslabs_per_vdev) - * [zfs_vdev_min_ms_count](#zfs_vdev_min_ms_count) - * [zfs_vdev_ms_count_limit](#zfs_vdev_ms_count_limit) -#### mirror - * [zfs_vdev_mirror_non_rotating_inc](#zfs_vdev_mirror_non_rotating_inc) - * [zfs_vdev_mirror_non_rotating_seek_inc](#zfs_vdev_mirror_non_rotating_seek_inc) - * [zfs_vdev_mirror_rotating_inc](#zfs_vdev_mirror_rotating_inc) - * [zfs_vdev_mirror_rotating_seek_inc](#zfs_vdev_mirror_rotating_seek_inc) - * [zfs_vdev_mirror_rotating_seek_offset](#zfs_vdev_mirror_rotating_seek_offset) -#### MMP - * [zfs_multihost_fail_intervals](#zfs_multihost_fail_intervals) - * [zfs_multihost_history](#zfs_multihost_history) - * [zfs_multihost_import_intervals](#zfs_multihost_import_intervals) - * [zfs_multihost_interval](#zfs_multihost_interval) - * [spl_hostid](#spl_hostid) - * [spl_hostid_path](#spl_hostid_path) -#### panic - * [spl_panic_halt](#spl_panic_halt) -#### prefetch - * [zfs_arc_min_prefetch_ms](#zfs_arc_min_prefetch_ms) - * [zfs_arc_min_prescient_prefetch_ms](#zfs_arc_min_prescient_prefetch_ms) - * [zfs_dedup_prefetch](#zfs_dedup_prefetch) - * [l2arc_noprefetch](#l2arc_noprefetch) - * [zfs_no_scrub_prefetch](#zfs_no_scrub_prefetch) - * [zfs_pd_bytes_max](#zfs_pd_bytes_max) - * [zfs_prefetch_disable](#zfs_prefetch_disable) - * [zfetch_array_rd_sz](#zfetch_array_rd_sz) - * [zfetch_max_distance](#zfetch_max_distance) - * [zfetch_max_streams](#zfetch_max_streams) - * [zfetch_min_sec_reap](#zfetch_min_sec_reap) - * [zvol_prefetch_bytes](#zvol_prefetch_bytes) -#### QAT - * [zfs_qat_checksum_disable](#zfs_qat_checksum_disable) - * [zfs_qat_compress_disable](#zfs_qat_compress_disable) - * [zfs_qat_disable](#zfs_qat_disable) - * [zfs_qat_encrypt_disable](#zfs_qat_encrypt_disable) -#### raidz - * [zfs_vdev_raidz_impl](#zfs_vdev_raidz_impl) -#### receive - * [zfs_disable_ivset_guid_check](#zfs_disable_ivset_guid_check) - * [zfs_recv_queue_length](#zfs_recv_queue_length) -#### remove - * [zfs_obsolete_min_time_ms](#zfs_obsolete_min_time_ms) - * [zfs_remove_max_segment](#zfs_remove_max_segment) -#### resilver - * [zfs_resilver_delay](#zfs_resilver_delay) - * [zfs_resilver_disable_defer](#zfs_resilver_disable_defer) - * [zfs_resilver_min_time_ms](#zfs_resilver_min_time_ms) - * [zfs_scan_checkpoint_intval](#zfs_scan_checkpoint_intval) - * [zfs_scan_fill_weight](#zfs_scan_fill_weight) - * [zfs_scan_idle](#zfs_scan_idle) - * [zfs_scan_ignore_errors](#zfs_scan_ignore_errors) - * [zfs_scan_issue_strategy](#zfs_scan_issue_strategy) - * [zfs_scan_legacy](#zfs_scan_legacy) - * [zfs_scan_max_ext_gap](#zfs_scan_max_ext_gap) - * [zfs_scan_mem_lim_fact](#zfs_scan_mem_lim_fact) - * [zfs_scan_mem_lim_soft_fact](#zfs_scan_mem_lim_soft_fact) - * [zfs_scan_strict_mem_lim](#zfs_scan_strict_mem_lim) - * [zfs_scan_suspend_progress](#zfs_scan_suspend_progress) - * [zfs_scan_vdev_limit](#zfs_scan_vdev_limit) - * [zfs_top_maxinflight](#zfs_top_maxinflight) - * [zfs_vdev_scrub_max_active](#zfs_vdev_scrub_max_active) - * [zfs_vdev_scrub_min_active](#zfs_vdev_scrub_min_active) -#### scrub - * [zfs_no_scrub_io](#zfs_no_scrub_io) - * [zfs_no_scrub_prefetch](#zfs_no_scrub_prefetch) - * [zfs_scan_checkpoint_intval](#zfs_scan_checkpoint_intval) - * [zfs_scan_fill_weight](#zfs_scan_fill_weight) - * [zfs_scan_idle](#zfs_scan_idle) - * [zfs_scan_issue_strategy](#zfs_scan_issue_strategy) - * [zfs_scan_legacy](#zfs_scan_legacy) - * [zfs_scan_max_ext_gap](#zfs_scan_max_ext_gap) - * [zfs_scan_mem_lim_fact](#zfs_scan_mem_lim_fact) - * [zfs_scan_mem_lim_soft_fact](#zfs_scan_mem_lim_soft_fact) - * [zfs_scan_min_time_ms](#zfs_scan_min_time_ms) - * [zfs_scan_strict_mem_lim](#zfs_scan_strict_mem_lim) - * [zfs_scan_suspend_progress](#zfs_scan_suspend_progress) - * [zfs_scan_vdev_limit](#zfs_scan_vdev_limit) - * [zfs_scrub_delay](#zfs_scrub_delay) - * [zfs_scrub_min_time_ms](#zfs_scrub_min_time_ms) - * [zfs_top_maxinflight](#zfs_top_maxinflight) - * [zfs_vdev_scrub_max_active](#zfs_vdev_scrub_max_active) - * [zfs_vdev_scrub_min_active](#zfs_vdev_scrub_min_active) -#### send - * [ignore_hole_birth](#ignore_hole_birth) - * [zfs_override_estimate_recordsize](#zfs_override_estimate_recordsize) - * [zfs_pd_bytes_max](#zfs_pd_bytes_max) - * [zfs_send_corrupt_data](#zfs_send_corrupt_data) - * [zfs_send_queue_length](#zfs_send_queue_length) - * [zfs_send_unmodified_spill_blocks](#zfs_send_unmodified_spill_blocks) -#### snapshot - * [zfs_admin_snapshot](#zfs_admin_snapshot) - * [zfs_expire_snapshot](#zfs_expire_snapshot) -#### SPA - * [spa_asize_inflation](#spa_asize_inflation) - * [spa_load_print_vdev_tree](#spa_load_print_vdev_tree) - * [spa_load_verify_data](#spa_load_verify_data) - * [spa_load_verify_shift](#spa_load_verify_shift) - * [spa_slop_shift](#spa_slop_shift) - * [zfs_sync_pass_deferred_free](#zfs_sync_pass_deferred_free) - * [zfs_sync_pass_dont_compress](#zfs_sync_pass_dont_compress) - * [zfs_sync_pass_rewrite](#zfs_sync_pass_rewrite) - * [zfs_sync_taskq_batch_pct](#zfs_sync_taskq_batch_pct) - * [zfs_txg_timeout](#zfs_txg_timeout) -#### special_vdev - * [zfs_ddt_data_is_special](#zfs_ddt_data_is_special) - * [zfs_special_class_metadata_reserve_pct](#zfs_special_class_metadata_reserve_pct) - * [zfs_user_indirect_is_special](#zfs_user_indirect_is_special) -#### SSD - * [metaslab_lba_weighting_enabled](#metaslab_lba_weighting_enabled) - * [zfs_vdev_mirror_non_rotating_inc](#zfs_vdev_mirror_non_rotating_inc) - * [zfs_vdev_mirror_non_rotating_seek_inc](#zfs_vdev_mirror_non_rotating_seek_inc) -#### taskq - * [spl_max_show_tasks](#spl_max_show_tasks) - * [spl_taskq_kick](#spl_taskq_kick) - * [spl_taskq_thread_bind](#spl_taskq_thread_bind) - * [spl_taskq_thread_dynamic](#spl_taskq_thread_dynamic) - * [spl_taskq_thread_priority](#spl_taskq_thread_priority) - * [spl_taskq_thread_sequential](#spl_taskq_thread_sequential) - * [zfs_zil_clean_taskq_nthr_pct](#zfs_zil_clean_taskq_nthr_pct) - * [zio_taskq_batch_pct](#zio_taskq_batch_pct) -#### trim - * [zfs_trim_extent_bytes_max](#zfs_trim_extent_bytes_max) - * [zfs_trim_extent_bytes_min](#zfs_trim_extent_bytes_min) - * [zfs_trim_metaslab_skip](#zfs_trim_metaslab_skip) - * [zfs_trim_queue_limit](#zfs_trim_queue_limit) - * [zfs_trim_txg_batch](#zfs_trim_txg_batch) - * [zfs_vdev_aggregate_trim](#zfs_vdev_aggregate_trim) -#### vdev - * [zfs_checksum_events_per_second](#zfs_checksum_events_per_second) - * [metaslab_aliquot](#metaslab_aliquot) - * [metaslab_bias_enabled](#metaslab_bias_enabled) - * [zfs_metaslab_fragmentation_threshold](#zfs_metaslab_fragmentation_threshold) - * [metaslabs_per_vdev](#metaslabs_per_vdev) - * [zfs_mg_fragmentation_threshold](#zfs_mg_fragmentation_threshold) - * [zfs_mg_noalloc_threshold](#zfs_mg_noalloc_threshold) - * [zfs_multihost_interval](#zfs_multihost_interval) - * [zfs_scan_vdev_limit](#zfs_scan_vdev_limit) - * [zfs_slow_io_events_per_second](#zfs_slow_io_events_per_second) - * [zfs_vdev_aggregate_trim](#zfs_vdev_aggregate_trim) - * [zfs_vdev_aggregation_limit](#zfs_vdev_aggregation_limit) - * [zfs_vdev_aggregation_limit_non_rotating](#zfs_vdev_aggregation_limit_non_rotating) - * [zfs_vdev_async_read_max_active](#zfs_vdev_async_read_max_active) - * [zfs_vdev_async_read_min_active](#zfs_vdev_async_read_min_active) - * [zfs_vdev_async_write_active_max_dirty_percent](#zfs_vdev_async_write_active_max_dirty_percent) - * [zfs_vdev_async_write_active_min_dirty_percent](#zfs_vdev_async_write_active_min_dirty_percent) - * [zfs_vdev_async_write_max_active](#zfs_vdev_async_write_max_active) - * [zfs_vdev_async_write_min_active](#zfs_vdev_async_write_min_active) - * [zfs_vdev_cache_bshift](#zfs_vdev_cache_bshift) - * [zfs_vdev_cache_max](#zfs_vdev_cache_max) - * [zfs_vdev_cache_size](#zfs_vdev_cache_size) - * [zfs_vdev_initializing_max_active](#zfs_vdev_initializing_max_active) - * [zfs_vdev_initializing_min_active](#zfs_vdev_initializing_min_active) - * [zfs_vdev_max_active](#zfs_vdev_max_active) - * [zfs_vdev_min_ms_count](#zfs_vdev_min_ms_count) - * [zfs_vdev_mirror_non_rotating_inc](#zfs_vdev_mirror_non_rotating_inc) - * [zfs_vdev_mirror_non_rotating_seek_inc](#zfs_vdev_mirror_non_rotating_seek_inc) - * [zfs_vdev_mirror_rotating_inc](#zfs_vdev_mirror_rotating_inc) - * [zfs_vdev_mirror_rotating_seek_inc](#zfs_vdev_mirror_rotating_seek_inc) - * [zfs_vdev_mirror_rotating_seek_offset](#zfs_vdev_mirror_rotating_seek_offset) - * [zfs_vdev_ms_count_limit](#zfs_vdev_ms_count_limit) - * [zfs_vdev_queue_depth_pct](#zfs_vdev_queue_depth_pct) - * [zfs_vdev_raidz_impl](#zfs_vdev_raidz_impl) - * [zfs_vdev_read_gap_limit](#zfs_vdev_read_gap_limit) - * [zfs_vdev_removal_max_active](#zfs_vdev_removal_max_active) - * [zfs_vdev_removal_min_active](#zfs_vdev_removal_min_active) - * [zfs_vdev_scheduler](#zfs_vdev_scheduler) - * [zfs_vdev_scrub_max_active](#zfs_vdev_scrub_max_active) - * [zfs_vdev_scrub_min_active](#zfs_vdev_scrub_min_active) - * [zfs_vdev_sync_read_max_active](#zfs_vdev_sync_read_max_active) - * [zfs_vdev_sync_read_min_active](#zfs_vdev_sync_read_min_active) - * [zfs_vdev_sync_write_max_active](#zfs_vdev_sync_write_max_active) - * [zfs_vdev_sync_write_min_active](#zfs_vdev_sync_write_min_active) - * [zfs_vdev_trim_max_active](#zfs_vdev_trim_max_active) - * [zfs_vdev_trim_min_active](#zfs_vdev_trim_min_active) - * [vdev_validate_skip](#vdev_validate_skip) - * [zfs_vdev_write_gap_limit](#zfs_vdev_write_gap_limit) - * [zio_dva_throttle_enabled](#zio_dva_throttle_enabled) - * [zio_slow_io_ms](#zio_slow_io_ms) -#### vdev_cache - * [zfs_vdev_cache_bshift](#zfs_vdev_cache_bshift) - * [zfs_vdev_cache_max](#zfs_vdev_cache_max) - * [zfs_vdev_cache_size](#zfs_vdev_cache_size) -#### vdev_initialize - * [zfs_initialize_value](#zfs_initialize_value) -#### vdev_removal - * [zfs_condense_indirect_commit_entry_delay_ms](#zfs_condense_indirect_commit_entry_delay_ms) - * [zfs_condense_indirect_vdevs_enable](#zfs_condense_indirect_vdevs_enable) - * [zfs_condense_max_obsolete_bytes](#zfs_condense_max_obsolete_bytes) - * [zfs_condense_min_mapping_bytes](#zfs_condense_min_mapping_bytes) - * [zfs_reconstruct_indirect_combinations_max](#zfs_reconstruct_indirect_combinations_max) - * [zfs_removal_ignore_errors](#zfs_removal_ignore_errors) - * [zfs_removal_suspend_progress](#zfs_removal_suspend_progress) - * [vdev_removal_max_span](#vdev_removal_max_span) -#### volume - * [zfs_max_recordsize](#zfs_max_recordsize) - * [zvol_inhibit_dev](#zvol_inhibit_dev) - * [zvol_major](#zvol_major) - * [zvol_max_discard_blocks](#zvol_max_discard_blocks) - * [zvol_prefetch_bytes](#zvol_prefetch_bytes) - * [zvol_request_sync](#zvol_request_sync) - * [zvol_threads](#zvol_threads) - * [zvol_volmode](#zvol_volmode) -#### write_throttle - * [zfs_delay_min_dirty_percent](#zfs_delay_min_dirty_percent) - * [zfs_delay_scale](#zfs_delay_scale) - * [zfs_dirty_data_max](#zfs_dirty_data_max) - * [zfs_dirty_data_max_max](#zfs_dirty_data_max_max) - * [zfs_dirty_data_max_max_percent](#zfs_dirty_data_max_max_percent) - * [zfs_dirty_data_max_percent](#zfs_dirty_data_max_percent) - * [zfs_dirty_data_sync](#zfs_dirty_data_sync) - * [zfs_dirty_data_sync_percent](#zfs_dirty_data_sync_percent) -#### zed - * [zfs_checksums_per_second](#zfs_checksums_per_second) - * [zfs_delays_per_second](#zfs_delays_per_second) - * [zio_slow_io_ms](#zio_slow_io_ms) -#### ZIL - * [zfs_commit_timeout_pct](#zfs_commit_timeout_pct) - * [zfs_immediate_write_sz](#zfs_immediate_write_sz) - * [zfs_zil_clean_taskq_maxalloc](#zfs_zil_clean_taskq_maxalloc) - * [zfs_zil_clean_taskq_minalloc](#zfs_zil_clean_taskq_minalloc) - * [zfs_zil_clean_taskq_nthr_pct](#zfs_zil_clean_taskq_nthr_pct) - * [zil_nocacheflush](#zil_nocacheflush) - * [zil_replay_disable](#zil_replay_disable) - * [zil_slog_bulk](#zil_slog_bulk) -#### ZIO_scheduler - * [zfs_dirty_data_sync](#zfs_dirty_data_sync) - * [zfs_dirty_data_sync_percent](#zfs_dirty_data_sync_percent) - * [zfs_resilver_delay](#zfs_resilver_delay) - * [zfs_scan_idle](#zfs_scan_idle) - * [zfs_scrub_delay](#zfs_scrub_delay) - * [zfs_top_maxinflight](#zfs_top_maxinflight) - * [zfs_txg_timeout](#zfs_txg_timeout) - * [zfs_vdev_aggregate_trim](#zfs_vdev_aggregate_trim) - * [zfs_vdev_aggregation_limit](#zfs_vdev_aggregation_limit) - * [zfs_vdev_aggregation_limit_non_rotating](#zfs_vdev_aggregation_limit_non_rotating) - * [zfs_vdev_async_read_max_active](#zfs_vdev_async_read_max_active) - * [zfs_vdev_async_read_min_active](#zfs_vdev_async_read_min_active) - * [zfs_vdev_async_write_active_max_dirty_percent](#zfs_vdev_async_write_active_max_dirty_percent) - * [zfs_vdev_async_write_active_min_dirty_percent](#zfs_vdev_async_write_active_min_dirty_percent) - * [zfs_vdev_async_write_max_active](#zfs_vdev_async_write_max_active) - * [zfs_vdev_async_write_min_active](#zfs_vdev_async_write_min_active) - * [zfs_vdev_initializing_max_active](#zfs_vdev_initializing_max_active) - * [zfs_vdev_initializing_min_active](#zfs_vdev_initializing_min_active) - * [zfs_vdev_max_active](#zfs_vdev_max_active) - * [zfs_vdev_queue_depth_pct](#zfs_vdev_queue_depth_pct) - * [zfs_vdev_read_gap_limit](#zfs_vdev_read_gap_limit) - * [zfs_vdev_removal_max_active](#zfs_vdev_removal_max_active) - * [zfs_vdev_removal_min_active](#zfs_vdev_removal_min_active) - * [zfs_vdev_scheduler](#zfs_vdev_scheduler) - * [zfs_vdev_scrub_max_active](#zfs_vdev_scrub_max_active) - * [zfs_vdev_scrub_min_active](#zfs_vdev_scrub_min_active) - * [zfs_vdev_sync_read_max_active](#zfs_vdev_sync_read_max_active) - * [zfs_vdev_sync_read_min_active](#zfs_vdev_sync_read_min_active) - * [zfs_vdev_sync_write_max_active](#zfs_vdev_sync_write_max_active) - * [zfs_vdev_sync_write_min_active](#zfs_vdev_sync_write_min_active) - * [zfs_vdev_trim_max_active](#zfs_vdev_trim_max_active) - * [zfs_vdev_trim_min_active](#zfs_vdev_trim_min_active) - * [zfs_vdev_write_gap_limit](#zfs_vdev_write_gap_limit) - * [zio_dva_throttle_enabled](#zio_dva_throttle_enabled) - * [zio_requeue_io_start_cut_in_line](#zio_requeue_io_start_cut_in_line) - * [zio_taskq_batch_pct](#zio_taskq_batch_pct) - -## Index - * [zfs_abd_scatter_enabled](#zfs_abd_scatter_enabled) - * [zfs_abd_scatter_max_order](#zfs_abd_scatter_max_order) - * [zfs_abd_scatter_min_size](#zfs_abd_scatter_min_size) - * [zfs_admin_snapshot](#zfs_admin_snapshot) - * [zfs_arc_average_blocksize](#zfs_arc_average_blocksize) - * [zfs_arc_dnode_limit](#zfs_arc_dnode_limit) - * [zfs_arc_dnode_limit_percent](#zfs_arc_dnode_limit_percent) - * [zfs_arc_dnode_reduce_percent](#zfs_arc_dnode_reduce_percent) - * [zfs_arc_evict_batch_limit](#zfs_arc_evict_batch_limit) - * [zfs_arc_grow_retry](#zfs_arc_grow_retry) - * [zfs_arc_lotsfree_percent](#zfs_arc_lotsfree_percent) - * [zfs_arc_max](#zfs_arc_max) - * [zfs_arc_meta_adjust_restarts](#zfs_arc_meta_adjust_restarts) - * [zfs_arc_meta_limit](#zfs_arc_meta_limit) - * [zfs_arc_meta_limit_percent](#zfs_arc_meta_limit_percent) - * [zfs_arc_meta_min](#zfs_arc_meta_min) - * [zfs_arc_meta_prune](#zfs_arc_meta_prune) - * [zfs_arc_meta_strategy](#zfs_arc_meta_strategy) - * [zfs_arc_min](#zfs_arc_min) - * [zfs_arc_min_prefetch_lifespan](#zfs_arc_min_prefetch_lifespan) - * [zfs_arc_min_prefetch_ms](#zfs_arc_min_prefetch_ms) - * [zfs_arc_min_prescient_prefetch_ms](#zfs_arc_min_prescient_prefetch_ms) - * [zfs_arc_overflow_shift](#zfs_arc_overflow_shift) - * [zfs_arc_p_dampener_disable](#zfs_arc_p_dampener_disable) - * [zfs_arc_p_min_shift](#zfs_arc_p_min_shift) - * [zfs_arc_pc_percent](#zfs_arc_pc_percent) - * [zfs_arc_shrink_shift](#zfs_arc_shrink_shift) - * [zfs_arc_sys_free](#zfs_arc_sys_free) - * [zfs_async_block_max_blocks](#zfs_async_block_max_blocks) - * [zfs_autoimport_disable](#zfs_autoimport_disable) - * [zfs_checksum_events_per_second](#zfs_checksum_events_per_second) - * [zfs_checksums_per_second](#zfs_checksums_per_second) - * [zfs_commit_timeout_pct](#zfs_commit_timeout_pct) - * [zfs_compressed_arc_enabled](#zfs_compressed_arc_enabled) - * [zfs_condense_indirect_commit_entry_delay_ms](#zfs_condense_indirect_commit_entry_delay_ms) - * [zfs_condense_indirect_vdevs_enable](#zfs_condense_indirect_vdevs_enable) - * [zfs_condense_max_obsolete_bytes](#zfs_condense_max_obsolete_bytes) - * [zfs_condense_min_mapping_bytes](#zfs_condense_min_mapping_bytes) - * [zfs_dbgmsg_enable](#zfs_dbgmsg_enable) - * [zfs_dbgmsg_maxsize](#zfs_dbgmsg_maxsize) - * [dbuf_cache_hiwater_pct](#dbuf_cache_hiwater_pct) - * [dbuf_cache_lowater_pct](#dbuf_cache_lowater_pct) - * [dbuf_cache_max_bytes](#dbuf_cache_max_bytes) - * [dbuf_cache_max_shift](#dbuf_cache_max_shift) - * [dbuf_cache_shift](#dbuf_cache_shift) - * [dbuf_metadata_cache_max_bytes](#dbuf_metadata_cache_max_bytes) - * [dbuf_metadata_cache_shift](#dbuf_metadata_cache_shift) - * [zfs_dbuf_state_index](#zfs_dbuf_state_index) - * [zfs_ddt_data_is_special](#zfs_ddt_data_is_special) - * [zfs_deadman_checktime_ms](#zfs_deadman_checktime_ms) - * [zfs_deadman_enabled](#zfs_deadman_enabled) - * [zfs_deadman_failmode](#zfs_deadman_failmode) - * [zfs_deadman_synctime_ms](#zfs_deadman_synctime_ms) - * [zfs_deadman_ziotime_ms](#zfs_deadman_ziotime_ms) - * [zfs_dedup_prefetch](#zfs_dedup_prefetch) - * [zfs_delay_min_dirty_percent](#zfs_delay_min_dirty_percent) - * [zfs_delay_scale](#zfs_delay_scale) - * [zfs_delays_per_second](#zfs_delays_per_second) - * [zfs_delete_blocks](#zfs_delete_blocks) - * [zfs_dirty_data_max](#zfs_dirty_data_max) - * [zfs_dirty_data_max_max](#zfs_dirty_data_max_max) - * [zfs_dirty_data_max_max_percent](#zfs_dirty_data_max_max_percent) - * [zfs_dirty_data_max_percent](#zfs_dirty_data_max_percent) - * [zfs_dirty_data_sync](#zfs_dirty_data_sync) - * [zfs_dirty_data_sync_percent](#zfs_dirty_data_sync_percent) - * [zfs_disable_dup_eviction](#zfs_disable_dup_eviction) - * [zfs_disable_ivset_guid_check](#zfs_disable_ivset_guid_check) - * [dmu_object_alloc_chunk_shift](#dmu_object_alloc_chunk_shift) - * [zfs_dmu_offset_next_sync](#zfs_dmu_offset_next_sync) - * [zfs_expire_snapshot](#zfs_expire_snapshot) - * [zfs_flags](#zfs_flags) - * [zfs_fletcher_4_impl](#zfs_fletcher_4_impl) - * [zfs_free_bpobj_enabled](#zfs_free_bpobj_enabled) - * [zfs_free_leak_on_eio](#zfs_free_leak_on_eio) - * [zfs_free_max_blocks](#zfs_free_max_blocks) - * [zfs_free_min_time_ms](#zfs_free_min_time_ms) - * [icp_aes_impl](#icp_aes_impl) - * [icp_gcm_impl](#icp_gcm_impl) - * [ignore_hole_birth](#ignore_hole_birth) - * [zfs_immediate_write_sz](#zfs_immediate_write_sz) - * [zfs_initialize_value](#zfs_initialize_value) - * [zfs_key_max_salt_uses](#zfs_key_max_salt_uses) - * [l2arc_feed_again](#l2arc_feed_again) - * [l2arc_feed_min_ms](#l2arc_feed_min_ms) - * [l2arc_feed_secs](#l2arc_feed_secs) - * [l2arc_headroom](#l2arc_headroom) - * [l2arc_headroom_boost](#l2arc_headroom_boost) - * [l2arc_nocompress](#l2arc_nocompress) - * [l2arc_noprefetch](#l2arc_noprefetch) - * [l2arc_norw](#l2arc_norw) - * [l2arc_write_boost](#l2arc_write_boost) - * [l2arc_write_max](#l2arc_write_max) - * [zfs_lua_max_instrlimit](#zfs_lua_max_instrlimit) - * [zfs_lua_max_memlimit](#zfs_lua_max_memlimit) - * [zfs_max_dataset_nesting](#zfs_max_dataset_nesting) - * [zfs_max_missing_tvds](#zfs_max_missing_tvds) - * [zfs_max_recordsize](#zfs_max_recordsize) - * [zfs_mdcomp_disable](#zfs_mdcomp_disable) - * [metaslab_aliquot](#metaslab_aliquot) - * [metaslab_bias_enabled](#metaslab_bias_enabled) - * [metaslab_debug_load](#metaslab_debug_load) - * [metaslab_debug_unload](#metaslab_debug_unload) - * [metaslab_force_ganging](#metaslab_force_ganging) - * [metaslab_fragmentation_factor_enabled](#metaslab_fragmentation_factor_enabled) - * [zfs_metaslab_fragmentation_threshold](#zfs_metaslab_fragmentation_threshold) - * [metaslab_lba_weighting_enabled](#metaslab_lba_weighting_enabled) - * [metaslab_preload_enabled](#metaslab_preload_enabled) - * [zfs_metaslab_segment_weight_enabled](#zfs_metaslab_segment_weight_enabled) - * [zfs_metaslab_switch_threshold](#zfs_metaslab_switch_threshold) - * [metaslabs_per_vdev](#metaslabs_per_vdev) - * [zfs_mg_fragmentation_threshold](#zfs_mg_fragmentation_threshold) - * [zfs_mg_noalloc_threshold](#zfs_mg_noalloc_threshold) - * [zfs_multihost_fail_intervals](#zfs_multihost_fail_intervals) - * [zfs_multihost_history](#zfs_multihost_history) - * [zfs_multihost_import_intervals](#zfs_multihost_import_intervals) - * [zfs_multihost_interval](#zfs_multihost_interval) - * [zfs_multilist_num_sublists](#zfs_multilist_num_sublists) - * [zfs_no_scrub_io](#zfs_no_scrub_io) - * [zfs_no_scrub_prefetch](#zfs_no_scrub_prefetch) - * [zfs_nocacheflush](#zfs_nocacheflush) - * [zfs_nopwrite_enabled](#zfs_nopwrite_enabled) - * [zfs_object_mutex_size](#zfs_object_mutex_size) - * [zfs_obsolete_min_time_ms](#zfs_obsolete_min_time_ms) - * [zfs_override_estimate_recordsize](#zfs_override_estimate_recordsize) - * [zfs_pd_bytes_max](#zfs_pd_bytes_max) - * [zfs_per_txg_dirty_frees_percent](#zfs_per_txg_dirty_frees_percent) - * [zfs_prefetch_disable](#zfs_prefetch_disable) - * [zfs_qat_checksum_disable](#zfs_qat_checksum_disable) - * [zfs_qat_compress_disable](#zfs_qat_compress_disable) - * [zfs_qat_disable](#zfs_qat_disable) - * [zfs_qat_encrypt_disable](#zfs_qat_encrypt_disable) - * [zfs_read_chunk_size](#zfs_read_chunk_size) - * [zfs_read_history](#zfs_read_history) - * [zfs_read_history_hits](#zfs_read_history_hits) - * [zfs_reconstruct_indirect_combinations_max](#zfs_reconstruct_indirect_combinations_max) - * [zfs_recover](#zfs_recover) - * [zfs_recv_queue_length](#zfs_recv_queue_length) - * [zfs_removal_ignore_errors](#zfs_removal_ignore_errors) - * [zfs_removal_suspend_progress](#zfs_removal_suspend_progress) - * [zfs_remove_max_segment](#zfs_remove_max_segment) - * [zfs_resilver_delay](#zfs_resilver_delay) - * [zfs_resilver_disable_defer](#zfs_resilver_disable_defer) - * [zfs_resilver_min_time_ms](#zfs_resilver_min_time_ms) - * [zfs_scan_checkpoint_intval](#zfs_scan_checkpoint_intval) - * [zfs_scan_fill_weight](#zfs_scan_fill_weight) - * [zfs_scan_idle](#zfs_scan_idle) - * [zfs_scan_ignore_errors](#zfs_scan_ignore_errors) - * [zfs_scan_issue_strategy](#zfs_scan_issue_strategy) - * [zfs_scan_legacy](#zfs_scan_legacy) - * [zfs_scan_max_ext_gap](#zfs_scan_max_ext_gap) - * [zfs_scan_mem_lim_fact](#zfs_scan_mem_lim_fact) - * [zfs_scan_mem_lim_soft_fact](#zfs_scan_mem_lim_soft_fact) - * [zfs_scan_min_time_ms](#zfs_scan_min_time_ms) - * [zfs_scan_strict_mem_lim](#zfs_scan_strict_mem_lim) - * [zfs_scan_suspend_progress](#zfs_scan_suspend_progress) - * [zfs_scan_vdev_limit](#zfs_scan_vdev_limit) - * [zfs_scrub_delay](#zfs_scrub_delay) - * [zfs_scrub_min_time_ms](#zfs_scrub_min_time_ms) - * [zfs_send_corrupt_data](#zfs_send_corrupt_data) - * [send_holes_without_birth_time](#send_holes_without_birth_time) - * [zfs_send_queue_length](#zfs_send_queue_length) - * [zfs_send_unmodified_spill_blocks](#zfs_send_unmodified_spill_blocks) - * [zfs_slow_io_events_per_second](#zfs_slow_io_events_per_second) - * [spa_asize_inflation](#spa_asize_inflation) - * [spa_config_path](#spa_config_path) - * [zfs_spa_discard_memory_limit](#zfs_spa_discard_memory_limit) - * [spa_load_print_vdev_tree](#spa_load_print_vdev_tree) - * [spa_load_verify_data](#spa_load_verify_data) - * [spa_load_verify_maxinflight](#spa_load_verify_maxinflight) - * [spa_load_verify_metadata](#spa_load_verify_metadata) - * [spa_load_verify_shift](#spa_load_verify_shift) - * [spa_slop_shift](#spa_slop_shift) - * [zfs_special_class_metadata_reserve_pct](#zfs_special_class_metadata_reserve_pct) - * [spl_hostid](#spl_hostid) - * [spl_hostid_path](#spl_hostid_path) - * [spl_kmem_alloc_max](#spl_kmem_alloc_max) - * [spl_kmem_alloc_warn](#spl_kmem_alloc_warn) - * [spl_kmem_cache_expire](#spl_kmem_cache_expire) - * [spl_kmem_cache_kmem_limit](#spl_kmem_cache_kmem_limit) - * [spl_kmem_cache_kmem_threads](#spl_kmem_cache_kmem_threads) - * [spl_kmem_cache_magazine_size](#spl_kmem_cache_magazine_size) - * [spl_kmem_cache_max_size](#spl_kmem_cache_max_size) - * [spl_kmem_cache_obj_per_slab](#spl_kmem_cache_obj_per_slab) - * [spl_kmem_cache_obj_per_slab_min](#spl_kmem_cache_obj_per_slab_min) - * [spl_kmem_cache_reclaim](#spl_kmem_cache_reclaim) - * [spl_kmem_cache_slab_limit](#spl_kmem_cache_slab_limit) - * [spl_max_show_tasks](#spl_max_show_tasks) - * [spl_panic_halt](#spl_panic_halt) - * [spl_taskq_kick](#spl_taskq_kick) - * [spl_taskq_thread_bind](#spl_taskq_thread_bind) - * [spl_taskq_thread_dynamic](#spl_taskq_thread_dynamic) - * [spl_taskq_thread_priority](#spl_taskq_thread_priority) - * [spl_taskq_thread_sequential](#spl_taskq_thread_sequential) - * [zfs_sync_pass_deferred_free](#zfs_sync_pass_deferred_free) - * [zfs_sync_pass_dont_compress](#zfs_sync_pass_dont_compress) - * [zfs_sync_pass_rewrite](#zfs_sync_pass_rewrite) - * [zfs_sync_taskq_batch_pct](#zfs_sync_taskq_batch_pct) - * [zfs_top_maxinflight](#zfs_top_maxinflight) - * [zfs_trim_extent_bytes_max](#zfs_trim_extent_bytes_max) - * [zfs_trim_extent_bytes_min](#zfs_trim_extent_bytes_min) - * [zfs_trim_metaslab_skip](#zfs_trim_metaslab_skip) - * [zfs_trim_queue_limit](#zfs_trim_queue_limit) - * [zfs_trim_txg_batch](#zfs_trim_txg_batch) - * [zfs_txg_history](#zfs_txg_history) - * [zfs_txg_timeout](#zfs_txg_timeout) - * [zfs_unlink_suspend_progress](#zfs_unlink_suspend_progress) - * [zfs_user_indirect_is_special](#zfs_user_indirect_is_special) - * [zfs_vdev_aggregate_trim](#zfs_vdev_aggregate_trim) - * [zfs_vdev_aggregation_limit](#zfs_vdev_aggregation_limit) - * [zfs_vdev_aggregation_limit_non_rotating](#zfs_vdev_aggregation_limit_non_rotating) - * [zfs_vdev_async_read_max_active](#zfs_vdev_async_read_max_active) - * [zfs_vdev_async_read_min_active](#zfs_vdev_async_read_min_active) - * [zfs_vdev_async_write_active_max_dirty_percent](#zfs_vdev_async_write_active_max_dirty_percent) - * [zfs_vdev_async_write_active_min_dirty_percent](#zfs_vdev_async_write_active_min_dirty_percent) - * [zfs_vdev_async_write_max_active](#zfs_vdev_async_write_max_active) - * [zfs_vdev_async_write_min_active](#zfs_vdev_async_write_min_active) - * [zfs_vdev_cache_bshift](#zfs_vdev_cache_bshift) - * [zfs_vdev_cache_max](#zfs_vdev_cache_max) - * [zfs_vdev_cache_size](#zfs_vdev_cache_size) - * [zfs_vdev_default_ms_count](#zfs_vdev_default_ms_count) - * [zfs_vdev_initializing_max_active](#zfs_vdev_initializing_max_active) - * [zfs_vdev_initializing_min_active](#zfs_vdev_initializing_min_active) - * [zfs_vdev_max_active](#zfs_vdev_max_active) - * [zfs_vdev_min_ms_count](#zfs_vdev_min_ms_count) - * [zfs_vdev_mirror_non_rotating_inc](#zfs_vdev_mirror_non_rotating_inc) - * [zfs_vdev_mirror_non_rotating_seek_inc](#zfs_vdev_mirror_non_rotating_seek_inc) - * [zfs_vdev_mirror_rotating_inc](#zfs_vdev_mirror_rotating_inc) - * [zfs_vdev_mirror_rotating_seek_inc](#zfs_vdev_mirror_rotating_seek_inc) - * [zfs_vdev_mirror_rotating_seek_offset](#zfs_vdev_mirror_rotating_seek_offset) - * [zfs_vdev_ms_count_limit](#zfs_vdev_ms_count_limit) - * [zfs_vdev_queue_depth_pct](#zfs_vdev_queue_depth_pct) - * [zfs_vdev_raidz_impl](#zfs_vdev_raidz_impl) - * [zfs_vdev_read_gap_limit](#zfs_vdev_read_gap_limit) - * [zfs_vdev_removal_max_active](#zfs_vdev_removal_max_active) - * [vdev_removal_max_span](#vdev_removal_max_span) - * [zfs_vdev_removal_min_active](#zfs_vdev_removal_min_active) - * [zfs_vdev_scheduler](#zfs_vdev_scheduler) - * [zfs_vdev_scrub_max_active](#zfs_vdev_scrub_max_active) - * [zfs_vdev_scrub_min_active](#zfs_vdev_scrub_min_active) - * [zfs_vdev_sync_read_max_active](#zfs_vdev_sync_read_max_active) - * [zfs_vdev_sync_read_min_active](#zfs_vdev_sync_read_min_active) - * [zfs_vdev_sync_write_max_active](#zfs_vdev_sync_write_max_active) - * [zfs_vdev_sync_write_min_active](#zfs_vdev_sync_write_min_active) - * [zfs_vdev_trim_max_active](#zfs_vdev_trim_max_active) - * [zfs_vdev_trim_min_active](#zfs_vdev_trim_min_active) - * [vdev_validate_skip](#vdev_validate_skip) - * [zfs_vdev_write_gap_limit](#zfs_vdev_write_gap_limit) - * [zfs_zevent_cols](#zfs_zevent_cols) - * [zfs_zevent_console](#zfs_zevent_console) - * [zfs_zevent_len_max](#zfs_zevent_len_max) - * [zfetch_array_rd_sz](#zfetch_array_rd_sz) - * [zfetch_max_distance](#zfetch_max_distance) - * [zfetch_max_streams](#zfetch_max_streams) - * [zfetch_min_sec_reap](#zfetch_min_sec_reap) - * [zfs_zil_clean_taskq_maxalloc](#zfs_zil_clean_taskq_maxalloc) - * [zfs_zil_clean_taskq_minalloc](#zfs_zil_clean_taskq_minalloc) - * [zfs_zil_clean_taskq_nthr_pct](#zfs_zil_clean_taskq_nthr_pct) - * [zil_nocacheflush](#zil_nocacheflush) - * [zil_replay_disable](#zil_replay_disable) - * [zil_slog_bulk](#zil_slog_bulk) - * [zio_deadman_log_all](#zio_deadman_log_all) - * [zio_decompress_fail_fraction](#zio_decompress_fail_fraction) - * [zio_delay_max](#zio_delay_max) - * [zio_dva_throttle_enabled](#zio_dva_throttle_enabled) - * [zio_requeue_io_start_cut_in_line](#zio_requeue_io_start_cut_in_line) - * [zio_slow_io_ms](#zio_slow_io_ms) - * [zio_taskq_batch_pct](#zio_taskq_batch_pct) - * [zvol_inhibit_dev](#zvol_inhibit_dev) - * [zvol_major](#zvol_major) - * [zvol_max_discard_blocks](#zvol_max_discard_blocks) - * [zvol_prefetch_bytes](#zvol_prefetch_bytes) - * [zvol_request_sync](#zvol_request_sync) - * [zvol_threads](#zvol_threads) - * [zvol_volmode](#zvol_volmode) - -# ZFS Module Parameters - -### ignore_hole_birth -When set, the hole_birth optimization will not be used and all holes will -always be sent by `zfs send` In the source code, ignore_hole_birth is an -alias for and SysFS PARAMETER for [send_holes_without_birth_time](#send_holes_without_birth_time). - -| ignore_hole_birth | Notes -|---|--- -| Tags | [send](#send) -| When to change | Enable if you suspect your datasets are affected by a bug in hole_birth during `zfs send` operations -| Data Type | boolean -| Range | 0=disabled, 1=enabled -| Default | 1 (hole birth optimization is ignored) -| Change | Dynamic -| Versions Affected | TBD - -### l2arc_feed_again -Turbo L2ARC cache warm-up. When the L2ARC is cold the fill interval will be -set to aggressively fill as fast as possible. - -| l2arc_feed_again | Notes -|---|--- -| Tags | [ARC](#arc), [L2ARC](#l2arc) -| When to change | If cache devices exist and it is desired to fill them as fast as possible -| Data Type | boolean -| Range | 0=disabled, 1=enabled -| Default | 1 -| Change | Dynamic -| Versions Affected | TBD - -### l2arc_feed_min_ms -Minimum time period for aggressively feeding the L2ARC. The L2ARC feed thread -wakes up once per second (see [l2arc_feed_secs](#l2arc_feed_secs)) to look for data to feed into -the L2ARC. `l2arc_feed_min_ms` only affects the turbo L2ARC cache warm-up and -allows the aggressiveness to be adjusted. - -| l2arc_feed_min_ms | Notes -|---|--- -| Tags | [ARC](#arc), [L2ARC](#l2arc) -| When to change | If cache devices exist and [l2arc_feed_again](#l2arc_feed_again) and the feed is too aggressive, then this tunable can be adjusted to reduce the impact of the fill -| Data Type | uint64 -| Units | milliseconds -| Range | 0 to (1000 * l2arc_feed_secs) -| Default | 200 -| Change | Dynamic -| Versions Affected | 0.6 and later - -### l2arc_feed_secs -Seconds between waking the L2ARC feed thread. One feed thread works for all cache devices in turn. - -If the pool that owns a cache device is imported readonly, then the feed thread is delayed 5 * [l2arc_feed_secs](#l2arc_feed_secs) before moving onto the next cache device. If multiple pools are imported with cache devices and one pool with cache is imported readonly, the L2ARC feed rate to all caches can be slowed. - -| l2arc_feed_secs | Notes -|---|--- -| Tags | [ARC](#arc), [L2ARC](#l2arc) -| When to change | Do not change -| Data Type | uint64 -| Units | seconds -| Range | 1 to UINT64_MAX -| Default | 1 -| Change | Dynamic -| Versions Affected | 0.6 and later - -### l2arc_headroom -How far through the ARC lists to search for L2ARC cacheable content, expressed -as a multiplier of [l2arc_write_max](#l2arc_write_max) - -| l2arc_headroom | Notes -|---|--- -| Tags | [ARC](#arc), [L2ARC](#l2arc) -| When to change | If the rate of change in the ARC is faster than the overall L2ARC feed rate, then increasing l2arc_headroom can increase L2ARC efficiency. Setting the value too large can cause the L2ARC feed thread to consume more CPU time looking for data to feed. -| Data Type | uint64 -| Units | unit -| Range | 0 to UINT64_MAX -| Default | 2 -| Change | Dynamic -| Versions Affected | 0.6 and later - -### l2arc_headroom_boost -Percentage scale for [l2arc_headroom](#l2arc_headroom) when L2ARC contents are being successfully -compressed before writing. - -| l2arc_headroom_boost | Notes -|---|--- -| Tags | [ARC](#arc), [L2ARC](#l2arc) -| When to change | If average compression efficiency is greater than 2:1, then increasing [l2arc_headroom_boost](#l2arc_headroom_boost) can increase the L2ARC feed rate -| Data Type | uint64 -| Units | percent -| Range | 100 to UINT64_MAX, when set to 100, the L2ARC headroom boost feature is effectively disabled -| Default | 200 -| Change | Dynamic -| Versions Affected | all - -### l2arc_nocompress -Disable writing compressed data to cache devices. Disabling allows the legacy -behavior of writing decompressed data to cache devices. - -| l2arc_nocompress | Notes -|---|--- -| Tags | [ARC](#arc), [L2ARC](#l2arc) -| When to change | When testing compressed L2ARC feature -| Data Type | boolean -| Range | 0=store compressed blocks in cache device, 1=store uncompressed blocks in cache device -| Default | 0 -| Change | Dynamic -| Versions Affected | deprecated in v0.7.0 by new compressed ARC design - -### l2arc_noprefetch -Disables writing prefetched, but unused, buffers to cache devices. - -| l2arc_noprefetch | Notes -|---|--- -| Tags | [ARC](#arc), [L2ARC](#l2arc), [prefetch](#prefetch) -| When to change | Setting to 0 can increase L2ARC hit rates for workloads where the ARC is too small for a read workload that benefits from prefetching. Also, if the main pool devices are very slow, setting to 0 can improve some workloads such as backups. -| Data Type | boolean -| Range | 0=write prefetched but unused buffers to cache devices, 1=do not write prefetched but unused buffers to cache devices -| Default | 1 -| Change | Dynamic -| Versions Affected | v0.6.0 and later - -### l2arc_norw -Disables writing to cache devices while they are being read. - -| l2arc_norw | Notes -|---|--- -| Tags | [ARC](#arc), [L2ARC](#l2arc) -| When to change | In the early days of SSDs, some devices did not perform well when reading and writing simultaneously. Modern SSDs do not have these issues. -| Data Type | boolean -| Range | 0=read and write simultaneously, 1=avoid writes when reading for antique SSDs -| Default | 0 -| Change | Dynamic -| Versions Affected | all - -### l2arc_write_boost -Until the ARC fills, increases the L2ARC fill rate [l2arc_write_max](#l2arc_write_max) by -`l2arc_write_boost`. - -| l2arc_write_boost | Notes -|---|--- -| Tags | [ARC](#arc), [L2ARC](#l2arc) -| When to change | To fill the cache devices more aggressively after pool import. -| Data Type | uint64 -| Units | bytes -| Range | 0 to UINT64_MAX -| Default | 8,388,608 -| Change | Dynamic -| Versions Affected | all - -### l2arc_write_max -Maximum number of bytes to be written to each cache device for each L2ARC feed -thread interval (see [l2arc_feed_secs](#l2arc_feed_secs)). The actual limit can be adjusted by -[l2arc_write_boost](#l2arc_write_boost). By default [l2arc_feed_secs](#l2arc_feed_secs) is 1 second, delivering a -maximum write workload to cache devices of 8 MiB/sec. - -| l2arc_write_max | Notes -|---|--- -| Tags | [ARC](#arc), [L2ARC](#l2arc) -| When to change | If the cache devices can sustain the write workload, increasing the rate of cache device fill when workloads generate new data at a rate higher than l2arc_write_max can increase L2ARC hit rate -| Data Type | uint64 -| Units | bytes -| Range | 1 to UINT64_MAX -| Default | 8,388,608 -| Change | Dynamic -| Versions Affected | all - -### metaslab_aliquot -Sets the metaslab granularity. Nominally, ZFS will try to allocate this amount -of data to a top-level vdev before moving on to the next top-level vdev. -This is roughly similar to what would be referred to as the "stripe size" in -traditional RAID arrays. - -When tuning for HDDs, it can be more efficient to have a few larger, sequential -writes to a device rather than switching to the next device. Monitoring the -size of contiguous writes to the disks relative to the write throughput can be -used to determine if increasing `metaslab_aliquot` can help. For modern devices, -it is unlikely that decreasing `metaslab_aliquot` from the default will help. - -If there is only one top-level vdev, this tunable is not used. - -| metaslab_aliquot | Notes -|---|--- -| Tags | [allocation](#allocation), [metaslab](#metaslab), [vdev](#vdev) -| When to change | If write performance increases as devices more efficiently write larger, contiguous blocks -| Data Type | uint64 -| Units | bytes -| Range | 0 to UINT64_MAX -| Default | 524,288 -| Change | Dynamic -| Versions Affected | all - -### metaslab_bias_enabled -Enables metaslab group biasing based on a top-level vdev's utilization -relative to the pool. Nominally, all top-level devs are the same size and the -allocation is spread evenly. When the top-level vdevs are not of the same size, -for example if a new (empty) top-level is added to the pool, this allows the -new top-level vdev to get a larger portion of new allocations. - -| metaslab_bias_enabled | Notes -|---|--- -| Tags | [allocation](#allocation), [metaslab](#metaslab), [vdev](#vdev) -| When to change | If a new top-level vdev is added and you do not want to bias new allocations to the new top-level vdev -| Data Type | boolean -| Range | 0=spread evenly across top-level vdevs, 1=bias spread to favor less full top-level vdevs -| Default | 1 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_metaslab_segment_weight_enabled -Enables metaslab allocation based on largest free segment rather than total -amount of free space. The goal is to avoid metaslabs that exhibit free space -fragmentation: when there is a lot of small free spaces, but few larger free -spaces. - -If `zfs_metaslab_segment_weight_enabled` is enabled, then -[metaslab_fragmentation_factor_enabled](#metaslab_fragmentation_factor_enabled) is ignored. - -| zfs_metaslab_segment_weight_enabled | Notes -|---|--- -| Tags | [allocation](#allocation), [metaslab](#metaslab) -| When to change | When testing allocation and fragmentation -| Data Type | boolean -| Range | 0=do not consider metaslab fragmentation, 1=avoid metaslabs where free space is highly fragmented -| Default | 1 -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_metaslab_switch_threshold -When using segment-based metaslab selection -(see [zfs_metaslab_segment_weight_enabled](#zfs_metaslab_segment_weight_enabled)), continue allocating -from the active metaslab until `zfs_metaslab_switch_threshold` -worth of free space buckets have been exhausted. - -| zfs_metaslab_switch_threshold | Notes -|---|--- -| Tags | [allocation](#allocation), [metaslab](#metaslab) -| When to change | When testing allocation and fragmentation -| Data Type | uint64 -| Units | free spaces -| Range | 0 to UINT64_MAX -| Default | 2 -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### metaslab_debug_load -When enabled, all metaslabs are loaded into memory during pool import. -Nominally, metaslab space map information is loaded and unloaded as needed -(see [metaslab_debug_unload](#metaslab_debug_unload)) - -It is difficult to predict how much RAM is required to store a space map. -An empty or completely full metaslab has a small space map. However, a highly -fragmented space map can consume significantly more memory. - -Enabling `metaslab_debug_load` can increase pool import time. - -| metaslab_debug_load | Notes -|---|--- -| Tags | [allocation](#allocation), [memory](#memory), [metaslab](#metaslab) -| When to change | When RAM is plentiful and pool import time is not a consideration -| Data Type | boolean -| Range | 0=do not load all metaslab info at pool import, 1=dynamically load metaslab info as needed -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### metaslab_debug_unload -When enabled, prevents metaslab information from being dynamically unloaded from RAM. -Nominally, metaslab space map information is loaded and unloaded as needed -(see [metaslab_debug_load](#metaslab_debug_load)) - -It is difficult to predict how much RAM is required to store a space map. -An empty or completely full metaslab has a small space map. However, a highly -fragmented space map can consume significantly more memory. - -Enabling `metaslab_debug_unload` consumes RAM that would otherwise be freed. - -| metaslab_debug_unload | Notes -|---|--- -| Tags | [allocation](#allocation), [memory](#memory), [metaslab](#metaslab) -| When to change | When RAM is plentiful and the penalty for dynamically reloading metaslab info from the pool is high -| Data Type | boolean -| Range | 0=dynamically unload metaslab info, 1=unload metaslab info only upon pool export -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### metaslab_fragmentation_factor_enabled -Enable use of the fragmentation metric in computing metaslab weights. - -In version v0.7.0, if [zfs_metaslab_segment_weight_enabled](#zfs_metaslab_segment_weight_enabled) is enabled, then -`metaslab_fragmentation_factor_enabled` is ignored. - -| metaslab_fragmentation_factor_enabled | Notes -|---|--- -| Tags | [allocation](#allocation), [metaslab](#metaslab) -| When to change | To test metaslab fragmentation -| Data Type | boolean -| Range | 0=do not consider metaslab free space fragmentation, 1=try to avoid fragmented metaslabs -| Default | 1 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### metaslabs_per_vdev -When a vdev is added, it will be divided into approximately, but no more than, -this number of metaslabs. - -| metaslabs_per_vdev | Notes -|---|--- -| Tags | [allocation](#allocation), [metaslab](#metaslab), [vdev](#vdev) -| When to change | When testing metaslab allocation -| Data Type | uint64 -| Units | metaslabs -| Range | 16 to UINT64_MAX -| Default | 200 -| Change | Prior to pool creation or adding new top-level vdevs -| Versions Affected | all - -### metaslab_preload_enabled -Enable metaslab group preloading. Each top-level vdev has a metaslab group. -By default, up to 3 copies of metadata can exist and are distributed across multiple -top-level vdevs. `metaslab_preload_enabled` allows the corresponding metaslabs to be -preloaded, thus improving allocation efficiency. - -| metaslab_preload_enabled | Notes -|---|--- -| Tags | [allocation](#allocation), [metaslab](#metaslab) -| When to change | When testing metaslab allocation -| Data Type | boolean -| Range | 0=do not preload metaslab info, 1=preload up to 3 metaslabs -| Default | 1 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### metaslab_lba_weighting_enabled -Modern HDDs have uniform bit density and constant angular velocity. -Therefore, the outer recording zones are faster (higher bandwidth) -than the inner zones by the ratio of outer to inner track diameter. -The difference in bandwidth can be 2:1, and is often available in the HDD -detailed specifications or drive manual. For HDDs when -`metaslab_lba_weighting_enabled` is true, write allocation preference is given -to the metaslabs representing the outer recording zones. Thus the allocation -to metaslabs prefers faster bandwidth over free space. - -If the devices are not rotational, yet misrepresent themselves to the OS as -rotational, then disabling `metaslab_lba_weighting_enabled` can result in more -even, free-space-based allocation. - -| metaslab_lba_weighting_enabled | Notes -|---|--- -| Tags | [allocation](#allocation), [metaslab](#metaslab), [HDD](#hdd), [SSD](#ssd) -| When to change | disable if using only SSDs and version v0.6.4 or earlier -| Data Type | boolean -| Range | 0=do not use LBA weighting, 1=use LBA weighting -| Default | 1 -| Change | Dynamic -| Verfication | The rotational setting described by a block device in sysfs by observing `/sys/block/DISK_NAME/queue/rotational` -| Versions Affected | prior to v0.6.5, the check for non-rotation media did not exist - -### spa_config_path -By default, the `zpool import` command searches for pool information in -the `zpool.cache` file. If the pool to be imported has an entry -in `zpool.cache` then the devices do not have to be scanned to determine if -they are pool members. The path to the cache file is spa_config_path. - -For more information on `zpool import` and the `-o cachefile` and -`-d` options, see the man page for zpool(8) - -See also [zfs_autoimport_disable](#zfs_autoimport_disable) - -| spa_config_path | Notes -|---|--- -| Tags | [import](#import) -| When to change | If creating a non-standard distribution and the cachefile property is inconvenient -| Data Type | string -| Default | `/etc/zfs/zpool.cache` -| Change | Dynamic, applies only to the next invocation of `zpool import` -| Versions Affected | all - -### spa_asize_inflation -Multiplication factor used to estimate actual disk consumption from the -size of data being written. The default value is a worst case estimate, -but lower values may be valid for a given pool depending on its -configuration. Pool administrators who understand the factors involved -may wish to specify a more realistic inflation factor, particularly if -they operate close to quota or capacity limits. - -The worst case space requirement for allocation is single-sector -max-parity RAIDZ blocks, in which case the space requirement is exactly -4 times the size, accounting for a maximum of 3 parity blocks. -This is added to the maximum number of ZFS `copies` parameter (copies max=3). -Additional space is required if the block could impact deduplication -tables. Altogether, the worst case is 24. - -If the estimation is not correct, then quotas or out-of-space conditions can -lead to optimistic expectations of the ability to allocate. Applications are -typically not prepared to deal with such failures and can misbehave. - -| spa_asize_inflation | Notes -|---|--- -| Tags | [allocation](#allocation), [SPA](#spa) -| When to change | If the allocation requirements for the workload are well known and quotas are used -| Data Type | uint64 -| Units | unit -| Range | 1 to 24 -| Default | 24 -| Change | Dynamic -| Versions Affected | v0.6.3 and later - -### spa_load_verify_data -An extreme rewind import (see `zpool import -X`) normally performs a -full traversal of all blocks in the pool for verification. If this parameter -is set to 0, the traversal skips non-metadata blocks. It can be toggled -once the import has started to stop or start the traversal of non-metadata -blocks. See also [spa_load_verify_metadata](#spa_load_verify_metadata). - -| spa_load_verify_data | Notes -|---|--- -| Tags | [allocation](#allocation), [SPA](#spa) -| When to change | At the risk of data integrity, to speed extreme import of large pool -| Data Type | boolean -| Range | 0=do not verify data upon pool import, 1=verify pool data upon import -| Default | 1 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### spa_load_verify_metadata -An extreme rewind import (see `zpool import -X`) normally performs a -full traversal of all blocks in the pool for verification. If this parameter -is set to 0, the traversal is not performed. It can be toggled once the -import has started to stop or start the traversal. See [spa_load_verify_data](#spa_load_verify_data) - -| spa_load_verify_metadata | Notes -|---|--- -| Tags | [import](#import) -| When to change | At the risk of data integrity, to speed extreme import of large pool -| Data Type | boolean -| Range | 0=do not verify metadata upon pool import, 1=verify pool metadata upon import -| Default | 1 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### spa_load_verify_maxinflight -Maximum number of concurrent I/Os during the data verification performed -during an extreme rewind import (see `zpool import -X`) - -| spa_load_verify_maxinflight | Notes -|---|--- -| Tags | [import](#import) -| When to change | During an extreme rewind import, to match the concurrent I/O capabilities of the pool devices -| Data Type | int -| Units | I/Os -| Range | 1 to MAX_INT -| Default | 10,000 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### spa_slop_shift -Normally, the last 3.2% (1/(2^`spa_slop_shift`)) of pool space -is reserved to ensure the pool doesn't run completely out of space, -due to unaccounted changes (e.g. to the MOS). -This also limits the worst-case time to allocate space. -When less than this amount of free space exists, most ZPL operations -(e.g. write, create) return error:no space (ENOSPC). - -Changing spa_slop_shift affects the currently loaded ZFS module and -all imported pools. spa_slop_shift is not stored on disk. Beware when -importing full pools on systems with larger spa_slop_shift can lead to -over-full conditions. - -The minimum SPA slop space is limited to 128 MiB. - -| spa_slop_shift | Notes -|---|--- -| Tags | [allocation](#allocation), [SPA](#spa) -| When to change | For large pools, when 3.2% may be too conservative and more usable space is desired, consider increasing `spa_slop_shift` -| Data Type | int -| Units | shift -| Range | 1 to MAX_INT, however the practical upper limit is 15 for a system with 4TB of RAM -| Default | 5 -| Change | Dynamic -| Versions Affected | v0.6.5 and later - -### zfetch_array_rd_sz -If prefetching is enabled, do not prefetch blocks larger than `zfetch_array_rd_sz` size. - -| zfetch_array_rd_sz | Notes -|---|--- -| Tags | [prefetch](#prefetch) -| When to change | To allow prefetching when using large block sizes -| Data Type | unsigned long -| Units | bytes -| Range | 0 to MAX_ULONG -| Default | 1,048,576 (1 MiB) -| Change | Dynamic -| Versions Affected | all - -### zfetch_max_distance -Limits the maximum number of bytes to prefetch per stream. - -| zfetch_max_distance | Notes -|---|--- -| Tags | [prefetch](#prefetch) -| When to change | Consider increasing read workloads that use large blocks and exhibit high prefetch hit ratios -| Data Type | uint -| Units | bytes -| Range | 0 to UINT_MAX -| Default | 8,388,608 -| Change | Dynamic -| Versions Affected | v0.7.0 - -### zfetch_max_streams -Maximum number of prefetch streams per file. - -For version v0.7.0 and later, when prefetching small files the number of -prefetch streams is automatically reduced below to prevent the -streams from overlapping. - -| zfetch_max_streams | Notes -|---|--- -| Tags | [prefetch](#prefetch) -| When to change | If the workload benefits from prefetching and has more than `zfetch_max_streams` concurrent reader threads -| Data Type | uint -| Units | streams -| Range | 1 to MAX_UINT -| Default | 8 -| Change | Dynamic -| Versions Affected | all - -### zfetch_min_sec_reap -Prefetch streams that have been accessed in `zfetch_min_sec_reap` seconds are -automatically stopped. - -| zfetch_min_sec_reap | Notes -|---|--- -| Tags | [prefetch](#prefetch) -| When to change | To test prefetch efficiency -| Data Type | uint -| Units | seconds -| Range | 0 to MAX_UINT -| Default | 2 -| Change | Dynamic -| Versions Affected | all - -### zfs_arc_dnode_limit_percent -Percentage of ARC metadata space that can be used for dnodes. - -The value calculated for `zfs_arc_dnode_limit_percent` can be overridden by -[zfs_arc_dnode_limit](#zfs_arc_dnode_limit). - -| zfs_arc_dnode_limit_percent | Notes -|---|--- -| Tags | [ARC](#arc) -| When to change | Consider increasing if `arc_prune` is using excessive system time and `/proc/spl/kstat/zfs/arcstats` shows `arc_dnode_size` is near or over `arc_dnode_limit` -| Data Type | int -| Units | percent of arc_meta_limit -| Range | 0 to 100 -| Default | 10 -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_arc_dnode_limit -When the number of bytes consumed by dnodes in the ARC exceeds -`zfs_arc_dnode_limit` bytes, demand for new metadata can take from the space -consumed by dnodes. - -The default value 0, indicates that a percent which is based on -[zfs_arc_dnode_limit_percent](#zfs_arc_dnode_limit_percent) of the ARC meta buffers that may be used for dnodes. - -`zfs_arc_dnode_limit` is similar to [zfs_arc_meta_prune](#zfs_arc_meta_prune) which serves a similar -purpose for metadata. - -| zfs_arc_dnode_limit | Notes -|---|--- -| Tags | [ARC](#arc) -| When to change | Consider increasing if `arc_prune` is using excessive system time and `/proc/spl/kstat/zfs/arcstats` shows `arc_dnode_size` is near or over `arc_dnode_limit` -| Data Type | uint64 -| Units | bytes -| Range | 0 to MAX_UINT64 -| Default | 0 (uses [zfs_arc_dnode_limit_percent](#zfs_arc_dnode_limit_percent)) -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_arc_dnode_reduce_percent -Percentage of ARC dnodes to try to evict in response to demand for -non-metadata when the number of bytes consumed by dnodes exceeds -[zfs_arc_dnode_limit](#zfs_arc_dnode_limit). - -| zfs_arc_dnode_reduce_percent | Notes -|---|--- -| Tags | [ARC](#arc) -| When to change | Testing dnode cache efficiency -| Data Type | uint64 -| Units | percent of size of dnode space used above [zfs_arc_dnode_limit](#zfs_arc_dnode_limit) -| Range | 0 to 100 -| Default | 10 -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_arc_average_blocksize -The ARC's buffer hash table is sized based on the assumption of an average -block size of `zfs_arc_average_blocksize`. The default of 8 KiB uses -approximately 1 MiB of hash table per 1 GiB of physical memory with -8-byte pointers. - -| zfs_arc_average_blocksize | Notes -|---|--- -| Tags | [ARC](#arc), [memory](#memory) -| When to change | For workloads where the known average blocksize is larger, increasing `zfs_arc_average_blocksize` can reduce memory usage -| Data Type | int -| Units | bytes -| Range | 512 to 16,777,216 -| Default | 8,192 -| Change | Prior to zfs module load -| Versions Affected | all - -### zfs_arc_evict_batch_limit -Number ARC headers to evict per sublist before proceeding to another sublist. -This batch-style operation prevents entire sublists from being evicted at once -but comes at a cost of additional unlocking and locking. - -| zfs_arc_evict_batch_limit | Notes -|---|--- -| Tags | [ARC](#arc) -| When to change | Testing ARC multilist features -| Data Type | int -| Units | count of ARC headers -| Range | 1 to INT_MAX -| Default | 10 -| Change | Dynamic -| Versions Affected | v0.6.5 and later - -### zfs_arc_grow_retry -When the ARC is shrunk due to memory demand, do not retry growing the ARC -for `zfs_arc_grow_retry` seconds. This operates as a damper to prevent -oscillating grow/shrink cycles when there is memory pressure. - -If `zfs_arc_grow_retry` = 0, the internal default of 5 seconds is used. - -| zfs_arc_grow_retry | Notes -|---|--- -| Tags | [ARC](#arc), [memory](#memory) -| When to change | TBD -| Data Type | int -| Units | seconds -| Range | 1 to MAX_INT -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.6.5 and later - -### zfs_arc_lotsfree_percent -Throttle ARC memory consumption, effectively throttling I/O, when free -system memory drops below this percentage of total system memory. Setting -`zfs_arc_lotsfree_percent` to 0 disables the throttle. - -The arcstat_memory_throttle_count counter in `/proc/spl/kstat/arcstats` -can indicate throttle activity. - -| zfs_arc_lotsfree_percent | Notes -|---|--- -| Tags | [ARC](#arc), [memory](#memory) -| When to change | TBD -| Data Type | int -| Units | percent -| Range | 0 to 100 -| Default | 10 -| Change | Dynamic -| Versions Affected | v0.6.5 and later - -### zfs_arc_max -Maximum size of ARC in bytes. If set to 0 then the maximum ARC size is set -to 1/2 of system RAM. - -`zfs_arc_max` can be changed dynamically with some caveats. It cannot be set back -to 0 while running and reducing it below the current ARC size will not cause -the ARC to shrink without memory pressure to induce shrinking. - -| zfs_arc_max | Notes -|---|--- -| Tags | [ARC](#arc), [memory](#memory) -| When to change | Reduce if ARC competes too much with other applications, increase if ZFS is the primary application and can use more RAM -| Data Type | uint64 -| Units | bytes -| Range | 67,108,864 to RAM size in bytes -| Default | 0 (uses default of RAM size in bytes / 2) -| Change | Dynamic (see description above) -| Verification | `c` column in `arcstats.py` or `/proc/spl/kstat/zfs/arcstats` entry `c_max` -| Versions Affected | all - -### zfs_arc_meta_adjust_restarts -The number of restart passes to make while scanning the ARC attempting -the free buffers in order to stay below the [zfs_arc_meta_limit](#zfs_arc_meta_limit). - -| zfs_arc_meta_adjust_restarts | Notes -|---|--- -| Tags | [ARC](#arc) -| When to change | Testing ARC metadata adjustment feature -| Data Type | int -| Units | restarts -| Range | 0 to INT_MAX -| Default | 4,096 -| Change | Dynamic -| Versions Affected | v0.6.5 and later - -### zfs_arc_meta_limit -Sets the maximum allowed size metadata buffers in the ARC. -When [zfs_arc_meta_limit](#zfs_arc_meta_limit) is reached metadata buffers are reclaimed, even if -the overall `c_max` has not been reached. - -In version v0.7.0, with a default value = 0, `zfs_arc_meta_limit_percent` is -used to set `arc_meta_limit` - -| zfs_arc_meta_limit | Notes -|---|--- -| Tags | [ARC](#arc) -| When to change | For workloads where the metadata to data ratio in the ARC can be changed to improve ARC hit rates -| Data Type | uint64 -| Units | bytes -| Range | 0 to `c_max` -| Default | 0 -| Change | Dynamic, except that it cannot be set back to 0 for a specific percent of the ARC; it must be set to an explicit value -| Verification | `/proc/spl/kstat/zfs/arcstats` entry `arc_meta_limit` -| Versions Affected | all - -### zfs_arc_meta_limit_percent -Sets the limit to ARC metadata, `arc_meta_limit`, as a percentage of -the maximum size target of the ARC, `c_max` - -Prior to version v0.7.0, the [zfs_arc_meta_limit](#zfs_arc_meta_limit) was used to set the limit as a -fixed size. `zfs_arc_meta_limit_percent` provides a more convenient interface -for setting the limit. - -| zfs_arc_meta_limit_percent | Notes -|---|--- -| Tags | [ARC](#arc) -| When to change | For workloads where the metadata to data ratio in the ARC can be changed to improve ARC hit rates -| Data Type | uint64 -| Units | percent of `c_max` -| Range | 0 to 100 -| Default | 75 -| Change | Dynamic -| Verification | `/proc/spl/kstat/zfs/arcstats` entry `arc_meta_limit` -| Versions Affected | v0.7.0 and later - -### zfs_arc_meta_min -The minimum allowed size in bytes that metadata buffers may consume in -the ARC. This value defaults to 0 which disables a floor on the amount -of the ARC devoted meta data. - -When evicting data from the ARC, if the `metadata_size` is less than -`arc_meta_min` then data is evicted instead of metadata. - -| zfs_arc_meta_min | Notes -|---|--- -| Tags | [ARC](#arc) -| When to change | -| Data Type | uint64 -| Units | bytes -| Range | 16,777,216 to `c_max` -| Default | 0 (use internal default 16 MiB) -| Change | Dynamic -| Verification | `/proc/spl/kstat/zfs/arcstats` entry `arc_meta_min` -| Versions Affected | all - -### zfs_arc_meta_prune -`zfs_arc_meta_prune` sets the number of dentries and znodes to be scanned looking -for entries which can be dropped. -This provides a mechanism to ensure the ARC can -honor the `arc_meta_limit and` reclaim otherwise pinned ARC buffers. -Pruning may be required when the ARC size drops to -`arc_meta_limit` because dentries and znodes can pin buffers in the ARC. -Increasing this value will cause to dentry and znode caches -to be pruned more aggressively and the arc_prune thread becomes more active. -Setting `zfs_arc_meta_prune` to 0 will disable pruning. - -| zfs_arc_meta_prune | Notes -|---|--- -| Tags | [ARC](#arc) -| When to change | TBD -| Data Type | uint64 -| Units | entries -| Range | 0 to INT_MAX -| Default | 10,000 -| Change | Dynamic -! Verification | Prune activity is counted by the `/proc/spl/kstat/zfs/arcstats` entry `arc_prune` -| Versions Affected | v0.6.5 and later - -### zfs_arc_meta_strategy -Defines the strategy for ARC metadata eviction (meta reclaim strategy). -A value of 0 (META_ONLY) will evict only the ARC metadata. -A value of 1 (BALANCED) indicates that additional data may be evicted -if required in order to evict the requested amount of metadata. - -| zfs_arc_meta_strategy | Notes -|---|--- -| Tags | [ARC](#arc) -| When to change | Testing ARC metadata eviction -| Data Type | int -| Units | enum -| Range | 0=evict metadata only, 1=also evict data buffers if they can free metadata buffers for eviction -| Default | 1 (BALANCED) -| Change | Dynamic -| Versions Affected | v0.6.5 and later - -### zfs_arc_min -Minimum ARC size limit. When the ARC is asked to shrink, it will stop shrinking -at `c_min` as tuned by `zfs_arc_min`. - -| zfs_arc_min | Notes -|---|--- -| Tags | [ARC](#arc) -| When to change | If the primary focus of the system is ZFS, then increasing can ensure the ARC gets a minimum amount of RAM -| Data Type | uint64 -| Units | bytes -| Range | 33,554,432 to `c_max` -| Default | For kernel: greater of 33,554,432 (32 MiB) and memory size / 32. For user-land: greater of 33,554,432 (32 MiB) and `c_max` / 2. -| Change | Dynamic -| Verification | `/proc/spl/kstat/zfs/arcstats` entry `c_min` -| Versions Affected | all - -### zfs_arc_min_prefetch_ms -Minimum time prefetched blocks are locked in the ARC. - -A value of 0 represents the default of 1 second. However, once changed, -dynamically setting to 0 will not return to the default. - -| zfs_arc_min_prefetch_ms | Notes -|---|--- -| Tags | [ARC](#arc), [prefetch](#prefetch) -| When to change | TBD -| Data Type | int -| Units | milliseconds -| Range | 1 to INT_MAX -| Default | 0 (use internal default of 1000 ms) -| Change | Dynamic -| Versions Affected | v0.8.0 and later - -### zfs_arc_min_prescient_prefetch_ms -Minimum time "prescient prefetched" blocks are locked in the ARC. -These blocks are meant to be prefetched fairly aggresively ahead of -the code that may use them. - -A value of 0 represents the default of 6 seconds. However, once changed, -dynamically setting to 0 will not return to the default. - -| zfs_arc_min_prescient_prefetch_ms | Notes -|---|--- -| Tags | [ARC](#arc), [prefetch](#prefetch) -| When to change | TBD -| Data Type | int -| Units | milliseconds -| Range | 1 to INT_MAX -| Default | 0 (use internal default of 6000 ms) -| Change | Dynamic -| Versions Affected | v0.8.0 and later - -### zfs_multilist_num_sublists -To allow more fine-grained locking, each ARC state contains a series -of lists (sublists) for both data and metadata objects. -Locking is performed at the sublist level. -This parameters controls the number of sublists per ARC state, and also -applies to other uses of the multilist data structure. - -| zfs_multilist_num_sublists | Notes -|---|--- -| Tags | [ARC](#arc) -| When to change | TBD -| Data Type | int -| Units | lists -| Range | 1 to INT_MAX -| Default | 0 (internal value is greater of number of online CPUs or 4) -| Change | Prior to zfs module load -| Versions Affected | v0.7.0 and later - -### zfs_arc_overflow_shift -The ARC size is considered to be overflowing if it exceeds the current -ARC target size (`/proc/spl/kstat/zfs/arcstats` entry `c`) by a -threshold determined by `zfs_arc_overflow_shift`. -The threshold is calculated as a fraction of c using the formula: -(ARC target size) `c >> zfs_arc_overflow_shift` - -The default value of 8 causes the ARC to be considered to be overflowing -if it exceeds the target size by 1/256th (0.3%) of the target size. - -When the ARC is overflowing, new buffer allocations are stalled until -the reclaim thread catches up and the overflow condition no longer exists. - -| zfs_arc_overflow_shift | Notes -|---|--- -| Tags | [ARC](#arc) -| When to change | TBD -| Data Type | int -| Units | shift -| Range | 1 to INT_MAX -| Default | 8 -| Change | Dynamic -| Versions Affected | v0.6.5 and later - -### zfs_arc_p_min_shift -arc_p_min_shift is used to shift of ARC target size -(`/proc/spl/kstat/zfs/arcstats` entry `c`) for calculating -both minimum and maximum most recently used (MRU) target size -(`/proc/spl/kstat/zfs/arcstats` entry `p`) - -A value of 0 represents the default setting of `arc_p_min_shift` = 4. -However, once changed, dynamically setting `zfs_arc_p_min_shift` to 0 will -not return to the default. - -| zfs_arc_p_min_shift | Notes -|---|--- -| Tags | [ARC](#arc) -| When to change | TBD -| Data Type | int -| Units | shift -| Range | 1 to INT_MAX -| Default | 0 (internal default = 4) -| Change | Dynamic -| Verification | Observe changes to `/proc/spl/kstat/zfs/arcstats` entry `p` -| Versions Affected | all - -### zfs_arc_p_dampener_disable -When data is being added to the ghost lists, the MRU target size is adjusted. -The amount of adjustment is based on the ratio of the MRU/MFU sizes. -When enabled, the ratio is capped to 10, avoiding large adjustments. - -| zfs_arc_p_dampener_disable | Notes -|---|--- -| Tags | [ARC](#arc) -| When to change | Testing ARC ghost list behaviour -| Data Type | boolean -| Range | 0=avoid large adjustments, 1=permit large adjustments -| Default | 1 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_arc_shrink_shift -`arc_shrink_shift` is used to adjust the ARC target sizes when large reduction -is required. The current ARC target size, `c`, and MRU size `p` can -be reduced by by the current `size >> arc_shrink_shift`. For the default value -of 7, this reduces the target by approximately 0.8%. - -A value of 0 represents the default setting of arc_shrink_shift = 7. -However, once changed, dynamically setting arc_shrink_shift to 0 will -not return to the default. - -| zfs_arc_shrink_shift | Notes -|---|--- -| Tags | [ARC](#arc), [memory](#memory) -| When to change | During memory shortfall, reducing `zfs_arc_shrink_shift` increases the rate of ARC shrinkage -| Data Type | int -| Units | shift -| Range | 1 to INT_MAX -| Default | 0 (`arc_shrink_shift` = 7) -| Change | Dynamic -| Versions Affected | all - -### zfs_arc_pc_percent -`zfs_arc_pc_percent` allows ZFS arc to play more nicely with the kernel's LRU -pagecache. It can guarantee that the arc size won't collapse under scanning -pressure on the pagecache, yet still allows arc to be reclaimed down to -zfs_arc_min if necessary. This value is specified as percent of pagecache -size (as measured by `NR_FILE_PAGES`) where that percent may exceed 100. This -only operates during memory pressure/reclaim. - -| zfs_arc_pc_percent | Notes -|---|--- -| Tags | [ARC](#arc), [memory](#memory) -| When to change | When using file systems under memory shortfall, if the page scanner causes the ARC to shrink too fast, then adjusting `zfs_arc_pc_percent` can reduce the shrink rate -| Data Type | int -| Units | percent -| Range | 0 to 100 -| Default | 0 (disabled) -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_arc_sys_free -`zfs_arc_sys_free` is the target number of bytes the ARC should leave as -free memory on the system. -Defaults to the larger of 1/64 of physical memory or 512K. Setting this -option to a non-zero value will override the default. - -A value of 0 represents the default setting of larger of 1/64 of physical -memory or 512 KiB. However, once changed, dynamically setting -zfs_arc_sys_free to 0 will not return to the default. - -| zfs_arc_sys_free | Notes -|---|--- -| Tags | [ARC](#arc), [memory](#memory) -| When to change | Change if more free memory is desired as a margin against memory demand by applications -| Data Type | ulong -| Units | bytes -| Range | 0 to ULONG_MAX -| Default | 0 (default to larger of 1/64 of physical memory or 512 KiB) -| Change | Dynamic -| Versions Affected | v0.6.5 and later - -### zfs_autoimport_disable -Disable reading zpool.cache file (see [spa_config_path](#spa_config_path)) when loading the zfs -module. - -| zfs_autoimport_disable | Notes -|---|--- -| Tags | [import](#import) -| When to change | Leave as default so that zfs behaves as other Linux kernel modules -| Data Type | boolean -| Range | 0=read `zpool.cache` at module load, 1=do not read `zpool.cache` at module load -| Default | 1 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_commit_timeout_pct -`zfs_commit_timeout_pct` controls the amount of time that a log (ZIL) write -block (lwb) remains "open" when it isn't "full" and it has a thread waiting -to commit to stable storage. -The timeout is scaled based on a percentage of the last lwb -latency to avoid significantly impacting the latency of each individual -intent log transaction (itx). - -| zfs_commit_timeout_pct | Notes -|---|--- -| Tags | [ZIL](#zil) -| When to change | TBD -| Data Type | int -| Units | percent -| Range | 1 to 100 -| Default | 5 -| Change | Dynamic -| Versions Affected | v0.8.0 - -### zfs_dbgmsg_enable -Internally ZFS keeps a small log to facilitate debugging. -The contents of the log are in the `/proc/spl/kstat/zfs/dbgmsg` file. -Writing 0 to `/proc/spl/kstat/zfs/dbgmsg` file clears the log. - -See also [zfs_dbgmsg_maxsize](#zfs_dbgmsg_maxsize) - -| zfs_dbgmsg_enable | Notes -|---|--- -| Tags | [debug](#debug) -| When to change | To view ZFS internal debug log -| Data Type | boolean -| Range | 0=do not log debug messages, 1=log debug messages -| Default | 0 (1 for debug builds) -| Change | Dynamic -| Versions Affected | v0.6.5 and later - -### zfs_dbgmsg_maxsize -The `/proc/spl/kstat/zfs/dbgmsg` file size limit is set by -zfs_dbgmsg_maxsize. - -See also zfs_dbgmsg_enable - -| zfs_dbgmsg_maxsize | Notes -|---|--- -| Tags | [debug](#debug) -| When to change | TBD -| Data Type | int -| Units | bytes -| Range | 0 to INT_MAX -| Default | 4 MiB -| Change | Dynamic -| Versions Affected | v0.6.5 and later - -### zfs_dbuf_state_index -The `zfs_dbuf_state_index` feature is currently unused. It is normally used -for controlling values in the `/proc/spl/kstat/zfs/dbufs` file. - -| zfs_dbuf_state_index | Notes -|---|--- -| Tags | [debug](#debug) -| When to change | Do not change -| Data Type | int -| Units | TBD -| Range | TBD -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.6.5 and later - -### zfs_deadman_enabled -When a pool sync operation takes longer than zfs_deadman_synctime_ms -milliseconds, a "slow spa_sync" message is logged to the debug log -(see [zfs_dbgmsg_enable](#zfs_dbgmsg_enable)). If `zfs_deadman_enabled` is -set to 1, then all pending IO operations are also checked and if any haven't -completed within zfs_deadman_synctime_ms milliseconds, a "SLOW IO" message -is logged to the debug log and a "deadman" system event (see zpool events -command) with the details of the hung IO is posted. - -| zfs_deadman_enabled | Notes -|---|--- -| Tags | [debug](#debug) -| When to change | To disable logging of slow I/O -| Data Type | boolean -| Range | 0=do not log slow I/O, 1=log slow I/O -| Default | 1 -| Change | Dynamic -| Versions Affected | v0.8.0 - -### zfs_deadman_checktime_ms -Once a pool sync operation has taken longer than -[zfs_deadman_synctime_ms](#zfs_deadman_synctime_ms) milliseconds, continue to check for slow -operations every [zfs_deadman_checktime_ms](#zfs_deadman_synctime_ms) milliseconds. - -| zfs_deadman_checktime_ms | Notes -|---|--- -| Tags | [debug](#debug) -| When to change | When debugging slow I/O -| Data Type | ulong -| Units | milliseconds -| Range | 1 to ULONG_MAX -| Default | 60,000 (1 minute) -| Change | Dynamic -| Versions Affected | v0.8.0 - -### zfs_deadman_ziotime_ms -When an individual I/O takes longer than `zfs_deadman_ziotime_ms` milliseconds, -then the operation is considered to be "hung". If [zfs_deadman_enabled](#zfs_deadman_enabled) -is set then the deadman behaviour is invoked as described by the -[zfs_deadman_failmode](#zfs_deadman_failmode) option. - -| zfs_deadman_ziotime_ms | Notes -|---|--- -| Tags | [debug](#debug) -| When to change | Testing ABD features -| Data Type | ulong -| Units | milliseconds -| Range | 1 to ULONG_MAX -| Default | 300,000 (5 minutes) -| Change | Dynamic -| Versions Affected | v0.8.0 - -### zfs_deadman_synctime_ms -The I/O deadman timer expiration time has two meanings -1. determines when the `spa_deadman()` logic should fire, indicating the txg sync -has not completed in a timely manner -2. determines if an I/O is considered "hung" - -In version v0.8.0, any I/O that has not completed in `zfs_deadman_synctime_ms` -is considered "hung" resulting in one of three behaviors controlled by the -[zfs_deadman_failmode](#zfs_deadman_failmode) parameter. - -`zfs_deadman_synctime_ms` takes effect if [zfs_deadman_enabled](#zfs_deadman_enabled) = 1. - -| zfs_deadman_synctime_ms | Notes -|---|--- -| Tags | [debug](#debug) -| When to change | When debugging slow I/O -| Data Type | ulong -| Units | milliseconds -| Range | 1 to ULONG_MAX -| Default | 600,000 (10 minutes) -| Change | Dynamic -| Versions Affected | v0.6.5 and later - -### zfs_deadman_failmode -zfs_deadman_failmode controls the behavior of the I/O deadman timer when it -detects a "hung" I/O. Valid values are: - * wait - Wait for the "hung" I/O (default) - * continue - Attempt to recover from a "hung" I/O - * panic - Panic the system - -| zfs_deadman_failmode | Notes -|---|--- -| Tags | [debug](#debug) -| When to change | In some cluster cases, panic can be appropriate -| Data Type | string -| Range | _wait_, _continue_, or _panic_ -| Default | wait -| Change | Dynamic -| Versions Affected | v0.8.0 - -### zfs_dedup_prefetch -ZFS can prefetch deduplication table (DDT) entries. `zfs_dedup_prefetch` allows -DDT prefetches to be enabled. - -| zfs_dedup_prefetch | Notes -|---|--- -| Tags | [prefetch](#prefetch), [memory](#memory) -| When to change | For systems with limited RAM using the dedup feature, disabling deduplication table prefetch can reduce memory pressure -| Data Type | boolean -| Range | 0=do not prefetch, 1=prefetch dedup table entries -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.6.5 and later - -### zfs_delete_blocks -`zfs_delete_blocks` defines a large file for the purposes of delete. -Files containing more than `zfs_delete_blocks` will be deleted asynchronously -while smaller files are deleted synchronously. -Decreasing this value reduces the time spent in an `unlink(2)` system call at -the expense of a longer delay before the freed space is available. - -The `zfs_delete_blocks` value is specified in blocks, not bytes. The size of -blocks can vary and is ultimately limited by the filesystem's recordsize -property. - -| zfs_delete_blocks | Notes -|---|--- -| Tags | [filesystem](#filesystem), [delete](#delete) -| When to change | If applications delete large files and blocking on `unlink(2)` is not desired -| Data Type | ulong -| Units | blocks -| Range | 1 to ULONG_MAX -| Default | 20,480 -| Change | Dynamic -| Versions Affected | all - -### zfs_delay_min_dirty_percent -The ZFS write throttle begins to delay each transaction when the amount of -dirty data reaches the threshold `zfs_delay_min_dirty_percent` of -[zfs_dirty_data_max](#zfs_dirty_data_max). -This value should be >= [zfs_vdev_async_write_active_max_dirty_percent](#zfs_vdev_async_write_active_max_dirty_percent). - -| zfs_delay_min_dirty_percent | Notes -|---|--- -| Tags | [write_throttle](#write_throttle) -| When to change | See section "ZFS TRANSACTION DELAY" -| Data Type | int -| Units | percent -| Range | 0 to 100 -| Default | 60 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_delay_scale -`zfs_delay_scale` controls how quickly the ZFS write throttle transaction -delay approaches infinity. -Larger values cause longer delays for a given amount of dirty data. - -For the smoothest delay, this value should be about 1 billion divided -by the maximum number of write operations per second the pool can sustain. -The throttle will smoothly handle between 10x and 1/10th `zfs_delay_scale`. - -Note: `zfs_delay_scale` * [zfs_dirty_data_max](#zfs_dirty_data_max) must be < 2^64. - -| zfs_delay_scale | Notes -|---|--- -| Tags | [write_throttle](#write_throttle) -| When to change | See section "ZFS TRANSACTION DELAY" -| Data Type | ulong -| Units | scalar (nanoseconds) -| Range | 0 to ULONG_MAX -| Default | 500,000 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_dirty_data_max -`zfs_dirty_data_max` is the ZFS write throttle dirty space limit. -Once this limit is exceeded, new writes are delayed until space is freed by -writes being committed to the pool. - -zfs_dirty_data_max takes precedence over [zfs_dirty_data_max_percent](#zfs_dirty_data_max_percent). - -| zfs_dirty_data_max | Notes -|---|--- -| Tags | [write_throttle](#write_throttle) -| When to change | See section "ZFS TRANSACTION DELAY" -| Data Type | ulong -| Units | bytes -| Range | 1 to [zfs_dirty_data_max_max](#zfs_dirty_data_max_max) -| Default | 10% of physical RAM -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_dirty_data_max_percent -`zfs_dirty_data_max_percent` is an alternative method of specifying -[zfs_dirty_data_max](#zfs_dirty_data_max), the ZFS write throttle dirty space limit. -Once this limit is exceeded, new writes are delayed until space is freed by -writes being committed to the pool. - -[zfs_dirty_data_max](#zfs_dirty_data_max) takes precedence over `zfs_dirty_data_max_percent`. - -| zfs_dirty_data_max_percent | Notes -|---|--- -| Tags | [write_throttle](#write_throttle) -| When to change | See section "ZFS TRANSACTION DELAY" -| Data Type | int -| Units | percent -| Range | 1 to 100 -| Default | 10% of physical RAM -| Change | Prior to zfs module load or a memory hot plug event -| Versions Affected | v0.6.4 and later - -### zfs_dirty_data_max_max -`zfs_dirty_data_max_max` is the maximum allowable value of -[zfs_dirty_data_max](#zfs_dirty_data_max). - -`zfs_dirty_data_max_max` takes precedence over [zfs_dirty_data_max_max_percent](#zfs_dirty_data_max_max_percent). - -| zfs_dirty_data_max_max | Notes -|---|--- -| Tags | [write_throttle](#write_throttle) -| When to change | See section "ZFS TRANSACTION DELAY" -| Data Type | ulong -| Units | bytes -| Range | 1 to physical RAM size -| Default | 25% of physical RAM -| Change | Prior to zfs module load -| Versions Affected | v0.6.4 and later - -### zfs_dirty_data_max_max_percent -`zfs_dirty_data_max_max_percent` an alternative to [zfs_dirty_data_max_max](#zfs_dirty_data_max_max) -for setting the maximum allowable value of [zfs_dirty_data_max](#zfs_dirty_data_max) - -[zfs_dirty_data_max_max](#zfs_dirty_data_max_max) takes precedence over `zfs_dirty_data_max_max_percent` - -| zfs_dirty_data_max_max_percent | Notes -|---|--- -| Tags | [write_throttle](#write_throttle) -| When to change | See section "ZFS TRANSACTION DELAY" -| Data Type | int -| Units | percent -| Range | 1 to 100 -| Default | 25% of physical RAM -| Change | Prior to zfs module load -| Versions Affected | v0.6.4 and later - -### zfs_dirty_data_sync -When there is at least `zfs_dirty_data_sync` dirty data, a transaction group -sync is started. This allows a transaction group sync to occur more frequently -than the transaction group timeout interval (see [zfs_txg_timeout](#zfs_txg_timeout)) -when there is dirty data to be written. - -| zfs_dirty_data_sync | Notes -|---|--- -| Tags | [write_throttle](#write_throttle), [ZIO_scheduler](#ZIO_scheduler) -| When to change | TBD -| Data Type | ulong -| Units | bytes -| Range | 1 to ULONG_MAX -| Default | 67,108,864 (64 MiB) -| Change | Dynamic -| Versions Affected | v0.6.4 through v0.8.x, deprecation planned for v2 - -### zfs_dirty_data_sync_percent -When there is at least `zfs_dirty_data_sync_percent` of [zfs_dirty_data_max](#zfs_dirty_data_max) -dirty data, a transaction group sync is started. -This allows a transaction group sync to occur more frequently -than the transaction group timeout interval (see [zfs_txg_timeout](#zfs_txg_timeout)) -when there is dirty data to be written. - -| zfs_dirty_data_sync_percent | Notes -|---|--- -| Tags | [write_throttle](#write_throttle), [ZIO_scheduler](#ZIO_scheduler) -| When to change | TBD -| Data Type | int -| Units | percent -| Range | 1 to [zfs_vdev_async_write_active_min_dirty_percent](#zfs_vdev_async_write_active_min_dirty_percent) -| Default | 20 -| Change | Dynamic -| Versions Affected | planned for v2, deprecates [zfs_dirty_data_sync](#zfs_dirty_data_sync) - -### zfs_fletcher_4_impl -Fletcher-4 is the default checksum algorithm for metadata and data. -When the zfs kernel module is loaded, a set of microbenchmarks are run to -determine the fastest algorithm for the current hardware. The -`zfs_fletcher_4_impl` parameter allows a specific implementation to be -specified other than the default (fastest). -Selectors other than _fastest_ and _scalar_ require instruction -set extensions to be available and will only appear if ZFS detects their -presence. The _scalar_ implementation works on all processors. - -The results of the microbenchmark are visible in the -`/proc/spl/kstat/zfs/fletcher_4_bench` file. -Larger numbers indicate better performance. -Since ZFS is processor endian-independent, the microbenchmark is run -against both big and little-endian transformation. - -| zfs_fletcher_4_impl | Notes -|---|--- -| Tags | [CPU](#cpu), [checksum](#checksum) -| When to change | Testing Fletcher-4 algorithms -| Data Type | string -| Range | _fastest_, _scalar_, _superscalar_, _superscalar4_, _sse2_, _ssse3_, _avx2_, _avx512f_, or _aarch64_neon_ depending on hardware support -| Default | fastest -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_free_bpobj_enabled -The processing of the free_bpobj object can be enabled by -`zfs_free_bpobj_enabled` - -| zfs_free_bpobj_enabled | Notes -|---|--- -| Tags | [delete](#delete) -| When to change | If there's a problem with processing free_bpobj (e.g. i/o error or bug) -| Data Type | boolean -| Range | 0=do not process free_bpobj objects, 1=process free_bpobj objects -| Default | 1 -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_free_max_blocks -`zfs_free_max_blocks` sets the maximum number of blocks to be freed in a single -transaction group (txg). For workloads that delete (free) large numbers of -blocks in a short period of time, the processing of the frees can negatively -impact other operations, including txg commits. `zfs_free_max_blocks` acts as a -limit to reduce the impact. - -| zfs_free_max_blocks | Notes -|---|--- -| Tags | [filesystem](#filesystem), [delete](#delete) -| When to change | For workloads that delete large files, `zfs_free_max_blocks` can be adjusted to meet performance requirements while reducing the impacts of deletion -| Data Type | ulong -| Units | blocks -| Range | 1 to ULONG_MAX -| Default | 100,000 -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_vdev_async_read_max_active -Maximum asynchronous read I/Os active to each device. - -| zfs_vdev_async_read_max_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to [zfs_vdev_max_active](#zfs_vdev_max_active) -| Default | 3 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_vdev_async_read_min_active -Minimum asynchronous read I/Os active to each device. - -| zfs_vdev_async_read_min_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to ([zfs_vdev_async_read_max_active](#zfs_vdev_async_read_max_active) - 1) -| Default | 1 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_vdev_async_write_active_max_dirty_percent -When the amount of dirty data exceeds the threshold -`zfs_vdev_async_write_active_max_dirty_percent` of [zfs_dirty_data_max](#zfs_dirty_data_max) -dirty data, then [zfs_vdev_async_write_max_active](#zfs_vdev_async_write_max_active) is used to -limit active async writes. -If the dirty data is between -[zfs_vdev_async_write_active_min_dirty_percent](#zfs_vdev_async_write_active_min_dirty_percent) -and `zfs_vdev_async_write_active_max_dirty_percent`, the active I/O limit is -linearly interpolated between [zfs_vdev_async_write_min_active](#zfs_vdev_async_write_min_active) -and [zfs_vdev_async_write_max_active](#zfs_vdev_async_write_max_active) - -| zfs_vdev_async_write_active_max_dirty_percent | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | int -| Units | percent of [zfs_dirty_data_max](#zfs_dirty_data_max) -| Range | 0 to 100 -| Default | 60 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_vdev_async_write_active_min_dirty_percent -If the amount of dirty data is between -`zfs_vdev_async_write_active_min_dirty_percent` -and [zfs_vdev_async_write_active_max_dirty_percent](#zfs_vdev_async_write_active_max_dirty_percent) -of [zfs_dirty_data_max](#zfs_dirty_data_max), -the active I/O limit is linearly interpolated between -[zfs_vdev_async_write_min_active](#zfs_vdev_async_write_min_active) and -[zfs_vdev_async_write_max_active](#zfs_vdev_async_write_max_active) - -| zfs_vdev_async_write_active_min_dirty_percent | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | int -| Units | percent of zfs_dirty_data_max -| Range | 0 to ([zfs_vdev_async_write_active_max_dirty_percent](#zfs_vdev_async_write_active_max_dirty_percent) - 1) -| Default | 30 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_vdev_async_write_max_active -`zfs_vdev_async_write_max_active` sets the maximum asynchronous -write I/Os active to each device. - -| zfs_vdev_async_write_max_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to [zfs_vdev_max_active](#zfs_vdev_max_active) -| Default | 10 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_vdev_async_write_min_active -`zfs_vdev_async_write_min_active` sets the minimum asynchronous write I/Os active to each device. - -Lower values are associated with better latency on rotational media but poorer -resilver performance. The default value of 2 was chosen as a compromise. A -value of 3 has been shown to improve resilver performance further at a cost of -further increasing latency. - -| zfs_vdev_async_write_min_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to [zfs_vdev_async_write_max_active](#zfs_vdev_async_write_max_active) -| Default | 1 for v0.6.x, 2 for v0.7.0 and later -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_vdev_max_active -The maximum number of I/Os active to each device. Ideally, -`zfs_vdev_max_active` >= the sum of each queue's max_active. - -Once queued to the device, the ZFS I/O scheduler is no longer able to -prioritize I/O operations. The underlying device drivers have their -own scheduler and queue depth limits. Values larger than the device's maximum -queue depth can have the affect of increased latency as the I/Os are queued in -the intervening device driver layers. - -| zfs_vdev_max_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | sum of each queue's min_active to UINT32_MAX -| Default | 1,000 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_vdev_scrub_max_active -`zfs_vdev_scrub_max_active` sets the maximum scrub or scan -read I/Os active to each device. - -| zfs_vdev_scrub_max_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler), [scrub](#scrub), [resilver](#resilver) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to [zfs_vdev_max_active](#zfs_vdev_max_active) -| Default | 2 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_vdev_scrub_min_active -`zfs_vdev_scrub_min_active` sets the minimum scrub or scan read I/Os active -to each device. - -| zfs_vdev_scrub_min_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler), [scrub](#scrub), [resilver](#resilver) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to [zfs_vdev_scrub_max_active](#zfs_vdev_scrub_max_active) -| Default | 1 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_vdev_sync_read_max_active -Maximum synchronous read I/Os active to each device. - -| zfs_vdev_sync_read_max_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to [zfs_vdev_max_active](#zfs_vdev_max_active) -| Default | 10 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_vdev_sync_read_min_active -`zfs_vdev_sync_read_min_active` sets the minimum synchronous read I/Os -active to each device. - -| zfs_vdev_sync_read_min_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to [zfs_vdev_sync_read_max_active](#zfs_vdev_sync_read_max_active) -| Default | 10 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_vdev_sync_write_max_active -`zfs_vdev_sync_write_max_active` sets the maximum synchronous write I/Os active -to each device. - -| zfs_vdev_sync_write_max_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to [zfs_vdev_max_active](#zfs_vdev_max_active) -| Default | 10 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_vdev_sync_write_min_active -`zfs_vdev_sync_write_min_active` sets the minimum synchronous write I/Os -active to each device. - -| zfs_vdev_sync_write_min_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to [zfs_vdev_sync_write_max_active](#zfs_vdev_sync_write_max_active) -| Default | 10 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_vdev_queue_depth_pct -Maximum number of queued allocations per top-level vdev expressed as -a percentage of [zfs_vdev_async_write_max_active](#zfs_vdev_async_write_max_active). -This allows the system to detect devices that are more capable of handling allocations -and to allocate more blocks to those devices. It also allows for dynamic -allocation distribution when devices are imbalanced as fuller devices -will tend to be slower than empty devices. Once the queue depth -reaches (`zfs_vdev_queue_depth_pct` * [zfs_vdev_async_write_max_active](#zfs_vdev_async_write_max_active) / 100) -then allocator will stop allocating blocks on that top-level device and -switch to the next. - -See also [zio_dva_throttle_enabled](#zio_dva_throttle_enabled) - -| zfs_vdev_queue_depth_pct | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to UINT32_MAX -| Default | 1,000 -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_disable_dup_eviction -Disable duplicate buffer eviction from ARC. - -| zfs_disable_dup_eviction | Notes -|---|--- -| Tags | [ARC](#arc), [dedup](#dedup) -| When to change | TBD -| Data Type | boolean -| Range | 0=duplicate buffers can be evicted, 1=do not evict duplicate buffers -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.6.5, deprecated in v0.7.0 - -### zfs_expire_snapshot -Snapshots of filesystems are normally automounted under the filesystem's -`.zfs/snapshot` subdirectory. When not in use, snapshots are unmounted -after zfs_expire_snapshot seconds. - -| zfs_expire_snapshot | Notes -|---|--- -| Tags | [filesystem](#filesystem), [snapshot](#snapshot) -| When to change | TBD -| Data Type | int -| Units | seconds -| Range | 0 disables automatic unmounting, maximum time is INT_MAX -| Default | 300 -| Change | Dynamic -| Versions Affected | v0.6.1 and later - -### zfs_admin_snapshot -Allow the creation, removal, or renaming of entries in the `.zfs/snapshot` -subdirectory to cause the creation, destruction, or renaming of snapshots. -When enabled this functionality works both locally and over NFS exports -which have the "no_root_squash" option set. - -| zfs_admin_snapshot | Notes -|---|--- -| Tags | [filesystem](#filesystem), [snapshot](#snapshot) -| When to change | TBD -| Data Type | boolean -| Range | 0=do not allow snapshot manipulation via the filesystem, 1=allow snapshot manipulation via the filesystem -| Default | 1 -| Change | Dynamic -| Versions Affected | v0.6.5 and later - -### zfs_flags -Set additional debugging flags (see [zfs_dbgmsg_enable](#zfs_dbgmsg_enable)) - -| flag value | symbolic name | description -|---|---|--- -| 0x1 | ZFS_DEBUG_DPRINTF | Enable dprintf entries in the debug log -| 0x2 | ZFS_DEBUG_DBUF_VERIFY | Enable extra dnode verifications -| 0x4 | ZFS_DEBUG_DNODE_VERIFY | Enable extra dnode verifications -| 0x8 | ZFS_DEBUG_SNAPNAMES | Enable snapshot name verification -| 0x10 | ZFS_DEBUG_MODIFY | Check for illegally modified ARC buffers -| 0x20 | ZFS_DEBUG_SPA | Enable spa_dbgmsg entries in the debug log -| 0x40 | ZFS_DEBUG_ZIO_FREE | Enable verification of block frees -| 0x80 | ZFS_DEBUG_HISTOGRAM_VERIFY | Enable extra spacemap histogram verifications -| 0x100 | ZFS_DEBUG_METASLAB_VERIFY | Verify space accounting on disk matches in-core range_trees -| 0x200 | ZFS_DEBUG_SET_ERROR | Enable SET_ERROR and dprintf entries in the debug log - -| zfs_flags | Notes -|---|--- -| Tags | [debug](#debug) -| When to change | When debugging ZFS -| Data Type | int -| Default | 0 no debug flags set, for debug builds: all except ZFS_DEBUG_DPRINTF and ZFS_DEBUG_SPA -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_free_leak_on_eio -If destroy encounters an I/O error (EIO) while reading metadata (eg indirect -blocks), space referenced by the missing metadata cannot be freed. -Normally, this causes the background destroy to become "stalled", as the -destroy is unable to make forward progress. While in this stalled state, -all remaining space to free from the error-encountering filesystem is -temporarily leaked. Set `zfs_free_leak_on_eio = 1` to ignore the EIO, -permanently leak the space from indirect blocks that can not be read, -and continue to free everything else that it can. - -The default, stalling behavior is useful if the storage partially -fails (eg some but not all I/Os fail), and then later recovers. In -this case, we will be able to continue pool operations while it is -partially failed, and when it recovers, we can continue to free the -space, with no leaks. However, note that this case is rare. - -Typically pools either: -1. fail completely (but perhaps temporarily (eg a top-level vdev going offline) - -2. have localized, permanent errors (eg disk returns the wrong data due to bit -flip or firmware bug) - -In case (1), the `zfs_free_leak_on_eio` setting does not matter because the -pool will be suspended and the sync thread will not be able to make -forward progress. In case (2), because the error is -permanent, the best effort do is leak the minimum amount of space. -Therefore, it is reasonable for `zfs_free_leak_on_eio` be set, but by default -the more conservative approach is taken, so that there is no -possibility of leaking space in the "partial temporary" failure case. - -| zfs_free_leak_on_eio | Notes -|---|--- -| Tags | [debug](#debug) -| When to change | When debugging I/O errors during destroy -| Data Type | boolean -| Range | 0=normal behavior, 1=ignore error and permanently leak space -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.6.5 and later - -### zfs_free_min_time_ms -During a `zfs destroy` operation using `feature@async_destroy` a -minimum of `zfs_free_min_time_ms` time will be spent working on freeing blocks -per txg commit. - -| zfs_free_min_time_ms | Notes -|---|--- -| Tags | [delete](#delete) -| When to change | TBD -| Data Type | int -| Units | milliseconds -| Range | 1 to (zfs_txg_timeout * 1000) -| Default | 1,000 -| Change | Dynamic -| Versions Affected | v0.6.0 and later - -### zfs_immediate_write_sz -If a pool does not have a log device, data blocks equal to or larger than -`zfs_immediate_write_sz` are treated as if the dataset being written to had -the property setting `logbias=throughput` - -Terminology note: `logbias=throughput` writes the blocks in "indirect mode" -to the ZIL where the data is written to the pool and a pointer to the data -is written to the ZIL. - -| zfs_immediate_write_sz | Notes -|---|--- -| Tags | [ZIL](#zil) -| When to change | TBD -| Data Type | long -| Units | bytes -| Range | 512 to 16,777,216 (valid block sizes) -| Default | 32,768 (32 KiB) -| Change | Dynamic -| Verification | Data blocks that exceed `zfs_immediate_write_sz` or are written as `logbias=throughput` increment the `zil_itx_indirect_count` entry in `/proc/spl/kstat/zfs/zil` -| Versions Affected | all - -### zfs_max_recordsize -ZFS supports logical record (block) sizes from 512 bytes to 16 MiB. -The benefits of larger blocks, and thus larger average I/O sizes, can be -weighed against the cost of copy-on-write of large block to modify one byte. -Additionally, very large blocks can have a negative impact on both I/O latency -at the device level and the memory allocator. The `zfs_max_recordsize` -parameter limits the upper bound of the dataset volblocksize and recordsize -properties. - -Larger blocks can be created by enabling `zpool` `large_blocks` feature and -changing this `zfs_max_recordsize`. Pools with larger blocks can always be -imported and used, regardless of the value of `zfs_max_recordsize`. - -For 32-bit systems, `zfs_max_recordsize` also limits the size of kernel virtual -memory caches used in the ZFS I/O pipeline (`zio_buf_*` and `zio_data_buf_*`). - -See also the `zpool` `large_blocks` feature. - -| zfs_max_recordsize | Notes -|---|--- -| Tags | [filesystem](#filesystem), [memory](#memory), [volume](#volume) -| When to change | To create datasets with larger volblocksize or recordsize -| Data Type | int -| Units | bytes -| Range | 512 to 16,777,216 (valid block sizes) -| Default | 1,048,576 -| Change | Dynamic, set prior to creating volumes or changing filesystem recordsize -| Versions Affected | v0.6.5 and later - -### zfs_mdcomp_disable -`zfs_mdcomp_disable` allows metadata compression to be disabled. - -| zfs_mdcomp_disable | Notes -|---|--- -| Tags | [CPU](#cpu), [metadata](#metadata) -| When to change | When CPU cycles cost less than I/O -| Data Type | boolean -| Range | 0=compress metadata, 1=do not compress metadata -| Default | 0 -| Change | Dynamic -| Versions Affected | from v0.6.0 to v0.8.0 - -### zfs_metaslab_fragmentation_threshold -Allow metaslabs to keep their active state as long as their fragmentation -percentage is less than or equal to this value. When writing, an active -metaslab whose fragmentation percentage exceeds -`zfs_metaslab_fragmentation_threshold` is avoided allowing metaslabs with less -fragmentation to be preferred. - -Metaslab fragmentation is used to calculate the overall pool `fragmentation` -property value. However, individual metaslab fragmentation levels are -observable using the `zdb` with the `-mm` option. - -`zfs_metaslab_fragmentation_threshold` works at the metaslab level and each -top-level vdev has approximately [metaslabs_per_vdev](#metaslabs_per_vdev) metaslabs. -See also [zfs_mg_fragmentation_threshold](#zfs_mg_fragmentation_threshold) - -| zfs_metaslab_fragmentation_threshold | Notes -|---|--- -| Tags | [allocation](#allocation), [fragmentation](#fragmentation), [vdev](#vdev) -| When to change | Testing metaslab allocation -| Data Type | int -| Units | percent -| Range | 1 to 100 -| Default | 70 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_mg_fragmentation_threshold -Metaslab groups (top-level vdevs) are considered eligible for allocations -if their fragmentation percentage metric is less than or equal to -`zfs_mg_fragmentation_threshold`. If a metaslab group exceeds this threshold -then it will be skipped unless all metaslab groups within the metaslab class -have also crossed the `zfs_mg_fragmentation_threshold` threshold. - -| zfs_mg_fragmentation_threshold | Notes -|---|--- -| Tags | [allocation](#allocation), [fragmentation](#fragmentation), [vdev](#vdev) -| When to change | Testing metaslab allocation -| Data Type | int -| Units | percent -| Range | 1 to 100 -| Default | 85 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_mg_noalloc_threshold -Metaslab groups (top-level vdevs) with free space percentage greater than -`zfs_mg_noalloc_threshold` are eligible for new allocations. -If a metaslab group's free space is less than or equal to the -threshold, the allocator avoids allocating to that group -unless all groups in the pool have reached the threshold. Once all -metaslab groups have reached the threshold, all metaslab groups are allowed -to accept allocations. The default value of 0 disables the feature and causes -all metaslab groups to be eligible for allocations. - -This parameter allows one to deal with pools having heavily imbalanced -vdevs such as would be the case when a new vdev has been added. -Setting the threshold to a non-zero percentage will stop allocations -from being made to vdevs that aren't filled to the specified percentage -and allow lesser filled vdevs to acquire more allocations than they -otherwise would under the older `zfs_mg_alloc_failures` facility. - -| zfs_mg_noalloc_threshold | Notes -|---|--- -| Tags | [allocation](#allocation), [fragmentation](#fragmentation), [vdev](#vdev) -| When to change | To force rebalancing as top-level vdevs are added or expanded -| Data Type | int -| Units | percent -| Range | 0 to 100 -| Default | 0 (disabled) -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_multihost_history -The pool `multihost` multimodifier protection (MMP) subsystem can record -historical updates in the `/proc/spl/kstat/zfs/POOL_NAME/multihost` file -for debugging purposes. -The number of lines of history is determined by zfs_multihost_history. - -| zfs_multihost_history | Notes -|---|--- -| Tags | [MMP](#mmp), [import](#import) -| When to change | When testing multihost feature -| Data Type | int -| Units | lines -| Range | 0 to INT_MAX -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_multihost_interval -`zfs_multihost_interval` controls the frequency of multihost writes performed -by the pool multihost multimodifier protection (MMP) subsystem. -The multihost write period is (`zfs_multihost_interval` / number of leaf-vdevs) -milliseconds. -Thus on average a multihost write will be issued for each leaf vdev every -`zfs_multihost_interval` milliseconds. In practice, the observed period can -vary with the I/O load and this observed value is the delay which is stored in -the uberblock. - -On import the multihost activity check waits a minimum amount of time -determined by (`zfs_multihost_interval` * [zfs_multihost_import_intervals](#zfs_multihost_import_intervals)) -with a lower bound of 1 second. -The activity check time may be further extended if the value of mmp delay -found in the best uberblock indicates actual multihost updates happened at -longer intervals than `zfs_multihost_interval` - -Note: the multihost protection feature applies to storage devices that can be -shared between multiple systems. - -| zfs_multihost_interval | Notes -|---|--- -| Tags | [MMP](#mmp), [import](#import), [vdev](#vdev) -| When to change | To optimize pool import time against possibility of simultaneous import by another system -| Data Type | ulong -| Units | milliseconds -| Range | 100 to ULONG_MAX -| Default | 1000 -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_multihost_import_intervals -`zfs_multihost_import_intervals` controls the duration of the activity test on -pool import for the multihost multimodifier protection (MMP) subsystem. -The activity test can be expected to take a minimum time of -(`zfs_multihost_import_interval`s * [zfs_multihost_interval](#zfs_multihost_interval) * `random(25%)`) -milliseconds. The random period of up to 25% improves simultaneous import -detection. For example, if two hosts are rebooted at the same time and -automatically attempt to import the pool, then is is highly probable that -one host will win. - -Smaller values of `zfs_multihost_import_intervals` reduces the -import time but increases the risk of failing to detect an active pool. -The total activity check time is never allowed to drop below one second. - -Note: the multihost protection feature applies to storage devices that can be -shared between multiple systems. - -| zfs_multihost_import_intervals | Notes -|---|--- -| Tags | [MMP](#mmp), [import](#import) -| When to change | TBD -| Data Type | uint -| Units | intervals -| Range | 1 to UINT_MAX -| Default | 10 -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_multihost_fail_intervals -`zfs_multihost_fail_intervals` controls the behavior of the pool when -write failures are detected in the multihost multimodifier protection (MMP) -subsystem. - -If `zfs_multihost_fail_intervals = 0` then multihost write failures are ignored. -The write failures are reported to the ZFS event daemon (`zed`) which -can take action such as suspending the pool or offlining a device. - -If `zfs_multihost_fail_intervals > 0` then sequential multihost write failures -will cause the pool to be suspended. This occurs when -(`zfs_multihost_fail_intervals` * [zfs_multihost_interval](#zfs_multihost_interval)) -milliseconds have passed since the last successful multihost write. -This guarantees the activity test will see multihost writes if the pool is -attempted to be imported by another system. - -| zfs_multihost_fail_intervals | Notes -|---|--- -| Tags | [MMP](#mmp), [import](#import) -| When to change | TBD -| Data Type | uint -| Units | intervals -| Range | 0 to UINT_MAX -| Default | 5 -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_delays_per_second -The ZFS Event Daemon (zed) processes events from ZFS. However, it can be -overwhelmed by high rates of error reports which can be generated by failing, -high-performance devices. `zfs_delays_per_second` limits the rate of -delay events reported to zed. - -| zfs_delays_per_second | Notes -|---|--- -| Tags | [zed](#zed), [delay](#delay) -| When to change | If processing delay events at a higher rate is desired -| Data Type | uint -| Units | events per second -| Range | 0 to UINT_MAX -| Default | 20 -| Change | Dynamic -| Versions Affected | v0.7.7 and later - -### zfs_checksums_per_second -The ZFS Event Daemon (zed) processes events from ZFS. However, it can be -overwhelmed by high rates of error reports which can be generated by failing, -high-performance devices. `zfs_checksums_per_second` limits the rate of -checksum events reported to zed. - -Note: do not set this value lower than the SERD limit for `checksum` in zed. -By default, `checksum_N` = 10 and `checksum_T` = 10 minutes, resulting in a -practical lower limit of 1. - -| zfs_checksums_per_second | Notes -|---|--- -| Tags | [zed](#zed), [checksum](#checksum) -| When to change | If processing checksum error events at a higher rate is desired -| Data Type | uint -| Units | events per second -| Range | 0 to UINT_MAX -| Default | 20 -| Change | Dynamic -| Versions Affected | v0.7.7 and later - -### zfs_no_scrub_io -When `zfs_no_scrub_io = 1` scrubs do not actually scrub data and -simply doing a metadata crawl of the pool instead. - -| zfs_no_scrub_io | Notes -|---|--- -| Tags | [scrub](#scrub) -| When to change | Testing scrub feature -| Data Type | boolean -| Range | 0=perform scrub I/O, 1=do not perform scrub I/O -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.6.0 and later - -### zfs_no_scrub_prefetch -When `zfs_no_scrub_prefetch = 1`, prefetch is disabled for scrub I/Os. - -| zfs_no_scrub_prefetch | Notes -|---|--- -| Tags | [prefetch](#prefetch), [scrub](#scrub) -| When to change | Testing scrub feature -| Data Type | boolean -| Range | 0=prefetch scrub I/Os, 1=do not prefetch scrub I/Os -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.6.4 and later - -### zfs_nocacheflush -ZFS uses barriers (volatile cache flush commands) to ensure data is committed to -permanent media by devices. This ensures consistent on-media state for devices -where caches are volatile (eg HDDs). - -For devices with nonvolatile caches, the cache flush operation can be a no-op. -However, in some RAID arrays, cache flushes can cause the entire cache to be -flushed to the backing devices. - -To ensure on-media consistency, keep cache flush enabled. - -| zfs_nocacheflush | Notes -|---|--- -| Tags | [disks](#disks) -| When to change | If the storage device has nonvolatile cache, then disabling cache flush can save the cost of occasional cache flush comamnds -| Data Type | boolean -| Range | 0=send cache flush commands, 1=do not send cache flush commands -| Default | 0 -| Change | Dynamic -| Versions Affected | all - -### zfs_nopwrite_enabled -The NOP-write feature is enabled by default when a crytographically-secure -checksum algorithm is in use by the dataset. `zfs_nopwrite_enabled` allows the -NOP-write feature to be completely disabled. - -| zfs_nopwrite_enabled | Notes -|---|--- -| Tags | [checksum](#checksum), [debug](#debug) -| When to change | TBD -| Data Type | boolean -| Range | 0=disable NOP-write feature, 1=enable NOP-write feature -| Default | 1 -| Change | Dynamic -| Versions Affected | v0.6.0 and later - -### zfs_dmu_offset_next_sync -`zfs_dmu_offset_next_sync` enables forcing txg sync to find holes. -This causes ZFS to act like older versions when `SEEK_HOLE` or `SEEK_DATA` flags -are used: when a dirty dnode causes txgs to be synced so the previous data -can be found. - -| zfs_dmu_offset_next_sync | Notes -|---|--- -| Tags | [DMU](#dmu) -| When to change | TBD -| Data Type | boolean -| Range | 0=do not force txg sync to find holes, 1=force txg sync to find holes -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_pd_bytes_max -`zfs_pd_bytes_max` limits the number of bytes prefetched during a pool traversal -(eg `zfs send` or other data crawling operations). These prefetches are -referred to as "prescient prefetches" and are always 100% hit rate. -The traversal operations do not use the default data or metadata prefetcher. - -| zfs_pd_bytes_max | Notes -|---|--- -| Tags | [prefetch](#prefetch), [send](#send) -| When to change | TBD -| Data Type | int32 -| Units | bytes -| Range | 0 to INT32_MAX -| Default | 52,428,800 (50 MiB) -| Change | Dynamic -| Versions Affected | TBD - -### zfs_per_txg_dirty_frees_percent -`zfs_per_txg_dirty_frees_percent` as a percentage of [zfs_dirty_data_max](#zfs_dirty_data_max) -controls the percentage of dirtied blocks from frees in one txg. -After the threshold is crossed, additional dirty blocks from frees -wait until the next txg. -Thus, when deleting large files, filling consecutive txgs with deletes/frees, -does not throttle other, perhaps more important, writes. - -A side effect of this throttle can impact `zfs receive` workloads that contain a -large number of frees and the [ignore_hole_birth](#ignore_hole_birth) optimization is -disabled. The symptom is that the receive workload causes an increase -in the frequency of txg commits. The frequency of txg commits is observable via the -`otime` column of `/proc/spl/kstat/zfs/POOLNAME/txgs`. Since txg commits also flush data -from volatile caches in HDDs to media, HDD performance can be negatively impacted. -Also, since the frees do not consume much bandwidth over the pipe, the pipe can appear to stall. -Thus the overall progress of receives is slower than expected. - -A value of zero will disable this throttle. - -| zfs_per_txg_dirty_frees_percent | Notes -|---|--- -| Tags | [delete](#delete) -| When to change | For `zfs receive` workloads, consider increasing or disabling. See section [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | ulong -| Units | percent -| Range | 0 to 100 -| Default | 30 -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_prefetch_disable -`zfs_prefetch_disable` controls the predictive prefetcher. - -Note that it leaves "prescient" prefetch (eg prefetch for `zfs send`) intact -(see [zfs_pd_bytes_max](#zfs_pd_bytes_max)) - -| zfs_prefetch_disable | Notes -|---|--- -| Tags | [prefetch](#prefetch) -| When to change | In some case where the workload is completely random reads, overall performance can be better if prefetch is disabled -| Data Type | boolean -| Range | 0=prefetch enabled, 1=prefetch disabled -| Default | 0 -| Change | Dynamic -| Verification | prefetch efficacy is observed by `arcstat`, `arc_summary`, and the relevant entries in `/proc/spl/kstat/zfs/arcstats` -| Versions Affected | all - -### zfs_read_chunk_size -`zfs_read_chunk_size` is the limit for ZFS filesystem reads. If an application -issues a `read()` larger than `zfs_read_chunk_size`, then the `read()` is divided -into multiple operations no larger than `zfs_read_chunk_size` - -| zfs_read_chunk_size | Notes -|---|--- -| Tags | [filesystem](#filesystem) -| When to change | TBD -| Data Type | ulong -| Units | bytes -| Range | 512 to ULONG_MAX -| Default | 1,048,576 -| Change | Dynamic -| Versions Affected | all - -### zfs_read_history -Historical statistics for the last `zfs_read_history` reads are available in -`/proc/spl/kstat/zfs/POOL_NAME/reads` - -| zfs_read_history | Notes -|---|--- -| Tags | [debug](#debug) -| When to change | To observe read operation details -| Data Type | int -| Units | lines -| Range | 0 to INT_MAX -| Default | 0 -| Change | Dynamic -| Versions Affected | all - -### zfs_read_history_hits -When [zfs_read_history](#zfs_read_history)` > 0`, zfs_read_history_hits controls whether ARC hits are -displayed in the read history file, `/proc/spl/kstat/zfs/POOL_NAME/reads` - -| zfs_read_history_hits | Notes -|---|--- -| Tags | [debug](#debug) -| When to change | To observe read operation details with ARC hits -| Data Type | boolean -| Range | 0=do not include data for ARC hits, 1=include ARC hit data -| Default | 0 -| Change | Dynamic -| Versions Affected | all - -### zfs_recover -`zfs_recover` can be set to true (1) to attempt to recover from -otherwise-fatal errors, typically caused by on-disk corruption. -When set, calls to `zfs_panic_recover()` will turn into warning messages -rather than calling `panic()` - -| zfs_recover | Notes -|---|--- -| Tags | [import](#import) -| When to change | zfs_recover should only be used as a last resort, as it typically results in leaked space, or worse -| Data Type | boolean -| Range | 0=normal operation, 1=attempt recovery zpool import -| Default | 0 -| Change | Dynamic -| Verification | check output of `dmesg` and other logs for details -| Versions Affected | v0.6.4 or later - -### zfs_resilver_min_time_ms -Resilvers are processed by the sync thread in syncing context. While -resilvering, ZFS spends at least `zfs_resilver_min_time_ms` time working on a -resilver between txg commits. - -The [zfs_txg_timeout](#zfs_txg_timeout) tunable sets a nominal timeout value -for the txg commits. By default, this timeout is 5 seconds and the `zfs_resilver_min_time_ms` -is 3 seconds. However, many variables contribute to changing the actual txg times. -The measured txg interval is observed as the `otime` column (in nanoseconds) in -the `/proc/spl/kstat/zfs/POOL_NAME/txgs` file. - -See also [zfs_txg_timeout](#zfs_txg_timeout) and [zfs_scan_min_time_ms](#zfs_scan_min_time_ms) - -| zfs_resilver_min_time_ms | Notes -|---|--- -| Tags | [resilver](#resilver) -| When to change | In some resilvering cases, increasing `zfs_resilver_min_time_ms` can result in faster completion -| Data Type | int -| Units | milliseconds -| Range | 1 to [zfs_txg_timeout](#zfs_txg_timeout) converted to milliseconds -| Default | 3,000 -| Change | Dynamic -| Versions Affected | all - -### zfs_scan_min_time_ms -Scrubs are processed by the sync thread in syncing context. While -scrubbing, ZFS spends at least `zfs_scan_min_time_ms` time working on a -scrub between txg commits. - -See also [zfs_txg_timeout](#zfs_txg_timeout) and [zfs_resilver_min_time_ms](#zfs_resilver_min_time_ms) - -| zfs_scan_min_time_ms | Notes -|---|--- -| Tags | [scrub](#scrub) -| When to change | In some scrub cases, increasing `zfs_scan_min_time_ms` can result in faster completion -| Data Type | int -| Units | milliseconds -| Range | 1 to [zfs_txg_timeout](#zfs_txg_timeout) converted to milliseconds -| Default | 1,000 -| Change | Dynamic -| Versions Affected | all - -### zfs_scan_checkpoint_intval -To preserve progress across reboots the sequential scan algorithm periodically -needs to stop metadata scanning and issue all the verifications I/Os to disk -every `zfs_scan_checkpoint_intval` seconds. - -| zfs_scan_checkpoint_intval | Notes -|---|--- -| Tags | [resilver](#resilver), [scrub](#scrub) -| When to change | TBD -| Data Type | int -| Units | seconds -| Range | 1 to INT_MAX -| Default | 7,200 (2 hours) -| Change | Dynamic -| Versions Affected | v0.8.0 and later - -### zfs_scan_fill_weight -This tunable affects how scrub and resilver I/O segments are ordered. A higher -number indicates that we care more about how filled in a segment is, while a -lower number indicates we care more about the size of the extent without -considering the gaps within a segment. - -| zfs_scan_fill_weight | Notes -|---|--- -| Tags | [resilver](#resilver), [scrub](#scrub) -| When to change | Testing sequential scrub and resilver -| Data Type | int -| Units | scalar -| Range | 0 to INT_MAX -| Default | 3 -| Change | Prior to zfs module load -| Versions Affected | v0.8.0 and later - -### zfs_scan_issue_strategy -`zfs_scan_issue_strategy` controls the order of data verification while scrubbing or -resilvering. - -| value | description -|---|--- -| 0 | fs will use strategy 1 during normal verification and strategy 2 while taking a checkpoint -| 1 | data is verified as sequentially as possible, given the amount of memory reserved for scrubbing (see [zfs_scan_mem_lim_fact](#zfs_scan_mem_lim_fact)). This can improve scrub performance if the pool's data is heavily fragmented. -| 2 | the largest mostly-contiguous chunk of found data is verified first. By deferring scrubbing of small segments, we may later find adjacent data to coalesce and increase the segment size. - - -| zfs_scan_issue_strategy | Notes -|---|--- -| Tags | [resilver](#resilver), [scrub](#scrub) -| When to change | TBD -| Data Type | enum -| Range | 0 to 2 -| Default | 0 -| Change | Dynamic -| Versions Affected | TBD - -### zfs_scan_legacy -Setting `zfs_scan_legacy = 1` enables the legacy scan and scrub behavior -instead of the newer sequential behavior. - -| zfs_scan_legacy | Notes -|---|--- -| Tags | [resilver](#resilver), [scrub](#scrub) -| When to change | In some cases, the new scan mode can consumer more memory as it collects and sorts I/Os; using the legacy algorithm can be more memory efficient at the expense of HDD read efficiency -| Data Type | boolean -| Range | 0=use new method: scrubs and resilvers will gather metadata in memory before issuing sequential I/O, 1=use legacy algorithm will be used where I/O is initiated as soon as it is discovered -| Default | 0 -| Change | Dynamic, however changing to 0 does not affect in-progress scrubs or resilvers -| Versions Affected | v0.8.0 and later - -### zfs_scan_max_ext_gap -`zfs_scan_max_ext_gap` limits the largest gap in bytes between scrub and -resilver I/Os that will still be considered sequential for sorting purposes. - -| zfs_scan_max_ext_gap | Notes -|---|--- -| Tags | [resilver](#resilver), [scrub](#scrub) -| When to change | TBD -| Data Type | ulong -| Units | bytes -| Range | 512 to ULONG_MAX -| Default | 2,097,152 (2 MiB) -| Change | Dynamic, however changing to 0 does not affect in-progress scrubs or resilvers -| Versions Affected | v0.8.0 and later - -### zfs_scan_mem_lim_fact -`zfs_scan_mem_lim_fact` limits the maximum fraction of RAM used for I/O sorting -by sequential scan algorithm. -When the limit is reached scanning metadata is stopped and -data verification I/O is started. -Data verification I/O continues until the memory used by the sorting -algorithm drops below below [zfs_scan_mem_lim_soft_fact](#zfs_scan_mem_lim_soft_fact) - -Memory used by the sequential scan algorithm can be observed as the kmem sio_cache. -This is visible from procfs as `grep sio_cache /proc/slabinfo` and can be monitored -using slab-monitoring tools such as `slabtop` - -| zfs_scan_mem_lim_fact | Notes -|---|--- -| Tags | [memory](#memory), [resilver](#resilver), [scrub](#scrub) -| When to change | TBD -| Data Type | int -| Units | divisor of physical RAM -| Range | TBD -| Default | 20 (physical RAM / 20 or 5%) -| Change | Dynamic -| Versions Affected | v0.8.0 and later - -### zfs_scan_mem_lim_soft_fact -`zfs_scan_mem_lim_soft_fact` sets the fraction of the hard limit, -[zfs_scan_mem_lim_fact](#zfs_scan_mem_lim_fact), used to determined the RAM soft limit -for I/O sorting by the sequential scan algorithm. -After [zfs_scan_mem_lim_fact](#zfs_scan_mem_lim_fact) has been reached, metadata scanning is stopped -until the RAM usage drops below `zfs_scan_mem_lim_soft_fact` - -| zfs_scan_mem_lim_soft_fact | Notes -|---|--- -| Tags | [resilver](#resilver), [scrub](#scrub) -| When to change | TBD -| Data Type | int -| Units | divisor of (physical RAM / [zfs_scan_mem_lim_fact](#zfs_scan_mem_lim_fact)) -| Range | 1 to INT_MAX -| Default | 20 (for default [zfs_scan_mem_lim_fact](#zfs_scan_mem_lim_fact), 0.25% of physical RAM) -| Change | Dynamic -| Versions Affected | v0.8.0 and later - -### zfs_scan_vdev_limit -`zfs_scan_vdev_limit` is the maximum amount of data that can be concurrently -issued at once for scrubs and resilvers per leaf vdev. -`zfs_scan_vdev_limit` attempts to strike a balance between keeping the leaf -vdev queues full of I/Os while not overflowing the queues causing high latency -resulting in long txg sync times. -While `zfs_scan_vdev_limit` represents a bandwidth limit, the existing I/O -limit of [zfs_vdev_scrub_max_active](#zfs_vdev_scrub_max_active) remains in effect, too. - -| zfs_scan_vdev_limit | Notes -|---|--- -| Tags | [resilver](#resilver), [scrub](#scrub), [vdev](#vdev) -| When to change | TBD -| Data Type | ulong -| Units | bytes -| Range | 512 to ULONG_MAX -| Default | 4,194,304 (4 MiB) -| Change | Dynamic -| Versions Affected | v0.8.0 and later - -### zfs_send_corrupt_data -`zfs_send_corrupt_data` enables `zfs send` to send of corrupt data by -ignoring read and checksum errors. The corrupted or unreadable blocks are -replaced with the value `0x2f5baddb10c` (ZFS bad block) - -| zfs_send_corrupt_data | Notes -|---|--- -| Tags | [send](#send) -| When to change | When data corruption exists and an attempt to recover at least some data via `zfs send` is needed -| Data Type | boolean -| Range | 0=do not send corrupt data, 1=replace corrupt data with cookie -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.6.0 and later - -### zfs_sync_pass_deferred_free -The SPA sync process is performed in multiple passes. Once the pass number -reaches `zfs_sync_pass_deferred_free`, frees are no long processed and must wait -for the next SPA sync. - -The `zfs_sync_pass_deferred_free` value is expected to be removed as a tunable -once the optimal value is determined during field testing. - -The `zfs_sync_pass_deferred_free` pass must be greater than 1 to ensure that -regular blocks are not deferred. - -| zfs_sync_pass_deferred_free | Notes -|---|--- -| Tags | [SPA](#spa) -| When to change | Testing SPA sync process -| Data Type | int -| Units | SPA sync passes -| Range | 1 to INT_MAX -| Default | 2 -| Change | Dynamic -| Versions Affected | all - -### zfs_sync_pass_dont_compress -The SPA sync process is performed in multiple passes. Once the pass number -reaches `zfs_sync_pass_dont_compress`, data block compression is no longer -processed and must wait for the next SPA sync. - -The `zfs_sync_pass_dont_compress` value is expected to be removed as a tunable -once the optimal value is determined during field testing. - -| zfs_sync_pass_dont_compress | Notes -|---|--- -| Tags | [SPA](#spa) -| When to change | Testing SPA sync process -| Data Type | int -| Units | SPA sync passes -| Range | 1 to INT_MAX -| Default | 5 -| Change | Dynamic -| Versions Affected | all - -### zfs_sync_pass_rewrite -The SPA sync process is performed in multiple passes. Once the pass number -reaches `zfs_sync_pass_rewrite`, blocks can be split into gang blocks. - -The `zfs_sync_pass_rewrite` value is expected to be removed as a tunable -once the optimal value is determined during field testing. - -| zfs_sync_pass_rewrite | Notes -|---|--- -| Tags | [SPA](#spa) -| When to change | Testing SPA sync process -| Data Type | int -| Units | SPA sync passes -| Range | 1 to INT_MAX -| Default | 2 -| Change | Dynamic -| Versions Affected | all - -### zfs_sync_taskq_batch_pct -`zfs_sync_taskq_batch_pct` controls the number of threads used by the -DSL pool sync taskq, `dp_sync_taskq` - -| zfs_sync_taskq_batch_pct | Notes -|---|--- -| Tags | [SPA](#spa) -| When to change | to adjust the number of `dp_sync_taskq` threads -| Data Type | int -| Units | percent of number of online CPUs -| Range | 1 to 100 -| Default | 75 -| Change | Prior to zfs module load -| Versions Affected | v0.7.0 and later - -### zfs_txg_history -Historical statistics for the last `zfs_txg_history` txg commits are available -in `/proc/spl/kstat/zfs/POOL_NAME/txgs` - -The work required to measure the txg commit (SPA statistics) is low. -However, for debugging purposes, it can be useful to observe the SPA -statistics. - -| zfs_txg_history | Notes -|---|--- -| Tags | [debug](#debug) -| When to change | To observe details of SPA sync behavior. -| Data Type | int -| Units | lines -| Range | 0 to INT_MAX -| Default | 0 for version v0.6.0 to v0.7.6, 100 for version v0.8.0 -| Change | Dynamic -| Versions Affected | all - -### zfs_txg_timeout -The open txg is committed to the pool periodically (SPA sync) and -`zfs_txg_timeout` represents the default target upper limit. - -txg commits can occur more frequently and a rapid rate of txg commits often -indicates a busy write workload, quota limits reached, or the free space is -critically low. - -Many variables contribute to changing the actual txg times. -txg commits can also take longer than `zfs_txg_timeout` if the ZFS write throttle -is not properly tuned or the time to sync is otherwise delayed (eg slow device). -Shorter txg commit intervals can occur due to [zfs_dirty_data_sync](#zfs_dirty_data_sync) -for write-intensive workloads. -The measured txg interval is observed as the `otime` column (in nanoseconds) in -the `/proc/spl/kstat/zfs/POOL_NAME/txgs` file. - -See also [zfs_dirty_data_sync](#zfs_dirty_data_sync) and -[zfs_txg_history](#zfs_txg_history) - -| zfs_txg_timeout | Notes -|---|--- -| Tags | [SPA](#spa), [ZIO_scheduler](#zio_scheduler) -| When to change | To optimize the work done by txg commit relative to the pool requirements. See also section [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | int -| Units | seconds -| Range | 1 to INT_MAX -| Default | 5 -| Change | Dynamic -| Versions Affected | all - -### zfs_vdev_aggregation_limit -To reduce IOPs, small, adjacent I/Os can be aggregated (coalesced) into a -large I/O. -For reads, aggregations occur across small adjacency gaps. -For writes, aggregation can occur at the ZFS or disk level. -`zfs_vdev_aggregation_limit` is the upper bound on the size of the larger, -aggregated I/O. - -Setting `zfs_vdev_aggregation_limit = 0` effectively disables aggregation by ZFS. -However, the block device scheduler can still merge (aggregate) I/Os. Also, many -devices, such as modern HDDs, contain schedulers that can aggregate I/Os. - -In general, I/O aggregation can improve performance for devices, such as HDDs, -where ordering I/O operations for contiguous LBAs is a benefit. For random access -devices, such as SSDs, aggregation might not improve performance relative to the -CPU cycles needed to aggregate. For devices that represent themselves as having -no rotation, the [zfs_vdev_aggregation_limit_non_rotating](#zfs_vdev_aggregation_limit_non_rotating) -parameter is used instead of `zfs_vdev_aggregation_limit` - -| zfs_vdev_aggregation_limit | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | If the workload does not benefit from aggregation, the `zfs_vdev_aggregation_limit` can be reduced to avoid aggregation attempts -| Data Type | int -| Units | bytes -| Range | 0 to 1,048,576 (default) or 16,777,216 (if `zpool` `large_blocks` feature is enabled) -| Default | 1,048,576, or 131,072 for > [dbuf_cache_max_shift](#dbuf_cache_max_shift) and the default -`dbuf_cache_max_bytes` - -| dbuf_cache_max_bytes | Notes -|---|--- -| Tags | [dbuf_cache](#dbuf_cache) -| When to change | Testing dbuf cache algorithms -| Data Type | ulong -| Units | bytes -| Range | 16,777,216 to ULONG_MAX -| Default | 104,857,600 (100 MiB) -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### dbuf_cache_max_shift -The [dbuf_cache_max_bytes](#dbuf_cache_max_bytes) minimum is the lesser of -[dbuf_cache_max_bytes](#dbuf_cache_max_bytes) -and the current ARC target size (`c`) >> `dbuf_cache_max_shift` - -| dbuf_cache_max_shift | Notes -|---|--- -| Tags | [dbuf_cache](#dbuf_cache) -| When to change | Testing dbuf cache algorithms -| Data Type | int -| Units | shift -| Range | 1 to 63 -| Default | 5 -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### dmu_object_alloc_chunk_shift -Each of the concurrent object allocators grabs -`2^dmu_object_alloc_chunk_shift` dnode slots at a time. -The default is to grab 128 slots, or 4 blocks worth. -This default value was experimentally determined to be the lowest value -that eliminates the measurable effect of lock contention in the DMU object -allocation code path. - -| dmu_object_alloc_chunk_shift | Notes -|---|--- -| Tags | [allocation](#allocation), [DMU](#dmu) -| When to change | If the workload creates many files concurrently on a system with many CPUs, then increasing `dmu_object_alloc_chunk_shift` can decrease lock contention -| Data Type | int -| Units | shift -| Range | 7 to 9 -| Default | 7 -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### send_holes_without_birth_time -Alias for [ignore_hole_birth](#ignore_hole_birth) - -### zfs_abd_scatter_enabled -`zfs_abd_scatter_enabled` controls the ARC Buffer Data (ABD) scatter/gather -feature. - -When disabled, the legacy behaviour is selected using linear buffers. -For linear buffers, all the data in the ABD is stored in one contiguous -buffer in memory (from a `zio_[data_]buf_*` kmem cache). - -When enabled (default), the data in the ABD is split into -equal-sized chunks (from the `abd_chunk_cache` kmem_cache), with pointers to -the chunks recorded in an array at the end of the ABD structure. This allows -more efficient memory allocation for buffers, especially when large -recordsizes are used. - -| zfs_abd_scatter_enabled | Notes -|---|--- -| Tags | [ABD](#abd), [memory](#memory) -| When to change | Testing ABD -| Data Type | boolean -| Range | 0=use linear allocation only, 1=allow scatter/gather -| Default | 1 -| Change | Dynamic -| Verification | ABD statistics are observable in `/proc/spl/kstat/zfs/abdstats`. Slab allocations are observable in `/proc/slabinfo` -| Versions Affected | v0.7.0 and later - -### zfs_abd_scatter_max_order -`zfs_abd_scatter_max_order` sets the maximum order for physical page allocation -when ABD is enabled (see [zfs_abd_scatter_enabled](#zfs_abd_scatter_enabled)) - -See also Buddy Memory Allocation in the Linux kernel documentation. - -| zfs_abd_scatter_max_order | Notes -|---|--- -| Tags | [ABD](#abd), [memory](#memory) -| When to change | Testing ABD features -| Data Type | int -| Units | orders -| Range | 1 to 10 (upper limit is hardware-dependent) -| Default | 10 -| Change | Dynamic -| Verification | ABD statistics are observable in `/proc/spl/kstat/zfs/abdstats` -| Versions Affected | v0.7.0 and later - -### zfs_compressed_arc_enabled -When compression is enabled for a dataset, later reads of the data can store -the blocks in ARC in their on-disk, compressed state. This can increse the -effective size of the ARC, as counted in blocks, and thus improve the ARC hit -ratio. - -| zfs_compressed_arc_enabled | Notes -|---|--- -| Tags | [ABD](#abd), [compression](#compression) -| When to change | Testing ARC compression feature -| Data Type | boolean -| Range | 0=compressed ARC disabled (legacy behaviour), 1=compress ARC data -| Default | 1 -| Change | Dynamic -| Verification | raw ARC statistics are observable in `/proc/spl/kstat/zfs/arcstats` and ARC hit ratios can be observed using `arcstat` -| Versions Affected | v0.7.0 and later - -### zfs_key_max_salt_uses -For encrypted datasets, the salt is regenerated every `zfs_key_max_salt_uses` -blocks. This automatic regeneration reduces the probability of collisions -due to the Birthday problem. When set to the default (400,000,000) the -probability of collision is approximately 1 in 1 trillion. - -| zfs_key_max_salt_uses | Notes -|---|--- -| Tags | [encryption](#encryption) -| When to change | Testing encryption features -| Data Type | ulong -| Units | blocks encrypted -| Range | 1 to ULONG_MAX -| Default | 400,000,000 -| Change | Dynamic -| Versions Affected | v0.8.0 and later - -### zfs_object_mutex_size -`zfs_object_mutex_size` facilitates resizing the the -per-dataset znode mutex array for testing deadlocks therein. - -| zfs_object_mutex_size | Notes -|---|--- -| Tags | [debug](#debug) -| When to change | Testing znode mutex array deadlocks -| Data Type | uint -| Units | orders -| Range | 1 to UINT_MAX -| Default | 64 -| Change | Dynamic -| Versions Affected | v0.7.0 and later - -### zfs_scan_strict_mem_lim -When scrubbing or resilvering, by default, ZFS checks to ensure it is not -over the hard memory limit before each txg commit. -If finer-grained control of this is needed `zfs_scan_strict_mem_lim` can be -set to 1 to enable checking before scanning each block. - -| zfs_scan_strict_mem_lim | Notes -|---|--- -| Tags | [memory](#memory), [resilver](#resilver), [scrub](#scrub) -| When to change | Do not change -| Data Type | boolean -| Range | 0=normal scan behaviour, 1=check hard memory limit strictly during scan -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.8.0 - -### zfs_send_queue_length -`zfs_send_queue_length` is the maximum number of bytes allowed in the zfs send queue. - -| zfs_send_queue_length| Notes -|---|--- -| Tags | [send](#send) -| When to change | When using the largest recordsize or volblocksize (16 MiB), increasing can improve send efficiency -| Data Type | int -| Units | bytes -| Range | Must be at least twice the maximum recordsize or volblocksize in use -| Default | 16,777,216 bytes (16 MiB) -| Change | Dynamic -| Versions Affected | v0.8.1 - -### zfs_recv_queue_length -`zfs_recv_queue_length` is the maximum number of bytes allowed in the zfs receive queue. - - -| zfs_recv_queue_length | Notes -|---|--- -| Tags | [receive](#receive) -| When to change | When using the largest recordsize or volblocksize (16 MiB), increasing can improve receive efficiency -| Data Type | int -| Units | bytes -| Range | Must be at least twice the maximum recordsize or volblocksize in use -| Default | 16,777,216 bytes (16 MiB) -| Change | Dynamic -| Versions Affected | v0.8.1 - -### zfs_arc_min_prefetch_lifespan -`arc_min_prefetch_lifespan` is the minimum time for a prefetched block to remain in ARC before -it is eligible for eviction. - -| zfs_arc_min_prefetch_lifespan | Notes -|---|--- -| Tags | [ARC](#ARC) -| When to change | TBD -| Data Type | int -| Units | clock ticks -| Range | 0 = use default value -| Default | 1 second (as expressed in clock ticks) -| Change | Dynamic -| Versions Affected | v0.7.0 - -### zfs_scan_ignore_errors -`zfs_scan_ignore_errors` allows errors discovered during scrub or resilver to be -ignored. This can be tuned as a workaround to remove the dirty time list (DTL) -when completing a pool scan. It is intended to be used during pool repair or -recovery to prevent resilvering when the pool is imported. - -| zfs_scan_ignore_errors | Notes -|---|--- -| Tags | [resilver](#resilver) -| When to change | See description above -| Data Type | boolean -| Range | 0 = do not ignore errors, 1 = ignore errors during pool scrub or resilver -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.8.1 - -### zfs_top_maxinflight -`zfs_top_maxinflight` is used to limit the maximum number of I/Os queued to top-level -vdevs during scrub or resilver operations. The actual top-level vdev limit is calculated -by multiplying the number of child vdevs by `zfs_top_maxinflight` This limit is an -additional cap over and above the scan limits - -| zfs_top_maxinflight | Notes -|---|--- -| Tags | [resilver](#resilver), [scrub](#scrub), [ZIO_scheduler](#zio_scheduler) -| When to change | for modern ZFS versions, the ZIO scheduler limits usually take precedence -| Data Type | int -| Units | I/O operations -| Range | 1 to MAX_INT -| Default | 32 -| Change | Dynamic -| Versions Affected | v0.6.0 - -### zfs_resilver_delay -`zfs_resilver_delay` sets a time-based delay for resilver I/Os. This delay is -in addition to the ZIO scheduler's treatement of scrub workloads. See also -[zfs_scan_idle](#zfs_scan_idle) - -| zfs_resilver_delay | Notes -|---|--- -| Tags | [resilver](#resilver), [ZIO_scheduler](#zio_scheduler) -| When to change | increasing can reduce impact of resilver workload on dynamic workloads -| Data Type | int -| Units | clock ticks -| Range | 0 to MAX_INT -| Default | 2 -| Change | Dynamic -| Versions Affected | v0.6.0 - -### zfs_scrub_delay -`zfs_scrub_delay` sets a time-based delay for scrub I/Os. This delay is -in addition to the ZIO scheduler's treatment of scrub workloads. See also -[zfs_scan_idle](#zfs_scan_idle) - -| zfs_scrub_delay | Notes -|---|--- -| Tags | [scrub](#scrub), [ZIO_scheduler](#zio_scheduler) -| When to change | increasing can reduce impact of scrub workload on dynamic workloads -| Data Type | int -| Units | clock ticks -| Range | 0 to MAX_INT -| Default | 4 -| Change | Dynamic -| Versions Affected | v0.6.0 - -### zfs_scan_idle -When a non-scan I/O has occurred in the past `zfs_scan_idle` clock ticks, then -[zfs_resilver_delay](#zfs_resilver_delay) or [zfs_scrub_delay](#zfs_scrub_delay) -are enabled. - -| zfs_scan_idle | Notes -|---|--- -| Tags | [resilver](#resilver), [scrub](#scrub), [ZIO_scheduler](#zio_scheduler) -| When to change | as part of a resilver/scrub tuning effort -| Data Type | int -| Units | clock ticks -| Range | 0 to MAX_INT -| Default | 50 -| Change | Dynamic -| Versions Affected | v0.6.0 - -### icp_aes_impl -By default, ZFS will choose the highest performance, hardware-optimized implementation of the -AES encryption algorithm. The `icp_aes_impl` tunable overrides this automatic choice. - -Note: `icp_aes_impl` is set in the `icp` kernel module, not the `zfs` kernel module. - -To observe the available options `cat /sys/module/icp/parameters/icp_aes_impl` -The default option is shown in brackets '[]' - -| icp_aes_impl | Notes -|---|--- -| Tags | [encryption](#encryption) -| Kernel module | icp -| When to change | debugging ZFS encryption on hardware -| Data Type | string -| Range | varies by hardware -| Default | automatic, depends on the hardware -| Change | dynamic -| Versions Affected | planned for v2 - -### icp_gcm_impl -By default, ZFS will choose the highest performance, hardware-optimized implementation of the -GCM encryption algorithm. The `icp_gcm_impl` tunable overrides this automatic choice. - -Note: `icp_gcm_impl` is set in the `icp` kernel module, not the `zfs` kernel module. - -To observe the available options `cat /sys/module/icp/parameters/icp_gcm_impl` -The default option is shown in brackets '[]' - -| icp_gcm_impl | Notes -|---|--- -| Tags | [encryption](#encryption) -| Kernel module | icp -| When to change | debugging ZFS encryption on hardware -| Data Type | string -| Range | varies by hardware -| Default | automatic, depends on the hardware -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_abd_scatter_min_size -`zfs_abd_scatter_min_size` changes the ARC buffer data (ABD) allocator's threshold -for using linear or page-based scatter buffers. Allocations smaller than `zfs_abd_scatter_min_size` -use linear ABDs. - -Scatter ABD's use at least one page each, so sub-page allocations waste some space -when allocated as scatter allocations. For example, 2KB scatter allocation wastes -half of each page. -Using linear ABD's for small allocations results in slabs containing many allocations. -This can improve memory efficiency, at the expense of more work for ARC evictions -attempting to free pages, because all the buffers on one slab -need to be freed in order to free the slab and its underlying pages. - -Typically, 512B and 1KB kmem caches have 16 buffers per slab, so it's possible -for them to actually waste more memory than scatter allocations: -* one page per buf = wasting 3/4 or 7/8 -* one buf per slab = wasting 15/16 - -Spill blocks are typically 512B and are heavily used on systems running _selinux_ -with the default dnode size and the `xattr=sa` property set. - -By default, linear allocations for 512B and 1KB, and scatter allocations for -larger (>= 1.5KB) allocation requests. - -| zfs_abd_scatter_min_size | Notes -|---|--- -| Tags | [ARC](#ARC) -| When to change | debugging memory allocation, especially for large pages -| Data Type | int -| Units | bytes -| Range | 0 to MAX_INT -| Default | 1536 (512B and 1KB allocations will be linear) -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_unlink_suspend_progress -`zfs_unlink_suspend_progress` changes the policy for removing pending unlinks. -When enabled, files will not be asynchronously removed from the list of pending -unlinks and the space they consume will be leaked. Once this option has been -disabled and the dataset is remounted, the pending unlinks will be processed -and the freed space returned to the pool. - -| zfs_unlink_suspend_progress | Notes -|---|--- -| Tags | -| When to change | used by the ZFS test suite (ZTS) to facilitate testing -| Data Type | boolean -| Range | 0 = use async unlink removal, 1 = do not async unlink thus leaking space -| Default | 0 -| Change | prior to dataset mount -| Versions Affected | planned for v2 - -### spa_load_verify_shift -`spa_load_verify_shift` sets the fraction of ARC that can be used by inflight I/Os when -verifying the pool during import. This value is a "shift" representing the fraction -of ARC target size (`grep -w c /proc/spl/kstat/zfs/arcstats`). The ARC target size is -shifted to the right. Thus a value of '2' results in the fraction = 1/4, while a value -of '4' results in the fraction = 1/8. - -For large memory machines, pool import can consume large amounts of ARC: much larger than -the value of maxinflight. This can result in [spa_load_verify_maxinflight](#spa_load_verify_maxinflight) -having a value of 0 causing the system to hang. -Setting `spa_load_verify_shift` can reduce this limit and allow importing without hanging. - -| spa_load_verify_shift | Notes -|---|--- -| Tags | [import](#import), [ARC](#ARC), [SPA](#SPA) -| When to change | troubleshooting pool import on large memory machines -| Data Type | int -| Units | shift -| Range | 1 to MAX_INT -| Default | 4 -| Change | prior to importing a pool -| Versions Affected | planned for v2 - -### spa_load_print_vdev_tree -`spa_load_print_vdev_tree` enables printing of the attempted pool import's vdev tree to -kernel message to the ZFS debug message log `/proc/spl/kstat/zfs/dbgmsg` -Both the provided vdev tree and MOS vdev tree are printed, which can be useful -for debugging problems with the zpool `cachefile` - -| spa_load_print_vdev_tree | Notes -|---|--- -| Tags | [import](#import), [SPA](#SPA) -| When to change | troubleshooting pool import failures -| Data Type | boolean -| Range | 0 = do not print pool configuration in logs, 1 = print pool configuration in logs -| Default | 0 -| Change | prior to pool import -| Versions Affected | planned for v2 - -### zfs_max_missing_tvds -When importing a pool in readonly mode (`zpool import -o readonly=on ...`) -then up to `zfs_max_missing_tvds` top-level vdevs can be missing, but the -import can attempt to progress. - -Note: This is strictly intended for advanced pool recovery cases since -missing data is almost inevitable. Pools with missing devices can only be imported -read-only for safety reasons, and the pool's `failmode` property is automatically -set to `continue` - -The expected use case is to recover pool data immediately after accidentally adding a -non-protected vdev to a protected pool. - -* With 1 missing top-level vdev, ZFS should be able to import the pool and mount all - datasets. User data that was not modified after the missing device has been - added should be recoverable. Thus snapshots created prior to the - addition of that device should be completely intact. - -* With 2 missing top-level vdevs, some datasets may fail to mount since there are - dataset statistics that are stored as regular metadata. Some data might be - recoverable if those vdevs were added recently. - -* With 3 or more top-level missing vdevs, the pool is severely damaged and MOS entries - may be missing entirely. Chances of data recovery are very low. Note that - there are also risks of performing an inadvertent rewind as we might be - missing all the vdevs with the latest uberblocks. - -| zfs_max_missing_tvds | Notes -|---|--- -| Tags | [import](#import) -| When to change | troubleshooting pools with missing devices -| Data Type | int -| Units | missing top-level vdevs -| Range | 0 to MAX_INT -| Default | 0 -| Change | prior to pool import -| Versions Affected | planned for v2 - -### dbuf_metadata_cache_shift -`dbuf_metadata_cache_shift` sets the size of the dbuf metadata cache -as a fraction of ARC target size. This is an alternate method for setting dbuf metadata -cache size than [dbuf_metadata_cache_max_bytes](#dbuf_metadata_cache_max_bytes). - -[dbuf_metadata_cache_max_bytes](#dbuf_metadata_cache_max_bytes) overrides `dbuf_metadata_cache_shift` - -This value is a "shift" representing the fraction -of ARC target size (`grep -w c /proc/spl/kstat/zfs/arcstats`). The ARC target size is -shifted to the right. Thus a value of '2' results in the fraction = 1/4, while a value -of '6' results in the fraction = 1/64. - -| dbuf_metadata_cache_shift | Notes -|---|--- -| Tags | [ARC](#ARC), [dbuf_cache](#dbuf_cache) -| When to change | -| Data Type | int -| Units | shift -| Range | practical range is ([dbuf_cache_shift](#dbuf_cache_shift) + 1) to MAX_INT -| Default | 6 -| Change | Dynamic -| Versions Affected | planned for v2 - -### dbuf_metadata_cache_max_bytes -`dbuf_metadata_cache_max_bytes` sets the size of the dbuf metadata cache -as a number of bytes. This is an alternate method for setting dbuf metadata -cache size than [dbuf_metadata_cache_shift](#dbuf_metadata_cache_shift) - -[dbuf_metadata_cache_max_bytes](#dbuf_metadata_cache_max_bytes) overrides `dbuf_metadata_cache_shift` - -| dbuf_metadata_cache_max_bytes | Notes -|---|--- -| Tags | [dbuf_cache](#dbuf_cache) -| When to change | -| Data Type | int -| Units | bytes -| Range | 0 = use [dbuf_metadata_cache_shift](#dbuf_metadata_cache_shift) to ARC `c_max` -| Default | 0 -| Change | Dynamic -| Versions Affected | planned for v2 - -### dbuf_cache_shift -`dbuf_cache_shift` sets the size of the dbuf cache as a fraction of ARC target size. -This is an alternate method for setting dbuf -cache size than [dbuf_cache_max_bytes](#dbuf_cache_max_bytes). - -[dbuf_cache_max_bytes](#dbuf_cache_max_bytes) overrides `dbuf_cache_shift` - -This value is a "shift" representing the fraction -of ARC target size (`grep -w c /proc/spl/kstat/zfs/arcstats`). The ARC target size is -shifted to the right. Thus a value of '2' results in the fraction = 1/4, while a value -of '5' results in the fraction = 1/32. - -Performance tuning of dbuf cache can be monitored using: - * `dbufstat` command - * [node_exporter](https://github.com/prometheus/node_exporter) ZFS module for prometheus environments - * [telegraf](https://github.com/influxdata/telegraf) ZFS plugin for general-purpose metric collection - * `/proc/spl/kstat/zfs/dbufstats` kstat - -| dbuf_cache_shift | Notes -|---|--- -| Tags | [ARC](#ARC), [dbuf_cache](#dbuf_cache) -| When to change | to improve performance of read-intensive channel programs -| Data Type | int -| Units | shift -| Range | 5 to MAX_INT -| Default | 5 -| Change | Dynamic -| Versions Affected | planned for v2 - -### dbuf_cache_max_bytes -`dbuf_cache_max_bytes` sets the size of the dbuf cache in bytes. -This is an alternate method for setting dbuf cache size than -[dbuf_cache_shift](#dbuf_cache_shift) - -Performance tuning of dbuf cache can be monitored using: - * `dbufstat` command - * [node_exporter](https://github.com/prometheus/node_exporter) ZFS module for prometheus environments - * [telegraf](https://github.com/influxdata/telegraf) ZFS plugin for general-purpose metric collection - * `/proc/spl/kstat/zfs/dbufstats` kstat - -| dbuf_cache_max_bytes | Notes -|---|--- -| Tags | [ARC](#ARC), [dbuf_cache](#dbuf_cache) -| When to change | -| Data Type | int -| Units | bytes -| Range | 0 = use [dbuf_cache_shift](#dbuf_cache_shift) to ARC `c_max` -| Default | 0 -| Change | Dynamic -| Versions Affected | planned for v2 - -### metaslab_force_ganging -When testing allocation code, `metaslab_force_ganging` forces blocks above the specified size to be ganged. - -| metaslab_force_ganging | Notes -|---|--- -| Tags | [allocation](#allocation) -| When to change | for development testing purposes only -| Data Type | ulong -| Units | bytes -| Range | SPA_MINBLOCKSIZE to (SPA_MAXBLOCKSIZE + 1) -| Default | SPA_MAXBLOCKSIZE + 1 (16,777,217 bytes) -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_vdev_default_ms_count -When adding a top-level vdev, `zfs_vdev_default_ms_count` is the target number of metaslabs. - -| zfs_vdev_default_ms_count | Notes -|---|--- -| Tags | [allocation](#allocation) -| When to change | for development testing purposes only -| Data Type | int -| Range | 16 to MAX_INT -| Default | 200 -| Change | prior to creating a pool or adding a top-level vdev -| Versions Affected | planned for v2 - -### vdev_removal_max_span -During top-level vdev removal, chunks of data are copied from the vdev -which may include free space in order to trade bandwidth for IOPS. -`vdev_removal_max_span` sets the maximum span of free space -included as unnecessary data in a chunk of copied data. - -| vdev_removal_max_span | Notes -|---|--- -| Tags | [vdev_removal](#vdev_removal) -| When to change | TBD -| Data Type | int -| Units | bytes -| Range | 0 to MAX_INT -| Default | 32,768 (32 MiB) -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_removal_ignore_errors -When removing a device, `zfs_removal_ignore_errors` controls the process for -handling hard I/O errors. When set, if a device encounters -a hard IO error during the removal process the removal will not be cancelled. -This can result in a normally recoverable block becoming permanently damaged -and is not recommended. This should only be used as a last resort when the -pool cannot be returned to a healthy state prior to removing the device. - -| zfs_removal_ignore_errors | Notes -|---|--- -| Tags | [vdev_removal](#vdev_removal) -| When to change | See description for caveat -| Data Type | boolean -| Range | during device removal: 0 = hard errors are not ignored, 1 = hard errors are ignored -| Default | 0 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_removal_suspend_progress -`zfs_removal_suspend_progress` is used during automated testing of the ZFS code to -incease test coverage. - -| zfs_removal_suspend_progress | Notes -|---|--- -| Tags | [vdev_removal](#vdev_removal) -| When to change | do not change -| Data Type | boolean -| Range | 0 = do not suspend during vdev removal -| Default | 0 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_condense_indirect_commit_entry_delay_ms -During vdev removal, the vdev indirection layer sleeps for `zfs_condense_indirect_commit_entry_delay_ms` -milliseconds during mapping geenration. This parameter is used during automated testing of the -ZFS code to improve test coverage. - -| zfs_condense_indirect_commit_entry_delay_ms | Notes -|---|--- -| Tags | [vdev_removal](#vdev_removal) -| When to change | do not change -| Data Type | int -| Units | milliseconds -| Range | 0 to MAX_INT -| Default | 0 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_condense_indirect_vdevs_enable -During vdev removal, condensing process is an attempt to save memory by removing obsolete mappings. -`zfs_condense_indirect_vdevs_enable` enables condensing indirect vdev mappings. -When set, ZFS attempts to condense indirect vdev mappings if the mapping uses more than -[zfs_condense_min_mapping_bytes](#zfs_condense_min_mapping_bytes) bytes of memory and -if the obsolete space map object uses more than -[zfs_condense_max_obsolete_bytes](#zfs_condense_max_obsolete_bytes) bytes on disk. - -| zfs_condense_indirect_vdevs_enable | Notes -|---|--- -| Tags | [vdev_removal](#vdev_removal) -| When to change | TBD -| Data Type | boolean -| Range | 0 = do not save memory, 1 = save memory by condensing obsolete mapping after vdev removal -| Default | 1 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_condense_max_obsolete_bytes -After vdev removal, `zfs_condense_max_obsolete_bytes` sets the limit for beginning the condensing -process. Condensing begins if the obsolete space map takes up more than `zfs_condense_max_obsolete_bytes` -of space on disk (logically). The default of 1 GiB is small enough relative to a typical pool that the -space consumed by the obsolete space map is minimal. - -See also [zfs_condense_indirect_vdevs_enable](#zfs_condense_indirect_vdevs_enable) - -| zfs_condense_max_obsolete_bytes | Notes -|---|--- -| Tags | [vdev_removal](#vdev_removal) -| When to change | no not change -| Data Type | ulong -| Units | bytes -| Range | 0 to MAX_ULONG -| Default | 1,073,741,824 (1 GiB) -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_condense_min_mapping_bytes -After vdev removal, `zfs_condense_min_mapping_bytes` is the lower limit for determining when -to condense the in-memory obsolete space map. The condensing process will not continue unless a minimum of -`zfs_condense_min_mapping_bytes` of memory can be freed. - -See also [zfs_condense_indirect_vdevs_enable](#zfs_condense_indirect_vdevs_enable) - -| zfs_condense_min_mapping_bytes | Notes -|---|--- -| Tags | [vdev_removal](#vdev_removal) -| When to change | do not change -| Data Type | ulong -| Units | bytes -| Range | 0 to MAX_ULONG -| Default | 128 KiB -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_vdev_initializing_max_active -`zfs_vdev_initializing_max_active` sets the maximum initializing I/Os active to each device. - -| zfs_vdev_initializing_max_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to [zfs_vdev_max_active](#zfs_vdev_max_active) -| Default | 1 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_vdev_initializing_min_active -`zfs_vdev_initializing_min_active` sets the minimum initializing I/Os active to each device. - -| zfs_vdev_initializing_min_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to [zfs_vdev_initializing_max_active](#zfs_vdev_initializing_max_active) -| Default | 1 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_vdev_removal_max_active -`zfs_vdev_removal_max_active` sets the maximum top-level vdev removal I/Os active to each device. - -| zfs_vdev_removal_max_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to [zfs_vdev_max_active](#zfs_vdev_max_active) -| Default | 2 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_vdev_removal_min_active -`zfs_vdev_removal_min_active` sets the minimum top-level vdev removal I/Os active to each device. - -| zfs_vdev_removal_min_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to [zfs_vdev_removal_max_active](#zfs_vdev_removal_max_active) -| Default | 1 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_vdev_trim_max_active -`zfs_vdev_trim_max_active` sets the maximum trim I/Os active to each device. - -| zfs_vdev_trim_max_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to [zfs_vdev_max_active](#zfs_vdev_max_active) -| Default | 2 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_vdev_trim_min_active -`zfs_vdev_trim_min_active` sets the minimum trim I/Os active to each device. - -| zfs_vdev_trim_min_active | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | See [ZFS I/O Scheduler](https://github.com/zfsonlinux/zfs/wiki/ZIO-Scheduler) -| Data Type | uint32 -| Units | I/O operations -| Range | 1 to [zfs_vdev_trim_max_active](#zfs_vdev_trim_max_active) -| Default | 1 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_initialize_value -When initializing a vdev, ZFS writes patterns of `zfs_initialize_value` bytes to the device. - -| zfs_initialize_value | Notes -|---|--- -| Tags | [vdev_initialize](#vdev_initialize) -| When to change | when debugging initialization code -| Data Type | uint32 or uint64 -| Default | 0xdeadbeef for 32-bit systems, 0xdeadbeefdeadbeee for 64-bit systems -| Change | prior to running `zpool initialize` -| Versions Affected | planned for v2 - -### zfs_lua_max_instrlimit -`zfs_lua_max_instrlimit` limits the maximum time for a ZFS channel program to run. - -| zfs_lua_max_instrlimit | Notes -|---|--- -| Tags | [channel_programs](#channel_programs) -| When to change | to enforce a CPU usage limit on ZFS channel programs -| Data Type | ulong -| Units | LUA instructions -| Range | 0 to MAX_ULONG -| Default | 100,000,000 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_lua_max_memlimit -'zfs_lua_max_memlimit' is the maximum memory limit for a ZFS channel program. - -| zfs_lua_max_memlimit | Notes -|---|--- -| Tags | [channel_programs](#channel_programs) -| When to change | -| Data Type | ulong -| Units | bytes -| Range | 0 to MAX_ULONG -| Default | 104,857,600 (100 MiB) -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_max_dataset_nesting -`zfs_max_dataset_nesting` limits the depth of nested datasets. -Deeply nested datasets can overflow the stack. The maximum stack depth depends on kernel compilation options, -so it is impractical to predict the possible limits. For kernels compiled with small stack sizes, -`zfs_max_dataset_nesting` may require changes. - -| zfs_max_dataset_nesting | Notes -|---|--- -| Tags | [dataset](#dataset) -| When to change | can be tuned temporarily to fix existing datasets that exceed the predefined limit -| Data Type | int -| Units | datasets -| Range | 0 to MAX_INT -| Default | 50 -| Change | Dynamic, though once on-disk the value for the pool is set -| Versions Affected | planned for v2 - -### zfs_ddt_data_is_special -`zfs_ddt_data_is_special` enables the deduplication table (DDT) to reside on a special top-level vdev. - -| zfs_ddt_data_is_special | Notes -|---|--- -| Tags | [dedup](#dedup), [special_vdev](#special_vdev) -| When to change | when using a special top-level vdev and no dedup top-level vdev and it is desired to store the DDT in the main pool top-level vdevs -| Data Type | boolean -| Range | 0=do not use special vdevs to store DDT, 1=store DDT in special vdevs -| Default | 1 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_user_indirect_is_special -If special vdevs are in use, `zfs_user_indirect_is_special` enables user data indirect blocks (a form of metadata) -to be written to the special vdevs. - -| zfs_user_indirect_is_special | Notes -|---|--- -| Tags | [special_vdev](#special_vdev) -| When to change | to force user data indirect blocks to remain in the main pool top-level vdevs -| Data Type | boolean -| Range | 0=do not write user indirect blocks to a special vdev, 1=write user indirect blocks to a special vdev -| Default | 1 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_reconstruct_indirect_combinations_max -After device removal, if an indirect split block contains more than `zfs_reconstruct_indirect_combinations_max` -many possible unique combinations when being reconstructed, it can be considered too computationally -expensive to check them all. Instead, at most `zfs_reconstruct_indirect_combinations_max` randomly-selected -combinations are attempted each time the block is accessed. This allows all segment -copies to participate fairly in the reconstruction when all combinations -cannot be checked and prevents repeated use of one bad copy. - -| zfs_reconstruct_indirect_combinations_max | Notes -|---|--- -| Tags | [vdev_removal](#vdev_removal) -| When to change | TBD -| Data Type | int -| Units | attempts -| Range | 0=do not limit attempts, 1 to MAX_INT = limit for attempts -| Default | 4096 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_send_unmodified_spill_blocks -`zfs_send_unmodified_spill_blocks` enables sending of unmodified spill blocks in the send stream. -Under certain circumstances, previous versions of ZFS could incorrectly remove the spill block from an -existing object. Including unmodified copies of the spill blocks creates a -backwards compatible stream which will recreate a spill block if it was incorrectly removed. - -| zfs_send_unmodified_spill_blocks | Notes -|---|--- -| Tags | [send](#send) -| When to change | TBD -| Data Type | boolean -| Range | 0=do not send unmodified spill blocks, 1=send unmodified spill blocks -| Default | 1 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_spa_discard_memory_limit -`zfs_spa_discard_memory_limit` sets the limit for maximum memory used for prefetching a -pool's checkpoint space map on each vdev while discarding a pool checkpoint. - -| zfs_spa_discard_memory_limit | Notes -|---|--- -| Tags | [checkpoint](#checkpoint) -| When to change | TBD -| Data Type | int -| Units | bytes -| Range | 0 to MAX_INT -| Default | 16,777,216 (16 MiB) -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_special_class_metadata_reserve_pct -`zfs_special_class_metadata_reserve_pct` sets a threshold for space in special vdevs to be reserved exclusively -for metadata. This prevents small blocks or dedup table from completely consuming a special vdev. - -| zfs_special_class_metadata_reserve_pct | Notes -|---|--- -| Tags | [special_vdev](#special_vdev) -| When to change | TBD -| Data Type | int -| Units | percent -| Range | 0 to 100 -| Default | 25 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_trim_extent_bytes_max -`zfs_trim_extent_bytes_max` sets the maximum size of a trim (aka discard, scsi unmap) command. -Ranges larger than `zfs_trim_extent_bytes_max` are split in to chunks no larger than `zfs_trim_extent_bytes_max` -bytes prior to being issued to the device. -Use `zpool iostat -w` to observe the latency of trim commands. - -| zfs_trim_extent_bytes_max | Notes -|---|--- -| Tags | [trim](#trim) -| When to change | if the device can efficiently handle larger trim requests -| Data Type | uint -| Units | bytes -| Range | [zfs_trim_extent_bytes_min](#zfs_trim_extent_bytes_min) to MAX_UINT -| Default | 134,217,728 (128 MiB) -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_trim_extent_bytes_min -`zfs_trim_extent_bytes_min` sets the minimum size of trim (aka discard, scsi unmap) commands. -Trim ranges smaller than `zfs_trim_extent_bytes_min` are skipped unless they're part of a larger -range which was broken in to chunks. Some devices have performance degradation during trim operations, -so using a larger `zfs_trim_extent_bytes_min` can reduce the total amount of space trimmed. -Use `zpool iostat -w` to observe the latency of trim commands. - -| zfs_trim_extent_bytes_min | Notes -|---|--- -| Tags | [trim](#trim) -| When to change | when trim is in use and device performance suffers from trimming small allocations -| Data Type | uint -| Units | bytes -| Range | 0=trim all unallocated space, otherwise minimum physical block size to MAX_ -| Default | 32,768 (32 KiB) -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_trim_metaslab_skip -`zfs_trim_metaslab_skip` enables uninitialized metaslabs to be skipped during the trim (aka discard, scsi unmap) -process. `zfs_trim_metaslab_skip` can be useful for pools constructed from large thinly-provisioned devices where trim -operations perform slowly. -As a pool ages an increasing fraction of the pool's metaslabs are initialized, progressively degrading the -usefulness of this option. -This setting is stored when starting a manual trim and persists for the duration of the requested trim. -Use `zpool iostat -w` to observe the latency of trim commands. - -| zfs_trim_metaslab_skip | Notes -|---|--- -| Tags | [trim](#trim) -| When to change | -| Data Type | boolean -| Range | 0=do not skip unitialized metaslabs during trim, 1=skip unitialized metaslabs during trim -| Default | 0 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_trim_queue_limit -`zfs_trim_queue_limit` sets the maximum queue depth for leaf vdevs. -See also [zfs_vdev_trim_max_active](#zfs_vdev_trim_max_active) and -[zfs_trim_extent_bytes_max](#zfs_trim_extent_bytes_max) -Use `zpool iostat -q` to observe trim queue depth. - -| zfs_trim_queue_limit | Notes -|---|--- -| Tags | [trim](#trim) -| When to change | to restrict the number of trim commands in the queue -| Data Type | uint -| Units | I/O operations -| Range | 1 to MAX_UINT -| Default | 10 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_trim_txg_batch -`zfs_trim_txg_batch` sets the number of transaction groups worth of frees which should be aggregated -before trim (aka discard, scsi unmap) commands are issued to a device. This setting represents a -trade-off between issuing larger, more efficient trim commands and the -delay before the recently trimmed space is available for use by the device. - -Increasing this value will allow frees to be aggregated for a longer time. -This will result is larger trim operations and potentially increased memory -usage. Decreasing this value will have the opposite effect. The default -value of 32 was empirically determined to be a reasonable compromise. - -| zfs_trim_txg_batch | Notes -|---|--- -| Tags | [trim](#trim) -| When to change | TBD -| Data Type | uint -| Units | metaslabs to stride -| Range | 1 to MAX_UINT -| Default | 32 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_vdev_aggregate_trim -`zfs_vdev_aggregate_trim` allows trim I/Os to be aggregated. This is normally not helpful because -the extents to be trimmed will have been already been aggregated by the metaslab. - - -| zfs_vdev_aggregate_trim | Notes -|---|--- -| Tags | [trim](#trim), [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | when debugging trim code or trim performance issues -| Data Type | boolean -| Range | 0=do not attempt to aggregate trim commands, 1=attempt to aggregate trim commands -| Default | 0 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_vdev_aggregation_limit_non_rotating -`zfs_vdev_aggregation_limit_non_rotating` is the equivalent of -[zfs_vdev_aggregation_limit](#zfs_vdev_aggregation_limit) for devices -which represent themselves as non-rotating to the Linux blkdev interfaces. -Such devices have a value of 0 in `/sys/block/DEVICE/queue/rotational` and are expected to be SSDs. - -| zfs_vdev_aggregation_limit_non_rotating | Notes -|---|--- -| Tags | [vdev](#vdev), [ZIO_scheduler](#zio_scheduler) -| When to change | see [zfs_vdev_aggregation_limit](#zfs_vdev_aggregation_limit) -| Data Type | int -| Units | bytes -| Range | 0 to MAX_INT -| Default | 131,072 bytes (128 KiB) -| Change | Dynamic -| Versions Affected | planned for v2 - -### zil_nocacheflush -ZFS uses barriers (volatile cache flush commands) to ensure data is committed to -permanent media by devices. This ensures consistent on-media state for devices -where caches are volatile (eg HDDs). - -`zil_nocacheflush` disables the cache flush commands that are normally sent to devices by -the ZIL after a log write has completed. - -The difference between `zil_nocacheflush` and [zfs_nocacheflush](#zfs_nocacheflush) is -`zil_nocacheflush` applies to ZIL writes while [zfs_nocacheflush](#zfs_nocacheflush) disables -barrier writes to the pool devices at the end of tranaction group syncs. - -WARNING: setting this can cause ZIL corruption on power loss if the device has a volatile write cache. - - -| zil_nocacheflush | Notes -|---|--- -| Tags | [disks](#disks), [ZIL](#ZIL) -| When to change | If the storage device has nonvolatile cache, then disabling cache flush can save the cost of occasional cache flush comamnds -| Data Type | boolean -| Range | 0=send cache flush commands, 1=do not send cache flush commands -| Default | 0 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zio_deadman_log_all -`zio_deadman_log_all` enables debugging messages for all ZFS I/Os, rather than only for leaf -ZFS I/Os for a vdev. This is meant to be used by developers to gain diagnostic information for hang -conditions which don't involve a mutex or other locking primitive. Typically these are conditions where a thread in -the zio pipeline is looping indefinitely. - -See also [zfs_dbgmsg_enable](#zfs_dbgmsg_enable) - -| zio_deadman_log_all | Notes -|---|--- -| Tags | [debug](#debug) -| When to change | when debugging ZFS I/O pipeline -| Data Type | boolean -| Range | 0=do not log all deadman events, 1=log all deadman events -| Default | 0 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zio_decompress_fail_fraction -If non-zero, `zio_decompress_fail_fraction` represents the denominator of the probability that ZFS -should induce a decompression failure. For instance, for a 5% decompression failure rate, this value -should be set to 20. - -| zio_decompress_fail_fraction | Notes -|---|--- -| Tags | [debug](#debug) -| When to change | when debugging ZFS internal compressed buffer code -| Data Type | ulong -| Units | probability of induced decompression failure is 1/`zio_decompress_fail_fraction` -| Range | 0 = do not induce failures, or 1 to MAX_ULONG -| Default | 0 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zio_slow_io_ms -An I/O operation taking more than `zio_slow_io_ms` milliseconds to complete is marked as a slow I/O. -Slow I/O counters can be observed with `zpool status -s`. -Each slow I/O causes a delay zevent, observable using `zpool events`. -See also `zfs-events(5)`. - -| zio_slow_io_ms | Notes -|---|--- -| Tags | [vdev](#vdev), [zed](#zed) -| When to change | when debugging slow devices and the default value is inappropriate -| Data Type | int -| Units | milliseconds -| Range | 0 to MAX_INT -| Default | 30,000 (30 seconds) -| Change | Dynamic -| Versions Affected | planned for v2 - -### vdev_validate_skip -`vdev_validate_skip` disables label validation steps during pool import. -Changing is not recommended unless you know what you are doing and are recovering a damaged label. - -| vdev_validate_skip | Notes -|---|--- -| Tags | [vdev](#vdev) -| When to change | do not change -| Data Type | boolean -| Range | 0=validate labels during pool import, 1=do not validate vdev labels during pool import -| Default | 0 -| Change | prior to pool import -| Versions Affected | planned for v2 - -### zfs_async_block_max_blocks -`zfs_async_block_max_blocks` limits the number of blocks freed in a single transaction group commit. -During deletes of large objects, such as snapshots, the number of freed blocks can cause the DMU -to extend txg sync times well beyond [zfs_txg_timeout](#zfs_txg_timeout). `zfs_async_block_max_blocks` -is used to limit these effects. - -| zfs_async_block_max_blocks | Notes -|---|--- -| Tags | [delete](#delete), [DMU](#DMU) -| When to change | TBD -| Data Type | ulong -| Units | blocks -| Range | 1 to MAX_ULONG -| Default | MAX_ULONG (do not limit) -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_checksum_events_per_second -`zfs_checksum_events_per_second` is a rate limit for checksum events. -Note that this should not be set below the `zed` thresholds (currently 10 checksums over 10 sec) -or else `zed` may not trigger any action. - -| zfs_checksum_events_per_second | Notes -|---|--- -| Tags | [vdev](#vdev) -| When to change | TBD -| Data Type | uint -| Units | checksum events -| Range | `zed` threshold to MAX_UINT -| Default | 20 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_disable_ivset_guid_check -`zfs_disable_ivset_guid_check` disables requirement for IVset guids to be present and match when doing a raw -receive of encrypted datasets. Intended for users whose pools were created with -ZFS on Linux pre-release versions and now have compatibility issues. - -For a ZFS raw receive, from a send stream created by `zfs send --raw`, the crypt_keydata nvlist includes -a to_ivset_guid to be set on the new snapshot. This value will override the value generated by the snapshot code. -However, this value may not be present, because older implementations of -the raw send code did not include this value. -When `zfs_disable_ivset_guid_check` is enabled, the receive proceeds and a newly-generated value is used. - -| zfs_disable_ivset_guid_check | Notes -|---|--- -| Tags | [receive](#receive) -| When to change | debugging pre-release ZFS raw sends -| Data Type | boolean -| Range | 0=check IVset guid, 1=do not check IVset guid -| Default | 0 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_obsolete_min_time_ms -`zfs_obsolete_min_time_ms` is similar to [zfs_free_min_time_ms](#zfs_free_min_time_ms) -and used for cleanup of old indirection records for vdevs removed using the `zpool remove` command. - -| zfs_obsolete_min_time_ms | Notes -|---|--- -| Tags | [delete](#delete), [remove](#remove) -| When to change | TBD -| Data Type | int -| Units | milliseconds -| Range | 0 to MAX_INT -| Default | 500 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_override_estimate_recordsize -`zfs_override_estimate_recordsize` overrides the default logic for estimating block -sizes when doing a zfs send. The default heuristic is that the average block size will be the current recordsize. - -| zfs_override_estimate_recordsize | Notes -|---|--- -| Tags | [send](#send) -| When to change | if most data in your dataset is not of the current recordsize and you require accurate zfs send size estimates -| Data Type | ulong -| Units | bytes -| Range | 0=do not override, 1 to MAX_ULONG -| Default | 0 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_remove_max_segment -`zfs_remove_max_segment` sets the largest contiguous segment that ZFS attempts to allocate when removing a vdev. -This can be no larger than 16MB. If there is a performance problem with attempting to allocate large blocks, consider decreasing this. -The value is rounded up to a power-of-2. - -| zfs_remove_max_segment | Notes -|---|--- -| Tags | [remove](#remove) -| When to change | after removing a top-level vdev, consider decreasing if there is a performance degradation when attempting to allocate large blocks -| Data Type | int -| Units | bytes -| Range | maximum of the physical block size of all vdevs in the pool to 16,777,216 bytes (16 MiB) -| Default | 16,777,216 bytes (16 MiB) -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_resilver_disable_defer -`zfs_resilver_disable_defer` disables the `resilver_defer` pool feature. -The `resilver_defer` feature allows ZFS to postpone new resilvers if an existing resilver is in progress. - -| zfs_resilver_disable_defer | Notes -|---|--- -| Tags | [resilver](#resilver) -| When to change | if resilver postponement is not desired due to overall resilver time constraints -| Data Type | boolean -| Range | 0=allow `resilver_defer` to postpone new resilver operations, 1=immediately restart resilver when needed -| Default | 0 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_scan_suspend_progress -`zfs_scan_suspend_progress` causes a scrub or resilver scan to freeze without actually pausing. - -| zfs_scan_suspend_progress | Notes -|---|--- -| Tags | [resilver](#resilver), [scrub](#scrub) -| When to change | testing or debugging scan code -| Data Type | boolean -| Range | 0=do not freeze scans, 1=freeze scans -| Default | 0 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_scrub_min_time_ms -Scrubs are processed by the sync thread. While scrubbing at least `zfs_scrub_min_time_ms` time is -spent working on a scrub between txg syncs. - -| zfs_scrub_min_time_ms | Notes -|---|--- -| Tags | [scrub](#scrub) -| When to change | -| Data Type | int -| Units | milliseconds -| Range | 1 to ([zfs_txg_timeout](#zfs_txg_timeout) - 1) -| Default | 1,000 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_slow_io_events_per_second -`zfs_slow_io_events_per_second` is a rate limit for slow I/O events. -Note that this should not be set below the `zed` thresholds (currently 10 checksums over 10 sec) -or else `zed` may not trigger any action. - -| zfs_slow_io_events_per_second | Notes -|---|--- -| Tags | [vdev](#vdev) -| When to change | TBD -| Data Type | uint -| Units | slow I/O events -| Range | `zed` threshold to MAX_UINT -| Default | 20 -| Change | Dynamic -| Versions Affected | planned for v2 - -### zfs_vdev_min_ms_count -`zfs_vdev_min_ms_count` is the minimum number of metaslabs to create in a top-level vdev. - -| zfs_vdev_min_ms_count | Notes -|---|--- -| Tags | [metaslab](#metaslab), [vdev](#vdev) -| When to change | TBD -| Data Type | int -| Units | metaslabs -| Range | 16 to [zfs_vdev_ms_count_limit](#zfs_vdev_ms_count_limit) -| Default | 16 -| Change | prior to creating a pool or adding a top-level vdev -| Versions Affected | planned for v2 - -### zfs_vdev_ms_count_limit -`zfs_vdev_ms_count_limit` is the practical upper limit for the number of metaslabs per top-level vdev. - -| zfs_vdev_ms_count_limit | Notes -|---|--- -| Tags | [metaslab](#metaslab), [vdev](#vdev) -| When to change | TBD -| Data Type | int -| Units | metaslabs -| Range | [zfs_vdev_min_ms_count](#zfs_vdev_min_ms_count) to 131,072 -| Default | 131,072 -| Change | prior to creating a pool or adding a top-level vdev -| Versions Affected | planned for v2 - -### spl_hostid -`spl_hostid` is a unique system id number. It orginated in Sun's products where most systems had a -unique id assigned at the factory. This assignment does not exist in modern hardware. -In ZFS, the hostid is stored in the vdev label and can be used to determine if another system had -imported the pool. -When set `spl_hostid` can be used to uniquely identify a system. -By default this value is set to zero which indicates the hostid is disabled. -It can be explicitly enabled by placing a unique non-zero value in the file shown in -[spl_hostid_path](#spl_hostid_path) - -| spl_hostid | Notes -|---|--- -| Tags | [hostid](#hostid), [MMP](#MMP) -| Kernel module | spl -| When to change | to uniquely identify a system when vdevs can be shared across multiple systems -| Data Type | ulong -| Range | 0=ignore hostid, 1 to 4,294,967,295 (32-bits or 0xffffffff) -| Default | 0 -| Change | prior to importing pool -| Versions Affected | v0.6.1 - -### spl_hostid_path -`spl_hostid_path` is the path name for a file that can contain a unique hostid. -For testing purposes, `spl_hostid_path` can be overridden by the ZFS_HOSTID environment variable. - -| spl_hostid_path | Notes -|---|--- -| Tags | [hostid](#hostid), [MMP](#MMP) -| Kernel module | spl -| When to change | when creating a new ZFS distribution where the default value is inappropriate -| Data Type | string -| Default | "/etc/hostid" -| Change | read-only, can only be changed prior to spl module load -| Versions Affected | v0.6.1 - -### spl_kmem_alloc_max -Large `kmem_alloc()` allocations fail if they exceed KMALLOC_MAX_SIZE, as determined by the kernel source. -Allocations which are marginally smaller than this limit may succeed but -should still be avoided due to the expense of locating a contiguous range -of free pages. Therefore, a maximum kmem size with reasonable safely -margin of 4x is set. `kmem_alloc()` allocations larger than this maximum -will quickly fail. `vmem_alloc()` allocations less than or equal to this -value will use `kmalloc()`, but shift to `vmalloc()` when exceeding this value. - -| spl_kmem_alloc_max | Notes -|---|--- -| Tags | [memory](#memory) -| Kernel module | spl -| When to change | TBD -| Data Type | uint -| Units | bytes -| Range | TBD -| Default | KMALLOC_MAX_SIZE / 4 -| Change | Dynamic -| Versions Affected | v0.7.0 - -### spl_kmem_alloc_warn -As a general rule `kmem_alloc()` allocations should be small, preferably -just a few pages since they must by physically contiguous. Therefore, a -rate limited warning is printed to the console for any `kmem_alloc()` -which exceeds the threshold `spl_kmem_alloc_warn` - -The default warning threshold is set to eight pages but capped at 32K to -accommodate systems using large pages. This value was selected to be small -enough to ensure the largest allocations are quickly noticed and fixed. -But large enough to avoid logging any warnings when a allocation size is -larger than optimal but not a serious concern. Since this value is tunable, -developers are encouraged to set it lower when testing so any new largish -allocations are quickly caught. These warnings may be disabled by setting -the threshold to zero. - -| spl_kmem_alloc_warn | Notes -|---|--- -| Tags | [memory](#memory) -| Kernel module | spl -| When to change | developers are encouraged lower when testing so any new, large allocations are quickly caught -| Data Type | uint -| Units | bytes -| Range | 0=disable the warnings, -| Default | 32,768 (32 KiB) -| Change | Dynamic -| Versions Affected | v0.7.0 - -### spl_kmem_cache_expire -Cache expiration is part of default illumos cache behavior. The idea is -that objects in magazines which have not been recently accessed should be -returned to the slabs periodically. This is known as cache aging and -when enabled objects will be typically returned after 15 seconds. - -On the other hand Linux slabs are designed to never move objects back to -the slabs unless there is memory pressure. This is possible because under -Linux the cache will be notified when memory is low and objects can be -released. - -By default only the Linux method is enabled. It has been shown to improve -responsiveness on low memory systems and not negatively impact the performance -of systems with more memory. This policy may be changed by setting the -`spl_kmem_cache_expire` bit mask as follows, both policies may be enabled -concurrently. - -| spl_kmem_cache_expire | Notes -|---|--- -| Tags | [memory](#memory) -| Kernel module | spl -| When to change | TBD -| Data Type | bitmask -| Range | 0x01 - Aging (illumos), 0x02 - Low memory (Linux) -| Default | 0x02 -| Change | Dynamic -| Versions Affected | v0.6.1 - -### spl_kmem_cache_kmem_limit -Depending on the size of a memory cache object it may be backed by `kmalloc()` -or `vmalloc()` memory. This is because the size of the required allocation -greatly impacts the best way to allocate the memory. - -When objects are small and only a small number of memory pages need to be -allocated, ideally just one, then `kmalloc()` is very efficient. However, -allocating multiple pages with `kmalloc()` gets increasingly expensive -because the pages must be physically contiguous. - -For this reason we shift to `vmalloc()` for slabs of large objects which -which removes the need for contiguous pages. `vmalloc()` cannot be used in -all cases because there is significant locking overhead involved. This -function takes a single global lock over the entire virtual address range -which serializes all allocations. Using slightly different allocation -functions for small and large objects allows us to handle a wide range of -object sizes. - -The `spl_kmem_cache_kmem_limit` value is used to determine this cutoff -size. One quarter of the kernel's compiled PAGE_SIZE is used as the default value because -[spl_kmem_cache_obj_per_slab](#spl_kmem_cache_obj_per_slab) defaults to 16. -With these default values, at most four contiguous pages are allocated. - -| spl_kmem_cache_kmem_limit | Notes -|---|--- -| Tags | [memory](#memory) -| Kernel module | spl -| When to change | TBD -| Data Type | uint -| Units | pages -| Range | TBD -| Default | PAGE_SIZE / 4 -| Change | Dynamic -| Versions Affected | v0.7.0 - -### spl_kmem_cache_max_size -`spl_kmem_cache_max_size` is the maximum size of a kmem cache slab in MiB. -This effectively limits the maximum cache object size to -`spl_kmem_cache_max_size` / [spl_kmem_cache_obj_per_slab](#spl_kmem_cache_obj_per_slab) -Kmem caches may not be created with object sized larger than this limit. - -| spl_kmem_cache_max_size | Notes -|---|--- -| Tags | [memory](#memory) -| Kernel module | spl -| When to change | TBD -| Data Type | uint -| Units | MiB -| Range | TBD -| Default | 4 for 32-bit kernel, 32 for 64-bit kernel -| Change | Dynamic -| Versions Affected | v0.7.0 - -### spl_kmem_cache_obj_per_slab -`spl_kmem_cache_obj_per_slab` is the preferred number of objects per slab in the kmem cache. -In general, a larger value will increase the caches memory footprint while decreasing the time -required to perform an allocation. Conversely, a smaller value will minimize the footprint and -improve cache reclaim time but individual allocations may take longer. - -| spl_kmem_cache_obj_per_slab | Notes -|---|--- -| Tags | [memory](#memory) -| Kernel module | spl -| When to change | TBD -| Data Type | uint -| Units | kmem cache objects -| Range | TBD -| Default | 8 -| Change | Dynamic -| Versions Affected | v0.7.0 - -### spl_kmem_cache_obj_per_slab_min -`spl_kmem_cache_obj_per_slab_min` is the minimum number of objects allowed per slab. -Normally slabs will contain [spl_kmem_cache_obj_per_slab](#spl_kmem_cache_obj_per_slab) objects but -for caches that contain very large objects it's desirable to only have a few, or even just one, object per slab. - -| spl_kmem_cache_obj_per_slab_min | Notes -|---|--- -| Tags | [memory](#memory) -| Kernel module | spl -| When to change | debugging kmem cache operations -| Data Type | uint -| Units | kmem cache objects -| Range | TBD -| Default | 1 -| Change | Dynamic -| Versions Affected | v0.7.0 - -### spl_kmem_cache_reclaim -`spl_kmem_cache_reclaim` prevents Linux from being able to rapidly reclaim all the memory held by the kmem caches. -This may be useful in circumstances where it's preferable that Linux reclaim memory from some other subsystem first. -Setting `spl_kmem_cache_reclaim` increases the likelihood out of memory events on a memory constrained system. - -| spl_kmem_cache_reclaim | Notes -|---|--- -| Tags | [memory](#memory) -| Kernel module | spl -| When to change | TBD -| Data Type | boolean -| Range | 0=enable rapid memory reclaim from kmem caches, 1=disable rapid memory reclaim from kmem caches -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.7.0 - -### spl_kmem_cache_slab_limit -For small objects the Linux slab allocator should be used to make the most efficient use of the memory. -However, large objects are not supported by the Linux slab allocator and therefore the SPL implementation is preferred. -`spl_kmem_cache_slab_limit` is used to determine the cutoff between a small and large object. - -Objects of `spl_kmem_cache_slab_limit` or smaller will be allocated using the Linux slab allocator, -large objects use the SPL allocator. A cutoff of 16 KiB was determined to be optimal for architectures -using 4 KiB pages. - -| spl_kmem_cache_slab_limit | Notes -|---|--- -| Tags | [memory](#memory) -| Kernel module | spl -| When to change | TBD -| Data Type | uint -| Units | bytes -| Range | TBD -| Default | 16,384 (16 KiB) when kernel PAGE_SIZE = 4KiB, 0 for other PAGE_SIZE values -| Change | Dynamic -| Versions Affected | v0.7.0 - -### spl_max_show_tasks -`spl_max_show_tasks` is the limit of tasks per pending list in each taskq shown in - `/proc/spl/taskq` and `/proc/spl/taskq-all`. -Reading the ProcFS files walks the lists with lock held and it could cause a lock up if the list -grow too large. If the list is larger than the limit, the string `"(truncated)" is printed. - -| spl_max_show_tasks | Notes -|---|--- -| Tags | [taskq](#taskq) -| Kernel module | spl -| When to change | TBD -| Data Type | uint -| Units | tasks reported -| Range | 0 disables the limit, 1 to MAX_UINT -| Default | 512 -| Change | Dynamic -| Versions Affected | v0.7.0 - -### spl_panic_halt -`spl_panic_halt` enables kernel panic upon assertion failures. -When not enabled, the asserting thread is halted to facilitate further debugging. - -| spl_panic_halt | Notes -|---|--- -| Tags | [debug](#debug), [panic](#panic) -| Kernel module | spl -| When to change | when debugging assertions and kernel core dumps are desired -| Data Type | boolean -| Range | 0=halt thread upon assertion, 1=panic kernel upon assertion -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.7.0 - -### spl_taskq_kick -Upon writing a non-zero value to `spl_taskq_kick`, all taskqs are scanned. -If any taskq has a pending task more than 5 seconds old, the taskq spawns more threads. -This can be useful in rare deadlock situations caused by one or more taskqs not spawning a thread when it should. - -| spl_taskq_kick | Notes -|---|--- -| Tags | [taskq](#taskq) -| Kernel module | spl -| When to change | See description above -| Data Type | uint -| Units | N/A -| Default | 0 -| Change | Dynamic -| Versions Affected | v0.7.0 - -### spl_taskq_thread_bind -`spl_taskq_thread_bind` enables binding taskq threads to specific CPUs, distributed evenly over the available CPUs. -By default, this behavior is disabled to allow the Linux scheduler the maximum flexibility to determine -where a thread should run. - -| spl_taskq_thread_bind | Notes -|---|--- -| Tags | [CPU](#CPU), [taskq](#taskq) -| Kernel module | spl -| When to change | when debugging CPU scheduling options -| Data Type | boolean -| Range | 0=taskqs are not bound to specific CPUs, 1=taskqs are bound to CPUs -| Default | 0 -| Change | prior to loading spl kernel module -| Versions Affected | v0.7.0 - -### spl_taskq_thread_dynamic -`spl_taskq_thread_dynamic` enables taskqs to set the TASKQ_DYNAMIC flag will by default create only a single thread. -New threads will be created on demand up to a maximum allowed number to facilitate the completion of -outstanding tasks. Threads which are no longer needed are promptly destroyed. - By default this behavior is enabled but it can be d. - -See also [zfs_zil_clean_taskq_nthr_pct](#zfs_zil_clean_taskq_nthr_pct), [zio_taskq_batch_pct](#zio_taskq_batch_pct) - -| spl_taskq_thread_dynamic | Notes -|---|--- -| Tags | [taskq](#taskq) -| Kernel module | spl -| When to change | disable for performance analysis or troubleshooting -| Data Type | boolean -| Range | 0=taskq threads are not dynamic, 1=taskq threads are dynamically created and destroyed -| Default | 1 -| Change | prior to loading spl kernel module -| Versions Affected | v0.7.0 - -### spl_taskq_thread_priority -`spl_taskq_thread_priority` allows newly created taskq threads to set a non-default scheduler priority. -When enabled the priority specified when a taskq is created will be applied -to all threads created by that taskq. -When disabled all threads will use the default Linux kernel thread priority. - -| spl_taskq_thread_priority | Notes -|---|--- -| Tags | [CPU](#CPU), [taskq](#taskq) -| Kernel module | spl -| When to change | when troubleshooting CPU scheduling-related performance issues -| Data Type | boolean -| Range | 0=taskq threads use the default Linux kernel thread priority, 1= -| Default | 1 -| Change | prior to loading spl kernel module -| Versions Affected | v0.7.0 - -### spl_taskq_thread_sequential -`spl_taskq_thread_sequential` is the number of items a taskq worker thread must handle without interruption -before requesting a new worker thread be spawned. `spl_taskq_thread_sequential` controls -how quickly taskqs ramp up the number of threads processing the queue. -Because Linux thread creation and destruction are relatively inexpensive a -small default value has been selected. Thus threads are created aggressively, which is typically desirable. -Increasing this value results in a slower thread creation rate which may be preferable for some configurations. - -| spl_taskq_thread_sequential | Notes -|---|--- -| Tags | [CPU](#CPU), [taskq](#taskq) -| Kernel module | spl -| When to change | TBD -| Data Type | int -| Units | taskq items -| Range | 1 to MAX_INT -| Default | 4 -| Change | Dynamic -| Versions Affected | v0.7.0 - -### spl_kmem_cache_kmem_threads -`spl_kmem_cache_kmem_threads` shows the current number of `spl_kmem_cache` threads. -This task queue is responsible for allocating new slabs for use by the kmem caches. -For the majority of systems and workloads only a small number of threads are required. - -| spl_kmem_cache_kmem_threads | Notes -|---|--- -| Tags | [CPU](#CPU), [memory](#memory) -| Kernel module | spl -| When to change | read-only -| Data Type | int -| Range | 1 to MAX_INT -| Units | threads -| Default | 4 -| Change | read-only, can only be changed prior to spl module load -| Versions Affected | v0.7.0 - -### spl_kmem_cache_magazine_size -`spl_kmem_cache_magazine_size` shows the current . -Cache magazines are an optimization designed to minimize the cost of -allocating memory. They do this by keeping a per-cpu cache of recently -freed objects, which can then be reallocated without taking a lock. This -can improve performance on highly contended caches. However, because -objects in magazines will prevent otherwise empty slabs from being -immediately released this may not be ideal for low memory machines. - -For this reason spl_kmem_cache_magazine_size can be used to set a maximum -magazine size. When this value is set to 0 the magazine size will be -automatically determined based on the object size. Otherwise magazines -will be limited to 2-256 objects per magazine (eg per CPU). -Magazines cannot be disabled entirely in this implementation. - -| spl_kmem_cache_magazine_size | Notes -|---|--- -| Tags | [CPU](#CPU), [memory](#memory) -| Kernel module | spl -| When to change | -| Data Type | int -| Units | threads -| Range | 0=automatically scale magazine size, otherwise 2 to 256 -| Default | 0 -| Change | read-only, can only be changed prior to spl module load -| Versions Affected | v0.7.0 +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/ZIO-Scheduler.md b/ZIO-Scheduler.md index 1214483..53f6cfd 100644 --- a/ZIO-Scheduler.md +++ b/ZIO-Scheduler.md @@ -1,74 +1,3 @@ -# ZFS I/O (ZIO) Scheduler -ZFS issues I/O operations to leaf vdevs (usually devices) to satisfy and -complete I/Os. The ZIO scheduler determines when and in what order those -operations are issued. Operations into five I/O classes -prioritized in the following order: +This page was moved to: https://openzfs.github.io/openzfs-docs/Performance%20and%20tuning/ZIO%20Scheduler.html -| Priority | I/O Class | Description -|---|---|--- -| highest | sync read | most reads -| | sync write | as defined by application or via 'zfs' 'sync' property -| | async read | prefetch reads -| | async write | most writes -| lowest | scrub read | scan read: includes both scrub and resilver - -Each queue defines the minimum and maximum number of concurrent operations -issued to the device. In addition, the device has an aggregate maximum, -zfs_vdev_max_active. Note that the sum of the per-queue minimums -must not exceed the aggregate maximum. If the sum of the per-queue -maximums exceeds the aggregate maximum, then the number of active I/Os -may reach zfs_vdev_max_active, in which case no further I/Os are issued -regardless of whether all per-queue minimums have been met. - -| I/O Class | Min Active Parameter | Max Active Parameter -|---|---|--- -| sync read | zfs_vdev_sync_read_min_active | zfs_vdev_sync_read_max_active -| sync write | zfs_vdev_sync_write_min_active | zfs_vdev_sync_write_max_active -| async read | zfs_vdev_async_read_min_active | zfs_vdev_async_read_max_active -| async write | zfs_vdev_async_write_min_active | zfs_vdev_async_write_max_active -| scrub read | zfs_vdev_scrub_min_active | zfs_vdev_scrub_max_active - -For many physical devices, throughput increases with the number of -concurrent operations, but latency typically suffers. Further, physical -devices typically have a limit at which more concurrent operations have no -effect on throughput or can actually cause it to performance to decrease. - -The ZIO scheduler selects the next operation to issue by first looking for an -I/O class whose minimum has not been satisfied. Once all are satisfied and -the aggregate maximum has not been hit, the scheduler looks for classes -whose maximum has not been satisfied. Iteration through the I/O classes is -done in the order specified above. No further operations are issued if the -aggregate maximum number of concurrent operations has been hit or if there -are no operations queued for an I/O class that has not hit its maximum. -Every time an I/O is queued or an operation completes, the I/O scheduler -looks for new operations to issue. - -In general, smaller max_active's will lead to lower latency of synchronous -operations. Larger max_active's may lead to higher overall throughput, -depending on underlying storage and the I/O mix. - -The ratio of the queues' max_actives determines the balance of performance -between reads, writes, and scrubs. For example, when there is contention, -increasing zfs_vdev_scrub_max_active will cause the scrub or resilver to -complete more quickly, but reads and writes to have higher latency and -lower throughput. - -All I/O classes have a fixed maximum number of outstanding operations -except for the async write class. Asynchronous writes represent the data -that is committed to stable storage during the syncing stage for -transaction groups (txgs). Transaction groups enter the syncing state -periodically so the number of queued async writes quickly bursts up -and then reduce down to zero. The zfs_txg_timeout tunable (default=5 seconds) -sets the target interval for txg sync. Thus a burst of async writes every -5 seconds is a normal ZFS I/O pattern. - -Rather than servicing I/Os as quickly as possible, the ZIO scheduler changes -the maximum number of active async write I/Os according to the amount of -dirty data in the pool. Since both throughput and latency typically increase -as the number of concurrent operations issued to physical devices, reducing -the burstiness in the number of concurrent operations also stabilizes the -response time of operations from other queues. This is particular important -for the sync read and write queues, where the periodic async write bursts of -the txg sync can lead to device-level contention. In broad strokes, the ZIO -scheduler issues more concurrent operations from the async write queue as -there's more dirty data in the pool. +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/_Footer.md b/_Footer.md index a9e61d9..50271ec 100644 --- a/_Footer.md +++ b/_Footer.md @@ -1 +1 @@ -[[Home]] / [[Project and Community]] / [[Developer Resources]] / [[License]] [![Creative Commons License](https://i.creativecommons.org/l/by-sa/3.0/80x15.png)](http://creativecommons.org/licenses/by-sa/3.0/) +[OpenZFS documentation](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/_Sidebar.md b/_Sidebar.md deleted file mode 100644 index 60b1a7a..0000000 --- a/_Sidebar.md +++ /dev/null @@ -1,50 +0,0 @@ -* [[Home]] -* [[Getting Started]] - * [ArchLinux][arch] - * [[Debian]] - * [[Fedora]] - * [FreeBSD][freebsd] - * [Gentoo][gentoo] - * [openSUSE][opensuse] - * [[RHEL and CentOS]] - * [[Ubuntu]] -* [[Project and Community]] - * [[Admin Documentation]] - * [[FAQ]] - * [[Mailing Lists]] - * [Releases][releases] - * [[Signing Keys]] - * [Issue Tracker][issues] - * [Roadmap][roadmap] -* [[Developer Resources]] - * [[Custom Packages]] - * [[Building ZFS]] - * [Buildbot Status][buildbot-status] - * [Buildbot Issue Tracking][known-zts-failures] - * [Buildbot Options][control-buildbot] - * [OpenZFS Tracking][openzfs-tracking] - * [[OpenZFS Patches]] - * [[OpenZFS Exceptions]] - * [OpenZFS Documentation][openzfs-devel] - * [[Git and GitHub for beginners]] -* Performance and Tuning - * [[ZFS on Linux Module Parameters]] - * [ZFS Transaction Delay and Write Throttle][ZFS-Transaction-Delay] - * [[ZIO Scheduler]] - * [[Checksums]] - * [Asynchronous Writes][Async-Write] - -[arch]: https://wiki.archlinux.org/index.php/ZFS -[gentoo]: https://wiki.gentoo.org/wiki/ZFS -[freebsd]: https://zfsonfreebsd.github.io/ZoF/ -[opensuse]: https://software.opensuse.org/package/zfs -[releases]: https://github.com/zfsonlinux/zfs/releases -[issues]: https://github.com/zfsonlinux/zfs/issues -[roadmap]: https://github.com/zfsonlinux/zfs/milestones -[openzfs-devel]: http://open-zfs.org/wiki/Developer_resources -[openzfs-tracking]: http://build.zfsonlinux.org/openzfs-tracking.html -[buildbot-status]: http://build.zfsonlinux.org/tgrid?length=100&branch=master&category=Platforms&rev_order=desc -[control-buildbot]: https://github.com/zfsonlinux/zfs/wiki/Buildbot-Options -[known-zts-failures]: http://build.zfsonlinux.org/known-issues.html -[ZFS-Transaction-Delay]: https://github.com/zfsonlinux/zfs/wiki/ZFS-Transaction-Delay -[Async-Write]: https://github.com/zfsonlinux/zfs/wiki/Async-Write diff --git a/dRAID-HOWTO.md b/dRAID-HOWTO.md index 1289728..207834d 100644 --- a/dRAID-HOWTO.md +++ b/dRAID-HOWTO.md @@ -1,289 +1,3 @@ -# Introduction +This page was moved to: https://openzfs.github.io/openzfs-docs/Basics%20concepts/dRAID%20Howto.html -## raidz vs draid - -ZFS users are most likely very familiar with raidz already, so a comparison with draid would help. The illustrations below are simplified, but sufficient for the purpose of a comparison. For example, 31 drives can be configured as a zpool of 6 raidz1 vdevs and a hot spare: -![raidz1](https://cloud.githubusercontent.com/assets/6722662/23642396/9790e432-02b7-11e7-8198-ae9f17c61d85.png) - -As shown above, if drive 0 fails and is replaced by the hot spare, only 5 out of the 30 surviving drives will work to resilver: drives 1-4 read, and drive 30 writes. - -The same 30 drives can be configured as 1 draid1 vdev of the same level of redundancy (i.e. single parity, 1/4 parity ratio) and single spare capacity: -![draid1](https://cloud.githubusercontent.com/assets/6722662/23642395/9783ef8e-02b7-11e7-8d7e-31d1053ee4ff.png) - -The drives are shuffled in a way that, after drive 0 fails, all 30 surviving drives will work together to restore the lost data/parity: -* All 30 drives read, because unlike the raidz1 configuration shown above, in the draid1 configuration the neighbor drives of the failed drive 0 (i.e. drives in a same data+parity group) are not fixed. -* All 30 drives write, because now there is no dedicated spare drive. Instead, spare blocks come from all drives. - -To summarize: -* Normal application IO: draid and raidz are very similar. There's a slight advantage in draid, since there's no dedicated spare drive which is idle when not in use. -* Restore lost data/parity: for raidz, not all surviving drives will work to rebuild, and in addition it's bounded by the write throughput of a single replacement drive. For draid, the rebuild speed will scale with the total number of drives because all surviving drives will work to rebuild. - -The dRAID vdev must shuffle its child drives in a way that regardless of which drive has failed, the rebuild IO (both read and write) will distribute evenly among all surviving drives, so the rebuild speed will scale. The exact mechanism used by the dRAID vdev driver is beyond the scope of this simple introduction here. If interested, please refer to the recommended readings in the next section. - -## Recommended Reading - -Parity declustering (the fancy term for shuffling drives) has been an active research topic, and many papers have been published in this area. The [Permutation Development Data Layout](http://www.cse.scu.edu/~tschwarz/TechReports/hpca.pdf) is a good paper to begin. The dRAID vdev driver uses a shuffling algorithm loosely based on the mechanism described in this paper. - -# Using dRAID - -First get the code [here](https://github.com/openzfs/zfs/pull/10102), build zfs with _configure --enable-debug_, and install. Then load the zfs kernel module with the following options which help dRAID rebuild performance. - -* zfs_vdev_scrub_max_active=10 -* zfs_vdev_async_write_min_active=4 - -## Create a dRAID vdev - -Similar to raidz vdev a dRAID vdev can be created using the `zpool create` command: - -``` -# zpool create draid[1,2,3][ -``` - -Unlike raidz, additional options may be provided as part of the `draid` vdev type to specify an exact dRAID layout. When unspecific reasonable defaults will be chosen. - -``` -# zpool create draid[1,2,3][:g][:s][:d][:] -``` - - * groups - Number of redundancy groups (default: 1 group per 12 vdevs) - * spares - Number of distributed hot spares (default: 1) - * data - Number of data devices per group (default: determined by number of groups) - * iterations - Number of iterations to perform generating a valid dRAID mapping (default 3). - -_Notes_: -* The default values are not set in stone and may change. -* For the majority of common configurations we intend to provide pre-computed balanced dRAID mappings. -* When _data_ is specified then: (draid_children - spares) % (parity + data) == 0, otherwise the pool creation will fail. - -Now the dRAID vdev is online and ready for IO: - -``` - pool: tank - state: ONLINE -config: - - NAME STATE READ WRITE CKSUM - tank ONLINE 0 0 0 - draid2:4g:2s-0 ONLINE 0 0 0 - L0 ONLINE 0 0 0 - L1 ONLINE 0 0 0 - L2 ONLINE 0 0 0 - L3 ONLINE 0 0 0 - ... - L50 ONLINE 0 0 0 - L51 ONLINE 0 0 0 - L52 ONLINE 0 0 0 - spares - s0-draid2:4g:2s-0 AVAIL - s1-draid2:4g:2s-0 AVAIL - -errors: No known data errors -``` - -There are two logical hot spare vdevs shown above at the bottom: -* The names begin with a `s-` followed by the name of the parent dRAID vdev. -* These hot spares are logical, made from reserved blocks on all the 53 child drives of the dRAID vdev. -* Unlike traditional hot spares, the distributed spare can only replace a drive in its parent dRAID vdev. - -The dRAID vdev behaves just like a raidz vdev of the same parity level. You can do IO to/from it, scrub it, fail a child drive and it'd operate in degraded mode. - -## Rebuild to distributed spare - -When there's a failed/offline child drive, the dRAID vdev supports a completely new mechanism to reconstruct lost data/parity, in addition to the resilver. First of all, resilver is still supported - if a failed drive is replaced by another physical drive, the resilver process is used to reconstruct lost data/parity to the new replacement drive, which is the same as a resilver in a raidz vdev. - -But if a child drive is replaced with a distributed spare, a new process called rebuild is used instead of resilver: -``` -# zpool offline tank sdo -# zpool replace tank sdo '%draid1-0-s0' -# zpool status - pool: tank - state: DEGRADED -status: One or more devices has been taken offline by the administrator. - Sufficient replicas exist for the pool to continue functioning in a - degraded state. -action: Online the device using 'zpool online' or replace the device with - 'zpool replace'. - scan: rebuilt 2.00G in 0h0m5s with 0 errors on Fri Feb 24 20:37:06 2017 -config: - - NAME STATE READ WRITE CKSUM - tank DEGRADED 0 0 0 - draid1-0 DEGRADED 0 0 0 - sdd ONLINE 0 0 0 - sde ONLINE 0 0 0 - sdf ONLINE 0 0 0 - sdg ONLINE 0 0 0 - sdh ONLINE 0 0 0 - sdu ONLINE 0 0 0 - sdj ONLINE 0 0 0 - sdv ONLINE 0 0 0 - sdl ONLINE 0 0 0 - sdm ONLINE 0 0 0 - sdn ONLINE 0 0 0 - spare-11 DEGRADED 0 0 0 - sdo OFFLINE 0 0 0 - %draid1-0-s0 ONLINE 0 0 0 - sdp ONLINE 0 0 0 - sdq ONLINE 0 0 0 - sdr ONLINE 0 0 0 - sds ONLINE 0 0 0 - sdt ONLINE 0 0 0 - spares - %draid1-0-s0 INUSE currently in use - %draid1-0-s1 AVAIL -``` - -The scan status line of the _zpool status_ output now says _"rebuilt"_ instead of _"resilvered"_, because the lost data/parity was rebuilt to the distributed spare by a brand new process called _"rebuild"_. The main differences from _resilver_ are: -* The rebuild process does not scan the whole block pointer tree. Instead, it only scans the spacemap objects. -* The IO from rebuild is sequential, because it rebuilds metaslabs one by one in sequential order. -* The rebuild process is not limited to block boundaries. For example, if 10 64K blocks are allocated contiguously, then rebuild will fix 640K at one time. So rebuild process will generate larger IOs than resilver. -* For all the benefits above, there is one price to pay. The rebuild process cannot verify block checksums, since it doesn't have block pointers. -* Moreover, the rebuild process requires support from on-disk format, and **only** works on draid and mirror vdevs. Resilver, on the other hand, works with any vdev (including draid). - -Although rebuild process creates larger IOs, the drives will not necessarily see large IO requests. The block device queue parameter _/sys/block/*/queue/max_sectors_kb_ must be tuned accordingly. However, since the rebuild IO is already sequential, the benefits of enabling larger IO requests might be marginal. - -At this point, redundancy has been fully restored without adding any new drive to the pool. If another drive is offlined, the pool is still able to do IO: -``` -# zpool offline tank sdj -# zpool status - state: DEGRADED -status: One or more devices has been taken offline by the administrator. - Sufficient replicas exist for the pool to continue functioning in a - degraded state. -action: Online the device using 'zpool online' or replace the device with - 'zpool replace'. - scan: rebuilt 2.00G in 0h0m5s with 0 errors on Fri Feb 24 20:37:06 2017 -config: - - NAME STATE READ WRITE CKSUM - tank DEGRADED 0 0 0 - draid1-0 DEGRADED 0 0 0 - sdd ONLINE 0 0 0 - sde ONLINE 0 0 0 - sdf ONLINE 0 0 0 - sdg ONLINE 0 0 0 - sdh ONLINE 0 0 0 - sdu ONLINE 0 0 0 - sdj OFFLINE 0 0 0 - sdv ONLINE 0 0 0 - sdl ONLINE 0 0 0 - sdm ONLINE 0 0 0 - sdn ONLINE 0 0 0 - spare-11 DEGRADED 0 0 0 - sdo OFFLINE 0 0 0 - %draid1-0-s0 ONLINE 0 0 0 - sdp ONLINE 0 0 0 - sdq ONLINE 0 0 0 - sdr ONLINE 0 0 0 - sds ONLINE 0 0 0 - sdt ONLINE 0 0 0 - spares - %draid1-0-s0 INUSE currently in use - %draid1-0-s1 AVAIL -``` - -As shown above, the _draid1-0_ vdev is still in _DEGRADED_ mode although two child drives have failed and it's only single-parity. Since the _%draid1-0-s1_ is still _AVAIL_, full redundancy can be restored by replacing _sdj_ with it, without adding new drive to the pool: -``` -# zpool replace tank sdj '%draid1-0-s1' -# zpool status - state: DEGRADED -status: One or more devices has been taken offline by the administrator. - Sufficient replicas exist for the pool to continue functioning in a - degraded state. -action: Online the device using 'zpool online' or replace the device with - 'zpool replace'. - scan: rebuilt 2.13G in 0h0m5s with 0 errors on Fri Feb 24 23:20:59 2017 -config: - - NAME STATE READ WRITE CKSUM - tank DEGRADED 0 0 0 - draid1-0 DEGRADED 0 0 0 - sdd ONLINE 0 0 0 - sde ONLINE 0 0 0 - sdf ONLINE 0 0 0 - sdg ONLINE 0 0 0 - sdh ONLINE 0 0 0 - sdu ONLINE 0 0 0 - spare-6 DEGRADED 0 0 0 - sdj OFFLINE 0 0 0 - %draid1-0-s1 ONLINE 0 0 0 - sdv ONLINE 0 0 0 - sdl ONLINE 0 0 0 - sdm ONLINE 0 0 0 - sdn ONLINE 0 0 0 - spare-11 DEGRADED 0 0 0 - sdo OFFLINE 0 0 0 - %draid1-0-s0 ONLINE 0 0 0 - sdp ONLINE 0 0 0 - sdq ONLINE 0 0 0 - sdr ONLINE 0 0 0 - sds ONLINE 0 0 0 - sdt ONLINE 0 0 0 - spares - %draid1-0-s0 INUSE currently in use - %draid1-0-s1 INUSE currently in use -``` - -Again, full redundancy has been restored without adding any new drive. If another drive fails, the pool will still be able to handle IO, but there'd be no more distributed spare to rebuild (both are in _INUSE_ state now). At this point, there's no urgency to add a new replacement drive because the pool can survive yet another drive failure. - -### Rebuild for mirror vdev - -The sequential rebuild process also works for the mirror vdev, when a drive is attached to a mirror or a mirror child vdev is replaced. - -By default, rebuild for mirror vdev is turned off. It can be turned on using the zfs module option _spa_rebuild_mirror=1_. - -### Rebuild throttling - -The rebuild process may delay _zio_ by _spa_vdev_scan_delay_ if the draid vdev has seen any important IO in the recent _spa_vdev_scan_idle_ period. But when a dRAID vdev has lost all redundancy, e.g. a draid2 with 2 faulted child drives, the rebuild process will go full speed by ignoring _spa_vdev_scan_delay_ and _spa_vdev_scan_idle_ altogether because the vdev is now in critical state. - -After delaying, the rebuild zio is issued using priority _ZIO_PRIORITY_SCRUB_ for reads and _ZIO_PRIORITY_ASYNC_WRITE_ for writes. Therefore the options that control the queuing of these two IO priorities will affect rebuild _zio_ as well, for example _zfs_vdev_scrub_min_active_, _zfs_vdev_scrub_max_active_, _zfs_vdev_async_write_min_active_, and _zfs_vdev_async_write_max_active_. - -## Rebalance - -Distributed spare space can be made available again by simply replacing any failed drive with a new drive. This process is called _rebalance_ which is essentially a _resilver_: -``` -# zpool replace -f tank sdo sdw -# zpool status - state: DEGRADED -status: One or more devices has been taken offline by the administrator. - Sufficient replicas exist for the pool to continue functioning in a - degraded state. -action: Online the device using 'zpool online' or replace the device with - 'zpool replace'. - scan: resilvered 2.21G in 0h0m58s with 0 errors on Fri Feb 24 23:31:45 2017 -config: - - NAME STATE READ WRITE CKSUM - tank DEGRADED 0 0 0 - draid1-0 DEGRADED 0 0 0 - sdd ONLINE 0 0 0 - sde ONLINE 0 0 0 - sdf ONLINE 0 0 0 - sdg ONLINE 0 0 0 - sdh ONLINE 0 0 0 - sdu ONLINE 0 0 0 - spare-6 DEGRADED 0 0 0 - sdj OFFLINE 0 0 0 - %draid1-0-s1 ONLINE 0 0 0 - sdv ONLINE 0 0 0 - sdl ONLINE 0 0 0 - sdm ONLINE 0 0 0 - sdn ONLINE 0 0 0 - sdw ONLINE 0 0 0 - sdp ONLINE 0 0 0 - sdq ONLINE 0 0 0 - sdr ONLINE 0 0 0 - sds ONLINE 0 0 0 - sdt ONLINE 0 0 0 - spares - %draid1-0-s0 AVAIL - %draid1-0-s1 INUSE currently in use -``` - -Note that the scan status now says _"resilvered"_. Also, the state of _%draid1-0-s0_ has become _AVAIL_ again. Since the resilver process checks block checksums, it makes up for the lack of checksum verification during previous rebuild. - -The dRAID1 vdev in this example shuffles three (4 data + 1 parity) redundancy groups to the 17 drives. For any single drive failure, only about 1/3 of the blocks are affected (and should be resilvered/rebuilt). The rebuild process is able to avoid unnecessary work, but the resilver process by default will not. The rebalance (which is essentially resilver) can speed up a lot by setting module option _zfs_no_resilver_skip_ to 0. This feature is turned off by default because of issue https://github.com/zfsonlinux/zfs/issues/5806. - -# Troubleshooting - -Please report bugs to [the dRAID PR](https://github.com/zfsonlinux/zfs/pull/10102), as long as the code is not merged upstream. \ No newline at end of file +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file diff --git a/hole_birth-FAQ.md b/hole_birth-FAQ.md index 7159d5a..48e20e3 100644 --- a/hole_birth-FAQ.md +++ b/hole_birth-FAQ.md @@ -1,25 +1,4 @@ -### Short explanation -The hole_birth feature has/had bugs, the result of which is that, if you do a `zfs send -i` (or `-R`, since it uses `-i`) from an affected dataset, the receiver will not see any checksum or other errors, but the resulting destination snapshot will not match the source. -ZoL versions 0.6.5.8 and 0.7.0-rc1 (and above) default to ignoring the faulty metadata which causes this issue *on the sender side*. +This page was moved to: https://openzfs.github.io/openzfs-docs/Project%20and%20Community/FAQ%20hole%20birth.html -### FAQ - -#### I have a pool with hole_birth enabled, how do I know if I am affected? -It is technically possible to calculate whether you have any affected files, but it requires scraping zdb output for each file in each snapshot in each dataset, which is a combinatoric nightmare. (If you really want it, there is a proof of concept [here](https://github.com/rincebrain/hole_birth_test). - -#### Is there any less painful way to fix this if we have already received an affected snapshot? -No, the data you need was simply not present in the send stream, unfortunately, and cannot feasibly be rewritten in place. - -### Long explanation -hole_birth is a feature to speed up ZFS send -i - in particular, ZFS used to not store metadata on when "holes" (sparse regions) in files were created, so every zfs send -i needed to include every hole. - -hole_birth, as the name implies, added tracking for the txg (transaction group) when a hole was created, so that zfs send -i could only send holes that had a birth_time between (starting snapshot txg) and (ending snapshot txg), and life was wonderful. - -Unfortunately, hole_birth had a number of edge cases where it could "forget" to set the birth_time of holes in some cases, causing it to record the birth_time as 0 (the value used prior to hole_birth, and essentially equivalent to "since file creation"). - -This meant that, when you did a zfs send -i, since zfs send does not have any knowledge of the surrounding snapshots when sending a given snapshot, it would see the creation txg as 0, conclude "oh, it is 0, I must have already sent this before", and not include it. - -This means that, on the receiving side, it does not know those holes should exist, and does not create them. This leads to differences between the source and the destination. - -ZoL versions 0.6.5.8 and 0.7.0-rc1 (and above) default to ignoring this metadata and always sending holes with birth_time 0, configurable using the tunable known as `ignore_hole_birth` or `send_holes_without_birth_time`. The latter is what OpenZFS standardized on. ZoL version 0.6.5.8 only has the former, but for any ZoL version with `send_holes_without_birth_time`, they point to the same value, so changing either will work. +[Go to OpenZFS documentation.](https://openzfs.github.io/openzfs-docs/) \ No newline at end of file