OpenZFS on Linux and FreeBSD
Go to file
Jason Lee 3070faa798 ZFS Interface for Accelerators (Z.I.A.)
The ZIO pipeline has been modified to allow for external,
alternative implementations of existing operations to be
used. The original ZFS functions remain in the code as
fallback in case the external implementation fails.

Definitions:
    Accelerator - an entity (usually hardware) that is
                  intended to accelerate operations
    Offloader   - synonym of accelerator; used interchangeably
    Data Processing Unit Services Module (DPUSM)
                - https://github.com/hpc/dpusm
                - defines a "provider API" for accelerator
                  vendors to set up
                - defines a "user API" for accelerator consumers
                  to call
                - maintains list of providers and coordinates
                  interactions between providers and consumers.
    Provider    - a DPUSM wrapper for an accelerator's API
    Offload     - moving data from ZFS/memory to the accelerator
    Onload      - the opposite of offload

In order for Z.I.A. to be extensible, it does not directly
communicate with a fixed accelerator. Rather, Z.I.A. acquires
a handle to a DPUSM, which is then used to acquire handles
to providers.

Using ZFS with Z.I.A.:
    1. Build and start the DPUSM
    2. Implement, build, and register a provider with the DPUSM
    3. Reconfigure ZFS with '--with-zia=<DPUSM root>'
    4. Rebuild and start ZFS
    5. Create a zpool
    6. Select the provider
           zpool set zia_provider=<provider name> <zpool>
    7. Select operations to offload
           zpool set zia_<property>=on <zpool>

The operations that have been modified are:
    - compression
        - non-raw-writes only
    - decompression
    - checksum
        - not handling embedded checksums
        - checksum compute and checksum error call the same function
    - raidz
        - generation
        - reconstruction
    - vdev_file
        - open
        - write
        - close
    - vdev_disk
        - open
        - invalidate
        - write
        - flush
        - close

Successful operations do not bring data back into memory after
they complete, allowing for subsequent offloader operations
reuse the data. This results in only one data movement per ZIO
at the beginning of a pipeline that is necessary for getting
data from ZFS to the accelerator.

When errors ocurr and the offloaded data is still accessible,
the offloaded data will be onloaded (or dropped if it still
matches the in-memory copy) for that ZIO pipeline stage and
processed with ZFS. This will cause thrashing if a later
operation offloads data. This should not happen often, as
constant errors (resulting in data movement) is not expected
to be the norm.

Unrecoverable errors such as hardware failures will trigger
pipeline restarts (if necessary) in order to complete the
original ZIO using the software path.

The modifications to ZFS can be thought of as two sets of changes:
    - The ZIO write pipeline
        - compression, checksum, RAIDZ generation, and write
        - Each stage starts by offloading data that was not
          previously offloaded
            - This allows for ZIOs to be offloaded at any point
              in the pipeline
    - Resilver
        - vdev_raidz_io_done (RAIDZ reconstruction, checksum, and
          RAIDZ generation), and write
        - Because the core of resilver is vdev_raidz_io_done, data
          is only offloaded once at the beginning of
          vdev_raidz_io_done
            - Errors cause data to be onloaded, but will not
              re-offload in subsequent steps within resilver
            - Write is a separate ZIO pipeline stage, so it will
              attempt to offload data

The zio_decompress function has been modified to allow for
offloading but the ZIO read pipeline as a whole has not, so it
is not part of the above list.

An example provider implementation can be found in
module/zia-software-provider
    - The provider's "hardware" is actually software - data is
      "offloaded" to memory not owned by ZFS
    - Calls ZFS functions in order to not reimplement operations
    - Has kernel module parameters that can be used to trigger
      ZIA_ACCELERATOR_DOWN states for testing pipeline restarts.

abd_t, raidz_row_t, and vdev_t have each been given an additional
"void *<prefix>_zia_handle" member. These opaque handles point to
data that is located on an offloader. abds are still allocated,
but their payloads are expected to diverge from the offloaded copy
as operations are run.

Encryption and deduplication are disabled for zpools with Z.I.A.
operations enabled

Aggregation is disabled for offloaded abds

RPMs will build with Z.I.A.

Signed-off-by: Jason Lee <jasonlee@lanl.gov>
2024-09-10 15:14:15 +00:00
.github Github workflow: fix typo in `zloop` artifact 2024-08-09 16:49:19 -07:00
cmd spa_prop_get: require caller to supply output nvlist 2024-09-06 08:45:58 -07:00
config ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
contrib Add DDT prune command 2024-09-04 14:17:02 -07:00
etc etc/init.d: decide which variant to use at build time. 2024-04-08 16:52:24 -07:00
include ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
lib ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
man ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
module ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
rpm ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
scripts disable automatic dependency tracking for dkms builds 2024-06-13 18:08:49 -07:00
tests ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
udev udev: correctly handle partition #16 and later 2024-03-21 16:38:24 -07:00
.cirrus.yml CI: add FreeBSD build with Cirrus CI 2023-10-06 08:50:26 -07:00
.editorconfig Add an .editorconfig; document git whitespace settings 2020-01-27 13:32:52 -08:00
.gitignore Packaging: Auto-generate changelog during configure (#15528) 2023-11-16 08:58:47 -08:00
.gitmodules .gitmodules: link to openzfs github repository 2021-04-12 09:37:23 -07:00
.mailmap AUTHORS: refresh with recent new contributors (#16362) 2024-07-23 11:47:04 -07:00
AUTHORS AUTHORS: refresh with recent new contributors (#16362) 2024-07-23 11:47:04 -07:00
CODE_OF_CONDUCT.md Documentation corrections 2022-12-22 11:34:28 -08:00
COPYRIGHT Fix typos 2020-06-09 21:24:09 -07:00
LICENSE Update build system and packaging 2018-05-29 16:00:33 -07:00
META Linux 6.10 compat: META 2024-08-21 17:38:06 -07:00
Makefile.am ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
NEWS Fix NEWS file 2020-08-26 21:44:41 -07:00
NOTICE Update build system and packaging 2018-05-29 16:00:33 -07:00
README.md FreeBSD: remove support for FreeBSD < 13.0-RELEASE (#16372) 2024-08-05 16:56:45 -07:00
RELEASES.md Add RELEASES.md file 2021-04-02 16:33:40 -07:00
TEST Remove CI builder customization from TEST 2020-03-16 10:46:03 -07:00
autogen.sh Ubuntu 22.04 integration: ShellCheck 2022-11-18 11:24:48 -08:00
configure.ac Packaging: Auto-generate changelog during configure (#15528) 2023-11-16 08:58:47 -08:00
copy-builtin copy-builtin: add hooks with sed/>> 2022-05-10 10:17:43 -07:00
zfs.release.in Move zfs.release generation to configure step 2012-07-12 12:22:51 -07:00

README.md

img

OpenZFS is an advanced file system and volume manager which was originally developed for Solaris and is now maintained by the OpenZFS community. This repository contains the code for running OpenZFS on Linux and FreeBSD.

codecov coverity

Official Resources

Installation

Full documentation for installing OpenZFS on your favorite operating system can be found at the Getting Started Page.

Contribute & Develop

We have a separate document with contribution guidelines.

We have a Code of Conduct.

Release

OpenZFS is released under a CDDL license. For more details see the NOTICE, LICENSE and COPYRIGHT files; UCRL-CODE-235197

Supported Kernels

  • The META file contains the officially recognized supported Linux kernel versions.
  • Supported FreeBSD versions are any supported branches and releases starting from 13.0-RELEASE.