zfs/include/sys
Jason Lee 3070faa798 ZFS Interface for Accelerators (Z.I.A.)
The ZIO pipeline has been modified to allow for external,
alternative implementations of existing operations to be
used. The original ZFS functions remain in the code as
fallback in case the external implementation fails.

Definitions:
    Accelerator - an entity (usually hardware) that is
                  intended to accelerate operations
    Offloader   - synonym of accelerator; used interchangeably
    Data Processing Unit Services Module (DPUSM)
                - https://github.com/hpc/dpusm
                - defines a "provider API" for accelerator
                  vendors to set up
                - defines a "user API" for accelerator consumers
                  to call
                - maintains list of providers and coordinates
                  interactions between providers and consumers.
    Provider    - a DPUSM wrapper for an accelerator's API
    Offload     - moving data from ZFS/memory to the accelerator
    Onload      - the opposite of offload

In order for Z.I.A. to be extensible, it does not directly
communicate with a fixed accelerator. Rather, Z.I.A. acquires
a handle to a DPUSM, which is then used to acquire handles
to providers.

Using ZFS with Z.I.A.:
    1. Build and start the DPUSM
    2. Implement, build, and register a provider with the DPUSM
    3. Reconfigure ZFS with '--with-zia=<DPUSM root>'
    4. Rebuild and start ZFS
    5. Create a zpool
    6. Select the provider
           zpool set zia_provider=<provider name> <zpool>
    7. Select operations to offload
           zpool set zia_<property>=on <zpool>

The operations that have been modified are:
    - compression
        - non-raw-writes only
    - decompression
    - checksum
        - not handling embedded checksums
        - checksum compute and checksum error call the same function
    - raidz
        - generation
        - reconstruction
    - vdev_file
        - open
        - write
        - close
    - vdev_disk
        - open
        - invalidate
        - write
        - flush
        - close

Successful operations do not bring data back into memory after
they complete, allowing for subsequent offloader operations
reuse the data. This results in only one data movement per ZIO
at the beginning of a pipeline that is necessary for getting
data from ZFS to the accelerator.

When errors ocurr and the offloaded data is still accessible,
the offloaded data will be onloaded (or dropped if it still
matches the in-memory copy) for that ZIO pipeline stage and
processed with ZFS. This will cause thrashing if a later
operation offloads data. This should not happen often, as
constant errors (resulting in data movement) is not expected
to be the norm.

Unrecoverable errors such as hardware failures will trigger
pipeline restarts (if necessary) in order to complete the
original ZIO using the software path.

The modifications to ZFS can be thought of as two sets of changes:
    - The ZIO write pipeline
        - compression, checksum, RAIDZ generation, and write
        - Each stage starts by offloading data that was not
          previously offloaded
            - This allows for ZIOs to be offloaded at any point
              in the pipeline
    - Resilver
        - vdev_raidz_io_done (RAIDZ reconstruction, checksum, and
          RAIDZ generation), and write
        - Because the core of resilver is vdev_raidz_io_done, data
          is only offloaded once at the beginning of
          vdev_raidz_io_done
            - Errors cause data to be onloaded, but will not
              re-offload in subsequent steps within resilver
            - Write is a separate ZIO pipeline stage, so it will
              attempt to offload data

The zio_decompress function has been modified to allow for
offloading but the ZIO read pipeline as a whole has not, so it
is not part of the above list.

An example provider implementation can be found in
module/zia-software-provider
    - The provider's "hardware" is actually software - data is
      "offloaded" to memory not owned by ZFS
    - Calls ZFS functions in order to not reimplement operations
    - Has kernel module parameters that can be used to trigger
      ZIA_ACCELERATOR_DOWN states for testing pipeline restarts.

abd_t, raidz_row_t, and vdev_t have each been given an additional
"void *<prefix>_zia_handle" member. These opaque handles point to
data that is located on an offloader. abds are still allocated,
but their payloads are expected to diverge from the offloaded copy
as operations are run.

Encryption and deduplication are disabled for zpools with Z.I.A.
operations enabled

Aggregation is disabled for offloaded abds

RPMs will build with Z.I.A.

Signed-off-by: Jason Lee <jasonlee@lanl.gov>
2024-09-10 15:14:15 +00:00
..
crypto icp: remove skein module 2024-05-31 15:13:39 -07:00
fm Add slow disk diagnosis to ZED 2024-02-08 09:19:52 -08:00
fs ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
lua autoconf: single-step includes 2022-05-10 10:18:51 -07:00
sysevent Teach zpool scrub to scrub only blocks in error log 2023-05-18 11:59:42 -07:00
zstd compress: change compression providers API to use ABDs 2024-08-22 16:22:24 -07:00
abd.h ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
abd_impl.h abd_os: break out platform-specific header parts 2024-08-21 13:37:18 -07:00
aggsum.h More aggsum optimizations 2021-06-07 09:02:47 -07:00
arc.h ddt: add support for prefetching tables into the ARC 2024-07-26 09:16:18 -07:00
arc_impl.h Several improvements to ARC shrinking (#16197) 2024-07-25 10:31:14 -07:00
asm_linkage.h Unify Assembler files between Linux and Windows 2023-01-17 11:09:19 -08:00
avl.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
avl_impl.h Make sure avl_tree.avl_pad is not in kernel module (#16280) 2024-07-17 13:54:11 -07:00
bitmap.h Replace dead opensolaris.org license links 2023-03-14 14:44:01 -07:00
bitops.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
blake3.h Update BLAKE3 for using the new impl handling 2023-03-02 13:52:27 -08:00
blkptr.h OpenZFS 8067 - zdb should be able to dump literal embedded block pointer 2017-07-07 11:28:01 -07:00
bplist.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
bpobj.h Add explicit prefetches to bpobj_iterate(). 2023-07-21 11:50:48 -07:00
bptree.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
bqueue.h Batch enqueue/dequeue for bqueue 2023-01-10 13:39:22 -08:00
brt.h zdb: include cloned blocks in block statistics 2023-08-01 08:56:30 -07:00
brt_impl.h brt: lift internal definitions into _impl header 2023-11-27 13:34:43 -08:00
btree.h btree: Implement faster binary search algorithm 2023-05-26 10:03:12 -07:00
dataset_kstats.h Update the kstat dataset_name when renaming a zvol 2023-11-07 11:34:50 -08:00
dbuf.h Skip dnode handles use when not needed 2024-07-29 14:48:12 -07:00
ddt.h Add DDT prune command 2024-09-04 14:17:02 -07:00
ddt_impl.h Add DDT prune command 2024-09-04 14:17:02 -07:00
dmu.h ddt: dedup log 2024-08-16 12:03:35 -07:00
dmu_impl.h ZIL: Avoid dbuf_read() before dmu_sync(). 2023-08-11 09:04:08 -07:00
dmu_objset.h Add prefetch property 2023-10-24 11:00:07 -07:00
dmu_recv.h nvpair: Constify string functions 2023-03-14 15:25:50 -07:00
dmu_redact.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dmu_send.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dmu_traverse.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dmu_tx.h Add dmu_tx_hold_append() interface 2023-05-09 09:03:10 -07:00
dmu_zfetch.h Speculative prefetch for reordered requests 2024-04-08 15:13:27 -07:00
dnode.h dnode: allow storage class to be overridden by object type 2024-07-29 17:05:41 -07:00
dsl_bookmark.h Increase limit of redaction list by using spill block 2023-08-26 11:34:43 -07:00
dsl_crypt.h Allow block cloning across encrypted datasets 2023-12-05 11:03:48 -08:00
dsl_dataset.h Revert zfeature_active() to static 2023-02-28 14:03:52 -08:00
dsl_deadlist.h Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
dsl_deleg.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dsl_destroy.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dsl_dir.h Cleanup ->dd_space_towrite should be unsigned 2023-01-20 11:10:15 -08:00
dsl_pool.h Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
dsl_prop.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dsl_scan.h ddt: add "flat phys" feature 2024-08-16 12:02:39 -07:00
dsl_synctask.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dsl_userhold.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
edonr.h Remove unused constant EdonR256_BLOCK_BITSIZE 2023-03-22 08:39:48 -07:00
efi_partition.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
frame.h Linux 5.10 compat: frame.h renamed objtool.h 2020-11-02 22:01:10 +00:00
hkdf.h Encryption patch follow-up 2017-10-11 16:54:48 -04:00
metaslab.h Selectable block allocators 2023-09-01 18:00:30 -07:00
metaslab_impl.h Reduce number of metaslab preload taskq threads. 2023-10-06 09:04:00 -07:00
mmp.h Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
mntent.h Expose ZFS dataset case sensitivity setting via sb_opts 2022-07-14 10:38:16 -07:00
mod.h linux: module: weld all but spl.ko into zfs.ko 2022-04-20 13:28:24 -07:00
multilist.h L2ARC: Relax locking during write 2024-04-09 16:23:19 -07:00
nvpair.h nvpair: Use flexible array member for nvpair name strings 2023-03-14 15:25:55 -07:00
nvpair_impl.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
objlist.h Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
pathname.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
qat.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
range_tree.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
rrwlock.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
sa.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
sa_impl.h Fix sa_add_projid to lookup and update SA_ZPL_DXATTR (avoid DXATTR loss) (#16288) 2024-07-31 18:41:49 -07:00
sha2.h icp: remove unused SHA2 HMAC mechanisms 2024-05-31 15:13:30 -07:00
skein.h icp: remove digest entry points 2024-05-31 15:13:16 -07:00
spa.h spa_prop_get: require caller to supply output nvlist 2024-09-06 08:45:58 -07:00
spa_checkpoint.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
spa_checksum.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
spa_impl.h ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
spa_log_spacemap.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
space_map.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
space_reftree.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
sysevent.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
txg.h Cleanup: Specify unsignedness on things that should not be signed 2022-09-27 16:42:41 -07:00
txg_impl.h Properly pad struct tx_cpu to cache line 2023-10-20 11:54:05 -07:00
u8_textprep.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
u8_textprep_data.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
uberblock.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
uberblock_impl.h vdev probe to slow disk can stall mmp write checker 2024-04-29 14:35:53 -07:00
uio_impl.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
unique.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
uuid.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev.h RAID-Z expansion feature 2023-11-08 10:19:41 -08:00
vdev_disk.h ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
vdev_draid.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_file.h ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
vdev_impl.h ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
vdev_indirect_births.h OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_indirect_mapping.h OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_initialize.h Add the ability to uninitialize 2023-05-18 10:02:20 -07:00
vdev_raidz.h ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
vdev_raidz_impl.h ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
vdev_rebuild.h Do not report bytes skipped by scan as issued. 2023-06-30 08:47:13 -07:00
vdev_removal.h Cleanup: Specify unsignedness on things that should not be signed 2022-09-27 16:42:41 -07:00
vdev_trim.h Fix short-lived txg caused by autotrim 2023-03-28 08:43:41 -07:00
xvattr.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zap.h ddt: add support for prefetching tables into the ARC 2024-07-26 09:16:18 -07:00
zap_impl.h ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
zap_leaf.h zap_leaf: make l_hash[] variable length to silence UBSAN 2024-04-03 16:38:18 -07:00
zcp.h Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
zcp_global.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zcp_iter.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zcp_prop.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zcp_set.h Support setting user properties in a channel program 2020-02-14 13:41:42 -08:00
zfeature.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_acl.h Linux 6.3 compat: idmapped mount API changes 2023-04-10 14:15:36 -07:00
zfs_bootenv.h zfs label bootenv should store data as nvlist 2020-09-15 15:42:27 -07:00
zfs_chksum.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_context.h Linux: Report reclaimable memory to kernel as such (#16385) 2024-07-30 11:40:47 -07:00
zfs_debug.h zdb/ztest: send dbgmsg output to stderr 2024-05-14 09:49:00 -07:00
zfs_delay.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_file.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_fuid.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_impl.h Add generic implementation handling and SHA2 impl 2023-03-02 13:52:21 -08:00
zfs_ioctl.h Parallel pool import 2024-04-22 09:42:38 -07:00
zfs_ioctl_impl.h Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
zfs_onexit.h zfs_onexit_add_cb: make action_handle point to a uintptr_t 2022-11-03 09:52:12 -07:00
zfs_project.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_quota.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_racct.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_ratelimit.h Change checksum & IO delay ratelimit values 2018-03-04 17:34:51 -08:00
zfs_refcount.h Switch refcount tracking from lists to AVL-trees. 2023-06-14 08:02:27 -07:00
zfs_rlock.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_sa.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_stat.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_sysfs.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_vfsops.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_vnops.h BRT: Fix FICLONE/FICLONERANGE shortened copy 2024-02-05 16:44:45 -08:00
zfs_znode.h ZIL: Cleanup sync and commit handling 2023-10-30 14:51:56 -07:00
zia.h ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
zia_cddl.h ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
zia_private.h ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
zil.h zil: add stats for commit failure/fallback (#16315) 2024-07-25 16:53:59 -07:00
zil_impl.h ZIL: Improve next log block size prediction 2023-12-21 10:54:44 -08:00
zio.h ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
zio_checksum.h Don't emit cksum_{actual_expected} in ereport.fs.zfs.checksum events 2023-07-21 11:49:26 -07:00
zio_compress.h ZFS Interface for Accelerators (Z.I.A.) 2024-09-10 15:14:15 +00:00
zio_crypt.h Enable -Wwrite-strings 2022-06-29 14:08:54 -07:00
zio_impl.h value strings: pretty printers for flags and enums 2024-09-05 13:40:05 -07:00
zio_priority.h value strings: pretty printers for flags and enums 2024-09-05 13:40:05 -07:00
zrlock.h Pack zrlock_t by 8 bytes 2023-01-05 09:31:55 -08:00
zthr.h Avoid memory allocations in the ARC eviction thread 2022-01-21 10:28:13 -08:00
zvol.h zvol: fix delayed update to block device ro entry 2023-10-31 09:50:38 -07:00
zvol_impl.h zvol: ensure device minors are properly cleaned up 2024-08-06 12:08:14 -07:00