OpenZFS on Linux and FreeBSD
Go to file
Rob Norris e35088c93f ZIO: add "vdev tracing" facility; use it for ZIL flushing
A problem with zio_flush() is that it issues a flush ZIO to a top-level
vdev, which then recursively issues child flush ZIOs until the real leaf
devices are flushed. As usual, an error in a child ZIO results in the
parent ZIO erroring too, so if a leaf device has failed, it's flush ZIO
will fail, and so will the entire flush operation.

This didn't matter when we used to ignore flush errors, but now that we
propagate them, the flush error propagates into the ZIL write ZIO. This
causes the ZIL to believe its write failed, and fall back to a full txg
wait. This still provides correct behaviour for zil_commit() callers (eg
fsync()) but it ruins performance.

We cannot simply skip flushing failed vdevs, because the associated
write may have succeeded before the vdev failed, which would give the
appearance the write is fully flushed when it is not. Neither can we
issue a "syncing write" to the device (eg SCSI FUA), as this also
degrades performance.

The answer is that we must bind writes and flushes together in a way
such that we only flush the physical devices that we wrote to.

This adds a "vdev tracing" facility to ZIOs, zio_vdev_trace. When
enabled on a ZIO with ZIO_FLAG_VDEV_TRACE, then upon successful
completion (in the _done handler), zio->io_vdev_trace_tree will have a
list of zio_vdev_trace_t objects that each describe a vdev that was
involved in the successful completion of the ZIO.

A companion function, zio_vdev_trace_flush(), is included, that issues a
flush ZIO to the child vdevs on the given trace tree.
zil_lwb_write_done() is updated to use this to bind ZIL writes and
flushes together.

The tracing facility is similar in many ways to the "deferred flushing"
facility inside the ZIL, to the point where it can replace it. Now, if
the flush should be deferred, the trace records from the writing ZIO are
captured and combined with any captured from previous writes. When its
finally time to issue the flush, we issue it to the entire accumulated
set of traced vdevs.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
2024-08-17 14:43:36 +10:00
.github Github workflow: fix typo in `zloop` artifact 2024-08-09 16:49:19 -07:00
cmd ddt: dedup log 2024-08-16 12:03:35 -07:00
config Linux 6.11: avoid passing "end" sentinel to register_sysctl() 2024-08-13 17:47:22 -07:00
contrib Updating bash completion build file 2024-08-08 15:39:25 -07:00
etc etc/init.d: decide which variant to use at build time. 2024-04-08 16:52:24 -07:00
include ZIO: add "vdev tracing" facility; use it for ZIL flushing 2024-08-17 14:43:36 +10:00
lib ddt: dedup log 2024-08-16 12:03:35 -07:00
man Enable L2 cache of all (MRU+MFU) metadata but MFU data only 2024-08-16 13:34:07 -07:00
module ZIO: add "vdev tracing" facility; use it for ZIL flushing 2024-08-17 14:43:36 +10:00
rpm Linux 6.10 compat: fix rpm-kmod and builtin 2024-08-15 14:00:18 -07:00
scripts disable automatic dependency tracking for dkms builds 2024-06-13 18:08:49 -07:00
tests ZTS: ZIL txg sync fallback test 2024-08-17 14:43:36 +10:00
udev udev: correctly handle partition #16 and later 2024-03-21 16:38:24 -07:00
.cirrus.yml CI: add FreeBSD build with Cirrus CI 2023-10-06 08:50:26 -07:00
.editorconfig Add an .editorconfig; document git whitespace settings 2020-01-27 13:32:52 -08:00
.gitignore Packaging: Auto-generate changelog during configure (#15528) 2023-11-16 08:58:47 -08:00
.gitmodules .gitmodules: link to openzfs github repository 2021-04-12 09:37:23 -07:00
.mailmap AUTHORS: refresh with recent new contributors (#16362) 2024-07-23 11:47:04 -07:00
AUTHORS AUTHORS: refresh with recent new contributors (#16362) 2024-07-23 11:47:04 -07:00
CODE_OF_CONDUCT.md Documentation corrections 2022-12-22 11:34:28 -08:00
COPYRIGHT Fix typos 2020-06-09 21:24:09 -07:00
LICENSE Update build system and packaging 2018-05-29 16:00:33 -07:00
META Linux 6.9 compat: META (#16358) 2024-07-16 16:27:54 -07:00
Makefile.am Process `script` directory for all configs 2022-10-27 16:45:14 -07:00
NEWS Fix NEWS file 2020-08-26 21:44:41 -07:00
NOTICE Update build system and packaging 2018-05-29 16:00:33 -07:00
README.md FreeBSD: remove support for FreeBSD < 13.0-RELEASE (#16372) 2024-08-05 16:56:45 -07:00
RELEASES.md Add RELEASES.md file 2021-04-02 16:33:40 -07:00
TEST Remove CI builder customization from TEST 2020-03-16 10:46:03 -07:00
autogen.sh Ubuntu 22.04 integration: ShellCheck 2022-11-18 11:24:48 -08:00
configure.ac Packaging: Auto-generate changelog during configure (#15528) 2023-11-16 08:58:47 -08:00
copy-builtin copy-builtin: add hooks with sed/>> 2022-05-10 10:17:43 -07:00
zfs.release.in Move zfs.release generation to configure step 2012-07-12 12:22:51 -07:00

README.md

img

OpenZFS is an advanced file system and volume manager which was originally developed for Solaris and is now maintained by the OpenZFS community. This repository contains the code for running OpenZFS on Linux and FreeBSD.

codecov coverity

Official Resources

Installation

Full documentation for installing OpenZFS on your favorite operating system can be found at the Getting Started Page.

Contribute & Develop

We have a separate document with contribution guidelines.

We have a Code of Conduct.

Release

OpenZFS is released under a CDDL license. For more details see the NOTICE, LICENSE and COPYRIGHT files; UCRL-CODE-235197

Supported Kernels

  • The META file contains the officially recognized supported Linux kernel versions.
  • Supported FreeBSD versions are any supported branches and releases starting from 13.0-RELEASE.